Next Article in Journal
Corrosion Resistance and Mechanical Properties of Cr-Rich 316 Stainless Steel Coatings Fabricated by the TIG Process Using Flux-Cored Wires
Previous Article in Journal
Photoinduced Site-Selective Aryl C-H Borylation with Electron-Donor-Acceptor Complex Derived from B2Pin2 and Isoquinoline
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

BiMPADR: A Deep Learning Framework for Predicting Adverse Drug Reactions in New Drugs

Department of Biostatistics, School of Public Health, Harbin Medical University, Harbin 150081, China
*
Authors to whom correspondence should be addressed.
Molecules 2024, 29(8), 1784; https://doi.org/10.3390/molecules29081784
Submission received: 22 March 2024 / Revised: 8 April 2024 / Accepted: 11 April 2024 / Published: 14 April 2024

Abstract

:
Detecting the unintended adverse reactions of drugs (ADRs) is a crucial concern in pharmacological research. The experimental validation of drug–ADR associations often entails expensive and time-consuming investigations. Thus, a computational model to predict ADRs from known associations is essential for enhanced efficiency and cost-effectiveness. Here, we propose BiMPADR, a novel model that integrates drug gene expression into adverse reaction features using a message passing neural network on a bipartite graph of drugs and adverse reactions, leveraging publicly available data. By combining the computed adverse reaction features with the structural fingerprints of drugs, we predict the association between drugs and adverse reactions. Our models obtained high AUC (area under the receiver operating characteristic curve) values ranging from 0.861 to 0.907 in an external drug validation dataset under differential experiment conditions. The case study on multiple BET inhibitors also demonstrated the high accuracy of our predictions, and our model’s exploration of potential adverse reactions for HWD-870 has contributed to its research and development for market approval. In summary, our method would provide a promising tool for ADR prediction and drug safety assessment in drug discovery and development.

1. Introduction

Adverse drug reactions (ADRs), according to the WHO, are any harmful or unintended responses to a medication occurring at normal doses used for disease prevention, diagnosis, or treatment [1]. Adverse drug reactions (ADRs) pose a substantial challenge in contemporary drug discovery and are a major contributor of illness and mortality in healthcare [2]. ADRs have been identified as the fourth leading cause of death in the United States. Annually, statistics show that nearly 100,000 fatalities are attributed to adverse drug reactions (ADRs) resulting from the use of medications at their recommended dosages [3,4,5]. ADRs also impose a significant financial burden on public health systems. Studies have shown that the incremental total cost per patient attributed to ADRs ranges from approximately EUR 702 to EUR 7318 [6,7]. Moreover, ADRs play a prominent role in the failure of drug research. Safety-related concerns are responsible for 35% of drug failures in Phase I and 28% in Phase II, significantly impacting the progression to the drug submission stage [8,9]. The identification of ADRs for numerous drugs often occurs several years after their market introduction. Each year, the FDA withdraws drugs from the market due to adverse effects, with prominent instances including Vioxx, Fen-Phen, and Rosiglitazone [9,10]. Hence, early evaluation of potential drug adverse reactions is vital to minimize health risks for participants and to reduce drug development costs.
The conventional approach to predicting ADRs typically entails researchers engaging in pharmacological experiments or conducting clinical observations. These processes require numerous in vitro screening and in vivo preclinical animal studies. Even though these methods are time-intensive and resource-heavy, numerous ADRs of novel drugs frequently remain undiscovered [11,12]. In recent years, there has been significant progress in the development of computational prediction methods, particularly deep learning techniques, for predicting drug adverse reactions using drug-related databases.
A commonly used group of methods for predicting adverse drug reactions involve treating the problem as the inference of missing connections within a bipartite network that links drugs and side effects. Cami et al. (2011) developed a model named PPNs (predictive pharmacosafety networks), which integrates the network structure formed by known adverse drug event (ADE) relationships with specific drug information and adverse event data to predict potential unidentified ADEs [13]. Zhang et al. (2016) investigated the prediction of potential drug side effects by utilizing two recommender methods and integrating their proposed approaches with existing methods to develop ensemble models [14]. Galeano et al. (2018) proposed a recommender system that predicts drug side effects for marketed drugs using collaborative filtering algorithms [15]. Lin et al. (2013) proposed a network-based external link prediction method that utilizes the neighborhood of a drug in a bipartite network to infer potential adverse drug reactions [16].
Another group of widely adopted methods employ multisource data to predict the associations between drugs and adverse reactions. Yamanishi et al. (2012) presented a drug side effect prediction approach that integrates chemical and biological spaces based on kernel regression models [17]. Liu et al. (2012) utilized five machine learning algorithms for predicting adverse drug reactions by leveraging the chemical, biological, and phenotypic properties of drugs [18]. Zhang et al. (2015) proposed a feature selection-based multi-label k-nearest neighbor method, which adopts ensemble learning techniques to combine various drug related features [19]. Ding et al. (2018) identified drug–side effect associations using a combination of a semi-supervised model and multiple kernel learning. Their approach enabled the integration of multiple sources of drug-related information, including the known relationships between drugs and side effect terms [20].
Although previous methods have yielded promising predictive outcomes, they encounter challenges when applied to new drugs with limited pre-existing information. Specifically, the approach relying on known neighbor nodes in the constructed heterogeneous graph fails to predict the potential adverse drug reactions (ADRs) for such scenarios. Moreover, the early stages of drug development mainly offer information on the chemical structure of the drug candidate, while certain biological information cannot be incorporated into the prediction model. Consequently, these methods do not provide prediction frameworks suitable for new drug molecules.
Obviously, there are also methods developed for predicting adverse reactions of new drugs. Pauwels et al. (2011) employed a sparse canonical correlation analysis model that relied on chemical structures to predict potential drug side effects [21]. Niu et al. (2015) developed a web service called DSEP, which utilizes chemical substructures to predict potential adverse drug reactions (ADRs) without relying on other factors [22]. Dimitri et al. (2017) introduced DrugClust, a method that clusters drugs based on their features and subsequently predicts side effects using Bayesian scores [23]. Ping Xuan et al. (2022) explored the effective utilization of graph structures and attribute information in drug-related data for predicting drug side effects. By considering the relationships between drugs, drug features, and side effect labels, they proposed a novel approach to enhance the accuracy of side effect prediction [24].
However, these methods exhibit limitations, including the random allocation of drug–adverse reaction pairs into training and testing sets. This approach leads to the inadvertent use of information from test set drugs during training and a deficiency in external validation. Furthermore, these methods have not fully utilized the potential of drug gene expression profile data. Some studies indicate that drug-induced alterations in gene expression may contribute to systemic off-target effects and subsequent adverse effects [25,26,27,28]. This highlights the potential significance of transcriptomic data, where alterations in gene expression can act as early markers of toxicity. These changes are frequently detectable before the appearance of histopathological or clinical signs, offering crucial insights into drug adverse reactions [29].
To overcome the limitations of the previously mentioned methods, we propose BiMPADR, a deep learning framework designed for predicting adverse drug reactions (ADRs) in new drugs. We hypothesized that compounds with similar structures are likely to elicit analogous adverse reactions. Differential gene expression levels can lead to different adverse reactions. Our framework incorporates a binary network-based message passing neural network that integrates drug expression signatures related to each ADR into its feature representation. These features are subsequently merged with compound structural data, represented by fingerprints, and a fully connected neural network is utilized to predict the associations between drugs and ADRs. Extensive evaluations on various representative datasets confirm the high accuracy of our method. Furthermore, the performance on external validation data showcases the utility of our model as a highly valuable tool for predicting ADRs in new drugs.

2. Results and Discussion

2.1. Performance on Different Datasets

We present all the results of our model in Table 1, which includes the performances on the training set, test set, and external validation dataset. It can be observed that regardless of the fingerprint used, the model consistently demonstrates stable and satisfying predictive performances across all four data sources. In the case of the external validation dataset, the AUC exceeds 0.85. The Precision of the model in the test set can reach 0.785~0.855. In purely external validation, the Precision drops slightly because this part of the data uses extremely unbalanced data. However, the AUC considers the overall performance of the classifier at different thresholds, not just the accuracy at a single threshold. Therefore, the AUC is still relatively high when the Precision is low, indicating that the model still has a good sorting ability when distinguishing between majority and minority classes; it does not affect the effect of our model in clinical application.
To further explore the factors influencing the model’s performance and its applicability range, we depict the results of the model under different input conditions (AUC on the external validation dataset) using a box plot in Figure 1. The following results can be derived from the analysis:

2.1.1. Performance on Different Fingerprints

Different types of drug fingerprints may have different calculation methods and thus different representational capabilities. Based on the results shown in Figure 1A, we observed that the choice of different compound fingerprints as drug structural features during model training did not significantly impact the model’s performance. Therefore, we can conclude that the widely applied fingerprints that represent compound structural features can be effectively utilized in our model without excessive consideration of specific fingerprint selection or conversion. This finding also highlights the robustness of our model in handling diverse types of compound data.

2.1.2. Performance on Different GE

Accurate prediction results can be obtained regardless of the type of cell line used for modeling, but the shorter length of the box plot from Figure 1B for normal cell lines indicates greater stability in the results. It can be inferred that certain gene perturbations after drug treatment may lead to the occurrence of adverse reactions, and these perturbations are relatively similar between normal and tumor cell lines. Therefore, in the absence of gene expression data from normal cell lines, gene perturbation data from tumor cell lines can also be widely applicable in adverse reaction prediction research.

2.1.3. Performance on ADR Selection

When we selected all adverse reactions from SIDER, the AUC was above 0.9, while choosing adverse reactions that appeared in the ADReCS dataset resulted in an AUC of around 0.86 (Figure 1C). One possible reason for this result could be that there is less association between the adverse reactions provided by ADReCS and the 978 core landmark genes, with most associations being filled with zeros. Another reason could be that constructing a dataset by directly selecting all adverse reactions from SIDER provides more drug–adverse reaction pairs, a larger sample size, and a better fitting of the model. Whether the initial information related to adverse reaction genes contributes to the prediction needs to be further explored through ablation experiments.

2.2. Ablation Study

We conducted ablation experiments to explore the impact of the selection of initial information related to adverse reactions and the application of the MPNN module on the predictive performance of the model. Since the choice of different compound fingerprints had a minimal impact on the model, we did not consider the role of fingerprints in this part of this study.
To explore whether using ADR–gene association information as the initial input feature can improve the model’s performance, we conducted two variant studies:
  • The first variant involved replacing the initial feature vectors of adverse reactions with zero vectors, completely excluding the use of ADR–gene association information.
  • The second variant maintained the same input as the original model but only utilized this information during the computation of attention coefficients in the binary network information propagation, without incorporating the adverse reaction initial features in the information update function, denoted as h v j = R e L U ( m v j ) . The difference in this process lies in the addition of a self-loop, where the original method is set to TRUE, while the ablation experiments are set to FALSE.
Table 2 and Table 3 present the results of the two ablation experiments in the external dataset, and Figure 2 provides a comparison between our method and the results of the ablation experiments. From Figure 2A, it can be observed that replacing the original features with zero vectors did not significantly degrade the model’s performance. However, the AUC values fluctuated more, and the stability slightly decreased under different conditions. Figure 2B also demonstrates a similar trend, but when the sample size is sufficiently large, such as when training the model using the GEn-SIDER and GEt-SIDER datasets, the impact of adding self-loops is not substantial. Therefore, we can infer that the adverse reaction–gene association information obtained from the ADReCS database can improve the predictive accuracy and stability of the model to some extent. However, when a particular adverse reaction does not exist in that database and we still want to understand its likelihood of occurrence, we can use a zero feature vector as its input in the model.
In order to investigate whether the MPNN module effectively utilizes the gene expression information of drugs and its impact on model performance, we directly concatenated the compound structure features with the adverse reaction–gene association features and used a fully connected neural network (FCNN) for prediction. From Table 4 and Figure 3, it can be observed that the predictive performance of the model significantly decreases without utilizing the MPNN module to integrate the gene expression information of drugs into the adverse reaction features. Additionally, compared to the original method, using a dataset constructed with all adverse reactions from the SIDER database, although having a larger sample size, yields poorer prediction results. This experiment demonstrates the crucial role of drug-induced cell line gene expression information in predicting associations between drugs and adverse reactions. Furthermore, the information integration method used in our model effectively utilizes the relevant information.

2.3. Performance of BiMPADR Compared with State-of-the-Art Methods

To ensure comparability between models, we select existing methods that can predict adverse reactions based solely on compound structure, including Pauwels’s method (SCCA) [21] and DrugClust [23]. These two comparison methods and the predictive performance of our model are shown in Table 5.
By comprehensive comparison, the AUC value of the SCCA algorithm is above 0.89, slightly higher than that of the BiMPADR algorithm, 0.86, but its ACC value is only about 0.5, which is far lower than the predicted result of this model. The accuracy of the model is also low, with a minimum of 0.38. The AUC value of the DrugClust algorithm is about 0.6, which is much lower than the other two methods. Although its Precision is relatively high, we tend to pay more attention to the AUC index, which can reflect the ordering ability in clinical practice. We randomly selected 50 drugs and 50 adverse reactions from the predicted values of each method in GEn-SIDER datasets to draw heat maps, and the results are shown in Figure 4. As can be seen from the graph, the SCCA and DrugClust prediction results have multiple lines of identical data. This reflects a very big drawback of the two control models; that is, multiple drugs often have the same predictive value vector, and the prediction results of multiple drugs for each adverse reaction may be the same, which greatly reduces the practicality of the prediction model in clinical research.

2.4. Case Study

We performed a case study to evaluate the accuracy of our model’s novel predictions by conducting a literature-based assessment of the newly identified associations. NHWD-870 [30] is a novel and potent BET inhibitor intended for the treatment of various solid tumors. We used the best performance model to predict the adverse reactions of NHWD-870 and nine other BET inhibitors, Alobresib [31], INCB0576543 [32], Mivebresib [33], Pelabresib [34], Birabresib [35], Molibresib [36], TEN010 [37], PLX51107 [38], and BMS-986158 [39], that have undergone Phase I/II clinical trials. The selected drugs were not present in our modeling dataset. The complete prediction results can be found in the Supplementary Section S1. Figure 5 shows the number of adverse drug reactions with predicted values higher than 0.99. From the graph, it can be observed that HWD-870 is associated with fewer adverse reactions, and it has fewer reactions than BMS-986158.
We present the top ten adverse reactions for each drug and validate the accuracy of our predictions through the public verification of clinical trial research results on NIH (https://ncbi.nlm.nih.gov/, accessed on 12 December 2023.). Additionally, the adverse reactions on the blood and lymphatic systems recorded in the NIH are important factors that affect the development and application of BET inhibitors. Therefore, we discuss the predicted values obtained through our model for the blood and lymphatic systems-related adverse reactions documented in the NIH. The results of BMS-986158 [39] are shown below, which are most similar to NHWD-870. Other detailed results evidenced by the NIH can be found in Supplementary Section S2.
From Table 6, it can be observed that for BMS-986158, almost all predicted top ten adverse reactions were found in the corresponding clinical reports’ adverse events. BMS-986158 may potentially lead to rhabdomyolysis, although no supporting literature has been found. Regarding BMS potentially causing hyperlipidemia, there is relevant research suggesting that the BET inhibitor Apabetalone can lead to an increase in HDL-C, which contradicts our predicted results. Therefore, we used our model to calculate the association score between Apabetalone and hyperlipidemia, which resulted in a score of 0.46. Consequently, BMS may have a higher cardiovascular risk compared to other BET inhibitors. From Table 7, adverse reactions related to the blood and lymphatic systems also had predicted values mostly exceeding 0.5, even reaching above 0.9.
Since NHWD-870 is a structural modification of BMS, we provide an overview of the adverse reactions produced by these two drugs in different organ systems, as shown in Figure 6 (results of other drugs can be found in Supplementary Section S2). The more clustered the points are at the top, the more likely the drug is to generate a greater number of adverse reactions within that system. It can be observed that NHWD-870 exhibits reduced adverse reactions in the blood and lymphatic system compared to BMS. However, it may potentially cause more adverse reactions in the liver and renal system.
For HWD-870, we selected adverse reactions with predicted values > 0.99 and created an association network shown in Figure 7 using the software ‘Cytoscape 3.6.1’. According to our predictions, HWD-870 is associated with common blood and lymphatic system disorders, such as Anemia, Thrombocytopenia, Coagulopathy, Neutropenia, and Leukopenia. It may also cause other severe adverse reactions in different systems, such as Acute Renal Failure, Upper Respiratory Tract Infection, and Hypertension.

3. Materials and Methods

3.1. Datasets

In this study, we use four types of data sources: (1) ground truth for drug–ADR pair labels, (2) gene expression profiling of the compounds (GE), (3) the chemical structure of the compounds (CS), and (4) ADR–gene associations (AS).
We obtained the ADR labels from the SIDER 4.1 Database [40], which includes data on medications available in the market and their reported ADRs obtained from public documents. In the SIDER 4.1 version of the database, there are approximately 1430 drugs, 5868 ADRs, and 139,756 drug–ADR associations. The MedDRA concept type was used to specify ADR terms and phrases. The preferred term (PT) level in SIDER was utilized as the standard ADR vocabulary to avoid the semantic redundancy.
The Library of Integrated Network-based Cellular Signatures (LINCS) database has a large collection of gene expression profiles that show how different human cell lines respond to 20,413 compounds at the transcriptomic level [41,42]. Considering that adverse reactions often occur within the normal organs of the human body, we categorized the expression data of drugs into perturbations in normal/primary cell lines and tumor cell lines, named GEn and GEt in our research. To avoid information redundancy, we selected the strongest signatures for each drug, irrespective of the cell type, dosage, or time point, utilizing level 5 data. The signatures for the 978 directly measured landmark genes were selected in this study.
The 2D chemical structures of small-molecule compounds are represented in the SMILES format. SMILES strings for marketed drugs were collected from PubChem [43] using PubChem Compound IDs from SIDER. Drug chemical structures were mapped to three types of fingerprints: PubChem, MACCS, and ECFP using the PyBioMed [44] Python library. PubChem fingerprints consist of 881 chemical substructures derived from the PubChem database. MACCS fingerprints consist of 166 structural keys representing molecular features. ECFP fingerprints capture local and global molecular features through atom neighborhood enumeration and hashing. The fingerprint size used here is 1024 bits.
The ADReCS-Target [45] database offers extensive information regarding ADRs resulting from drug interactions with proteins, genes, genetic variations, and gene–ADR associations. There are 1156 ADRs, 8571 genes, and 2,443,256 gene–ADR pairs included. We organized the associations between ADRs and the 978 landmark genes mentioned in the LINCS database into a binary profile. If an ADR–gene association was documented in the ADReCS-Target database, we marked that position as 1; otherwise, it was filled with 0.
The set of drugs have perturbations in the above two categories of cell lines, which can be found in SIDER, which contains 656 and 766 compounds, respectively (duplicates are avoided by taking the drug ids, which are unique). Drugs lacking gene expression information in SIDER were considered as external validation data. The ADRs that are observed with at least one drug are included. Therefore, the number of adverse reactions left for further study corresponding to these two sets of drugs is 3616 and 3695, respectively. Among these adverse reactions, 751 and 762 are also recorded in the ADReCS-Target database. In the end, we obtained a total of four datasets with varying numbers of drugs and adverse reactions (Figure 8 and Table 8).

3.2. Methods

3.2.1. MPNNs

Message passing neural networks [46] (MPNNs) are a class of general frameworks used for supervised learning on graphs. They are commonly applied to undirected graphs, where node features are represented as x v and edge features as e v w . The usage of such models primarily consists of two stages: the message passing stage and the readout stage. During the message passing stage, the model iteratively updates the hidden layer features of each node, using an information function M t and a vertex update function U t , for a total of T iterations. The updated hidden layer features h v t for each node, based on the information m v t + 1 and the previous hidden layer features, can be expressed by the following formula:
m v t + 1 = w N ( v ) M t ( h v t , h w t , e v w )
h v t + 1 = U t ( h v t , m v t + 1 )
In the summation process, N ( v ) represents all neighboring nodes of the node v in the graph. During the readout stage, a common readout function R is used to calculate a feature vector based on the entire graph, according to the following formula:
y ^ = R ( { h v T | v G } )
The message functions M t , vertex update functions U t , and readout function R are all learned differentiable functions. We can define these functions according to our purposes.

3.2.2. Overall Schema of the Deep Learning Network

In our study, we defined the task of predicting the association between drugs and adverse drug reactions (ADRs) as a binary classification problem. We extracted informative features from both drugs and ADRs and utilized these features to train the model in order to predict novel associations. Figure 9 shows the frame of our method. We generated the features of ADRs via MPNNs and yielded a latent representation of drug fingerprints via fully connected layers. After processing both the drug and ADR layers, we concatenated these layers and constructed the fully connected layer, resulting in the output. Every layer except the output layer was activated with the LeakyReLU function. The output layer was activated with the sigmoid function to predict whether the drug and ADR interact.

3.2.3. MPNN Layer with ADR Embedding Vector

We can view the association network between drugs and adverse reactions as a bipartite graph B G ( U , V , E ) , where U represents the drug nodes in the graph; V represents the adverse reaction nodes; u i and v j denote the i -th and j -th node in U and V , respectively; i = 1,2 , , M ,   j = 1,2 , , N ; E is a set of edges representing an association between a drug and an adverse drug reaction; e = { ( u , v ) | u U ,   v V } ; and e i j denotes the edge between u i and v j . The gene expression feature matrix for drugs can be represented as X u , X u R M × P , where x u i represents the gene expression feature vectors for each drug. The initial input feature matrix for adverse reactions can be represented as X v , X v R N × Q , where x v j represents the initial feature vectors for each adverse reaction and h v j represents the updated adverse reaction feature vectors after information propagation.
To apply the MPNN framework on the bipartite graph, appropriate information functions and vertex update functions need to be selected for feature propagation and aggregation among the nodes. For simplicity, we perform only one iteration, denoted as T = 1 . The process of propagating the gene expression information from drug nodes to adverse reaction nodes’ feature representations can be defined as
m v j = u i N v j e M ( x v j , x u i )
h v j = U ( x v j , m v j )
where N v j e represents all nodes connected to node v j through edges in the bipartite graph B G ( U , V , E ) . We apply the GAT (Graph Attention Network) [47] to the process of information propagation and aggregation, defining W u R P × S and W v R Q × S as two learnable weight parameter matrices. The purpose is to linearly transform the input features of the two types, aiming to acquire sufficient data representation capacity. Thus, our message functions M and vertex update functions U can be expressed as
m v j = M ( x v j , x u i ) = α u i , v j W v x v j
h v j = U x v j , m v j = W v x v j + R e L U ( m v j )
where α u i , v j represents the attention coefficients, indicating the importance of a node to node v j . It can be calculated using the following formula, where σ is the non-linear function LeakyReLU and α R 2 S :
α u i , v j = e x p ( ρ ( α T [ W u x u i | | W v x v j ] ) ) u k N v j e ρ ( α T [ W u x u k | | W v x v j ] )

3.3. Experimental Setting

We employ 5-fold cross-validation to assess the performance of our models. The cross-validation folds are stratified based on drugs, ensuring that all experiments involving a particular drug are either entirely in the training set or completely in the test set. This setup enables our models to predict the side effects of previously unseen drugs during testing. To tackle data imbalance in the training datasets and test datasets, we consider all confirmed drug–adverse reaction associations as positive samples, and we randomly select unobserved associations as negative samples in a 1:1 ratio. In external validation datasets, we predict all possible associations between drugs and adverse events.
We utilize the binary cross-entropy [48] (BCE) loss function to measure the discrepancy between predicted and true labels. An Adam optimizer [49] is used for training the neural networks. Additionally, we incorporate regular dropout to the hidden layer units in the MLP decoder, which helps to prevent overfitting and encourages the model to learn more robust and generalizable representations.
We measure the prediction performance using three criteria: the AUC, Precision, and ACC, which are widely used for drug indication prediction tasks. Let P and N represent the counts of positive and negative instances in the dataset, respectively. TP, FN, TN, and FP denote the counts of true positives, false negatives, true negatives, and false positives in the predictions. The following performance metrics are defined:
P r e c i s i o n = T P T P + F P
A C C = T P + T N P + N
Our method is implemented in Python 3.7.13 and PyTorch 1.7.1. We use a Random Search to determine the hyperparameters. The batch size is set to be 10,000, and the Adam optimizer is used with a learning rate of 1 × 10−4. We set the dropout rate for this work to 0.2. We allow the model to run for 300 epochs at most for all datasets. The best-performing model is selected at the epoch giving the best AUC score on the test set, which is then used to evaluate the final performance on the external validation set.

4. Conclusions

We developed a novel ADR detection model named BiMPADR based on a bipartite message passing neural network. Our model achieved information fusion across drug gene expressions and ADR–gene association information. The proposed model conducted the integration of drug expression information and gene–ADR association information, enriching the practical significance provided by each adverse reaction feature vector. Extensive experiments have shown that our model achieves an excellent performance in the task of drug–ADR prediction under different conditions. Furthermore, we conducted external validation to confirm the potential applicability of our approach to new drugs. Case studies provide concrete examples that validate the practical utility of our approach. It will assist pharmacists and healthcare providers in comprehending the potential risks of drug side effects and addressing the problem of underreporting spontaneous reports. In future work, we intend to employ geometric deep learning techniques to extract compound structural features and better utilize compound information to further enhance the predictive performance of our model. Additionally, we aim to identify suitable methods for assessing the contribution of genes to the occurrence of each adverse reaction.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/molecules29081784/s1, Predicted values for all ten BET inhibitors across 3616 adverse drug reactions. Table S1: Evidence for the top ten predicted ADRs. Table S2: Blood and lymphatic system disorder ADRs recorded by NIH and related literature. Figure S1: Adverse reactions produced by other BET inhibitors in different organ systems. Refs. [50,51,52,53] are cited in supplementary files.

Author Contributions

Conceptualization, K.L. and S.L.; methodology, S.L.; software, J.J.; validation, X.Z.; formal analysis, S.L.; investigation, L.Z.; resources, S.L.; data curation, L.W.; writing—original draft preparation, S.L.; writing—review and editing, L.C.; visualization, J.H.; supervision, K.L.; project administration, K.L.; funding acquisition, K.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China, grant numbers 82273734 and 82304250.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The datasets and codes used during the current study are available in the github repository, https://github.com/Ls94wood/BiMPADR.git (accessed on 9 December 2021).

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Nebeker, J.R.; Barach, P.; Samore, M.H. Clarifying adverse drug events: A clinician’s guide to terminology, documentation, and reporting. Ann. Intern. Med. 2004, 140, 795–801. [Google Scholar] [CrossRef]
  2. Pirmohamed, M.; James, S.; Meakin, S.; Green, C.; Scott, A.K.; Walley, T.J.; Farrar, K.; Park, B.K.; Breckenridge, A.M. Adverse drug reactions as cause of admission to hospital: Prospective analysis of 18 820 patients. BMJ (Clin. Res. Ed.) 2004, 329, 15–19. [Google Scholar] [CrossRef] [PubMed]
  3. Cocos, A.; Fiks, A.G.; Masino, A.J. Deep learning for pharmacovigilance: Recurrent neural network architectures for labeling adverse drug reactions in Twitter posts. J. Am. Med. Inform. Assoc. JAMIA 2017, 24, 813–821. [Google Scholar] [CrossRef] [PubMed]
  4. Chi, L.H.; Burrows, A.D.; Anderson, R.L. Can preclinical drug development help to predict adverse events in clinical trials? Drug discovery today 2022, 27, 257–268. [Google Scholar] [CrossRef] [PubMed]
  5. Gurwitz, J.H.; Field, T.S.; Avorn, J.; McCormick, D.; Jain, S.; Eckler, M.; Benser, M.; Edmondson, A.C.; Bates, D.W. Incidence and preventability of adverse drug events in nursing homes. Am. J. Med. 2000, 109, 87–94. [Google Scholar] [CrossRef] [PubMed]
  6. Batel Marques, F.; Penedones, A.; Mendes, D.; Alves, C. A systematic review of observational studies evaluating costs of adverse drug reactions. Clin. Outcomes Res. CEOR 2016, 8, 413–426. [Google Scholar] [CrossRef] [PubMed]
  7. Ernst, F.R.; Grizzle, A.J. Drug-related morbidity and mortality: Updating the cost-of-illness model. J. Am. Pharm. Assoc. 2001, 41, 192–199. [Google Scholar] [CrossRef] [PubMed]
  8. Arrowsmith, J.; Miller, P. Trial watch: Phase II and phase III attrition rates 2011–2012. Nat. Rev. Drug Discov. 2013, 12, 569. [Google Scholar] [CrossRef] [PubMed]
  9. Hughes, J.P.; Rees, S.; Kalindjian, S.B.; Philpott, K.L. Principles of early drug discovery. Br. J. Pharmacol. 2011, 162, 1239–1249. [Google Scholar] [CrossRef]
  10. da Silva, B.A.; Krishnamurthy, M. The alarming reality of medication error: A patient case and review of Pennsylvania and National data. J. Community Hosp. Intern. Med. Perspect. 2016, 6, 31758. [Google Scholar] [CrossRef]
  11. Tatonetti, N.P. The Next Generation of Drug Safety Science: Coupling Detection, Corroboration, and Validation to Discover Novel Drug Effects and Drug-Drug Interactions. Clin. Pharmacol. Ther. 2018, 103, 177–179. [Google Scholar] [CrossRef] [PubMed]
  12. Voskens, C.J.; Goldinger, S.M.; Loquai, C.; Robert, C.; Kaehler, K.C.; Berking, C.; Bergmann, T.; Bockmeyer, C.L.; Eigentler, T.; Fluck, M.; et al. The price of tumor control: An analysis of rare side effects of anti-CTLA-4 therapy in metastatic melanoma from the ipilimumab network. PLoS ONE 2013, 8, e53745. [Google Scholar] [CrossRef] [PubMed]
  13. Cami, A.; Arnold, A.; Manzi, S.; Reis, B. Predicting adverse drug events using pharmacological network models. Sci. Transl. Med. 2011, 3, 114ra127. [Google Scholar] [CrossRef] [PubMed]
  14. Zhang, W.; Zou, H.; Luo, L.; Liu, Q.; Wu, W.; Xiao, W. Predicting potential side effects of drugs by recommender methods and ensemble learning. Neurocomputing 2016, 173, 979–987. [Google Scholar] [CrossRef]
  15. Galeano, D.; Paccanaro, A. A recommender system approach for predicting drug side effects. In Proceedings of the 2018 International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brazil, 8–13 July 2018; pp. 1–8. [Google Scholar]
  16. Lin, J.; Kuang, Q.; Li, Y.; Zhang, Y.; Sun, J.; Ding, Z.; Li, M. Prediction of adverse drug reactions by a network based external link prediction method. Anal. Methods 2013, 5, 6120–6127. [Google Scholar] [CrossRef]
  17. Yamanishi, Y.; Pauwels, E.; Kotera, M. Drug side-effect prediction based on the integration of chemical and biological spaces. J. Chem. Inf. Model. 2012, 52, 3284–3292. [Google Scholar] [CrossRef] [PubMed]
  18. Liu, M.; Wu, Y.; Chen, Y.; Sun, J.; Zhao, Z.; Chen, X.-w.; Matheny, M.E.; Xu, H. Large-scale prediction of adverse drug reactions using chemical, biological, and phenotypic properties of drugs. J. Am. Med. Inform. Assoc. 2012, 19, e28–e35. [Google Scholar] [CrossRef] [PubMed]
  19. Zhang, W.; Liu, F.; Luo, L.; Zhang, J. Predicting drug side effects by multi-label learning and ensemble learning. BMC Bioinform. 2015, 16, 365. [Google Scholar] [CrossRef]
  20. Ding, Y.; Tang, J.; Guo, F. Identification of drug-side effect association via semisupervised model and multiple kernel learning. IEEE J. Biomed. Health Inform. 2018, 23, 2619–2632. [Google Scholar] [CrossRef]
  21. Pauwels, E.; Stoven, V.; Yamanishi, Y. Predicting drug side-effect profiles: A chemical fragment-based approach. BMC Bioinform. 2011, 12, 169. [Google Scholar] [CrossRef]
  22. Niu, S.-Y.; Xin, M.-Y.; Luo, J.; Liu, M.-Y.; Jiang, Z.-R. Dsep: A tool implementing novel method to predict side effects of drugs. J. Comput. Biol. 2015, 22, 1108–1117. [Google Scholar] [CrossRef]
  23. Dimitri, G.M.; Lió, P. DrugClust: A machine learning approach for drugs side effects prediction. Comput. Biol. Chem. 2017, 68, 204–210. [Google Scholar] [CrossRef]
  24. Xuan, P.; Wang, M.; Liu, Y.; Wang, D.; Zhang, T.; Nakaguchi, T. Integrating specific and common topologies of heterogeneous graphs and pairwise attributes for drug-related side effect prediction. Brief. Bioinform. 2022, 23, bbac126. [Google Scholar] [CrossRef] [PubMed]
  25. Lee, S.; Lee, K.H.; Song, M.; Lee, D. Building the process-drug–side effect network to discover the relationship between biological Processes and side effects. BMC Bioinform. 2011, 12, S2. [Google Scholar] [CrossRef] [PubMed]
  26. Handschin, C.; Meyer, U.A. Induction of drug metabolism: The role of nuclear receptors. Pharmacol. Rev. 2003, 55, 649–673. [Google Scholar] [CrossRef]
  27. Toyoshiba, H.; Sawada, H.; Naeshiro, I.; Horinouchi, A. Similar compounds searching system by using the gene expression microarray database. Toxicol. Lett. 2009, 186, 52–57. [Google Scholar] [CrossRef]
  28. Babcock, J.J.; Du, F.; Xu, K.; Wheelan, S.J.; Li, M. Integrated analysis of drug-induced gene expression profiles predicts novel hERG inhibitors. PLoS ONE 2013, 8, e69513. [Google Scholar] [CrossRef]
  29. Zhang, J.D.; Sach-Peltason, L.; Kramer, C.; Wang, K.; Ebeling, M. Multiscale modelling of drug mechanism and safety. Drug Discov. Today 2020, 25, 519–534. [Google Scholar] [CrossRef] [PubMed]
  30. Kuhn, M.; Letunic, I.; Jensen, L.J.; Bork, P. The SIDER database of drugs and side effects. Nucleic Acids Res. 2016, 44, D1075–D1079. [Google Scholar] [CrossRef]
  31. Stathias, V.; Turner, J.; Koleti, A.; Vidovic, D.; Cooper, D.; Fazel-Najafabadi, M.; Pilarczyk, M.; Terryn, R.; Chung, C.; Umeano, A. LINCS Data Portal 2.0: Next generation access point for perturbation-response signatures. Nucleic Acids Res. 2020, 48, D431–D439. [Google Scholar] [CrossRef]
  32. Subramanian, A.; Narayan, R.; Corsello, S.M.; Peck, D.D.; Natoli, T.E.; Lu, X.; Gould, J.; Davis, J.F.; Tubelli, A.A.; Asiedu, J.K. A next generation connectivity map: L1000 platform and the first 1,000,000 profiles. Cell 2017, 171, 1437–1452.e17. [Google Scholar] [CrossRef] [PubMed]
  33. Kim, S.; Thiessen, P.A.; Bolton, E.E.; Chen, J.; Fu, G.; Gindulyte, A.; Han, L.; He, J.; He, S.; Shoemaker, B.A. PubChem substance and compound databases. Nucleic Acids Res. 2016, 44, D1202–D1213. [Google Scholar] [CrossRef] [PubMed]
  34. Dong, J.; Yao, Z.-J.; Zhang, L.; Luo, F.; Lin, Q.; Lu, A.-P.; Chen, A.F.; Cao, D.-S. PyBioMed: A python library for various molecular representations of chemicals, proteins and DNAs and their interactions. J. Cheminform. 2018, 10, 16. [Google Scholar] [CrossRef] [PubMed]
  35. Huang, L.-H.; He, Q.-S.; Liu, K.; Cheng, J.; Zhong, M.-D.; Chen, L.-S.; Yao, L.-X.; Ji, Z.-L. ADReCS-Target: Target profiles for aiding drug safety research and application. Nucleic Acids Res. 2018, 46, D911–D917. [Google Scholar] [CrossRef] [PubMed]
  36. Gilmer, J.; Schoenholz, S.S.; Riley, P.F.; Vinyals, O.; Dahl, G.E. Message passing neural networks. In Machine Learning Meets Quantum Physics; Springer: Cham, Switzerland, 2020; pp. 199–214. [Google Scholar]
  37. Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. arXiv 2017, arXiv:1710.10903. [Google Scholar]
  38. Ruby, U.; Yendapalli, V. Binary cross entropy with deep learning technique for image classification. Int. J. Adv. Trends Comput. Sci. Eng. 2020, 9, 5393–5397. [Google Scholar]
  39. Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
  40. Yin, M.; Guo, Y.; Hu, R.; Cai, W.L.; Li, Y.; Pei, S.; Sun, H.; Peng, C.; Li, J.; Ye, R. Potent BRD4 inhibitor suppresses cancer cell-macrophage interaction. Nat. Commun. 2020, 11, 1833. [Google Scholar] [CrossRef] [PubMed]
  41. Bonazzoli, E.; Predolini, F.; Cocco, E.; Bellone, S.; Altwerger, G.; Menderes, G.; Zammataro, L.; Bianchi, A.; Pettinella, F.; Riccio, F. Inhibition of BET bromodomain proteins with GS-5829 and GS-626510 in uterine serous carcinoma, a biologically aggressive variant of endometrial cancer. Clin. Cancer Res. 2018, 24, 4845–4853. [Google Scholar] [CrossRef]
  42. Stubbs, M.C.; Maduskuie, T.; Burn, T.; Diamond-Fosbenner, S.; Falahatpisheh, N.; Volgina, A.; Zolotarjova, N.; Wen, X.; Feldman, P.; Rupar, M. Preclinical characterization of the potent and selective BET inhibitor INCB057643 in models of hematologic malignancies. Cancer Res. 2017, 77, 5071. [Google Scholar] [CrossRef]
  43. Faivre, E.J.; Wilcox, D.M.; Hessler, P.; Uziel, T.; Tapang, P.; Magoc, T.; Albert, D.H.; Fang, G.; Rosenberg, S.; McDaniel, K. ABBV-075, a novel BET family inhibitor, disrupts critical transcription programs that drive prostate cancer growth to induce potent anti-tumor activity in vitro and in vivo. Cancer Res. 2016, 76, 4694. [Google Scholar] [CrossRef]
  44. Albrecht, B.K.; Gehling, V.S.; Hewitt, M.C.; Vaswani, R.G.; Côté, A.; Leblanc, Y.; Nasveschuk, C.G.; Bellon, S.; Bergeron, L.; Campbell, R. Identification of a benzoisoxazoloazepine inhibitor (CPI-0610) of the bromodomain and extra-terminal (BET) family as a candidate for human clinical trials. J. Med. Chem. 2016, 59, 1330–1339. [Google Scholar] [CrossRef] [PubMed]
  45. Noel, J.; Iwata, K.; Ooike, S.; Sugahara, K.; Nakamura, H.; Daibata, M. Development of the BET bromodomain inhibitor OTX015. In Proceedings of the AACR-NCI-EORTC International Conference: Molecular Targets and Cancer Therapeutics, Boston, MA, USA, 19–23 October 2013. [Google Scholar]
  46. Nicodeme, E.; Jeffrey, K.L.; Schaefer, U.; Beinke, S.; Dewell, S.; Chung, C.-W.; Chandwani, R.; Marazzi, I.; Wilson, P.; Coste, H. Suppression of inflammation by a synthetic histone mimic. Nature 2010, 468, 1119–1123. [Google Scholar] [CrossRef] [PubMed]
  47. Firle, K.; Szymansky, A.; Witthauer, M.; Dorado-Garcia, H.; Toedling, J.; Schoenbeck, K.; Henssen, A.; Hertwig, F.; Eggert, A.; Schulte, J. Preclinical evaluation of BET-bromodomain inhibitor TEN-010 as monotherapy and combination therapy in MYC-driven neuroblastoma. Ann. Oncol. 2018, 29, iii12–iii13. [Google Scholar] [CrossRef]
  48. Ozer, H.G.; El-Gamal, D.; Powell, B.; Hing, Z.A.; Blachly, J.S.; Harrington, B.; Mitchell, S.; Grieselhuber, N.R.; Williams, K.; Lai, T.-H. BRD4 profiling identifies critical chronic lymphocytic leukemia oncogenic circuits and reveals sensitivity to PLX51107, a novel structurally distinct BET inhibitor. Cancer Discov. 2018, 8, 458–477. [Google Scholar] [CrossRef]
  49. Hilton, J.; Cristea, M.; Postel-Vinay, S.; Baldini, C.; Voskoboynik, M.; Edenfield, W.; Shapiro, G.I.; Cheng, M.L.; Vuky, J.; Corr, B. BMS-986158, a small molecule inhibitor of the bromodomain and extraterminal domain proteins, in patients with selected advanced solid tumors: Results from a phase 1/2a trial. Cancers 2022, 14, 4079. [Google Scholar] [CrossRef] [PubMed]
  50. Roboz, G.J.; Desai, P.; Lee, S.; Ritchie, E.K.; Winer, E.S.; DeMario, M.; Brennan, B.; Nüesch, E.; Chesne, E.; Brennan, L. A dose escalation study of RO6870810/TEN-10 in patients with acute myeloid leukemia and myelodysplastic syndrome. Leuk. Lymphoma 2021, 62, 1740–1748. [Google Scholar] [CrossRef]
  51. Senapati, J.; Fiskus, W.C.; Daver, N.; Wilson, N.R.; Ravandi, F.; Garcia-Manero, G.; Kadia, T.; DiNardo, C.D.; Jabbour, E.; Burger, J. Phase I Results of Bromodomain and Extra-Terminal Inhibitor PLX51107 in Combination with Azacitidine in Patients with Relapsed/Refractory Myeloid Malignancies. Clin. Cancer Res. 2023, 29, 4352–4360. [Google Scholar] [CrossRef] [PubMed]
  52. Blum, K.A.; Supko, J.G.; Maris, M.B.; Flinn, I.W.; Goy, A.; Younes, A.; Bobba, S.; Senderowicz, A.M.; Efuni, S.; Rippley, R. A phase I study of pelabresib (CPI-0610), a small-molecule inhibitor of BET proteins, in patients with relapsed or refractory lymphoma. Cancer Res. Commun. 2022, 2, 795–805. [Google Scholar] [CrossRef]
  53. Piha-Paul, S.A.; Sachdev, J.C.; Barve, M.; LoRusso, P.; Szmulewitz, R.; Patel, S.P.; Lara, P.N., Jr.; Chen, X.; Hu, B.; Freise, K.J. First-in-human study of mivebresib (ABBV-075), an oral pan-inhibitor of bromodomain and extra terminal proteins, in patients with relapsed/refractory solid tumors. Clin. Cancer Res. 2019, 25, 6309–6319. [Google Scholar] [CrossRef]
Figure 1. AUC of the external validation dataset under different conditions: (A) different compound fingerprint selections; (B) different drug cell line expression data selections; (C) different adverse reaction selections.
Figure 1. AUC of the external validation dataset under different conditions: (A) different compound fingerprint selections; (B) different drug cell line expression data selections; (C) different adverse reaction selections.
Molecules 29 01784 g001
Figure 2. AUC of the external validation dataset under different ablations: (A) ablation experiments without ADR–gene information; (B) ablation experiments without self-loop.
Figure 2. AUC of the external validation dataset under different ablations: (A) ablation experiments without ADR–gene information; (B) ablation experiments without self-loop.
Molecules 29 01784 g002
Figure 3. AUC of the external dataset under ablation experiments without MPNN module.
Figure 3. AUC of the external dataset under ablation experiments without MPNN module.
Molecules 29 01784 g003
Figure 4. Visualization of predicted values on GEn-SIDER datasets by three methods.
Figure 4. Visualization of predicted values on GEn-SIDER datasets by three methods.
Molecules 29 01784 g004
Figure 5. Count of ADRs with a predicted value greater than 0.99.
Figure 5. Count of ADRs with a predicted value greater than 0.99.
Molecules 29 01784 g005
Figure 6. Adverse reaction predictions across different organ system classifications: (A) predictive value for BMS-986158 in different system; (B) predictive value for HWD-870 in different system.
Figure 6. Adverse reaction predictions across different organ system classifications: (A) predictive value for BMS-986158 in different system; (B) predictive value for HWD-870 in different system.
Molecules 29 01784 g006
Figure 7. Most relative ADRs of NHWD-870.
Figure 7. Most relative ADRs of NHWD-870.
Molecules 29 01784 g007
Figure 8. Overview of the datasets used in this study: (A) the drugs selected for this study; (B) the adverse reactions selected for this study.
Figure 8. Overview of the datasets used in this study: (A) the drugs selected for this study; (B) the adverse reactions selected for this study.
Molecules 29 01784 g008
Figure 9. The workflow and architecture of BiMPADR: (A) the model receives three parts of data, chemical structures (CSs) used to encode the feature of drugs, drug-induced gene expression (GE), and ADR–gene associations (ASs) used to encode the feature of ADRs through MPNN module; (B) message transfer direction in the MPNN module. Solid arrows represent the transmission of drug information to adjacent adverse reactions, while dashed arrows represent the self-transmission of adverse reaction information.
Figure 9. The workflow and architecture of BiMPADR: (A) the model receives three parts of data, chemical structures (CSs) used to encode the feature of drugs, drug-induced gene expression (GE), and ADR–gene associations (ASs) used to encode the feature of ADRs through MPNN module; (B) message transfer direction in the MPNN module. Solid arrows represent the transmission of drug information to adjacent adverse reactions, while dashed arrows represent the self-transmission of adverse reaction information.
Molecules 29 01784 g009
Table 1. The summary of model performance.
Table 1. The summary of model performance.
DatasetCSTrainTestExternal Validation
AUCPrecisionACCAUCPrecisionACCAUCPrecisionACC
GEn-ADReCSECFP20.948 ± 0.0150.839 ± 0.0320.877 ± 0.0170.873 ± 0.0180.796 ± 0.0340.802 ± 0.0150.861 ± 0.0260.177 ± 0.0280.77 ± 0.053
MACCS0.958 ± 0.0070.844 ± 0.0160.889 ± 0.0090.879 ± 0.0190.798 ± 0.0280.808 ± 0.0150.871 ± 0.0160.178 ± 0.0170.774 ± 0.033
PubChem0.97 ± 0.0080.869 ± 0.0170.907 ± 0.0130.894 ± 0.010.815 ± 0.0190.819 ± 0.0070.874 ± 0.0070.193 ± 0.0120.802 ± 0.019
GEn-SIDERECFP20.975 ± 0.0120.89 ± 0.0270.923 ± 0.0250.898 ± 0.0090.853 ± 0.0110.831 ± 0.0120.903 ± 0.0030.109 ± 0.0070.849 ± 0.013
MACCS0.983 ± 0.010.898 ± 0.0280.937 ± 0.0210.906 ± 0.0060.852 ± 0.0170.84 ± 0.0030.903 ± 0.0070.106 ± 0.0130.842 ± 0.024
PubChem0.98 ± 0.0110.892 ± 0.0340.928 ± 0.0270.909 ± 0.0130.847 ± 0.0030.84 ± 0.0150.902 ± 0.0030.105 ± 0.0050.844 ± 0.01
GEt-ADReCSECFP20.95 ± 0.0240.852 ± 0.030.882 ± 0.0320.878 ± 0.0190.807 ± 0.0270.803 ± 0.0230.872 ± 0.0150.188 ± 0.0150.805 ± 0.015
MACCS0.96 ± 0.0140.842 ± 0.0320.888 ± 0.0220.877 ± 0.0120.788 ± 0.0290.798 ± 0.0170.868 ± 0.010.168 ± 0.020.768 ± 0.042
PubChem0.966 ± 0.0110.873 ± 0.0290.908 ± 0.0180.877 ± 0.0130.813 ± 0.0190.801 ± 0.0190.863 ± 0.010.189 ± 0.0190.808 ± 0.029
GEt-SIDERECFP20.982 ± 0.0070.897 ± 0.0240.934 ± 0.0170.913 ± 0.0080.849 ± 0.020.842 ± 0.0090.907 ± 0.0050.107 ± 0.0130.85 ± 0.023
MACCS0.989 ± 0.0050.917 ± 0.0140.951 ± 0.010.91 ± 0.0060.86 ± 0.010.842 ± 0.0080.905 ± 0.0070.11 ± 0.0060.859 ± 0.012
PubChem0.99 ± 0.0050.918 ± 0.0160.951 ± 0.0120.91 ± 0.0050.865 ± 0.0130.837 ± 0.0110.907 ± 0.0020.114 ± 0.0080.864 ± 0.013
Table 2. Ablation experiments for BiMPADR models without ADR–gene information.
Table 2. Ablation experiments for BiMPADR models without ADR–gene information.
DatasetTrainTestExternal Validation
AUCPrecisionACCAUCPrecisionACCAUCPrecisionACC
GEn-ADReCS0.953 ± 0.020.851 ± 0.0330.887 ± 0.030.878 ± 0.0180.804 ± 0.020.808 ± 0.0170.864 ± 0.0190.184 ± 0.0180.789 ± 0.025
GEn-SIDER0.984 ± 0.0120.906 ± 0.0340.939 ± 0.0260.904 ± 0.0090.855 ± 0.0160.836 ± 0.0060.904 ± 0.0050.109 ± 0.010.849 ± 0.018
GEt-ADReCS0.937 ± 0.0260.823 ± 0.0430.864 ± 0.0320.871 ± 0.0220.785 ± 0.0350.8 ± 0.0180.858 ± 0.0260.167 ± 0.0270.765 ± 0.052
GEt-SIDER0.98 ± 0.0170.897 ± 0.0230.933 ± 0.0260.911 ± 0.0120.849 ± 0.0120.843 ± 0.0110.902 ± 0.010.103 ± 0.0080.845 ± 0.016
Table 3. Ablation experiments for BiMPADR models without self-loop.
Table 3. Ablation experiments for BiMPADR models without self-loop.
DatasetTrainTestExternal Validation
AUCPrecisionACCAUCPrecisionACCAUCPrecisionACC
GEn-SIDER0.978 ± 0.0160.892 ± 0.0280.927 ± 0.0270.906 ± 0.0090.852 ± 0.0150.839 ± 0.0070.903 ± 0.0070.108 ± 0.010.848 ± 0.018
GEn-ADReCS0.953 ± 0.0270.851 ± 0.040.888 ± 0.0370.875 ± 0.0190.801 ± 0.0230.805 ± 0.0160.863 ± 0.020.182 ± 0.0190.785 ± 0.03
GEt-SIDER0.982 ± 0.0120.903 ± 0.0330.937 ± 0.0220.914 ± 0.010.856 ± 0.0250.844 ± 0.0070.904 ± 0.010.11 ± 0.0180.854 ± 0.029
GEt-ADReCS0.951 ± 0.0180.847 ± 0.0310.886 ± 0.0260.878 ± 0.0140.803 ± 0.020.81 ± 0.0120.864 ± 0.0150.179 ± 0.0170.788 ± 0.026
Table 4. Ablation experiments for BiMPADR models without MPNN module.
Table 4. Ablation experiments for BiMPADR models without MPNN module.
DatasetTrainTestExternal Validation
AUCPrecisionACCAUCPrecisionACCAUCPrecisionACC
GEn-SIDER0.802 ± 0.0110.719 ± 0.0090.716 ± 0.0080.649 ± 0.0230.659 ± 0.030.608 ± 0.020.634 ± 0.0070.038 ± 0.0030.755 ± 0.032
GEn-ADReCS0.877 ± 0.0160.753 ± 0.0240.775 ± 0.0150.716 ± 0.010.667 ± 0.0140.643 ± 0.010.7 ± 0.0090.103 ± 0.0050.712 ± 0.033
GEt-SIDER0.798 ± 0.0110.718 ± 0.0120.713 ± 0.0080.651 ± 0.0190.67 ± 0.0340.606 ± 0.0160.638 ± 0.0080.039 ± 0.0030.771 ± 0.041
GEt-ADReCS0.879 ± 0.0190.755 ± 0.0180.777 ± 0.0150.717 ± 0.0120.67 ± 0.0170.642 ± 0.010.701 ± 0.010.1 ± 0.0060.712 ± 0.037
Table 5. Performance comparison of different approaches.
Table 5. Performance comparison of different approaches.
DatasetMethodAUCPrecisionACC
GEn-SIDERDrugClust0.6044 ± 0.01110.1877 ± 0.01770.9644 ± 0.003
SCCA0.9131 ± 0.00020.0392 ± 0.00080.4814 ± 0.0121
BiMPADR0.902 ± 0.0030.105 ± 0.0050.844 ± 0.01
GEn-AdrecsDrugClust0.615 ± 0.01690.2415 ± 0.02430.913 ± 0.0086
SCCA0.8891 ± 0.00050.1091 ± 0.00140.5468 ± 0.0066
BiMPADR0.874 ± 0.0070.193 ± 0.0120.802 ± 0.019
GEt-SIDERDrugClust0.6335 ± 0.01690.2087 ± 0.02830.9662 ± 0.0017
SCCA0.9137 ± 0.00050.0381 ± 0.00090.4736 ± 0.0128
BiMPADR0.907 ± 0.0020.114 ± 0.0080.864 ± 0.013
GEt-AdrecsDrugClust0.651 ± 0.02020.2498 ± 0.01950.9125 ± 0.0042
SCCA0.8897 ± 0.00040.1061 ± 0.00050.5485 ± 0.0022
BiMPADR0.863 ± 0.010.189 ± 0.0190.808 ± 0.029
Table 6. Evidence for the top ten predicted ADRs in example drugs.
Table 6. Evidence for the top ten predicted ADRs in example drugs.
Drug NameADR NamePred ValueNCT Number
BMS-986158Transaminases increased0.998NCT02419417
Rhabdomyolysis0.998
Dermatitis0.997NCT02419417
Intermittent claudication0.997NCT02419417
Hypertriglyceridaemia0.997
Hyperglycaemia0.996NCT02419417
Hyperlipidaemia0.996
Upper respiratory tract infection0.996NCT02419417
Influenza-like illness0.996NCT02419417
Gastroenteritis0.995NCT02419417
Table 7. Blood and lymphatic system disorders ADRs recorded by NIH.
Table 7. Blood and lymphatic system disorders ADRs recorded by NIH.
Drug NameADR NamePred ValueNCT Number
BMS-986158Anemia0.991NCT02419417
Leukopenia0.983NCT02419417
Lymphopenia0.689NCT02419417
Neutropenia0.985NCT02419417
Thrombocytopenia0.991NCT02419417
Table 8. Summary of datasets used in this study.
Table 8. Summary of datasets used in this study.
DatasetNumber of DrugsNumber of ADRsNumber of Drugs in External Dataset
GEn-SIDER6563616774
GEn-ADReCS656751774
GEt-SIDER7663695664
GEt-ADReCS766762664
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, S.; Zhang, L.; Wang, L.; Ji, J.; He, J.; Zheng, X.; Cao, L.; Li, K. BiMPADR: A Deep Learning Framework for Predicting Adverse Drug Reactions in New Drugs. Molecules 2024, 29, 1784. https://doi.org/10.3390/molecules29081784

AMA Style

Li S, Zhang L, Wang L, Ji J, He J, Zheng X, Cao L, Li K. BiMPADR: A Deep Learning Framework for Predicting Adverse Drug Reactions in New Drugs. Molecules. 2024; 29(8):1784. https://doi.org/10.3390/molecules29081784

Chicago/Turabian Style

Li, Shuang, Liuchao Zhang, Liuying Wang, Jianxin Ji, Jia He, Xiaohan Zheng, Lei Cao, and Kang Li. 2024. "BiMPADR: A Deep Learning Framework for Predicting Adverse Drug Reactions in New Drugs" Molecules 29, no. 8: 1784. https://doi.org/10.3390/molecules29081784

Article Metrics

Back to TopTop