1. Introduction
Electronic Health Records (EHRs) are the original records of the whole process of diagnosis and treatment of patients, which can greatly improve the work efficiency and medical quality of the care process and provide a smoother patient experience [
1,
2,
3,
4]. With the deepening of medical information construction, efforts have been devoted to research studies, including medication recommendations, medical knowledge questions and answers, etc. [
5,
6,
7].
Figure 1 illustrates the synergy of the medication recommendation system in the diagnosis and treatment process, which simplifies the medical process and assists doctors in formulating safe and effective prescriptions. Moreover, the goal of medication recommendations is to recommend personalized medication combinations for patients based on their current medical records and historical health conditions. However, most of the early medication recommendation tasks are based on the prior knowledge of experts with rich clinical experience. In recent years, the continuous optimization of the wireless sensor structure has enabled the collection work of EHRs to be fully carried out [
8,
9]. Meanwhile, a large number of deep learning models are widely used in the field of medication recommendation, which significantly improves the accuracy of recommendation tasks and the feasibility for practical applications [
10,
11,
12,
13]. However, although the EHRs are cleaned and organized in pretraining, there are still many uncertainties, including a lack of information and inconsistencies from different sources, highly subjective and imprecise medical history, etc. Among them, obscure structural and temporal features and Drug–Drug Interactions (DDIs) [
14,
15] are two more critical factors that bring difficulties to the subsequent medication recommendation tasks:
- (1)
Obscure structural and temporal features: the EHRs can be considered as a continuous collection of medical entities consisting of diagnoses, procedures and medications, which are full of obscure structure and temporal correlations among medical entities. For example, a peptic ulcer may cause gastric perforation, chickenpox may cause pneumonia, and cerebral infarction often causes high blood pressure. This cryptic medical knowledge affects the accuracy of medication recommendation tasks.
- (2)
DDIs: two or more drugs taken at the same time or in sequences of time may result in a compound effect. The effect can enhance or reduce the drugs’ efficacy and reduce or aggravate the side effects. For example, the combination of a cholinesterase reactivator and atropine sulfate can produce complementary effects, which reduces the amount of atropine and adverse effects. The antimalarial drug artemisinin is susceptible to resistance when used alone and can delay resistance when used in combination with sulfamethoxine and pyrimethamine.
To enhance the learning of obscure structures and temporal features and reduce the DDI rate in recommendation results, early studies include k-means clustering [
16], the association rule method [
17] and the expert system [
18]. With the development of deep learning, Graph Neural Networks (GNN) are introduced into medication recommendation. Studies [
19,
20,
21] introduce Graph Convolutional Networks (GCN) to capture the structural and temporal features between medical events that fully improve the recommendation efficiency. However, they ignore the influence of DDIs in medication-recommendation results. In addition, some models [
22,
23,
24] account for DDIs in training, but their abilities to model the structural and temporal properties of EHRs are poor. Therefore, to simultaneously learn the cryptic structural and temporal features and reduce the DDIs in recommendation results, we propose a medical ontology tree model (Med-tree) combined with the Graph Attention Networks (GAT) for medication recommendation. Specifically, the Med-tree model applies the class hierarchy extracted from the medical ontology and the GAT model to learn the representations of diagnoses and procedures, and the model builds a DDI graph structure to reduce the DDI rate. As a result, the Med-tree model significantly improves the recommendation quality.
The contributions of our work can be summarized as follows:
- (1)
We treat EHRs as continuous records with structural and temporal properties and propose a medical ontology tree combined with GAT. The proposed model uses the medical hierarchy to model the structural features of medical entities. Meanwhile, the model applies two GRU models with attention mechanisms to capture the temporal characteristics in EHRs.
- (2)
We model the medical knowledge database and construct the drug interaction map. Specifically, the drug interaction map is embedded into the memory component using the memory bank and dynamic memory. Moreover, the query generator is applied to realize the memory search based on the attention mechanism, which effectively reduces the DDI rate.
- (3)
The proposed model is tested on MIMIC-III datasets, and the performance of the model is superior to all baselines in terms of the Jaccard Similarity Score, Average F1 score and Precision–Recall AUC. In addition, the model achieves a lower DDI rate among recommended medication combinations compared with previous recommendation models.
The rest of the paper is organized as follows:
Section 2 introduces the related works, and
Section 3 reviews the framework of the proposed model. In
Section 4, the predictive performance of the proposed model is evaluated with baselines from MIMIC-III datasets, and several analyses are presented. In the end,
Section 5 presents the conclusions and future directions of the research.
2. Related Work
2.1. Graph Attention Networks
Attention mechanisms have been successfully applied to many sequence-based tasks, such as Machine Translation (MT) and Natural Language Understanding (NLU) [
25,
26]. Different from Graph Convolutional Networks (GCN) that treat all neighbor nodes equally, Graph Attention Networks (GAT) integrated with the attention mechanism can assign different attention weights to each neighbor node so as to identify the relatively important nodes. Moreover, the attention mechanism incorporated into the GCN can make the propagation step more intuitive, and the GAT can also be regarded as a method of the GCN family.
The GAT model has been applied in many fields. To be specific, Fang et al. [
27] proposed a new traffic network speed prediction model named L-GAT, which can capture the spatial characteristics and the temporal dynamics of the traffic network. Based on GAT, Cai et al. [
28] proposed an unsupervised model named DQ-GAT, which can achieve scalable and proactive autonomous driving. Moreover, DQ-GAT provided a better trade-off between safety and efficiency in both seen and unseen scenarios. Qin et al. [
29] proposed a Co-Interactive Graph Attention Network (Co-GAT), and the model can establish connections between the dialog act recognition and the sentiment classification so as to capture speakers’ intentions. Moreover, studies [
22,
30] applied GAT to model the internal relationship between medical events that bring breakthrough progress in the medical field.
Similar to the GCN model, the GAT model executes the calculation process by calculating the local network of nodes rather than the whole graph structure, and this process improves the calculation efficiency and reduces memory usage. However, differently from the GCN model, the GAT model assigns different weights to different neighbor nodes according to the importance of the current node, which can better deal with the structural problems, so the GAT model is widely used in social prediction, drug discovery and recommendation systems.
2.2. Medication Recommendation
Medication recommendation is one of the significant research directions in the field of intelligent medicine, which can assist doctors in making safe and effective prescriptions and has great significance for drug synergism and safety. Due to the lack of available datasets and the difficulties in information sharing in early medical research work [
31,
32], early medication recommendation methods are mainly based on expert prior rules and focus on the association and causality between diagnosis, procedure, and drug combinations. Specifically, Chen et al. [
33] described a physician-advisory system for Chronic Heart Failure (CHF) management, which encoded the entire set of clinical practice guidelines using answer-set programming and gave patients medical information like a human physician. Slavescu et al. [
34] presented a rule-based system that can assist medical doctors in routine tasks, and the system suggested a diagnosis and recommended treatment for patients based on their medical history and current symptoms. Moreover, Ajmi et al. [
35] proposed an expert system that could recommend the right medication combinations depending on the location where the patient lives and the symptoms of the patient. Although these rule-based medication recommendation methods can recommend drugs for patients, their recommendation accuracy is limited.
With the security of information collection and sharing improved [
36,
37], medical data have been widely accumulated. Meanwhile, medication recommendation methods based on deep learning have gradually become the mainstream methods in the field of medication recommendation. These deep learning approaches learn the relationships between medical entities based on statistical regularities of EHRs and apply these relationships to medical consultation and medication recommendation. To be specific, Wang et al. [
23] constructed temporal information from medical records to obtain patient representations and build a key-value memory network to recommend medications. Wang et al. [
38] obtained the target distribution associated with safe medication combinations from raw patient records, which could shape distributions of patient representations and reduce the DDI rate of the medication recommendation. Furthermore, Shang et al. [
22] pretrained the relationship between drugs in advance and constructed a knowledge map. Moreover, studies [
39,
40] used the GCN model to learn the relationship between medical events and built a medical relationship tree. Although these methods have higher accuracy than the earlier methods, they still have many limitations and defects, such as high complexity and more training parameters.
Based on the above reasons, we propose a medical ontology tree model (Med-tree) that can simultaneously capture the complex correlation and temporal features in EHRs and effectively reduce the DDI rate in real medical datasets.
3. The Proposed Model
In this section, the training process of the proposed model is described in detail. To be specific, the description of Med-tree is divided into three parts. First, the data structures and the medication recommendation tasks are explained. Next, the framework of the proposed model is described in detail. Finally, the optimization of the Med-tree and the training algorithm are presented.
3.1. Problem Formulation
Medication recommendation models based on EHRs need to be trained on datasets with high accuracy. Moreover, to improve the recommendation accuracy, the EHRs need to be standardized and pretrained. Specifically, the standardized EHRs, the medical ontology tree constructed in pretraining, the definition of the EHR graph and DDI graph and the specific medication recommendation task are described in the following parts. Moreover, the notations used in the proposed model are shown in
Table 1.
3.1.1. Definition of the Standardized EHRs
To improve the accuracy of medication recommendation, the EHRs are standardized into ICD-9 diagnosis codes, ICD-9 procedure codes and ATC medication codes, as shown in
Figure 2. Furthermore, the standardized EHRs can be represented as a sequence of multi-variate observations:
=
, where
represents the total number of the patients and
T represents the maximum number of one’s visits. To avoid confusion and ambiguity, the superscript
n is omitted and the training process of the Med-tree is described for a single patient. To be specific, the
tth visit
=
of a patient contains diagnosis codes
, procedure codes
and medication codes
3.1.2. Definition of the Medical Ontology Tree
The structure of the ICD-9 encoding system is similar to the class hierarchy extracted from the medical ontology, and it can be represented by a directed acyclic graph structure. Specifically, the leaf node of the structure represents an ICD-9 diagnosis code or an ICD-9 procedure code, and the ancestor node represents the medical hierarchy with the specific classification significance of the medicine. Moreover, the structure of the medical ontology tree is shown in
Figure 3, and each node in
Figure 3 is described in
Table 2. It can be seen that
and
represent the angina decubitus and prinzmetal Angina, respectively, and the nodes belong to the sub-classification of angina pectoris represented by
and the general classification of ischemic heart disease represented by
. In the training of Med-tree, all diagnosis nodes and their ancestors are regarded as tree nodes, and the relationship between each medical entity is regarded as the edge composition between nodes and their neighbors. Based on the edge connection of these medical entities, the medical ontology tree is constructed. Meanwhile, due to the different medical concepts of diagnosis and procedure, the diagnosis ontology tree and the procedure ontology tree are constructed, respectively.
3.1.3. Definition of the EHR Graph and DDI Graph
The EHR graph and the DDI graph can be represented as = and = respectively, where is the set of medications, is the edge set of the EHR datasets, and the is the edge set of known DDIs between a pair of drugs. Furthermore, adjacency matrices , ∈ are constructed to clarify the construction of edge . To be specific, = 1 can be described as drug i and drug j appearing in the same prescription and acting synergistically. Meanwhile, = 1 indicates that drug i and drug j contradict each other.
3.1.4. Medication Recommendation Task
Given diagnosis codes , procedure codes of the current tth visit and historical visit representations = , where = represents the set of the diagnosis codes, procedure codes and medication codes of the ith visit. The goal of the medication recommendation task is to obtain a medication combination at the tth visit based on the patient’s current clinical events , and historical visits , where represents the total number of the recommended medications. Meanwhile, Since the medication combination is more than one label, the medication recommendation based on EHRs is considered a multi-label classification task.
3.2. Model Framework
The Med-tree model is divided into three parts: Ontology Embedding, Temporal Dependency and Knowledge Memory. In the following content, the specific structure shown in
Figure 4 and the training process of the proposed model is explained in detail.
3.2.1. Ontology Embedding
The EHR of a single admission is considered a set of medical entities consisting of diagnoses, procedures and medications. Moreover, these entities have internal correlations, and this correlation with different meanings and degrees is called structural characteristics. Specific procedures and medications may be needed for certain diagnoses. However, the vector representation of the standardized EHRs is unordered, and it does not conform to the actual situation. Therefore, the medical ontology tree is used to enrich the vector representation of medical entities. As shown in
Figure 5, the point of the arrow indicates the direction of the node message delivery in the medical ontology tree, and the tree realizes the convergence and update of messages among medical nodes through a special GAT. Different operations are carried out for the leaf nodes of specific medical entities and the ancestor nodes with medical classification concepts in the medical ontology tree.
For the non-leaf nodes with the concept of medical classification, the correlation of each child node is adaptively learned, and its representation is updated by weighted addition as follows:
where
K represents the number of multiple attention;
is the sigmoid function;
represents all children nodes of node
i;
is the vector representation of node
j;
can be interpreted as the weight coefficient of the
kth attention between node
i and node
j. Specifically, the calculation formula of
[
41] can be illustrated below:
where
is a single-layer feedforward neural network;
is a concatenation operation; the weight coefficient
is obtained through the LeakyReLU activation function.
For the leaf node
, the information of its ancestor nodes can be aggregated into itself, and the embedded representation of the leaf node can be enriched by combining the information of its medical classification. Furthermore, the graph attention mechanisms are used to adaptively learn the correlation coefficients of each classification concept, and the mechanisms update the representation of leaf node
as follows:
where
represents all ancestor nodes of node
i;
represents the vector representation of node
j; the formula of
is translated as follows:
The establishment of the medical ontology tree uses the GAT model to transfer information from bottom to top and aggregates the transferred results into all leaf nodes from top to bottom, which enriches the vector representation of the medical entities. Therefore, the diagnosis codes and procedure codes of the current tth visit are transformed into the more comprehensive representations .
3.2.2. Temporal Dependency
The records of a patient’s multiple visits can be regarded as a collection of time series. Meanwhile, the current diagnosis of patients may be influenced by their previous health conditions. For example, for patients with chronic conditions, such as diabetes or cerebral infarction, their current visits may be very similar to the historical visits. Moreover, stomach perforation and erysipelas may appear in a patient’s later visit when they have a peptic ulcer or chickenpox. Therefore, it is necessary to fully explore the temporal characteristics of the EHRs.
The more comprehensive representations
and
are input into the GRU model to capture the temporal features of the EHRs. To be specific, the training of the diagnosis representation
can be taken as an example. Two GRU models with attention mechanisms are used to calculate the influence of the historical diagnosis sequence on the current diagnosis representation, and the hidden layer representations
and
are obtained as follows:
where
are two weights of the attention mechanisms, and they can be activated using different functions as follows:
The weights of the attention mechanisms
and
are applied to obtain the representation with historical diagnostic information, and the formula of the representation
can be expressed as follows:
where ⊗ represents the element multiplication.
Similar to the training of diagnosis representation , the procedure representation can be input into the GRU model to capture the influence of the patient’s historical process information and obtain the procedure representation with history dependency information. Therefore, the representations and are transformed into and .
3.2.3. Knowledge Memory
The Knowledge Memory module is constructed to make full use of the medication knowledge, and this module not only embeds EHR and DDI graphs as facts in the memory bank but also inserts the patients’ history information into the dynamic memory to fully obtain information from different views. Specifically, inspired by research [
22], the EHRs and DDI records in
Section 3.1.3 are embedded into the memory bank to further improve the model recommendation accuracy and reduce the DDI rate of recommendation results. To be specific, the Knowledge Memory module can be divided into the following five steps:
Convert inputs into a representation vector of the
tth visit. Here,
and
transformed from the Temporal Dependency module can be generated as a query
as follows:
where
represents a transformation function that connects the diagnosis representation
and the procedure representation
.
Design a memory bank. Graphically enhanced memory representations are stored in the memory bank with two adjacency matrices
and
. According to the GCN procedure, each
(
is used for
and
) is preprocessed as follows:
where
is a diagonal matrix such that
=
and
are identity matrices. Moreover, two-layer GCN is used for the EHR graph and DDI graph to capture the medical relationship between drug combination usage and DDIs, respectively, and the output
is generated by two graph embeddings added as follows:
where
is a training coefficient;
and
are the medication embedding from the EHR graph and DDI graph; and
and
are the weight matrix of the hidden layer.
Design a dynamic memory. Specifically, patients’ historical information is inserted into the dynamic memory as key-value pairs to fully capture the information from different perspectives. Moreover, based on
and medication representation
, the history cache of the
tth visit can be represented in the form of key-value pairs as follows:
where
is empty when
t = 1, and
∈
represents the historical visit before the
tth visit. For convenience,
:
are denoted as key vectors, and
:
are denoted as value vectors to represent the history cache of the
tth visit.
Output the memory representation. Get the output
and
by the query
, the memory bank
and the dynamic memory
. Here, attention mechanisms are applied to retrieve the most relevant information with the query
as follows:
Obtain the multi-label recommended medication combination
by activating
,
, and
, which can be expressed as follows:
where
represents the activation function.
3.3. Optimization
The gap between the recommended medication combination and the real medication recommendation determines the quality of the recommendation model. For this reason, the medication recommendation task can be regarded as a multi-label classification problem. Thus, the multi-label margin loss
and the binary cross-entropy loss
are combined as the multi-label classification loss
as below:
Here, is the mixture weights; T represents the maximum number of one’s visits; and are the mean medication i coordinate at the tth visit and its value, respectively; and represents the jth classification label indexed by classification label set .
DDI loss is designed to control the DDI rate in the medication recommendation result, and its formula can be described as follows:
where
gives the pair-wise probability of the recommendation result. ⊙ is the element-wise product.
To achieve a lower DDI rate for medication recommendation results, the balance between
and
should be achieved. Inspired by study [
42], we transform between
and
with a certain probability as follows:
on the one hand, there will be a high probability of using
in the case where the DDI rate
larger than the expected DDI rate
s. On the other hand, decay rate
will be applied to the temperature
when the model becomes stable with training time, and
will be used as the loss function.
In summary, the training algorithm of the Med-tree model is detailed in Algorithm 1.
Algorithm 1: Training Algorithm. |
|
4. Performance Analysis
This section is mainly introduced from three aspects. First, the experimental configurations are explained, which includes the data source, the baselines, and so on. Secondly, the performance of Med-tree for medication recommendation is verified by comparing it with some basic models. Finally, the feasibility of the proposed model in practical application is derived from a case study.
4.1. Experimental Setup
4.1.1. Data Source
The experiments are carried out on the MIMIC-III datasets, and the structure of the MIMIC-III datasets is described in detail as follows:
MIMIC-III are open-source medical datasets based on intensive care unit patient monitoring managed by MIT. It contains more than 50,000 admissions to intensive care units in large tertiary care hospitals between 2001 and 2012 and 7870 newborns admitted between 2001 and 2008. Specifically, the MIMIC-III data components include vital signs, diagnoses, procedures, medications, and so on. Furthermore, to improve EHRs standardization and availability, the data of the MIMIC-III datasets are transformed into a time-series list of diagnoses, procedures, and medications that are easy to be trained.
In addition, the representations of medical entities on MIMIC-III datasets are converted according to medical standards. To be specific, the medications using NDC codes are converted to ATC codes, and the diagnoses and procedures are integrated with the ICD-9 codes. Moreover, the statistical results are listed in
Table 3 to further illustrate the characteristics of the MIMIC-III datasets.
4.1.2. Baselines
Several baselines are considered for comparisons as follows:
4.1.3. Metrics
To measure the accuracy of experimental results, the Jaccard Similarity Score (Jaccard), Average F1 (F1) and Precision–Recall AUC (PRAUC) are considered as the scoring functions. The formula of Jaccard can be written below:
where
N means the total number of patients and
can be interpreted as the maximum number of visits of the
kth patient.
The PRAUC is calculated by the trapezoidal integral for the area under the PR curve, which can be applied to datasets with imbalanced positive and negative sample numbers.
The F1 score can transform the multi-classification problem into
n bipartitions and calculate the average score of the bipartition to obtain the final evaluation index. Moreover, the F1 score can be described as follows:
where
t means the
tth visit, and
k is the
kth patient in the test set.
To measure the medication safety, the DDI rate is defined as the percentage of medication recommendations containing DDIs, and the calculation formula of the DDI rate is as follows:
Here, the DDI rate will count each medication pair
in a recommendation set
if pair
belongs to an edge set
of the DDI graph. Furthermore, the
is defined as the percentage of the DDI rate compared to the DDI rate in EHR test datasets, and its formula is defined below:
4.1.4. Evaluation Strategies
The datasets are randomly divided into training, validation, and test sets at a ratio of . To be specific, the EHR graph and the DDI graph are constructed in pretraining, and the Adam is applied as an optimizer with an initial learning rate of 0.001. The best recommendation result can be fixed on the evaluation set in 40 epochs. All the methods are implemented in PyTorch 1.7.0 and trained on Ubuntu 18.04 with 12 GB memory and Nvidia 3090 GPU.
4.2. Experimental Results
The validity of the proposed model is proven in the following four parts. To be specific, the first part compares the recommendation accuracy of the proposed model with baselines. In the second part, the DDI rate between Med-tree and baselines is compared. Next, in the third part, the performance of Med-tree is compared with the baseline model when the frequency of medication occurrence is different. In the end, the performance of Med-tree with baselines is compared in medical records with varying lengths.
4.2.1. Prediction Performance
Table 4 shows the performance comparison of the Jaccard Similarity Score, Average F1 score, Precision–Recall AUC, and DDI rate between Med-tree and baselines on MIMIC-III datasets. It can be seen from the experimental results that the proposed model achieves the best results among all indicators. In terms of Jaccard Similarity Score, Average F1 score, and Precision–Recall AUC, the proposed model is 1.02%, 1.23%, and 1.09% higher than that of the latest method (PREMIER), respectively. Moreover, in terms of the number of medication recommendation outcomes, the average number of drugs recommended by Med-tree is 14.98, which has the smallest gap from the true value of 14.68. The several indicators above show that the proposed model can effectively improve the accuracy of medication recommendations. Meanwhile, Med-tree takes into account drug safety. To be specific, LR and LEAP are machine learning-based methods, which are inferior to RETAIN and other deep learning methods in all indicators. Moreover, GAMENet and PREMIER are the latest methods that introduce DDI knowledge and reduce the DDI rate in the medication recommendation combination. However, the DDI rate of these methods is all lower than the proposed model, which verifies that the proposed model combined with the medical ontology tree and learning medical feature representations can greatly improve the medication recommendation result.
4.2.2. Evaluation of the DDIs
To verify the ability of the model to reduce the DDI rate in medication recommendation combinations, the experiments are classified with Top 40, Top 60, Top 80, and Top 100 to study the comparisons between Med-tree and the baselines, and
Table 5 lists the comparison results.
Leap is a rule-based approach where medication recommendation combinations are selected from the prescribed drugs of previous doctors, so the DDI rate recommended by Leap is lower than other baselines. However, the Leap model requires a large number of manual participations, which is not suitable for complex datasets. Moreover, the RETAIN model does not incorporate the knowledge of DDIs, which leads to a higher DDI rate for medication recommendation. The GAMENet model and the PREMIER model learn the knowledge of DDIs by establishing the EHR graph and the DDI graph so that their Top 40 and Top 60 indicators still have a low DDI rate. However, their DDI rates are inevitably increased when the number of indicators increases. Furthermore, the Med-tree model is superior to all previous deep learning methods. Even though the increases from −18.48% to −0.26%, the Med-tree model can still keep the DDI rate below zero. These characteristics further indicate that the Med-tree can still take into account drug safety in some complex environments and situations. It also shows that the introduction of the medical ontology tree can enrich the representations of medical entities and improve the quality of downstream medication recommendation tasks.
4.2.3. Evaluation for Unbalanced Medications
Due to the particularity of EHRs, the occurrence frequency of drugs may be different, so it is difficult to recommend drugs with low frequency during medication recommendation. Moreover, the proposed model could mitigate the influence of different frequencies of medications by constructing the medical ontology tree and the memory bank.
Figure 6a counts the number of medications in different frequency bands, and it can be seen that 58 of the 145 medication types appear less than 100 times, while nearly 40 types are recommended more than 1000 times. The average F1 scores of medication recommendation results are calculated in different frequency ranges, as shown in
Figure 6b, which indicates that the proposed model significantly improves the recommendation of less frequent medications compared with other methods. In addition, several frequency bands have higher average F1 scores because they contain more dedicated medications for specific diseases, such as hypertension drugs, diabetes drugs, etc.
4.2.4. Evaluation for EHRs of Different Length
As can be seen from
Table 3, the maximum number of medical visits of patients on MIMIC-III datasets is 29. Since each patient has a different number of admissions, the influence of temporal series length should be considered.
Figure 7 shows the evaluations for EHRs with different temporal lengths on MIMIC-III datasets. To be specific, the Leap model is a rule-based model, and the length of the EHRs has no effect on its recommendation results, which results in low F1 indicators. Moreover, the GAMENet model uses the GRU to learn the temporal characteristics of a patient’s medical records, which greatly improves the F1 indicators compared with the LEAP model. However, its F1 indicators gradually decrease with the increase in the sequence length of the EHRs. The PREMIER model improves the GAMENet model by adding two attention mechanisms to prompt the learning of temporal information of medical records, which results in relatively high F1 indicators. Furthermore, Med-tree has the best performance among all methods, and its F1 value can still maintain a high level with the increase in medical record sequence length, which indicates that the proposed model can better learn the temporal dependency of a patient’s medical records.
4.3. Case Study
To specifically and intuitively observe the ability of the proposed model in medication recommendation, the performance of Med-tree is compared with other baselines on the case samples randomly selected from MIMIC-III datasets. A typical case is shown in
Table 6, which indicates the recommendation result of a patient’s last visit. Moreover, there are 15 drugs actually recommended in the last visit, and the recommended medication combinations from Med-tree and baseline methods are listed in
Table 6. It is seen that Med-tree has the best performance. To be specific, Med-tree correctly recommends 14 of the 15 drugs, only 1 medication is missed and 3 types are mispredicted. In contrast, the previous best method is PREMIER, which recommends one less correct drug than Med-tree. Furthermore, it can be found that none of the models successfully hit the medication “Anxiolytics”, which may due to the specific habits of certain doctors for this drug. In summary, compared with other recommended methods, Med-tree has more accurate recommendation effects on MIMIC-III datasets, and the comparison results of the case fully prove the excellent performance of the proposed model in the actual medication recommendation process.
5. Conclusion and Future Work
In this work, we propose a novel method for medication recommendation that can adequately enhance the learning of obscure structural and temporal features and reduce the DDIs in recommendation results. Moreover, in terms of structural correlation, we construct the medical ontology tree and apply the GAT model to learn the target features, which effectively learns the internal correlation between medical events. Meanwhile, the GRU model combined with the attention mechanisms is applied to capture the temporal characteristics, which improves the accuracy of recommendation. Finally, the memory bank and dynamic memory mechanism are introduced to reduce the DDI rate of the recommendation result. Experimental results show that the proposed model effectively captures the correlation and temporal features and further has a better performance than the existing methods. Furthermore, the case study also shows that the proposed model can make more accurate and reasonable prescriptions for patients in practical applications.
In fact, EHRs are inadequate for the task of personalized and accurate drug recommendation. Moreover, due to a lack of information, inaccurate information, contradictory information, contradictions, and other reasons, there are many uncertainties in EHRs for medication recommendation. However, we only consider the influences of diagnoses and procedures, and there is still great research potential for improvement in the feature mining of EHRs. In the future, we will consider incorporating the original textual information of the EHRs and focus on how to effectively model the fine-grained temporal evolution in EHRs. Meanwhile, medical artificial intelligence studies based on EHRs can further expand the application scope. In addition to medication recommendation, we will further apply EHRs to medical knowledge question answering, disease prediction, and other issues in the future.
Author Contributions
W.Y., L.Z. (Lei Zhang), L.Z. (Lijuan Zhang), J.H. and J.W. contributed to the conception of the study; W.Y. performed the experiment; W.Y., L.Z. (Lijuan Zhang) and N.X. contributed significantly to analysis and manuscript preparation; W.Y., L.Z. (Lijuan Zhang) and J.W. performed the data analyses and wrote the manuscript; L.Z. (Lei Zhang) and N.X. helped perform the analysis with constructive discussions. All authors have read and agreed to the published version of the manuscript.
Funding
The research was partially funded by Zhejiang Province Key Research and Development Project (2020C03071, 2021C03145).
Conflicts of Interest
The authors declare no conflict of interest.
References
- Wang, M.; Wang, M.; Yu, F.; Yang, Y.; Walker, J.; Mostafa, J. A systematic review of automatic text summarization for biomedical literature and EHRs. J. Am. Med. Inform. Assoc. 2021, 28, 2287–2297. [Google Scholar] [CrossRef] [PubMed]
- Ramachandran, S.; Kiruthika, O.O.; Ramasamy, A.; Vanaja, R.; Mukherjee, S. A review on blockchain-based strategies for management of electronic health records (EHRs). In Proceedings of the 2020 International Conference on Smart Electronics and Communication (ICOSEC), Trichy, India, 10–12 September 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 341–346. [Google Scholar]
- Tutty, M.A.; Carlasare, L.E.; Lloyd, S.; Sinsky, C.A. The complex case of EHRs: Examining the factors impacting the EHR user experience. J. Am. Med. Inform. Assoc. 2019, 26, 673–677. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Gao, Y.; Xiang, X.; Xiong, N.; Huang, B.; Lee, H.J.; Alrifai, R.; Jiang, X.; Fang, Z. Human action monitoring for healthcare based on deep learning. IEEE Access 2018, 6, 52277–52285. [Google Scholar] [CrossRef]
- Miller, D.D.; Brown, E.W. Artificial intelligence in medical practice: The question to the answer? Am. J. Med. 2018, 131, 129–133. [Google Scholar] [CrossRef]
- Xie, S.; Yu, Z.; Lv, Z. Multi-disease prediction based on deep learning: A survey. CMES Comput. Model. Eng. Sci. 2021, 127, 3. [Google Scholar] [CrossRef]
- Yadav, P.; Steinbach, M.; Kumar, V.; Simon, G. Mining electronic health records (EHRs) A survey. ACM Comput. Surv. 2018, 50, 1–40. [Google Scholar] [CrossRef]
- Lin, C.; He, Y.X.; Xiong, N. An energy-efficient dynamic power management in wireless sensor networks. In Proceedings of the 2006 Fifth International Symposium on Parallel and Distributed Computing, Timisoara, Romania, 6–9 July 2006; IEEE: Piscataway, NJ, USA, 2006; pp. 148–154. [Google Scholar]
- Xia, F.; Hao, R.; Li, J.; Xiong, N.; Yang, L.T.; Zhang, Y. Adaptive GTS allocation in IEEE 802.15. 4 for real-time wireless sensor networks. J. Syst. Archit. 2013, 59, 1231–1242. [Google Scholar] [CrossRef]
- Elhoseny, M.; Shankar, K.; Uthayakumar, J. Intelligent diagnostic prediction and classification system for chronic kidney disease. Sci. Rep. 2019, 9, 1–14. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Erraguntla, M.; Zapletal, J.; Lawley, M. Framework for Infectious Disease Analysis: A comprehensive and integrative multi-modeling approach to disease prediction and management. Health Inform. J. 2019, 25, 1170–1187. [Google Scholar] [CrossRef] [PubMed]
- Wu, C.; Luo, C.; Xiong, N.; Zhang, W.; Kim, T.H. A greedy deep learning method for medical disease analysis. IEEE Access 2018, 6, 20021–20030. [Google Scholar] [CrossRef]
- Wu, C.; Ju, B.; Wu, Y.; Lin, X.; Xiong, N.; Xu, G.; Li, H.; Liang, X. UAV autonomous target search based on deep reinforcement learning in complex disaster scene. IEEE Access 2019, 7, 117227–117245. [Google Scholar] [CrossRef]
- Nyamabo, A.K.; Yu, H.; Shi, J.Y. SSI–DDI: Substructure–substructure interactions for drug–drug interaction prediction. Briefings Bioinform. 2021, 22, bbab133. [Google Scholar] [CrossRef] [PubMed]
- Ren, Z.H.; Yu, C.Q.; Li, L.P.; You, Z.H.; Guan, Y.J.; Wang, X.F.; Pan, J. BioDKG–DDI: Predicting drug–drug interactions based on drug knowledge graph fusing biochemical information. Briefings Funct. Genom. 2022, 21, 216–229. [Google Scholar] [CrossRef] [PubMed]
- John, A.; Vasudevan, V.; Ilyas, H.A. Medication recommendation system based on clinical documents. In Proceedings of the 2016 International Conference on Information Science (ICIS), Dublin, Ireland, 11–14 December 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 180–184. [Google Scholar]
- Syed-Abdul, S.; Nguyen, A.; Huang, F.; Jian, W.S.; Iqbal, U.; Yang, V.; Hsu, M.H.; Li, Y.C. A smart medication recommendation model for the electronic prescription. Comput. Methods Programs Biomed. 2014, 117, 218–224. [Google Scholar] [CrossRef]
- Ghasemi, S.H.; Etminani, K.; Dehghan, H.; Eslami, S.; Hasibian, M.R.; Vakili-Arki, H.; Saberi, M.R.; Aghabagheri, M.; Namayandeh, S.M. Design and Evaluation of a Smart Medication Recommendation System for the Electronic Prescription. In Proceedings of the dHealth, Vienna, Austria, 28–29 May 2019; pp. 128–135. [Google Scholar]
- Choi, E.; Bahadori, M.T.; Song, L.; Stewart, W.F.; Sun, J. GRAM: Graph-based attention model for healthcare representation learning. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, 13–17 August 2017; pp. 787–795. [Google Scholar]
- Gao, C.; Sun, H.; Wang, T.; Tang, M.; Bohnen, N.I.; Müller, M.L.; Herman, T.; Giladi, N.; Kalinin, A.; Spino, C.; et al. Model-based and model-free machine learning techniques for diagnostic prediction and classification of clinical outcomes in Parkinson’s disease. Sci. Rep. 2018, 8, 1–21. [Google Scholar] [CrossRef] [Green Version]
- Choi, E.; Bahadori, M.T.; Sun, J.; Kulas, J.; Schuetz, A.; Stewart, W. Retain: An interpretable predictive model for healthcare using reverse time attention mechanism. Adv. Neural Inf. Process. Syst. 2016, 29, 3504–3512. [Google Scholar]
- Shang, J.; Xiao, C.; Ma, T.; Li, H.; Sun, J. Gamenet: Graph augmented memory networks for recommending medication combination. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 1126–1133. [Google Scholar]
- Wang, Y.; Chen, W.; Pi, D.; Yue, L. Adversarially regularized medication recommendation model with multi-hop memory network. Knowl. Inf. Syst. 2021, 63, 125–142. [Google Scholar] [CrossRef]
- Yang, C.; Xiao, C.; Ma, F.; Glass, L.; Sun, J. SafeDrug: Dual Molecular Graph Encoders for Recommending Effective and Safe Drug Combinations. In Proceedings of the 30th International Joint Conference on Artificial Intelligence (IJCAI-21), Montreal, QC, Canada, 19–27 August 2021; pp. 3735–3741. [Google Scholar]
- Farinhas, A.; Martins, A.F.; Aguiar, P.M. Multimodal continuous visual attention mechanisms. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 1047–1056. [Google Scholar]
- Liang, Y.; Li, H.; Guo, B.; Yu, Z.; Zheng, X.; Samtani, S.; Zeng, D.D. Fusion of heterogeneous attention mechanisms in multi-view convolutional neural network for text classification. Inf. Sci. 2021, 548, 295–312. [Google Scholar] [CrossRef]
- Fang, Y.; Jiang, J.; He, Y. Traffic Speed Prediction Based on LSTM-Graph Attention Network (L-GAT). In Proceedings of the 4th International Conference on Advanced Electronic Materials, Computers and Software Engineering (AEMCSE), Changsha, China, 26–28 March 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 788–793. [Google Scholar]
- Cai, P.; Wang, H.; Sun, Y.; Liu, M. DQ-GAT: Towards Safe and Efficient Autonomous Driving with Deep Q-Learning and Graph Attention Networks. IEEE Trans. Intell. Transp. Syst. 2022, 8, 1–11. [Google Scholar] [CrossRef]
- Qin, L.; Li, Z.; Che, W.; Ni, M.; Liu, T. Co-gat: A co-interactive graph attention network for joint dialog act recognition and sentiment classification. In Proceedings of the AAAI Conference on Artificial Intelligence, Online, 2–9 February 2021; Volume 35, pp. 13709–13717. [Google Scholar]
- Su, C.; Gao, S.; Li, S. GATE: Graph-attention augmented temporal neural network for medication recommendation. IEEE Access 2020, 8, 125447–125458. [Google Scholar] [CrossRef]
- Yao, Y.; Xiong, N.; Park, J.H.; Ma, L.; Liu, J. Privacy-preserving max/min query in two-tiered wireless sensor networks. Comput. Math. Appl. 2013, 65, 1318–1325. [Google Scholar] [CrossRef]
- Zhao, J.; Huang, J.; Xiong, N. An effective exponential-based trust and reputation evaluation system in wireless sensor networks. IEEE Access 2019, 7, 33859–33869. [Google Scholar] [CrossRef]
- Chen, Z.; Marple, K.; Salazar, E.; Gupta, G.; Tamil, L. A physician advisory system for chronic heart failure management based on knowledge patterns. Theory Pract. Log. Program. 2016, 16, 604–618. [Google Scholar] [CrossRef] [Green Version]
- Slăvescu, R.R.; Groşan, A.C.; Slăvescu, K.C. Towards Assisting Medical Decisions by Using Rule Based Protocols and Semantic Resources. In Proceedings of the International Conference on Advancements of Medicine and Health Care through Technology, Cluj-Napoca, Romania, 5–7 June 2014; Springer: Berlin/Heidelberg, Germany, 2014; pp. 31–36. [Google Scholar]
- Al-Ajmi, N.; Almulla, M.A. Rule-Based Expert System for Headache Diagnosis and Medication Recommendation. Int. J. Health Med. Eng. 2020, 14, 388–391. [Google Scholar]
- Chen, Y.; Zhou, L.; Pei, S.; Yu, Z.; Chen, Y.; Liu, X.; Du, J.; Xiong, N. KNN-BLOCK DBSCAN: Fast clustering for large-scale data. IEEE Trans. Syst. Man Cybern. Syst. 2019, 51, 3939–3953. [Google Scholar] [CrossRef]
- Huang, S.; Zeng, Z.; Ota, K.; Dong, M.; Wang, T.; Xiong, N.N. An intelligent collaboration trust interconnections system for mobile information control in ubiquitous 5G networks. IEEE Trans. Netw. Sci. Eng. 2020, 8, 347–365. [Google Scholar] [CrossRef]
- Wang, Y.; Chen, W.; Pi, D.; Yue, L.; Wang, S.; Xu, M. Self-Supervised Adversarial Distribution Regularization for Medication Recommendation. In Proceedings of the International Joint Conferences on Artificial Intelligence Organization, Online, 19–26 August 2021. [Google Scholar]
- Shang, J.; Ma, T.; Xiao, C.; Sun, J. Pre-training of graph augmented transformers for medication recommendation. arXiv 2019, arXiv:1906.00346. [Google Scholar]
- Choi, E.; Xu, Z.; Li, Y.; Dusenberry, M.; Flores, G.; Xue, E.; Dai, A. Learning the graphical structure of electronic health records with graph convolutional transformer. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 606–613. [Google Scholar]
- Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. arXiv 2017, arXiv:1710.10903. [Google Scholar]
- Kirkpatrick, S.; Gelatt, C.D., Jr.; Vecchi, M.P. Optimization by simulated annealing. Science 1983, 220, 671–680. [Google Scholar] [CrossRef]
- Esteban, C.; Tresp, V.; Yang, Y.; Baier, S.; Krompaß, D. Predicting the co-evolution of event and knowledge graphs. In Proceedings of the 19th International Conference on Information Fusion (FUSION), Heidelberg, Germany, 5–8 July 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 98–105. [Google Scholar]
- Bhoi, S.; Li, L.M.; Hsu, W. Premier: Personalized recommendation for medical prescriptions from electronic records. arXiv 2020, arXiv:2008.13569. [Google Scholar]
| Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).