International Classification of Diseases Prediction from MIMIIC-III Clinical Text Using Pre-Trained ClinicalBERT and NLP Deep Learning Models Achieving State of the Art
Abstract
:1. Introduction
1.1. Background
1.2. Data Exploratory and Analysis
1.3. Data Processing
2. Methodology
- -
- NOTEEVENTS: This table includes admission text notes, which are free-text descriptions of patient encounters.
- -
- DIAGNOSIS-ICD: This table lists the ICD-9 diagnosis codes for the conditions diagnosed during the hospital stay.
3. Experimental Setup
- ·
- Hyperparameter tuning to find optimal model configurations.
- ·
- Error analysis to identify prediction pain points.
- ·
- More aggressive data sampling strategies
- ·
- Feature engineering such as text pre-processing
- ·
- Regularization methods like dropout to prevent over-fitting.
- ·
- Early stopping to halt training when the result does not improve.
- ·
- Learning curves to determine whether more training data are required.
4. Results and Discussions
5. Limitation and Future Work
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Appendix A
References
- National Health Service. International Statistical Classification of Diseases and Related Health Problems, 10th Revision (ICD-10), 5th Edition. 2022. Available online: https://classbrowser.nhs.uk/ref_books/ICD-10_2022_5th_Ed_NCCS.pdf (accessed on 10 January 2024).
- PhysioNet. MIMIC-III Clinical Database (Version 1.4). 2016. Available online: https://physionet.org/content/mimiciii/1.4/ (accessed on 12 January 2024).
- Mullenbach, J.; Wiegreffe, S.; Duke, J.; Sun, J.; Eisenstein, J. Explainable Prediction of Medical Codes from Clinical Text. arXiv 2018, arXiv:1802.05695. [Google Scholar]
- Huang, J.; Osorio, C.; Sy, L.W. An empirical evaluation of deep learning for ICD-9 code assignment using MIMIC-III clinical notes. Comput. Methods Programs Biomed. 2019, 177, 141–153. [Google Scholar] [CrossRef]
- Biswas, B.; Pham, T.-H.; Zhang, P. TransICD: Transformer Based Code-wise Attention Model for Explainable ICD Coding. arXiv 2021, arXiv:2104.10652. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. arXiv 2017, arXiv:1706.03762. [Google Scholar]
- Han, K.; Xiao, A.; Wu, E.; Guo, J.; Xu, C.; Wang, Y. Transformer in transformer. Adv. Neural Inf. Process. Syst. 2021, 34, 15908–15919. [Google Scholar]
- Li, W.; Fan, L.; Wang, Z.; Ma, C.; Cui, X. Tackling mode collapse in multi-generator GANs with orthogonal vectors. Pattern Recognit. 2021, 110, 107646. [Google Scholar] [CrossRef]
- Lee, J.; Shin, H.; Kim, Y. The Effects of Hyperparameters in Deep Learning on Medical Dataset: A Case Study on EMR. arXiv 2020, arXiv:2009.05451. [Google Scholar]
- Alsentzer, E.; Murphy, J.R.; Boag, W.; Weng, W.; Jin, D.; Naumann, T.; McDermott, M.B.A. Publicly Available Clinical BERT Embeddings. arXiv 2019, arXiv:1904.03323. [Google Scholar]
- Choi, Y.; Kang, S. A systematic review of deep learning-based automated diagnosis of neurologic disorders using EEG signals. BMC Med. Inform. Decis. Mak. 2022, 22, 1–18. [Google Scholar]
- Hsu, C.C.; Chang, P.C.; Chang, A. Multi-Label Classification of ICD Coding Using Deep Learning. In Proceedings of the International Symposium on Community-Centric Systems (CcS), Tokyo, Japan, 23–26 September 2020; pp. 1–6. [Google Scholar]
- Gangavarapu, T.; Krishnan, G.S.; Kamath, S.; Jeganathan, J. FarSight: Long-Term Disease Prediction Using Unstructured Clinical Nursing Notes. IEEE Trans. Emerg. Top. Comput. 2020, 9, 1151–1169. [Google Scholar] [CrossRef]
- Samonte, M.J.C.; Gerardo, B.D.; Fajardo, A.C.; Medina, R.P. ICD-9 tagging of clinical notes using topical word embedding. In Proceedings of the 2018 International Conference on Internet and e-Business, Taipei, Taiwan, 16–18 May 2018; pp. 118–123. [Google Scholar]
- Obeid, J.S.; Dahne, J.; Christensen, S.; Howard, S.; Crawford, T.; Frey, L.J.; Stecker, T.; Bunnell, B.E. Identifying and Predicting intentional self-harm in electronic health record clinical notes: Deep learning approach. JMIR Med. Inform. 2020, 8, e17784. [Google Scholar] [CrossRef] [PubMed]
- Hsu, J.L.; Hsu, T.J.; Hsieh, C.H.; Singaravelan, A. Applying Convolutional Neural Networks to Predict the ICD-9 Codes of Medical Records. Sensors 2020, 20, 7116. [Google Scholar] [CrossRef] [PubMed]
- Xie, P.; Xing, E. A Neural Architecture for Automated ICD Coding. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Melbourne, Australia, 15–20 July 2018; Association for Computational Linguistics: Melbourne, Australia, 2018; pp. 1066–1076. [Google Scholar]
- Singaravelan, A.; Hsieh, C.-H.; Liao, Y.-K.; Hsu, J.L. Predicting ICD-9 Codes Using Self-Report of Patients. Appl. Sci. 2021, 11, 10046. [Google Scholar] [CrossRef]
- Zeng, M.; Li, M.; Fei, Z.; Yu, Y.; Pan, Y.; Wang, J. Automatic ICD-9 coding via deep transfer learning. Neurocomputing 2019, 324, 43–50. [Google Scholar] [CrossRef]
- Masud, J.H.B.; Kuo, C.-C.; Yeh, C.-Y.; Yang, H.-C.; Lin, M.-C. Applying Deep Learning Model to Predict Diagnosis Code of Medical Records. Diagnostics 2023, 13, 2297. [Google Scholar] [CrossRef] [PubMed]
- Xu, K.; Lam, M.; Pang, J.; Gao, X.; Band, C.; Mathur, P.; Papay, F.; Khanna, A.K.; Cywinski, J.B.; Maheshwari, K.; et al. Multimodal Machine Learning for Automated ICD Coding. In Proceedings of the Machine Learning Research, Ann Arbor, MI, USA, 9–10 August 2019; Volume 106, pp. 1–17. [Google Scholar]
- Biseda, B.; Desai, G.; Lin, H.; Philip, A. Prediction of ICD Codes with Clinical BERT Embeddings and Text Augmentation with Label-Balancing-using-MIMIC-III. arXiv 2020, arXiv:2008.10492. [Google Scholar]
- Edin, J.; Junge, A.; Havtorn, J.D.; Borgholt, L.; Maistro, M.; Ruotsalo, T.; Maaløe, L. Automated Medical Coding on MIMIC-III and MIMIC-IV: A Critical Review and Replicability Study. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information, Taipei, Taiwan, 23–27 July 2023; pp. 2572–2582. [Google Scholar] [CrossRef]
- Gero, Z.; Singh, C.; Cheng, H.; Naumann, T.; Galley, M.; Gao, J.; Poon, H. Self-Verification Improves Few-Shot Clinical Information Extraction. arXiv 2023, arXiv:2306.00024. [Google Scholar]
Category | Number of Rows | Unique Values |
---|---|---|
Note events | 2,083,180 | 2,023,185 |
Diagnosis | 651,047 | 6984 |
Category | Number of Rows | Unique Values | Note Events (%) |
---|---|---|---|
Top 10 Diagnosis | 677,738 | 10 | 32.5 |
Top 50 Diagnosis | 1,058,988 | 50 | 52.8 |
Model | Diagnosis | Precision (%) | Recall/Accuracy (%) | F1 Score (%) |
---|---|---|---|---|
RNN | Top 10 | 24 | 26 | 25 |
LSTM | 81 | 81 | 81 | |
BiLSTM | 78 | 78 | 78 | |
BERT | 87 | 87 | 87 | |
RNN | Top 50 | 8 | 8 | 5 |
LSTM | 68 | 68 | 66 | |
BiLSTM | 65 | 65 | 65 | |
BERT | 81 | 81 | 80 |
Model | Diagnosis | Hyperparameters |
---|---|---|
RNN | Top 10 | Batch_size=16, epochs=10, embedding_dim=128, hidden_dim=256, optimizer=‘AdamW’, activation=‘relu’, dropout=0.4, lr=0.00002 |
LSTM | Batch_size=16, epochs=10, embedding_dim=128, hidden_dim=256, optimizer=‘AdamW’, activation=‘relu’, dropout=0.2,lr=0.00 | |
BiLSTM | Batch_size=16, epochs=10, embedding_dim=128, hidden_dim=256, optimizer=‘AdamW’, activation=‘relu’, dropout=0.2, lr=0.001 | |
BERT | Batch_size=16, epochs=10, embedding_dim=128, hidden_dim=256, optimizer=‘AdamW’, activation=‘relu’, dropout=0.4, lr=0.001 | |
RNN | Top 50 | Batch_size=16, epochs=10, embedding_dim=128, hidden_dim=256, optimizer=‘AdamW’, activation=relu’, dropout=0.4, lr=0.00002 |
LSTM | Batch_size=16, epochs=10, embedding_dim=128, hidden_dim=256, optimizer=‘AdamW’, activation=‘relu’,dropout=0.2, lr=0.001 | |
BiLSTM | Batch_size=16, epochs=10, embedding_dim=128, hidden_dim=256, optimizer=‘AdamW’, activation=‘relu’, dropout=0.2, lr=0.001 | |
BERT | Batch_size=16, epochs=10, embedding_dim=128, hidden_dim=256, optimizer=‘AdamW’, activation=‘relu’, dropout=0.4, lr=0.001 |
Work | Data | Method | Target Variable | Performance Measures |
---|---|---|---|---|
Hsu et al. [12] | Discharge summary | Deep learning | (I) 19 distinct ICD-9 chapter codes, (II) top 50 ICD-9 codes, (III) top 100 ICD-9 codes | (I) micro-F1 score of 0.76, (II) micro-F1 score of 0.57, (III) micro-F1 score of 0.51 |
Gangavarapu et al. [13] | Nursing notes | Deep learning | 19 distinct ICD-9 chapter codes | Accuracy of 0.833 |
Samonte et al. [14] | Discharge summary | Deep learning | 10 distinct ICD-9 codes | Precision of 0.780, Recall of 0.620, F1 score of 0.678 |
Obeid et al. [15] | Clinical notes | Deep learning | ICD-9 code from E950-E959 | Area under the ROC curve score of 0.882, F1 score of 0.769 |
Hsu et al. [16] | subjective component | Deep learning | (I) 17 distinct ICD-9 chapter codes, (II) 2017 distinct ICD-9 codes | (I) Accuracy of 0.580, (II) Accuracy of 0.409 |
Xie et al. [17] | Diagnosis description | Deep learning | 2833 ICD-9 codes | Sensitivity score of 0.29, specificity score of 0.33 |
Singaravelan et al. [18] | Subjective component | Deep learning | 1871 ICD-9 codes | Recall score for chapter code is 0.57, recall score for block is 0.49, recall score for three-digit code is 0.43, recall score for full code is 0.45 |
Zeng et al. [19] | Discharge summary | Deep learning | 6984 ICD-9 codes | F1 score of 0.42 |
Huang et al. | Discharge summary | Deep learning | (I) 10 ICD-9 codes, (II) 10 blocks 1131 ICD-10 codes | (I) F1 score of 0.69, (II) F1 score of 0.72 |
Current study | Discharge summary | Deep learning | (I) top 10 ICD-10 codes, (II) top 50 ICD-10 codes | (I) Precision of 0.88, recall of 0.88, F1 score of 0.88, (II) Precision of 0.81, recall of 0.81, F1 score of 0.80 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Aden, I.; Child, C.H.T.; Reyes-Aldasoro, C.C. International Classification of Diseases Prediction from MIMIIC-III Clinical Text Using Pre-Trained ClinicalBERT and NLP Deep Learning Models Achieving State of the Art. Big Data Cogn. Comput. 2024, 8, 47. https://doi.org/10.3390/bdcc8050047
Aden I, Child CHT, Reyes-Aldasoro CC. International Classification of Diseases Prediction from MIMIIC-III Clinical Text Using Pre-Trained ClinicalBERT and NLP Deep Learning Models Achieving State of the Art. Big Data and Cognitive Computing. 2024; 8(5):47. https://doi.org/10.3390/bdcc8050047
Chicago/Turabian StyleAden, Ilyas, Christopher H. T. Child, and Constantino Carlos Reyes-Aldasoro. 2024. "International Classification of Diseases Prediction from MIMIIC-III Clinical Text Using Pre-Trained ClinicalBERT and NLP Deep Learning Models Achieving State of the Art" Big Data and Cognitive Computing 8, no. 5: 47. https://doi.org/10.3390/bdcc8050047
APA StyleAden, I., Child, C. H. T., & Reyes-Aldasoro, C. C. (2024). International Classification of Diseases Prediction from MIMIIC-III Clinical Text Using Pre-Trained ClinicalBERT and NLP Deep Learning Models Achieving State of the Art. Big Data and Cognitive Computing, 8(5), 47. https://doi.org/10.3390/bdcc8050047