Entropy-Based Time Window Features Extraction for Machine Learning to Predict Acute Kidney Injury in ICU
Abstract
:Featured Application
Abstract
1. Introduction
2. Materials and Methods
2.1. Study Population
2.2. AKI Definition
2.3. Problem Formulation
- Time gap variation: The time gap from AKI onset to 48 h before onset, using a feature window of 24 h. In another word, we conducted a research where the time gap rolling from 0 h before onset to 48 h before onset. Coupled with.
- Feature window variation: The feature window size from 24 h to 48 h before onset, with a time gap of 24 h.
2.4. Input Feature
2.5. Entropy-Based Feature Engineering Framework
- Step 1: We evaluated the setting time , which is the time interval between the patient’s first admission time and their first data entry. This step is important, as the setting time dramatically affects the portion of missing data. In another word, we only consider data collected after setting time .
- Step 2: The Shannon entropy was used to evaluate all vital signs V. This measures the probability distribution that characterizes the amount of missing information and data quality.
- Step 3: We conducted missing value imputation on V. Both Steps 1 and 2 are critical for the data quality, as it is not measured on a frequent and consistent basis; yet, vital signs are crucial for AKI evaluation and indication. An overall workflow is shown in Figure 3.
2.5.1. Setting Time
2.5.2. Vital Sign Entropy Feature Generation
2.5.3. Missing Value Imputation
3. Results
3.1. Cohort Analysis
3.2. Setting Time and Missing Data Analysis Results
3.3. Classification and Evaluation Criteria
3.4. Classification Performance with Time Gap Variation
3.5. Classification Performance with Feature Window Variation
4. Discussion and Limitation
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Ostermann, M.; Bellomo, R.; Burdmann, E.A.; Doi, K.; Endre, Z.H.; Goldstein, S.L.; Kane-Gill, S.L.; Liu, K.D.; Prowle, J.R.; Shaw, A.D.; et al. Controversies in acute kidney injury: Conclusions from a Kidney Disease: Improving Global Outcomes (KDIGO) Conference. Kidney Int. 2020, 98, 294–309. [Google Scholar] [CrossRef]
- Hoste, E.A.; Bagshaw, S.M.; Bellomo, R.; Cely, C.M.; Colman, R.; Cruz, D.N.; Edipidis, K.; Forni, L.G.; Gomersall, C.D.; Govil, D.; et al. Epidemiology of acute kidney injury in critically ill patients: The multinational AKI-EPI study. Intensive Care Med. 2015, 41, 1411–1423. [Google Scholar] [CrossRef]
- Zabala-Blanco, D.; Mora, M.; Barrientos, R.J.; Hernández-García, R.; Naranjo-Torres, J. Fingerprint Classification through Standard and Weighted Extreme Learning Machines. Appl. Sci. 2020, 10, 4125. [Google Scholar] [CrossRef]
- Chen, J.; Zhao, H.; Cao, Z.; Guo, F.; Pang, L. A Customized Semantic Segmentation Network for the Fingerprint Singular Point Detection. Appl. Sci. 2020, 10, 3868. [Google Scholar] [CrossRef]
- Fujita, H.; Cimr, D. Decision support system for arrhythmia prediction using convolutional neural network structure without preprocessing. Appl. Intell. 2019, 49, 3383–3391. [Google Scholar] [CrossRef]
- Tomašev, N.; Glorot, X.; Rae, J.W.; Zielinski, M.; Askham, H.; Saraiva, A.; Mottram, A.; Meyer, C.; Ravuri, S.; Protsyuk, I.; et al. A clinically applicable approach to continuous prediction of future acute kidney injury. Nature 2019, 572, 116–119. [Google Scholar] [CrossRef] [PubMed]
- Shannon, C.E. A mathematical theory of communication. ACM SIGMOBILE Mob. Comput. Commun. Rev. 2001, 5, 3–55. [Google Scholar] [CrossRef]
- Lent, C.S. Information and entropy in physical systems. In Energy Limits in Computation; Springer: Berlin/Heidelberg, Germany, 2019; pp. 1–63. [Google Scholar]
- Li, P.; Karmakar, C.; Yearwood, J.; Venkatesh, S.; Palaniswami, M.; Liu, C. Detection of epileptic seizure based on entropy analysis of short-term EEG. PLoS ONE 2018, 13, e0193691. [Google Scholar]
- Chicote, B.; Irusta, U.; Aramendi, E.; Alcaraz, R.; Rieta, J.J.; Isasi, I.; Alonso, D.; Baqueriza, M.D.M.; Ibarguren, K. Fuzzy and sample entropies as predictors of patient survival using short ventricular fibrillation recordings during out of hospital cardiac arrest. Entropy 2018, 20, 591. [Google Scholar] [CrossRef] [Green Version]
- Chen, C.H.; Huang, P.W.; Tang, S.C.; Shieh, J.S.; Lai, D.M.; Wu, A.Y.; Jeng, J.S. Complexity of heart rate variability can predict stroke-in-evolution in acute ischemic stroke patients. Sci. Rep. 2015, 5, 1–5. [Google Scholar]
- Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
- Awad, A.; Bader-El-Den, M.; McNicholas, J.; Briggs, J. Early hospital mortality prediction of intensive care unit patients using an ensemble learning approach. Int. J. Med Inform. 2017, 108, 185–195. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Lin, K.; Hu, Y.; Kong, G. Predicting in-hospital mortality of patients with acute kidney injury in the ICU using random forest model. Int. J. Med Inform. 2019, 125, 55–61. [Google Scholar] [CrossRef] [PubMed]
- Celi, L.A.G.; Tang, R.J.; Villarroel, M.C.; Davidzon, G.A.; Lester, W.T.; Chueh, H.C. A clinical database-driven approach to decision support: Predicting mortality among patients with acute kidney injury. J. Healthc. Eng. 2011, 2, 97–110. [Google Scholar] [CrossRef] [Green Version]
- Khwaja, A. KDIGO clinical practice guidelines for acute kidney injury. Nephron Clin. Pract. 2012, 120, c179–c184. [Google Scholar] [CrossRef] [PubMed]
- Moor, M.; Rieck, B.; Horn, M.; Jutzeler, C.; Borgwardt, K. Early Prediction of Sepsis in the ICU using Machine Learning: A Systematic Review. medRxiv 2021, 8, 348. [Google Scholar]
- Pencina, M.J.; D’Agostino, R.B., Sr.; D’Agostino, R.B., Jr.; Vasan, R.S. Evaluating the added predictive ability of a new marker: From area under the ROC curve to reclassification and beyond. Stat. Med. 2008, 27, 157–172. [Google Scholar] [CrossRef]
- Makris, K.; Spanou, L. Acute kidney injury: Definition, pathophysiology and clinical phenotypes. Clin. Biochem. Rev. 2016, 37, 85. [Google Scholar]
- Ilaria, G.; Kianoush, K.; Ruxandra, B.; Francesca, M.; Mariarosa, C.; Davide, G.; Claudio, R. Clinical adoption of Nephrocheck® in the early detection of acute kidney injury. Ann. Clin. Biochem. 2021, 58, 6–15. [Google Scholar] [CrossRef]
- Wang, Y.; Wei, Y.; Yang, H.; Li, J.; Zhou, Y.; Wu, Q. Utilizing imbalanced electronic health records to predict acute kidney injury by ensemble learning and time series model. BMC Med. Inform. Decis. Mak. 2020, 20, 1–13. [Google Scholar] [CrossRef]
- Abdullah, S.S.; Rostamzadeh, N.; Sedig, K.; Garg, A.X.; McArthur, E. Predicting Acute Kidney Injury: A Machine Learning Approach Using Electronic Health Records. Information 2020, 11, 386. [Google Scholar] [CrossRef]
Notations | Definition |
---|---|
Target function | |
Set of features | |
I | Set of patients |
J | Set of admission ID |
T | Set of time stamps |
Time gap where h before AKI onset | |
Setting time | |
Feature window where h before AKI onset | |
V | Set of vital signs |
n | Total number of patients |
m | Total number of admission IDs |
k | Total number of vital signs |
t | Time stamp |
Entropy for Vital sign V at the time | |
H | Target entropy |
h | Total number of possible states |
AKI onset | |
24 h before AKI onset | |
48 h before AKI onset | |
72 h before AKI onset | |
Mean and variance of vital signs | |
Entropy of vital signs | |
The combination of both and | |
Mean of vital signs | |
Variance of vital signs | |
Feature window length | |
q | q is an iterative counter denoting the time, where |
N | Number of computation times |
Variable | Type | Features |
---|---|---|
Vital signs | Vital sign | SBP, DBP, Pulse Pressure, Oximetry, Respiratory Rate, Pulse Rate, Body Temperature |
Mean | ||
Variance |
Missing Proportion Mean(%) | |||||||
---|---|---|---|---|---|---|---|
SBP | DBP | Pulse Pressure | Oximetry | Respiratory Rate | Pulse Rate | Temperature | |
Entropy | 0.23(0.07) | 0.23(0.07) | 0.24(0.07) | 11.38(0.58) | 0.83(0.11) | 0.11(0.04) | 7.76(0.31) |
Mean | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Variance | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Merge | 0.08(0.12) | 0.08(0.12) | 0.08(0.12) | 3.79(5.41) | 0.28(0.4) | 0.04(0.06) | 2.59(3.69) |
Variable | Imputation Method | Variable |
---|---|---|
Vital Signs | Median | Shannon entropy, Variance |
Random imputation from normal range | Mean SBP, DBP, Pulse pressure, Oximetry, Respiratory rate, Pulse rate, Body temperature |
Variable | Mean | Training and Validation | p-Value | Testing | p-Value | ||
---|---|---|---|---|---|---|---|
(STD) | AKI | Non-AKI | AKI | Non-AKI | |||
Patient population, N | 4278 | 1492 | 2382 | – | 139 | 265 | – |
Time span in ICU(days) | 11.29 (12.5) | 17.55 (17.42) | 7.4 (5.95) | ** | 16.02 (10.61) | 8.43 (7.85) | ** |
Demographic | |||||||
Age | 60.61 (16.47) | 63.78 (16.5) | 58.68 (16.17) | ** | 63.52 (16.93) | 58.54 (15.88) | ** |
BMI | 23.55 (4.95) | 23.04 (4.89) | 23.92 (4.9) | ** | 23.2 (4.1) | 23.37 (5.81) | ** |
Male | 2806 (65.59%) | 975 (65.35%) | 1574 (66.08%) | 0.64 | 91 (65.47%) | 166 (62.64%) | 0.58 |
Female | 1472 (34.41%) | 517 (34.65%) | 808 (33.92%) | 0.64 | 48 (34.53%) | 99 (37.36%) | 0.58 |
Variable | Training and Validation | Testing | |||||
---|---|---|---|---|---|---|---|
Mean (STD) | All | AKI | Non-AKI | p-Value | AKI | Non-AKI | p-Value |
Vital signs, N | 4278 | 1492 | 2382 | — | 139 | 265 | — |
SBP(mmHg) | 131.4 (23.68) | 129.69 (25.04) | 132.41 (22.68) | ** | 130.75 (26.76) | 132.23 (22.48) | 0.56 |
DBP(mmHg) | 78 (16.77) | 76.51 (17.43) | 78.85 (16.26) | ** | 76.15 (18.59) | 79.59 (15.79) | 0.05 |
Pulse pressure | 53.35 (19.42) | 53.18 (20.63) | 53.49 (18.55) | 0.62 | 54.6 (23.38) | 52.42 (17.72) | 0.29 |
Oximetry(%) | 98.21 (2.87) | 97.93 (3.45) | 98.39 (2.37) | ** | 98.09 (3.51) | 98.3 (2.92) | 0.53 |
Respiratory rate | 18.86 (3.92) | 19.42 (4.36) | 18.44 (3.44) | ** | 20.54 (5.74) | 18.6 (3.49) | ** |
Pulse rate(/min) | 90.17 (19.98) | 94.48 (21.12) | 87.45 (18.68) | ** | 95.22 (21.37) | 87.68 (19.24) | ** |
Temperature(Celsius) | 36.63 (0.93) | 94.48 (21.12) | 36.58 (0.87) | ** | 36.76 (1.11) | 36.57 (0.82) | 0.05 |
Medication, Vasopressors | |||||||
Vasopressin | 24 (0.56%) | 13 (0.87%) | 7 (0.29%) | * | 1 (0.72%) | 3 (1.13%) | 0.69 |
Norepinephrine | 826 (19.31%) | 398 (26.68%) | 348 (14.61%) | ** | 38 (27.34%) | 42 (15.85%) | ** |
Dopamine | 451 (10.54%) | 156 (10.46%) | 253 (10.62%) | 0.87 | 14 (10.07%) | 28 (10.57%) | 0.88 |
Epinephrine | 278 (6.5%) | 145 (9.72%) | 102 (4.28%) | ** | 16 (11.51%) | 15 (5.66%) | * |
Dobutamine | 59 (1.38%) | 23 (1.54%) | 31 (1.3%) | 0.54 | 2 (1.44%) | 3 (1.13%) | 0.79 |
Variable | Training and Validation | Testing | |||||
---|---|---|---|---|---|---|---|
Mean (STD) | All | AKI | Non-AKI | p-Value | AKI | Non-AKI | p-Value |
Ventilatory Support | 4287 | 1492 | 2382 | – | 139 | 265 | – |
FIO | 70.95 (25.87) | 75.4 (26.07) | 67.7 (25.27) | ** | 78.62 (25.33) | 65.89 (24.73) | ** |
PEEPCPAP | 4.87 (1.69) | 5.06 (1.78) | 4.71 (1.59) | ** | 5.09 (1.72) | 4.86 (1.71) | 0.24 |
PAW | 23.13 (6.73) | 23.21 (6.72) | 22.89 (6.63) | 0.61 | 23.76 (8.1) | 24.38 (6.71) | 0.8 |
MAPS | 11.62 (2.55) | 11.93 (2.65) | 11.34 (2.52) | * | 10.97 (1.47) | 11.47 (2.06) | 0.47 |
TOTRR | 18.89 (5.36) | 19.76 (5.8) | 18.1 (4.81) | ** | 19.01 (5.31) | 17.71 (4.74) | 0.44 |
VTEXH | 0.52 (0.11) | 0.52 (0.11) | 0.53 (0.11) | 0.28 | 0.51 (0.11) | 0.49 (0.13) | 0.63 |
MVEXH | 9.57 (2.8) | 9.79 (3.03) | 9.38 (2.61) | 0.13 | 9.72 (2.7) | 9.08 (2.16) | 0.44 |
Vital Signs | Mean Setting Time (h) |
---|---|
Pulse pressure (SBP, DBP) | 2.93 |
Oximetry | 8.39 |
Respiratory rate | 3.00 |
Pulse rate | 2.99 |
Temperature | 3.05 |
Feature Types | AUROC Mean (STD) | Accuracy(%) Mean (STD) | NRI(%) Mean (STD) | ||||||
---|---|---|---|---|---|---|---|---|---|
Time Gap | Before Critical Point | After Critical point | pValue | Before Critical Point | After Critical Point | pValue | Before Critical Point | After Critical Point | pValue |
Entropy | 69.09 (0.82) | 67.36 (0.99) | ** | 65.66 (0.80) | 64.02 (1.08) | ** | −0.9 (1.86) | −4.19 (2.27) | ** |
Mean | 70.81 (0.31) | 70.35 (0.53) | 0.07 | 66.01 (0.42) | 65.48 (0.55) | 0.06 | −0.22 (1.09) | −1.29 (1.10) | 0.09 |
Variance | 67.68 (0.49) | 67.67 (0.56) | 0.96 | 64.13 (0.31) | 64.18 (0.60) | 0.79 | 0.19 (0.85) | 0.17 (1.23) | 0.95 |
Merge | 75.87 (0.30) | 75.58 (0.34) | 0.09 | 70.3 (0.31) | 69.89 (0.46) | * | 1.3 (0.30) | 0.2 (0.95) | * |
Feature Types | AUROC Mean (STD) | Accuracy(%) Mean (STD) | ||||
---|---|---|---|---|---|---|
Time Gap | Before Critical Point | After Critical Point | pValue | Before Critical Point | After Critical Point | pValue |
Shannon entropy | 55.74 (0.89) | 53.88 (1.44) | ** | 60.57 (1.04) | 58.42 (1.44) | ** |
Mean | 60.82 (1.37) | 59.49 (0.99) | * | 59.98 (0.46) | 59.51 (0.69) | 0.19 |
Variance | 55.72 (1.24) | 55.1 (0.94) | 0.25 | 58.31 (0.35) | 58.45 (0.82) | 0.65 |
Merge | 66.76 (0.67) | 66.33 (0.99) | 0.4 | 64.7 (0.44) | 64.18 (0.58) | * |
Feature Types | AUROC (%) | Accuracy (%) | NRI (%) | ||||||
---|---|---|---|---|---|---|---|---|---|
Feature Window | 24 h | 48 h | Improvement | 24 h | 48 h | Improvement | 24 h | 48 h | Improvement |
Shannon entropy | 70.01 | 76.8 | 6.79 | 65.82 | 69.33 | 3.5 | – | 8.32 | 8.32 |
Mean | 72.48 | 71.31 | −1.16 | 67.91 | 67.42 | −0.48 | – | −0.88 | −0.88 |
Variance | 67.08 | 69.51 | 2.42 | 64.7 | 65.96 | 1.26 | – | 3 | 3 |
Merge | 77.64 | 81.24 | 3.59 | 71.62 | 73.67 | 2.05 | – | 4.4 | 4.4 |
Feature Types | Recall(%) | F1-Score | NRI (%) | ||||||
---|---|---|---|---|---|---|---|---|---|
Feature Window | 24 h | 48 h | Improvement | 24 h | 48 h | Improvement | 24 h | 48 h | Improvement |
Shannon entropy | 54.55 | 64.59 | 10.03 | 55.37 | 59.32 | 3.94 | 54.96 | 61.84 | 6.88 |
Mean | 60.56 | 59.28 | −1.27 | 57.56 | 57.48 | −0.08 | 59.02 | 58.37 | −0.65 |
Variance | 52.2 | 54.97 | 2.76 | 53.97 | 55.9 | 1.92 | 53.07 | 55.43 | 2.35 |
Merge | 67.43 | 70.88 | 3.45 | 61.48 | 64.37 | 2.52 | 64.51 | 67.47 | 2.95 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Huang, C.-T.; Chang, R.-C.; Tsai, Y.-L.; Pai, K.-C.; Wang, T.-J.; Hsu, C.-T.; Chen, C.-H.; Huang, C.-C.; Wang, M.-S.; Chen, L.-C.; et al. Entropy-Based Time Window Features Extraction for Machine Learning to Predict Acute Kidney Injury in ICU. Appl. Sci. 2021, 11, 6364. https://doi.org/10.3390/app11146364
Huang C-T, Chang R-C, Tsai Y-L, Pai K-C, Wang T-J, Hsu C-T, Chen C-H, Huang C-C, Wang M-S, Chen L-C, et al. Entropy-Based Time Window Features Extraction for Machine Learning to Predict Acute Kidney Injury in ICU. Applied Sciences. 2021; 11(14):6364. https://doi.org/10.3390/app11146364
Chicago/Turabian StyleHuang, Chun-Te, Rong-Ching Chang, Yi-Lu Tsai, Kai-Chih Pai, Tsai-Jung Wang, Chia-Tien Hsu, Cheng-Hsu Chen, Chien-Chung Huang, Min-Shian Wang, Lun-Chi Chen, and et al. 2021. "Entropy-Based Time Window Features Extraction for Machine Learning to Predict Acute Kidney Injury in ICU" Applied Sciences 11, no. 14: 6364. https://doi.org/10.3390/app11146364
APA StyleHuang, C. -T., Chang, R. -C., Tsai, Y. -L., Pai, K. -C., Wang, T. -J., Hsu, C. -T., Chen, C. -H., Huang, C. -C., Wang, M. -S., Chen, L. -C., Sheu, R. -K., Wu, C. -L., & Lai, C. -M. (2021). Entropy-Based Time Window Features Extraction for Machine Learning to Predict Acute Kidney Injury in ICU. Applied Sciences, 11(14), 6364. https://doi.org/10.3390/app11146364