Next Article in Journal
Impact of Net Atrioventricular Compliance on Mitral Valve Area Assessment—A Perspective Considering Three-Dimensional Mitral Valve Area by Transesophageal Echocardiography
Previous Article in Journal
An Interdisciplinary Approach: Presentation of the Pediatric Obstructive Sleep Apnea Diagnostic Examination Form (POSADEF)
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Systematic Review

Machine Learning Models for Predicting Mortality in Critically Ill Patients with Sepsis-Associated Acute Kidney Injury: A Systematic Review

by
Chieh-Chen Wu
1,
Tahmina Nasrin Poly
2,
Yung-Ching Weng
1,
Ming-Chin Lin
1,3,4,* and
Md. Mohaimenul Islam
5,*
1
Department of Healthcare Information and Management, School of Health and Medical Engineering, Ming Chuan University, Taipei 111, Taiwan
2
Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 110, Taiwan
3
Taipei Neuroscience Institute, Taipei Medical University, Taipei 110, Taiwan
4
Department of Neurosurgery, Shuang Ho Hospital, Taipei Medical University, Taipei 110, Taiwan
5
Department of Outcomes and Translational Sciences, College of Pharmacy, The Ohio State University, Columbus, OH 43210, USA
*
Authors to whom correspondence should be addressed.
Diagnostics 2024, 14(15), 1594; https://doi.org/10.3390/diagnostics14151594
Submission received: 12 July 2024 / Revised: 22 July 2024 / Accepted: 22 July 2024 / Published: 24 July 2024
(This article belongs to the Special Issue Artificial Intelligence for Clinical Diagnostic Decision Making)

Abstract

:
While machine learning (ML) models hold promise for enhancing the management of acute kidney injury (AKI) in sepsis patients, creating models that are equitable and unbiased is crucial for accurate patient stratification and timely interventions. This study aimed to systematically summarize existing evidence to determine the effectiveness of ML algorithms for predicting mortality in patients with sepsis-associated AKI. An exhaustive literature search was conducted across several electronic databases, including PubMed, Scopus, and Web of Science, employing specific search terms. This review included studies published from 1 January 2000 to 1 February 2024. Studies were included if they reported on the use of ML for predicting mortality in patients with sepsis-associated AKI. Studies not written in English or with insufficient data were excluded. Data extraction and quality assessment were performed independently by two reviewers. Five studies were included in the final analysis, reporting a male predominance (>50%) among patients with sepsis-associated AKI. Limited data on race and ethnicity were available across the studies, with White patients comprising the majority of the study cohorts. The predictive models demonstrated varying levels of performance, with area under the receiver operating characteristic curve (AUROC) values ranging from 0.60 to 0.87. Algorithms such as extreme gradient boosting (XGBoost), random forest (RF), and logistic regression (LR) showed the best performance in terms of accuracy. The findings of this study show that ML models hold immense ability to identify high-risk patients, predict the progression of AKI early, and improve survival rates. However, the lack of fairness in ML models for predicting mortality in critically ill patients with sepsis-associated AKI could perpetuate existing healthcare disparities. Therefore, it is crucial to develop trustworthy ML models to ensure their widespread adoption and reliance by both healthcare professionals and patients.

1. Introduction

Acute kidney injury (AKI) is a common public health threat worldwide, affecting one in five adults and one in three children [1]. The risk of AKI is even higher among patients with sepsis, playing a critical role in 40–50% of cases [2]. Previous studies have highlighted that sepsis increases the likelihood of in-hospital mortality by six- to eight-fold [3,4]. However, the timely identification and careful management of high-risk AKI patients may reduce mortality risk [5,6]. Recent technological advancements paved the way for developing automated real-time monitoring tools, facilitating appropriate clinical interventions [7,8,9].
In recent decades, machine learning (ML) has emerged as a pivotal tool for predicting disease onset and effectively managing medical conditions [10,11,12]. By leveraging extensive clinical datasets, comprising patient demographics, previous drug and disease history, organizational factors, lifestyle variables, and biomarkers [13], ML algorithms can correctly identify patterns and risk factors associated with conditions, such as AKI [14,15,16]. Existing algorithms can analyze complex interactions among various factors, facilitating the development of predictive models that assess an individual’s likelihood of developing AKI [17]. ML models also hold promise in personalized treatment plans by stratifying individual responses to various interventions, optimizing strategies for disease management, and ultimately improving patient outcomes [18,19,20].
However, developing fair ML models for stratifying high-risk AKI patients is crucial to minimize potential biases and disparities in healthcare outcomes. There is a dearth of prior research examining the fairness (e.g., testing ML performance in various racial and ethnic backgrounds) of ML models, a factor critical for ensuring equitable treatment and accurate classification across diverse demographic groups. This study aims to fill this gap by evaluating the performance of ML models in predicting mortality in patients with sepsis-associated AKI using insights gleaned from previously published research. Ensuring fairness in ML for mortality prediction, healthcare systems can offer more equitable and precise care to individuals, regardless of their racial and ethnic background or contextual orientations. Moreover, this study also provides strategies for reducing bias and developing fair models within real-world clinical settings.

2. Methods

2.1. Search Strategy

A systematic search was conducted across popular electronic databases such as PubMed, Web of Science Core Collection, and Scopus between 1 January 2000 and 1 February 2024. The following keywords were used to retrieve all relevant articles: “machine learning” AND “sepsis-associated acute kidney injury” AND “mortality”. This search strategy underwent validation by an expert and involved the development of structure search strategies. Language restriction was not applied during the search process. The bibliographies of retrieved studies were further scrutinized to identify additional relevant studies for inclusion.

2.2. Eligibility Criteria

Studies were only eligible for inclusion if they met the following criteria: (i) described ML algorithms, (ii) focused on mortality in patients with sepsis-associated AKI as the outcome of interest, (iii) reported the predictive performance of ML models, and (iv) were clinical studies. Studies were excluded if they (i) were not written in English or (ii) were reviews, editorials, letters, or conference abstracts. No specific study design or setting was prioritized in the inclusion criteria.

2.3. Study Selection

Two reviewers (M.M.I. and C.C.W.) independently screened the titles and abstracts to identify relevant studies for inclusion and exclusion. Eligible articles underwent full-text review, with duplicate studies removed. The same two reviewers conducted full-text screening if the studies met the inclusion criteria. Any discrepancies during the screening process were resolved through discussions between the two reviewers, reaching a final consensus.

2.4. Risk of Bias Assessment

The same two reviewers independently utilized the prediction model risk of bias assessment tool (PROBAST) to assess the methodological quality of each included ML model study [21]. This tool is specifically designed to assess the risk of bias and the applicability of diagnostic and prognostic prediction model studies. It is structured around four domains: participants, predictors, outcome, and analysis, which are crucial for evaluating the reliability and relevance of included studies. PROBAST rigorously assesses the methodologies to identify potential biases that might influence outcomes [22]. This tool is highly regarded for its systematic, structured method of assessing the integrity and clarity of the prediction model [23].

2.5. Data Extraction

A table was generated to record all relevant data. The extracted information was independently collected and cross-checked by the same two authors. Any discrepancies were resolved through consensus based on predefined criteria. All data items were collected according to the Cochrane guidance for data collection and the Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modelling Studies (CHARMS) checklist [24]. This tool is used to assist in the appraisal and extraction of relevant data from prediction model studies for systematic reviews [25]. This checklist guides reviewers through a comprehensive assessment of study methodologies, focusing on key aspects such as study participants, predictors, outcomes, and statistical analysis methods [26]. We extracted the following information from each study: first author, publication year, country, total number of patients, mean age, study design, AKI definition, patient inclusion and exclusion criteria, data partitioning method, name of ML model, internal and external validation details, and predictive performance metrics (including discrimination and calibration). If the included study reports multiple ML models, we separately recorded the predictive performance of each model.

2.6. Statistical Analysis

A descriptive summary was provided regarding the types of ML models used, the prediction of mortality in patients with sepsis-associated AKI, risk of bias assessment, and model validation. The results of ML studies were presented for each predicted outcome to illustrate the predictive performance of each type of ML model.

3. Results

3.1. Search Results

Initially, 85 studies were retrieved through the search process, followed by the exclusion of 66 duplicate studies. A total of 19 studies underwent initial screening, resulting in the exclusion of 12 studies based on title and abstract review. Of the remaining seven studies, a full-text evaluation was conducted, leading to the exclusion of two studies upon review. Finally, five studies met the criteria for inclusion in this systematic review [27,28,29,30,31] (Figure 1).

3.2. Study Characteristics

A total of 57,769 patients were enrolled across five studies. Table 1 presents the baseline characteristics of these studies. All studies utilized MIMIC data to develop and evaluate the performance of ML models. Luo et al. [27], however, employed data from the eICU Collaborative Research Database (eICU-CRD) to test their ML models. The majority (>50%) of patients across studies were male. While only two studies reported on race and ethnicity, a predominance of White patients was included [27,28]. Vital signs such as heart rate, respiratory rate, and blood pressure, along with laboratory tests including serum creatinine, platelets, white blood cell count (WBC), blood urea nitrogen (BUN), and urine outputs, were consistently assessed across all studies to predict mortality. Additionally, the Sequential Organ Failure Assessment (SOFA) score and Simplified Acute Physiology Score (SAPS II) were commonly utilized for outcome prediction in patients with severe sepsis-associated AKI.
The studies included in the systematic review were predominantly conducted in ICU settings using Sepsis-3 criteria as the selection standard. Of the five studies reviewed, three utilized data partition methods for model development and evaluation, involving various approaches to dividing the dataset into training and testing subsets. Two studies employed cross-validation methods, iteratively training and testing the model on different data subsets to assess generalizability and performance. The proportion of missing data varied, and different imputation methods such as XGBoost, MiceForest, and K-nearest neighbor were employed. Notably, none of the studies tested their models across different racial groups, highlighting a significant gap in evaluating the models’ fairness and generalizability (Table 2).

3.3. Predictor Identification

Gao et al. [29] initially considered 51 original factors to predict mortality in patients with sepsis-associated AKI. They then applied ensemble stepwise feature ranking to identify the 11 most promising factors. Li et al. [30] collected 44 variables and identified 24 statistically significant variables associated with mortality using LASSO regression. Luo et al. [27] applied the XGBoost algorithm to assess the importance of various features in predicting mortality. For exploring the interpretability of these XGBoost models, the Shapley Additive Explanations (SHAP) method was also utilized. Yang et al. [31] initially conducted univariate regression analysis on all variables, excluding any with a p-value above 0.05. They then employed the Boruta algorithm, which relies on random forest techniques, to select the most important variables for predicting mortality. Finally, Zhou et al. [28] employed the recursive feature elimination (RFE) algorithm to pinpoint key variables for their ML models.

3.4. Risk of Bias

Figure 2 presents the overall risk of bias assessment based on PROBAST criteria. All studies were rated as having a low risk of bias in both the participant and predictor domains. However, in the “outcome” domain, three studies (60%) received an unclear risk of bias rating due to insufficient information on the outcome definition and assessment. Additionally, all studies were rated as having a high risk of bias due to the absence of an external validation and evaluation of machine learning performance across diverse groups. Consequently, the overall risk of bias for the included studies was considered high, primarily due to the unclear ratings in the outcome domain and the lack of external validation.

3.5. Performance of Machine Learning Models

The predictive models demonstrated varying levels of performance, with area under the receiver operating characteristic curve (AUROC) values ranging from 0.60 to 0.87. The included studies developed and tested several popular machine learning (ML) models, including logistic regression (LR), random forest (RF), and extreme gradient boosting (XGBoost). All studies developed and validated the XGBoost model, which showed an area under the receiver operating characteristic curve (AUROC) ranging from 0.79 to 0.87 and an accuracy between 0.77 and 0.83. Random forest (RF) and logistic regression (LR) were also commonly used to predict mortality in patients with sepsis-associated AKI, achieving an average accuracy of 0.813 and an AUROC of 0.802. Table 3 presents the performance of these ML models in predicting mortality.

4. Discussion

This systematic review of ML models for predicting mortality in patients with sepsis-associated acute kidney injury (AKI) reveals several key findings and areas for future research. To our knowledge, this represents the first systematic review of ML studies aimed at predicting mortality in this patient population. The findings of this study show that included studies primarily conducted in ICU settings and utilized Sepsis-3 criteria for patient selection. Data partitioning and cross-validation methods were employed to develop and evaluate the models, with three studies using traditional partitioning techniques and two opting for cross-validation to ensure generalizability.
Feature selection and imputation methods varied across studies, with XGBoost, MiceForest, and K-nearest neighbor, as these methods were crucial in handling missing data and identifying significant predictors of mortality. Vital signs, comorbidities, and laboratory tests emerged as critical prognostic indicators, aligning with clinical insights and reinforcing the importance of these variables in risk stratification. The included studies utilized widely used feature selection techniques such as SHAP, LASSO regression, and recursive feature elimination to identify key variables for their ML models. The performance of ML models varied across studies, with AUROC values ranging from 0.60 to 0.87. Notably, algorithms such as XGBoost, random forest (RF), and logistic regression (LR) consistently emerged as high-performing models. Despite the advancements, a significant gap was identified in the evaluation of these models across different racial groups. None of the studies included in the systematic review tested the performance of models in diverse racial populations, raising concerns about the fairness and generalizability of the findings.
Recent evidence underscores the importance of ensuring the fairness of ML models when predicting diseases such as AKI [32,33,34]. It is important to note that AKI affects individuals from diverse backgrounds, including various racial and ethnic backgrounds, socioeconomic status, sexual orientation, and geographical regions [35,36,37]. Without fair development, algorithms may disproportionately impact certain demographic groups, leading to inequitable healthcare outcomes. Algorithmic biases may also result in inaccurate predictions or misdiagnoses of high-risk AKI, potentially causing inappropriate treatments or delays in care. However, developing fair ML models ensures all individuals receive accurate assessments and appropriate interventions, regardless of their demographic characteristics [38,39].
Ethical concerns are increasingly raised regarding the differential impact that models may exert on under-represented communities [40,41,42]. Therefore, there is widespread acknowledgment of the urgent need for fairness in AI, especially within healthcare ML models [43]. With AKI prevalence higher among minority groups, it is essential to develop fair ML models using diverse and representative datasets. Continuous monitoring and testing of the ML model performance across various demographic groups are essential for identifying and effectively addressing relevant biases. Indeed, including individual, organizational, and community factors in the model development and validation processes can offer valuable insights into fairness considerations [33,44,45]. To ensure algorithms’ accountability and trustworthiness, transparent documentation of the model’s decision-making process and regular audits are recommended.
  • Strengths and Limitations:
Our study has several strengths. Firstly, it is the first systematic review assessing the fairness of ML models in predicting mortality among patients with sepsis-associated AKI. Secondly, our findings also elucidate commonly utilized ML models, offering valuable insights for researchers embarking on similar investigations. Nevertheless, our study has limitations. Our study has several limitations that could affect the robustness and applicability of our findings. Firstly, it is based on a small sample size, including only five existing studies, which may reduce the statistical power and comprehensive nature of our analysis. More studies could have broadened the scope and depth of our assessment. Secondly, the use of a single dataset across all studies limits the generalizability of our conclusions, as results might not be replicable with different data. Lastly, the performance of the ML models, while satisfactory in predicting mortality, is drawn from data with limited racial and ethnic diversity. This lack of representation could potentially perpetuate health disparities, suggesting a need for more diverse datasets to ensure broader applicability and equity in healthcare outcomes.
  • Future Direction:
To enhance the robustness and relevance of future studies, several steps should be taken based on the current limitations.
(a)
External Validation:
One significant research gap is the absence of external validation. Most studies included in this review used the same dataset for both developing and evaluating the model. Future research should focus on validating machine learning models with external datasets to ensure their robustness and applicability across various clinical settings.
(b)
Racial and Ethnic Diversity:
None of the included studies evaluated their models in diverse racial and ethnic groups, which raises concerns about fairness and generalizability. Future research should focus on developing and testing models that can be applied across various populations to help reduce health disparities.
(c)
Advanced Imputation Techniques:
Although various imputation methods were used in the included studies, the development and application of more advanced techniques could significantly improve model accuracy, especially in datasets with high levels of missing data. Each study should report the percentage of missing data to assist researchers in future studies.
(d)
Comprehensive Feature Selection:
Although the included studies utilized popular feature selection algorithms, future research should explore more comprehensive and innovative techniques to identify novel prognostic indicators. Integrating clinical expertise with advanced statistical methods can lead to more reliable predictors of mortality among sepsis patients with AKI.
(e)
Longitudinal Data:
Using longitudinal data can enhance the predictive power of machine learning models for predicting mortality among sepsis patients with AKI. Future research should aim to develop models that incorporate and analyze temporal trends in patient data.
By addressing these gaps, future research can enhance the development of ML models that are not only accurate but also fair and generalizable, ultimately improving clinical outcomes for patients with sepsis-associated AKI. There is a need for research on the integration of ML models into clinical workflows; therefore, future studies should assess the feasibility, acceptance, and impact of these models in real-world clinical settings to facilitate their adoption.

5. Conclusions

In this study, we evaluated ML models to predict mortality in patients with sepsis-associated acute kidney injury (AKI), a condition noted for its high morbidity and mortality rates. Our findings show that ML models such as logistic regression, random forest, and extreme gradient boosting (XGBoost) delivered promising results. These outcomes highlight the efficacy of advanced data-driven methods in improving predictive accuracy, essential for timely interventions and management in septic patients. Key feature selection methods like Shapley Additive Explanations (SHAP), LASSO regression, and recursive feature elimination (RFE) were crucial in optimizing these models, underscoring the importance of selecting clinically relevant variables to maximize the ML model’s benefits in healthcare. Despite the robust performance of these ML models, included studies showed a significant risk of bias, potentially limiting the application of ML in medical diagnostics and planning. Particularly, included studies often excluded minority populations, which raises concerns about the models’ effectiveness across diverse socio-economic and racial groups. Future research should focus on enhancing the fairness of algorithms to ensure consistent and reliable predictions across all demographic groups.

Author Contributions

Conceptualization, C.-C.W. and M.M.I.; methodology, T.N.P.; software, T.N.P.; validation, M.-C.L.; formal analysis, Y.-C.W.; investigation, M.-C.L.; resources, M.M.I.; data curation, M.M.I.; writing—original draft preparation, M.M.I., C.-C.W.; writing—review and editing, T.N.P.; visualization, MML.; supervision, M.M.I. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Susantitaphong, P.; Cruz, D.N.; Cerda, J.; Abulfaraj, M.; Alqahtani, F.; Koulouridis, I.; Jaber, B.L. World incidence of AKI: A meta-analysis. Clin. J. Am. Soc. Nephrol. 2013, 8, 1482. [Google Scholar] [CrossRef]
  2. Peerapornratana, S.; Manrique-Caballero, C.L.; Gómez, H.; Kellum, J.A. Acute kidney injury from sepsis: Current concepts, epidemiology, pathophysiology, prevention and treatment. Kidney Int. 2019, 96, 1083–1099. [Google Scholar] [CrossRef]
  3. Uchino, S.; Kellum, J.A.; Bellomo, R.; Doig, G.S.; Morimatsu, H.; Morgera, S.; Schetz, M.; Tan, I.; Bouman, C.; Macedo, E. Acute renal failure in critically ill patients: A multinational, multicenter study. JAMA 2005, 294, 813–818. [Google Scholar] [CrossRef]
  4. Thakar, C.V.; Christianson, A.; Freyberg, R.; Almenoff, P.; Render, M.L. Incidence and outcomes of acute kidney injury in intensive care units: A Veterans Administration study. Crit. Care Med. 2009, 37, 2552–2558. [Google Scholar] [CrossRef]
  5. Borthwick, E.; Ferguson, A. Perioperative acute kidney injury: Risk factors, recognition, management, and outcomes. BMJ 2010, 341, c3365. [Google Scholar] [CrossRef]
  6. Macedo, E.; Bihorac, A.; Siew, E.D.; Palevsky, P.M.; Kellum, J.A.; Ronco, C.; Mehta, R.L.; Rosner, M.H.; Haase, M.; Kashani, K.B. Quality of care after AKI development in the hospital: Consensus from the 22nd Acute Disease Quality Initiative (ADQI) conference. Eur. J. Intern. Med. 2020, 80, 45–53. [Google Scholar] [CrossRef]
  7. Wang, P.; Kricka, L.J. Current and emerging trends in point-of-care technology and strategies for clinical validation and implementation. Clin. Chem. 2018, 64, 1439–1452. [Google Scholar] [CrossRef]
  8. Dwivedi, R.; Mehrotra, D.; Chandra, S. Potential of Internet of Medical Things (IoMT) applications in building a smart healthcare system: A systematic review. J. Oral Biol. Craniofacial Res. 2022, 12, 302–318. [Google Scholar] [CrossRef]
  9. Vashist, S.K.; Luong, J.H.; Vashist, S.K.; Luong, J.H. An overview of point-of-care technologies enabling next-generation healthcare monitoring and management. In Point-of-Care Technologies Enabling Next-Generation Healthcare Monitoring and Management; Springer: Cham, Switzerland, 2019. [Google Scholar]
  10. Ahmed, Z.; Mohamed, K.; Zeeshan, S.; Dong, X. Artificial intelligence with multi-functional machine learning platform development for better healthcare and precision medicine. Database 2020, 2020, baaa010. [Google Scholar] [CrossRef] [PubMed]
  11. Poly, T.N.; Islam, M.M.; Li, Y.C. Early diabetes prediction: A comparative study using machine learning techniques. In Advances in Informatics, Management and Technology in Healthcare; IOS Press: Amsterdam, The Netherlands, 2022; pp. 409–413. [Google Scholar]
  12. Vatansever, S.; Schlessinger, A.; Wacker, D.; Kaniskan, H.Ü.; Jin, J.; Zhou, M.M.; Zhang, B. Artificial intelligence and machine learning-aided drug discovery in central nervous system diseases: State-of-the-arts and future directions. Med. Res. Rev. 2021, 41, 1427–1473. [Google Scholar] [CrossRef]
  13. Bartolome, A.; Prioleau, T. A computational framework for discovering digital biomarkers of glycemic control. NPJ Digit. Med. 2022, 5, 111. [Google Scholar] [CrossRef]
  14. Penny-Dimri, J.C.; Bergmeir, C.; Reid, C.M.; Williams-Spence, J.; Cochrane, A.D.; Smith, J.A. Machine learning algorithms for predicting and risk profiling of cardiac surgery-associated acute kidney injury. Semin. Thorac. Cardiovasc. Surg. 2021, 33, 735–745. [Google Scholar] [CrossRef]
  15. Mohamadlou, H.; Lynn-Palevsky, A.; Barton, C.; Chettipally, U.; Shieh, L.; Calvert, J.; Saber, N.R.; Das, R. Prediction of acute kidney injury with a machine learning algorithm using electronic health record data. Can. J. Kidney Health Dis. 2018, 5, 2054358118776326. [Google Scholar] [CrossRef]
  16. Argyropoulos, A.; Townley, S.; Upton, P.M.; Dickinson, S.; Pollard, A.S. Identifying on admission patients likely to develop acute kidney injury in hospital. BMC Nephrol. 2019, 20, 56. [Google Scholar] [CrossRef]
  17. Chen, W.; Hu, Y.; Zhang, X.; Wu, L.; Liu, K.; He, J.; Tang, Z.; Song, X.; Waitman, L.R.; Liu, M. Causal risk factor discovery for severe acute kidney injury using electronic health records. BMC Med. Inform. Decis. Mak. 2018, 18, 57–63. [Google Scholar] [CrossRef]
  18. Morid, M.A.; Sheng, O.R.L.; Del Fiol, G.; Facelli, J.C.; Bray, B.E.; Abdelrahman, S. Temporal pattern detection to predict adverse events in critical care: Case study with acute kidney injury. JMIR Med. Inform. 2020, 8, e14272. [Google Scholar] [CrossRef]
  19. Zhu, K.; Song, H.; Zhang, Z.; Ma, B.; Bao, X.; Zhang, Q.; Jin, J. Acute kidney injury in solitary kidney patients after partial nephrectomy: Incidence, risk factors and prediction. Transl. Androl. Urol. 2020, 9, 1232. [Google Scholar] [CrossRef]
  20. Tomašev, N.; Glorot, X.; Rae, J.W.; Zielinski, M.; Askham, H.; Saraiva, A.; Mottram, A.; Meyer, C.; Ravuri, S.; Protsyuk, I. A clinically applicable approach to continuous prediction of future acute kidney injury. Nature 2019, 572, 116–119. [Google Scholar] [CrossRef]
  21. Moons, K.G.; Wolff, R.F.; Riley, R.D.; Whiting, P.F.; Westwood, M.; Collins, G.S.; Reitsma, J.B.; Kleijnen, J.; Mallett, S. PROBAST: A tool to assess risk of bias and applicability of prediction model studies: Explanation and elaboration. Ann. Intern. Med. 2019, 170, W1–W33. [Google Scholar] [CrossRef]
  22. Du, M.; Haag, D.; Song, Y.; Lynch, J.; Mittinty, M. Examining bias and reporting in oral health prediction modeling studies. J. Dent. Res. 2020, 99, 374–387. [Google Scholar] [CrossRef]
  23. Ma, L.-L.; Wang, Y.-Y.; Yang, Z.-H.; Huang, D.; Weng, H.; Zeng, X.-T. Methodological quality (risk of bias) assessment tools for primary and secondary medical studies: What are they and which is better? Mil. Med. Res. 2020, 7, 7. [Google Scholar] [CrossRef]
  24. Moons, K.G.; de Groot, J.A.; Bouwmeester, W.; Vergouwe, Y.; Mallett, S.; Altman, D.G.; Reitsma, J.B.; Collins, G.S. Critical appraisal and data extraction for systematic reviews of prediction modelling studies: The CHARMS checklist. PLoS Med. 2014, 11, e1001744. [Google Scholar] [CrossRef]
  25. Lindroth, H.; Bratzke, L.; Purvis, S.; Brown, R.; Coburn, M.; Mrkobrada, M.; Chan, M.T.; Davis, D.H.; Pandharipande, P.; Carlsson, C.M. Systematic review of prediction models for delirium in the older adult inpatient. BMJ Open 2018, 8, e019223. [Google Scholar] [CrossRef] [PubMed]
  26. Bradley, A.; Van der Meer, R.; McKay, C.J. A systematic review of methodological quality of model development studies predicting prognostic outcome for resectable pancreatic cancer. BMJ Open 2019, 9, e027192. [Google Scholar] [CrossRef]
  27. Luo, X.-Q.; Yan, P.; Duan, S.-B.; Kang, Y.-X.; Deng, Y.-H.; Liu, Q.; Wu, T.; Wu, X. Development and validation of machine learning models for real-time mortality prediction in critically ill patients with sepsis-associated acute kidney injury. Front. Med. 2022, 9, 853102. [Google Scholar] [CrossRef]
  28. Zhou, H.; Liu, L.; Zhao, Q.; Jin, X.; Peng, Z.; Wang, W.; Huang, L.; Xie, Y.; Xu, H.; Tao, L. Machine learning for the prediction of all-cause mortality in patients with sepsis-associated acute kidney injury during hospitalization. Front. Immunol. 2023, 14, 1140755. [Google Scholar] [CrossRef]
  29. Gao, T.; Nong, Z.; Luo, Y.; Mo, M.; Chen, Z.; Yang, Z.; Pan, L. Machine learning-based prediction of in-hospital mortality for critically ill patients with sepsis-associated acute kidney injury. Ren. Fail. 2024, 46, 2316267. [Google Scholar] [CrossRef]
  30. Li, X.; Wu, R.; Zhao, W.; Shi, R.; Zhu, Y.; Wang, Z.; Pan, H.; Wang, D. Machine learning algorithm to predict mortality in critically ill patients with sepsis-associated acute kidney injury. Sci. Rep. 2023, 13, 5223. [Google Scholar] [CrossRef] [PubMed]
  31. Yang, J.; Peng, H.; Luo, Y.; Zhu, T.; Xie, L. Explainable ensemble machine learning model for prediction of 28-day mortality risk in patients with sepsis-associated acute kidney injury. Front. Med. 2023, 10, 1165129. [Google Scholar] [CrossRef]
  32. Yuan, C.; Linn, K.A.; Hubbard, R.A. Algorithmic Fairness of Machine Learning Models for Alzheimer Disease Progression. JAMA Netw. Open 2023, 6, e2342203. [Google Scholar] [CrossRef]
  33. Rajkomar, A.; Hardt, M.; Howell, M.D.; Corrado, G.; Chin, M.H. Ensuring fairness in machine learning to advance health equity. Ann. Intern. Med. 2018, 169, 866–872. [Google Scholar] [CrossRef] [PubMed]
  34. Schuch, H.S.; Furtado, M.; dos Santos Silva, G.F.; Kawachi, I.; Chiavegatto Filho, A.D.; Elani, H.W. Fairness of Machine Learning Algorithms for Predicting Foregone Preventive Dental Care for Adults. JAMA Netw. Open 2023, 6, e2341625. [Google Scholar] [CrossRef] [PubMed]
  35. Patzer, R.E.; McClellan, W.M. Influence of race, ethnicity and socioeconomic status on kidney disease. Nat. Rev. Nephrol. 2012, 8, 533–541. [Google Scholar] [CrossRef] [PubMed]
  36. Bjornstad, E.C.; Marshall, S.W.; Mottl, A.K.; Gibson, K.; Golightly, Y.M.; Charles, A.; Gower, E.W. Racial and health insurance disparities in pediatric acute kidney injury in the USA. Pediatr. Nephrol. 2020, 35, 1085–1096. [Google Scholar] [CrossRef]
  37. Mohottige, D.; Diamantidis, C.J.; Norris, K.C.; Boulware, L.E. Racism and kidney health: Turning equity into a reality. Am. J. Kidney Dis. 2021, 77, 951–962. [Google Scholar] [CrossRef] [PubMed]
  38. Landers, R.N.; Behrend, T.S. Auditing the AI auditors: A framework for evaluating fairness and bias in high stakes AI predictive models. Am. Psychol. 2023, 78, 36. [Google Scholar] [CrossRef]
  39. Chen, P.; Wu, L.; Wang, L. AI Fairness in Data Management and Analytics: A Review on Challenges, Methodologies and Applications. Appl. Sci. 2023, 13, 10258. [Google Scholar] [CrossRef]
  40. Huang, Y.; Guo, J.; Chen, W.-H.; Lin, H.-Y.; Tang, H.; Wang, F.; Xu, H.; Bian, J. A scoping review of fair machine learning techniques when using real-world data. J. Biomed. Inform. 2024, 151, 104622. [Google Scholar] [CrossRef]
  41. Ntoutsi, E.; Fafalios, P.; Gadiraju, U.; Iosifidis, V.; Nejdl, W.; Vidal, M.E.; Ruggieri, S.; Turini, F.; Papadopoulos, S.; Krasanakis, E. Bias in data-driven artificial intelligence systems—An introductory survey. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2020, 10, e1356. [Google Scholar] [CrossRef]
  42. Feng, Q.; Du, M.; Zou, N.; Hu, X. Fair machine learning in healthcare: A review. arXiv 2022, arXiv:2206.14397. [Google Scholar]
  43. Giovanola, B.; Tiribelli, S. Beyond bias and discrimination: Redefining the AI ethics principle of fairness in healthcare machine-learning algorithms. AI Soc. 2023, 38, 549–563. [Google Scholar] [CrossRef]
  44. Fortin, M. Perspectives on organizational justice: Concept clarification, social context integration, time and links with morality. Int. J. Manag. Rev. 2008, 10, 93–126. [Google Scholar] [CrossRef]
  45. Johnson, K.B.; Wei, W.Q.; Weeraratne, D.; Frisse, M.E.; Misulis, K.; Rhee, K.; Zhao, J.; Snowdon, J.L. Precision medicine, AI, and the future of personalized health care. Clin. Transl. Sci. 2021, 14, 86–93. [Google Scholar] [CrossRef]
Figure 1. PRISMA flowchart diagram of study selection.
Figure 1. PRISMA flowchart diagram of study selection.
Diagnostics 14 01594 g001
Figure 2. Overall risk of bias assessment of included studies using PROBAST.
Figure 2. Overall risk of bias assessment of included studies using PROBAST.
Diagnostics 14 01594 g002
Table 1. Baseline chrematistics of included studies.
Table 1. Baseline chrematistics of included studies.
Li et al. [30]Luo et al. [27]Yang et al. [31]Zhou et al. [28]Gao et al. [29]
MIMIC IV (N = 8129)MIMIC IV (N = 12,132) MIMIC IV (N = 9158)MIMIC IV (N = 16,154)MIMIC IV (N = 12,196)
Demographic features
  Age68.7 (57.2, 79.6)68 (58–80)67.0 (57.0–78.0)67.7 (15.2)67.0 ± 16.1
  Sex, male, n (%)4708 (57.9)6920 (57.0)5242 (57.2)9318 (57.7)6995 (57.4)
  Body Mass Index (BMI)N/AN/A29.2 (25.0–34.3)30.9 (8.8)29.6 ± 7.8
Race and Ethnicity
  WhiteN/A8417 (69.3)N/A10,920 (67.6)N/A
  BlackN/A1063 (8.7)N/A1733 (10.7)N/A
  HispanicN/A375 (3.0)N/A538 (3.3)N/A
  AsianN/A297 (2.2)N/A377 (2.3)N/A
  OthersN/A2010 (16.6)N/A2586 (16.0)N/A
Comorbidities, n (%)
  Chronic pulmonary disease2358 (29.0)N/A2352 (25.7)N/A3398 (27.9)
  Peptic ulcer disease261 (3.2)N/A268 (2.93)N/AN/A
  Peripheral vascular disease1143 (14.1)N/A1055 (11.5)N/AN/A
  Myocardial infarction1643 (20.2)N/A1522 (16.6)N/A2308 (18.9)
  Cerebrovascular disease1250 (15.4)N/A1200 (13.1)N/A2101 (17.2)
  Diabetes2566 (31.6)N/A2224 (24.3)N/A3669 (30.1)
  Renal disease1938 (23.8)N/AN/AN/A2394 (19.6)
  Liver disease1384 (17.0)N/A1548 (16.9)3253 (20.1)1611 (13.2)
  Cancer1108 (13.6)N/A1241 (13.6)N/A1611 (13.2)
  Congestive heart failure2831 (34.8)N/A2366 (25.8)N/A4010 (32.9)
Vital signs
  Heart rate (beats/minute)86.1 (76.2–98.5)87 (76–99)105 (92.0–120)106.1 (21.6) *N/A
  Mean arterial pressure (mmHg)74.8 (69.6–81.3)N/A57.0 (50.0–63.0)N/A76.7 ± 10.1
  Respiratory rate (beats/minute)19.3 (16.9–22.4)20 (17–24)28.0 (24.0–32.0)28.7 (6.7) *19.9 ± 4.1
  Body temperature (°C)36.9 (36.6–37.3)36.9 (36.6–37.3)37.4 (37.0–38.0)37.5 (0.8) 36.9 ± 0.7
  SpO2 (%)97.4 (95.9–98.7)97 (95–99)93.0 (90.0–95.0)100 (100–100) 97.1 (2.2)
  Systolic blood pressure (mmHg)N/A117 (106–130)86.0 (78.0–94.0)85.9 (16.4) 115.4 ± 15.1
  Diastolic blood pressure (mmHg)N/A60 (53–68)44.0 (38.0–50.0) 61.3 ± 10.4
Laboratory results
  Serum creatinine (mg/dL)1.3 (0.9–2.1)1.1 (0.7–1.9)1.10 (0.80–1.60)1.4 (0.9–2.5) 2.0 ± 1.7
  Serum glucose (mg/dL)155 (124–209)N/A101 (86.0–124)143.0 (115.0–194.0) 179.5 ± 102.9
  Serum chloride (mEq/L)107 (103–111)104 (100–108)103 (99.0–106)N/AN/A
  Serum calcium (mg/dL)8.5 (8.0–9.0)N/A7.90 (7.40–8.40)N/AN/A
  Hematocrit (%)34.8 (30.7–39.7)N/A N/AN/A
  Hemoglobin (g/dL)11.4 (10.0–13.1)9.4 (8.3–10.6)9.80 (8.30–11.3)N/A9.9 ± 2.2
  Platelets (K/µL)209 (151–282)178 (112–270)155 (105–218)189.0 (135.0–257.0) 178.7 ± 104.7
  Anion gap (mEq/L)17.0 (14.0–20.0)N/A16.0 (13.0–19.0)N/A16.9 ± 5.1
  WBC (K/µL)14.6 (10.6–19.8)11.3 (8.2–15.4)14.8 (10.8–19.8)13.5 (9.5–18.6) 16.2 ± 11.8
  International normalized ratio (INR)1.4 (1.2–1.7)1.3 (1.2–1.6)N/AN/A1.4 (1.2–1.7)
  Partial pressure of oxygen (PaO2) (mmHg)N/A104 (84–133)75.0 (46.0–103)174.0 (104.0–321.0) 82.8 ± 54.3
  Partial pressure of carbon dioxide (PaCO2) (mmHg)N/A40 (35–46)47.0 (41.0–54.0)100.0 (100.0–100.0) 49.3 ± 14.7
  PaO2/FiO2 ratioN/AN/A168 (100–248)N/AN/A
  PhN/A7.41 (7.36–7.45)7.31 (7.24–7.36)N/AN/A
  NLR ratioN/AN/A7.88 (3.99–15.9)N/AN/A
  PLR ratioN/AN/A145 (77.8–281)N/AN/A
  Bicarbonate (mmol/L)24.0 (21.0–27.0)24 (21–28)24.0 (22.0–26.0)N/A20.9 ± 5.0
  Serum sodium (mEq/L)140 (137–143)139 (136–143)137 (134–140)N/A144.8 ± 5.8
  BUN (mg/dL)26.0 (17.0–42.0)28 (17–46)21.0 (16.0–32.0)27.0 (18.0–45.0)N/A
  Serum potassium (mEq/L)4.6 (4.2–5.1)4.0 (3.7–4.4)4.50 (4.10–5.00)N/A5.1 ± 0.9
  Prothrombin time (s)15.2 (13.2, 18.7)N/AN/AN/A18.4 ± 12.4
  Partial thromboplastin time (PTT) (s)34.2 (29.0–48.9)32.5 (28.2–43.0)N/A34.4 (29.3–46.4)33.9 (28.7–47.5)
  Urine output (mL)1295 (765–2000)750 (410–1260)1280 (815–1875)1040.0 (537.0–1665.0)1674.5 ± 1205.7
Treatments, n (%)
  RRT592 (7.3)8697 (11.2)248 (2.71)1633 (10.1)N/A
  Vasopressors use785 (9.7)22,573 (29.2)N/A5912 (36.6)1873 (15.4)
  Mechanical ventilation7485 (92.1)45,758 (59.1)5326 (58.2)9518 (58.9)11,764 (96.5)
Severity scores of illnesses
  Sequential Organ Failure (SOFA) score7 (5.0–10.0)N/A3.00 (2.00–5.00)6.0 (4.0–9.0)3.0 (2.0–5.0)
  Simplified Acute Physiology Score (SAPS II)42 (34–52)N/A39.0 (31.0–50.0)42.0 (34.0–52.0)42.1 ± 14.1
  APSIIIN/AN/A52.0 (37.0–75.0)N/AN/A
Note: N = the number of participants in the study; * = standard deviation, N/A = Not Applicable.
Table 2. Summary of included studies on machine learning models for mortality prediction in sepsis-associated acute kidney injury.
Table 2. Summary of included studies on machine learning models for mortality prediction in sepsis-associated acute kidney injury.
Author Publication YearSettingSelection Criteria % of MissingnessImputation MethodFeature Selection
Method
Data Partition External Validation Model Testing in Different Racial Groups
Luo et al. [27]2022ICUSepsis-3N/AXGBoostXGBoost50:30:20NoNo
Li et al. [30]2023ICUSepsis-3<20MiceForestLASSO80:20NoNo
Yang et al. [31]2023ICUSepsis-3<20Random forestBoruta algorithm5-fold NoNo
Zhou et al. [28]2023ICUSepsis-3N/AK-nearest neighborRecursive feature elimination80:20YesNo
Gao et al. [29]2024ICUSepsis-3<25MiceForestRandom forest10-foldNoNo
N/A = Not Available.
Table 3. The performance of machine learning models for predicting mortality in patients with sepsis-associated acute kidney injury.
Table 3. The performance of machine learning models for predicting mortality in patients with sepsis-associated acute kidney injury.
StudyModelAUROCAccuracySensitivitySpecificity PrecisionF1-Score
Li et al. [30]Decision Tree (DT)0.585 (0.547–0.623)0.7370.3780.8120.425N/A
K-nearest Neighbor (K-NN)0.601 (0.563–0.638)0.7930.3670.7830.429N/A
Support Vector Machine (SVM)0.680 (0.643–0.717)0.8260.5620.7360.556N/A
Logistic Regression (LR)0.730 (0.694–0.765)0.8220.6080.7540.572N/A
Random Forest (RF)0.778 (0.745–0.812)0.8250.7390.6740.622N/A
Extremely gradient boosting (XGBoost)0.794 (0.762–0.827)0.8320.7930.7520.660N/A
Luo et al. [27]Support Vector Machine (SVM)0.754 (0.740–0.769)N/AN/AN/AN/AN/A
Random Forest (RF)0.829 (0.819–0.840)N/AN/AN/AN/AN/A
Extremely gradient boosting (XGBoost)0.848 (0.838–0.858)0.7890.7430.791N/AN/A
Yang et al. [31]Random Forest (RF)0.849 (0.834–0.863)N/AN/AN/AN/AN/A
Logistic Regression (LR)0.85 (0.836–0.864)N/AN/AN/AN/AN/A
Gradient Boosting (GBM)0.865 (0.851–0.878)N/AN/AN/AN/AN/A
Extremely gradient boosting (XGBoost)0.873 (0.860–0.886)0.7730.896N/A0.7240.801
Zhou et al. [28]Naive Bayes0.760.680.740.670.370.49
Support vector machine (SVM)0760.740.690.750.430.53
Multi-layer perception (MLP)0.790.730.700.730.410.52
Logistic regression0.790.730.710.740.410.52
K-nearest Neighbor (KNN) 0.800.720.730.720.410.52
Extremely gradient boosting (XGBoost)0.810.770.680.790.460.55
Gradient boosting decision tree (GBDT)0.820.710.790.690.400.53
Light gradient boosting (LightGBM)0.820.740.750.740.430.55
Adapting Boosting (AdaBoost) 0.820.790.650.830.510.57
Random Forest 0.820.780.660.810.480.55
Categorical Boosting (CatBoost)0.830.750.750.750.440.56
Gao et al. [29]Decision Tree0.635 (0.611–0.659)0.765N/AN/A0.3950.409
K-nearest Neighbor0.689 (0.663–0.714)0.805N/AN/A0.4870.359
SVM Linear0.718 (0.691–0.743)0.809N/AN/A1.0000.008
Logistic Regression0.756 (0.732–0.779)0.816N/AN/A0.5680.258
Naive Bayesian0.764 (0.741–0.786)0.795N/AN/A0.4460.354
XGBoost0.796 (0.774–0.817)0.823N/AN/A0.5690.414
Random Forest0.798 (0.774–0.821)0.832N/AN/A0.6610.372
N/A = Not Available.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wu, C.-C.; Poly, T.N.; Weng, Y.-C.; Lin, M.-C.; Islam, M.M. Machine Learning Models for Predicting Mortality in Critically Ill Patients with Sepsis-Associated Acute Kidney Injury: A Systematic Review. Diagnostics 2024, 14, 1594. https://doi.org/10.3390/diagnostics14151594

AMA Style

Wu C-C, Poly TN, Weng Y-C, Lin M-C, Islam MM. Machine Learning Models for Predicting Mortality in Critically Ill Patients with Sepsis-Associated Acute Kidney Injury: A Systematic Review. Diagnostics. 2024; 14(15):1594. https://doi.org/10.3390/diagnostics14151594

Chicago/Turabian Style

Wu, Chieh-Chen, Tahmina Nasrin Poly, Yung-Ching Weng, Ming-Chin Lin, and Md. Mohaimenul Islam. 2024. "Machine Learning Models for Predicting Mortality in Critically Ill Patients with Sepsis-Associated Acute Kidney Injury: A Systematic Review" Diagnostics 14, no. 15: 1594. https://doi.org/10.3390/diagnostics14151594

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop