Machine Learning-Based Predictions of Mortality and Readmission in Type 2 Diabetes Patients in the ICU

Hu, Tung-Lai; Chao, Chuang-Min; Wu, Chien-Chih; Chien, Te-Nien; Li, Chengcheng

doi:10.3390/app14188443

Open AccessArticle

Machine Learning-Based Predictions of Mortality and Readmission in Type 2 Diabetes Patients in the ICU

by

Tung-Lai Hu

¹

,

Chuang-Min Chao

¹,

Chien-Chih Wu

^2,*

,

Te-Nien Chien

²

and

Chengcheng Li

²

¹

Department of Business Management, National Taipei University of Technology, Taipei 106, Taiwan

²

College of Management, National Taipei University of Technology, Taipei 106, Taiwan

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(18), 8443; https://doi.org/10.3390/app14188443

Submission received: 22 August 2024 / Revised: 13 September 2024 / Accepted: 17 September 2024 / Published: 19 September 2024

(This article belongs to the Special Issue Machine Learning in Biomedical Applications)

Download

Browse Figures

Versions Notes

Abstract

Prognostic outcomes for patients with type 2 diabetes in the intensive care unit (ICU), including mortality and readmission rates, are critical for informed clinical decision-making. Although existing research has established a link between type 2 diabetes and adverse outcomes in the ICU, the potential of machine learning techniques for enhancing predictive accuracy has not been fully realized. This study seeks to develop and validate predictive models employing machine learning algorithms to forecast mortality and 30-day post-discharge readmission rates among ICU type 2 diabetes patients, thereby enhancing predictive accuracy and supporting clinical decision-making. Data were extracted and preprocessed from the MIMIC-III database, focusing on 14,222 patients with type 2 diabetes and their corresponding ICU admission records. Comprehensive information, including vital signs, laboratory results, and demographic characteristics, was utilized. Six machine learning algorithms—bagging, AdaBoost, GaussianNB, logistic regression, MLP, and SVC—were developed and evaluated using 10-fold cross-validation to predict mortality at 3 days, 30 days, and 365 days, as well as 30-day post-discharge readmission rates. The machine learning models demonstrated strong predictive performance for both mortality and readmission rates. Notably, the bagging and AdaBoost models showed superior performance in predicting mortality across various time intervals, achieving AUC values up to 0.8112 and an accuracy of 0.8832. In predicting 30-day readmission rates, the MLP and AdaBoost models yielded the highest performance, with AUC values reaching 0.8487 and accuracy rates of 0.9249. The integration of electronic health record data with advanced machine learning techniques significantly enhances the accuracy of mortality and readmission predictions in ICU type 2 diabetes patients. These models facilitate the identification of high-risk patients, enabling timely interventions, improving patient outcomes, and demonstrating the significant potential of machine learning in clinical prediction and decision support.

Keywords:

type 2 diabetes; electronic health records; machine learning; mortality; readmission

1. Introduction

Diabetes is a complex metabolic disorder characterized by persistent hyperglycemia resulting from insulin secretion defects. Projections indicate that the number of people with diabetes could rise to 643 million by 2030 [1]. Fueled by factors such as aging, urbanization, and rising obesity, diabetes poses an increasing global health challenge with substantial economic and healthcare impacts [2]. Type 2 diabetes, which accounts for 90% of all diabetes cases, is associated with β-cell dysfunction, insulin resistance, and a range of complications, including heightened cardiovascular mortality and increased renal glucose reabsorption. The growing prevalence of type 2 diabetes places a significant burden on individuals and healthcare systems globally [3]. In critically ill patients, type 2 diabetes is a common comorbidity [4], leading to a significant increase in ICU admissions for type 2 diabetes patients. These patients now represent over 45% of ICU resource utilization, exceeding that of other chronic conditions [4]. Despite the critical nature of these trends, research on readmission rates among patients with type 2 diabetes remains limited [5], which could have profound implications for the management and allocation of healthcare resources.

In the United States, 40% of hospitalized patients succumb to their illnesses, with approximately 22% of these patients spending their entire hospitalization in the ICU [6]. This imposes a substantial burden on the healthcare system, potentially influencing ICU admission criteria, intervention strategies, and lengths of stay, thereby affecting patient outcomes. Patients with type 2 diabetes admitted to the ICU face higher risks of severe disease and complications due to impaired immune responses, significantly impacting survival rates [7]. Existing research on mortality among ICU type 2 diabetes patients is limited, primarily focusing on factors associated with increased mortality rather than comprehensive predictive models. Previous prognostic models, such as Cox regression and linear regression, perform optimally when the duration of the type 2 diabetes and cohort characteristics are known [8]; however, few studies have integrated multiple factors to predict mortality comprehensively. Predictive models utilizing simple demographic information, such as vital signs and age, offer a resource-efficient and practical approach, as vital signs are non-invasive and universally recognized by healthcare professionals as fundamental health indicators. While these variables can partially predict mortality in ICU patients, most studies have predominantly employed quantitative data. Existing statistical prediction models excel at manipulating quantitative variables, whereas standardizing and incorporating qualitative data remain challenging [9]. This underscores the need for advanced methodologies to integrate diverse data types effectively, ultimately improving mortality predictions and patient outcomes in the ICU setting.

Hospital readmission rates and lengths of stay wield substantial influence on healthcare expenditures, particularly emphasizing the need to curtail readmissions within a 30-day window [10]. Recent reports underscore a significant financial burden of $41 billion on the United States healthcare system that is attributed to hospital readmissions among type 2 diabetes patients within this timeframe [11]. Moreover, individuals with type 2 diabetes face an escalated susceptibility to COVID-19 infection, with studies indicating a threefold mortality rate in comparison to the general populace [12]. Given these pressing challenges, precise prediction of readmissions and lengths of stay assumes paramount importance in efficiently managing hospital bed availability and upholding service quality. Accumulating evidence documents type 2 diabetes as an independent risk factor for all-cause mortality [13]. The risk of all-cause mortality in persons with type 2 diabetes is approximately doubled. However, these conclusions are mainly based on the assumption that the risk of diabetes in women is the same as in men [14]. The sex-based difference in the risk of diabetes may not only be due to patient management and treatment but also to diverse biological factors [15]. Consequently, this domain has garnered significant scholarly attention in recent years [5,11].

The exponential growth of healthcare data is significantly accelerating the transition towards precision and personalized medicine. Artificial intelligence (AI) and algorithmic methodologies are becoming crucial to enhancing clinical decision-making, healthcare delivery, and the provision of health services. AI, employing advanced computer science techniques and massive datasets, is increasingly vital in medical research, as evidenced by its applications across a spectrum of studies in biology and medicine. Particularly in the realm of type 2 diabetes, AI tools, such as machine learning and neural networks, not only facilitate therapeutic monitoring but are also instrumental in predicting the onset and future complications of type 2 diabetes [16,17,18]. These technological advancements streamline routine operations, thus empowering healthcare providers to focus on more critical issues and improve the efficiency of medical care [19]. For instance, these tools enable patients to perform preliminary self-assessments of their type 2 diabetes status using personal physiological data, which aids physicians in making faster and more accurate diagnoses. Moreover, the implementation of AI in predicting disease patterns is anticipated to contribute to a reduction in the global prevalence of type 2 diabetes, which currently stands at 8.8% [20]. This demonstrates the profound impact AI is poised to have on global health outcomes and the management of chronic diseases such as type 2 diabetes.

The management of type 2 diabetes is notably complex, affected by a variety of sociodemographic, medical, and process-related factors, which cumulatively contribute to the challenge of mitigating readmission risks for patients with this condition. Beyond the systemic issues within healthcare settings, patient-specific factors play a crucial role in influencing readmission rates [21]. Notably, poor health literacy emerges as a significant barrier, as effective type 2 diabetes management necessitates a higher level of patient involvement than many other chronic illnesses. Patients with limited health literacy are often unable to recall or fully understand their discharge instructions, which significantly heightens their risk of readmission. Although numerous clinical trials have been conducted with the aim of reducing readmission rates, the focus has predominantly been on broader hospital populations, including unselected hospitalized patients, medical service users, the elderly, or those admitted for conditions like heart failure. These studies have reported reductions in 30-day readmission risks ranging from 30% to 75%. However, the specific challenges associated with type 2 diabetes have received less attention, underscoring the need for targeted research [5,22].

To demonstrate the effectiveness of our approach, this study utilized comprehensive clinical data from the MIMIC-III repository, focusing on patients admitted to the ICU at Beth Israel Deaconess Medical Center in Boston, MA. The dataset provides detailed patient information, including laboratory test results, demographic characteristics, microbiology findings, treatment courses, and fluid volumes in and out. The MIMIC-III dataset is extensively used in a range of research areas, including clinical prediction, disease diagnosis, and treatment strategies, and has been featured in numerous studies and publications across various domains. By leveraging machine learning to identify patterns in readmissions among type 2 diabetes patients, our study aims to develop customized interventions to effectively reduce these rates. We propose a novel approach integrating machine learning techniques to forecast readmission and mortality rates among type 2 diabetes patients, thereby enhancing prognostic accuracy and supporting clinical decision-making. Specifically, our study focuses on quantitative data from ICU patients in the 24 h preceding discharge, aiming to predict readmissions within 30 days post-discharge and mortality at 3, 30, and 365 days, utilizing the extensive clinical data available in the MIMIC-III dataset.

2. Materials and Methods

In our study, we collected quantitative data from ICU type 2 diabetes patients. The quantitative data primarily consisted of vital sign data observed within 24 h prior to ICU discharge and basic patient information collected during hospitalization. The collected data underwent data filtering and preprocessing. To model readmissions of ICU patients within 30 days of discharge and mortality at 3, 30, and 365 days, this study employed the following six commonly used machine learning methods: Adaboost; bagging; GaussianNB; logistic regression; MLP; and SVC. Finally, model performance was evaluated using five metrics, as follows: precision; recall; F1 score; accuracy; and AUROC. Our research framework combined multiple data analysis and machine learning techniques to build a predictive model for evaluating mortality and readmission rates in ICU patients with type 2 diabetes, providing valuable insights for clinical decision-making and treatment strategies. The research framework is shown in Figure 1.

2.1. Data Collection and Preprocessing

Our study leveraged the Medical Intensive Care Information Mart (MIMIC-III) repository, renowned for its extensive clinical data on patients admitted to Beth Israel Deaconess Medical Center in Boston, Massachusetts. MIMIC-III serves as the primary repository of de-identified public materials for research purposes [23]. This comprehensive dataset encompasses various clinical parameters such as laboratory test results, demographic attributes, microbiology findings, hemodynamic parameters, treatment regimens, fluid balance, and other pertinent data. The wealth of clinical information embedded in MIMIC-III renders it an invaluable resource for medical research. Data were extracted from MIMIC-III data using SQL queries. The database contains information regarding ICU admission, medications, vitals, duration of stay, ICD-9-CM, and laboratory reports. The patients with the ICD-9-CM diagnosis code for diabetes mellitus (diabetes type 1 and 2, secondary and gestational type 2 diabetes) were admitted to the ICU. In this study, we utilized MIMIC-III to analyze data from 46,520 patient records and 58,976 admission records, including vital signs, medications, laboratory measurements, and observation records. The average age of adults in this dataset was 66 years, the average ICU length of stay was 4.4 days, and the mortality rate was 8.9%. Ethical approval to access the MIMIC-III database was obtained by completing the National Institutes of Health (NIH) online course, passing the Human Research Participant Protection Examination, and submitting an access request (certificate number: 35628530), thus ensuring that appropriate safeguards were in place for all data used in this study.

To ensure the comparability of our results with previous relevant studies, we adhered to identical patient selection criteria, avoiding analyses based on specific diseases. Instead, we integrated data from all patients, thereby enhancing the generalizability of our findings. This methodology enables us to contribute to the existing literature on predicting ICU mortality and readmission rates while maintaining consistency with prior research [24,25]. The objectives of this study were twofold: first, to predict patient mortality at 3, 30, and 365 days post discharge, and second, to predict ICU readmission within 30 days of discharge. A patient cohort was selected based on the following exclusion criteria: initially, the first ICU stay without all subsequent ICU stays; adult patients (age ≥ 16); and type 2 diabetes patients.

The study utilized vital sign data obtained during the 24 h prior to patient discharge, which included, as follows: heart rate (X1); respiratory rate (X2); diastolic blood pressure (X3); systolic blood pressure (X4); temperature (X5); glycosylated hemoglobin (X6); random blood glucose (X7); creatinine (X8); non-invasive blood pressure mean (X9); glucose (X10); white blood cell count (X11); RBC (red blood cell)(X12); sodium level (X13); and serum bicarbonate level (X14). Additionally, the following basic demographic variables were considered: age (X15); ICU length of stay (X16); hospital length of stay (X17); gender (X18); marital status (X19); ethnicity (X20); insurance type (X21); and last ICU type (X22). In total, 22 structural predictors were used in this study. Appendix A shows the selected variables.

Accurately resolving missing values is critical in research, and this study highlights the complexity of handling missing data during preprocessing. This challenge, while reflective of real-world conditions, poses a significant obstacle to the development of mortality and readmission prediction models. To manage missing data, we excluded patients with more than 30% missing values [26]. Our study addresses this challenge using data preprocessing techniques. In our study, we used specific selection criteria to identify ICU mortality and readmissions [4,27,28,29]. The dataset we studied included a total of 14,222 hospitalizations, in which mortality within 365 days of discharge was studied in 6793 ICU patients and readmissions were studied in 11,317 ICU patients. Figure 2 illustrates the data extraction process used in our study.

Mortality prediction: Mortality in this context is defined as the patient’s outcome at hospital discharge, forming a binary classification task. The cohort for this analysis was selected based on the availability of hospital discharge status in patients’ records and a minimum length of stay of at least 48 h. The focus is on predictions made during the first 24 and 48 h of the ICU stay;
Readmission prediction: Readmission is defined as the outcome of whether a patient was readmitted to the hospital following discharge, constituting a binary classification task. This study includes criteria encompassing patients readmitted to the ICU within 30 days of initial discharge and those who died within the same period.

2.1.1. Variable Selection

In this study, we employed the data preprocessing method proposed by [26], implementing a three-stage approach to handle missing values. Initially, we excluded patients with more than 30% missing data. Next, we removed predictors with over 40% missing values. In the third stage, we eliminated variables with a missing data rate exceeding 20%, following which the mean imputation was applied to the remaining missing values. Subsequently, the information gain technique (entropy) was utilized to assess the significance of the 22 variables. Our final variable selection was based on those attributes with a score of 0.01 or higher.

2.1.2. Dealing with an Imbalanced Dataset

Due to the smaller sample size of ICU patients who either died or were readmitted compared to those who survived and were not readmitted, we employed the synthetic minority oversampling technique (SMOTE) to balance the dataset. SMOTE is a widely used oversampling method in machine learning for handling imbalanced data [30]. This technique generates new samples for the minority class by interpolating between the nearest neighbors of existing minority class instances. These synthetic samples are created based on the features of the original dataset, ensuring that they closely resembled the original minority class instances [24].

2.2. Machine Learning

The dataset was organized and partitioned into training and testing subsets, with 80% allocated for training and 20% for testing. To predict mortality and readmission rates among patients with type 2 diabetes, we developed models utilizing six widely used machine learning algorithms, as follows: AdaBoost, bagging, GaussianNB, logistic regression, MLP, and SVC. All data mining tasks were performed using the Python 3.12 programming language. Table 1 provides detailed descriptions of the machine learning classification algorithms employed in this study.

2.3. Performance Evaluation

To comprehensively assess the impact of quantitative information on the prediction of ICU patient mortality and readmission rates, this study employed the following five distinct evaluation metrics: the area under the receiver operating characteristic (AUROC) curve; precision; recall; F1 score; and accuracy.

P r e c i s i o n = P P V = \frac{T P}{T P + F P}

(1)

R e c a l l = T P R = \frac{T P}{T P + F N}

(2)

F 1 - s c o r e = \frac{2 * P r e c i s i o n * R e c a l l}{P r e c i s i o n + R e c a l l}

(3)

A c c u r a c y = \frac{T P + T N}{T P + F P + T N + F N}

(4)

In predictive modeling, performance evaluation metrics are crucial for assessing the accuracy and effectiveness of a model. Four common metrics in binary classification tasks include true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN). Precision measures the proportion of correct predictions among positive samples, indicating the ratio of true positive samples to all samples classified as positive. Recall, or sensitivity, reflects the proportion of actual positive samples that are correctly identified, representing the ratio of true positive samples to all actual positive samples. These metrics are integral to identification and prediction algorithms. The F1 score, which harmonizes precision and recall, provides a comprehensive measure of model performance and is often employed to evaluate the precision of classification algorithms. Another critical performance metric is accuracy, which quantifies the proportion of correct predictions across all samples, calculated as the ratio of correctly classified samples to the total number of samples in the test dataset. In diagnostic testing, the area under the receiver operating characteristic (AUROC) curve is the predominant metric for evaluating the performance of diagnostic tools. The AUROC curve graphically represents the relationship between the true positive rate (TPR) and the false positive rate (FPR) across varying classification thresholds, offering a thorough measure of classifier performance. It assigns equal importance to all instances, irrespective of the nature of the positive label. The ROC curve is constructed with FPR on the x-axis and TPR on the y-axis, illustrating the trade-off between sensitivity and specificity across different thresholds.

3. Results

This study used data from the MIMIC-III database for preprocessing, ultimately focusing on 14,222 patients with type 2 diabetes and their ICU admission records. Table 2 lists the demographic characteristics of the study cohort. The mean age of the patients was 68.42 years, with a standard deviation of 15.23 years. The gender distribution shows that 56.22% were male. In terms of insurance types, most patients (64.77%) had purchased medical insurance, followed by those who had purchased private insurance (24.55%). The racial distribution shows that the majority of the population was white (66.02%), followed by lack/Africans (15.05%). Marital status varied, with 48.72% being married and 23.74% being single. Regarding ICU type, most patients were treated in the medical ICU (44.34%), followed by the cardiac surgery recovery unit (18.61%). These demographic details provide a comprehensive overview of the study population, are summarized in Table 2, and provide important insights for the subsequent analyses conducted in this study. Table 2 also lists the mean values of predictor variables for ICU patients with type 2 diabetes used in this study.

3.1. Prediction of the Mortality

In our study, we employed a 10-fold cross-validation method to construct a predictive model for type 2 diabetes patients, forecasting outcomes at three days, 30 days, and 365 days. The model utilized data collected within the 24 h preceding ICU admission, including the 365-day mortality rate. The dataset was partitioned, with 80% allocated for model training and 20% for testing, followed by comprehensive statistical analyses to evaluate its performance.

Table 3 presents the AUROC scores for various classifiers in mortality prediction. Notably, for the 3-day mortality prediction, the bagging classifier achieved the highest AUROC score of 0.8112, surpassing other methods. In contrast, for the 30-day and 365-day mortality predictions, the Adaboost classifier obtained the highest AUROC scores of 0.7952 and 0.7898, respectively. This highlights Adaboost’s superior capability in distinguishing between patients who were readmitted and those who were not. Figure 3 depicts the ROC curves of the classifiers, visually illustrating their trade-offs between true positive and false positive rates. These curves effectively demonstrate the performance of each classifier across different thresholds. In the figure, the red dashed line represents the baseline for a random classifier, corresponding to an AUROC of 0.5. Figure 4 compares the AUROC scores for different machine learning methods across 3 days, 30 days, and 365 days, providing a clear visualization of their relative effectiveness.

Table 4 provides a comprehensive summary of the performance of six machine learning models in predicting ICU patient mortality at 3, 30, and 365 days post-discharge, evaluated using precision, recall, F1 score, and accuracy. The results indicate that the GaussianNB model attained the highest accuracy for mortality prediction at all three of the time intervals considered in this study, as follows: 0.8832 at 3 days, 0.8742 at 30 days, and 0.8693 at 365 days. Adaboost followed, demonstrating the second-highest accuracy with values of 0.8766, 0.8256, and 0.8161 at 3 days, 30 days, and 365 days, respectively.

3.2. Prediction of the Readmission

As with the mortality prediction, a 10-fold cross-validation approach was used to develop a prediction model for 30-day readmission rates in patients with type 2 diabetes. The model integrated data collected during the patient’s last 24 h before discharge and the corresponding treatment plan provided by the attending physician. The dataset was divided into training (80%) and testing (20%) subsets, and rigorous statistical analysis was performed to evaluate the performance of the model. The results summarized in Table 5 and shown in Figure 5 show the AUROC scores and ROC curves for various classifiers.

Table 5 presents the AUROC scores of different classifiers, indicating their respective performance in predicting readmission rates. Notably, the MLP classifier achieved the highest AUROC score of 0.8487, outperforming the other methods. This suggests its superior ability to discriminate between readmitted and non-readmitted patients. Figure 5 depicts the ROC curves for the classifiers, visually representing their trade-off between the true positive rate and false positive rate. The curves illustrate the performance of each classifier across different threshold values.

Our model, utilizing data collected within 24 h before hospital discharge, demonstrates promising accuracy, approaching 85%, in predicting readmission rates for type 2 diabetes patients within 30 days post-discharge. Particularly, the MLP classifier exhibits the highest prediction accuracy among the tested methods, reaffirming its effectiveness in this predictive task. These findings underscore the potential of employing machine learning techniques to enhance clinical decision-making and patient management strategies for ICU type 2 diabetes patients.

Table 6 presents Adaboost demonstrated the highest precision (0.5306), indicating its capability to accurately identify true positives with minimal false positives. Its recall (0.8298) and F1 score (0.6473) further substantiate its effectiveness in correctly identifying positive cases while maintaining a balance between precision and recall. The high-accuracy score of Adaboost (0.9249) underscores its robust performance in accurately predicting patient readmissions after discharge. This indicates its potential to assist clinicians in making well-informed decisions to enhance patient outcomes in the ICU setting. Overall, the comprehensive analysis of the classification model performance metrics highlights the efficacy of Adaboost in predicting readmission rates for type 2 diabetes patients post-discharge, thereby offering valuable insights for clinical decision-making and patient management strategies.

4. Discussion

4.1. Mortality Principal Findings

In a study utilizing data from two multicenter databases [37], it was concluded that hospitalization with type 2 diabetes mellitus is associated with an increased risk of mortality following community-acquired pneumonia. Conversely, Donnelly, Nair [38], in a large prospective population-based study, found that while type 2 diabetes was significantly associated with an increased risk of hospitalization for infection, it did not correlate with higher 28-day mortality [39]. Notably, neither study employed machine learning techniques to further predict mortality outcomes. This study aims to delve deeper into the importance of variable generation through machine learning methods, to gain insights into key factors affecting patient mortality, and to identify potential preventive measures in medical practice. To this end, we utilized a 10-fold cross-validation approach to develop a predictive model for type 2 diabetes patients, forecasting mortality at intervals of 3, 30, and 365 days. The model incorporates data collected during the first 24 h of ICU admission. A comprehensive evaluation of the model’s performance through various statistical analyses, including the AUROC values presented in Table 3, underscores the efficacy of the six machine learning methods employed. Our findings demonstrate the robustness of the model when based on quantitative data, with bagging and Adaboost identified as particularly effective models for predicting mortality across different time intervals. This integration offers a holistic approach to mortality prediction, providing clinicians with valuable tools to make informed decisions and improve outcomes for ICU patients.

4.2. Readmission Principal Findings

The aggregation of advanced machine learning techniques and EHR data offers a promising avenue for enhancing healthcare delivery, particularly in critical care settings such as ICUs. Our study highlights the importance of leveraging artificial intelligence and machine learning methods to develop predictive models with which to assess post-discharge readmission risk, particularly for patients with type 2 diabetes. Despite the increasing importance of avoidable readmissions as a key metric for assessing the quality of hospital care, existing machine learning-based models have shown only limited performance, typically producing AUC scores of 0.619–0.686. Shang et al. applied three time series machine learning models, namely, random forest, naive Bayes, and tree ensemble, to predict AUROC for 30-day readmission rates. The readmission prediction AUC values for random forest, naive Bayes, and tree ensemble were reported as 0.640, 0.619, and 0.634, respectively [40]. In another study, Sharma and Shah conducted a comparative analysis of various machine learning algorithms to evaluate their performance in classification tasks. In their study, they employed the following five different algorithms: logistic regression, decision tree classifier, random forest, AdaBoost, and gradient boosting. The results indicated that the accuracy of these models ranged from 0.579 to 0.665, while the F1 scores were observed to be between 0.673 and 0.783. These findings were published in their research paper, which contributes to the ongoing discourse on the efficacy of machine learning models in predictive analytics [41]. In comparison, our proposed method achieves an AUC score of 0.84, which is better than previous studies. These findings highlight the effectiveness of combining natural language processing techniques with quantitative data in accurately predicting discharge readmission rates. Our findings demonstrate that machine learning models, including AdaBoost, bagging, GaussianNB, logistic regression, MLP, and SVC, have significant success in predicting the odds of readmission in patients with type 2 diabetes. Notably, the MLP and Adaboost models showed excellent performance with an AUC score of 0.8487 and an accuracy score of 0.9249. These results highlight the potential of machine learning algorithms to identify high-risk patients who may benefit from additional monitoring or interventions to prevent readmissions.

4.3. Limitations

Our study has several notable limitations that warrant consideration. The primary limitation is the use of patient profiles from the MIMIC-III repository, which reflects specific patient demographics and healthcare contexts. Given that the data are predominantly from large urban areas, the generalizability of our findings to ICU patients in smaller healthcare facilities remains uncertain. To enhance the robustness of our conclusions, future studies should strive to include data from a variety of settings, including rural or small healthcare facilities. Additionally, the lack of high-quality care records for a significant proportion of patients poses a potential limitation, particularly when extrapolating the performance of this model to other critical care settings. While physician-recorded information during patient consultations is valuable for research on diseases and treatments, it also presents inherent challenges, such as common spelling errors and inaccuracies. Another significant limitation is the absence of external validation of our predictive model, which affects the overall reliability and trustworthiness of the proposed model. Future research efforts should prioritize the inclusion of robust external validation procedures to solidify the model’s credibility. Furthermore, our study specifically focused on predicting mortality and readmission rates in patients with type 2 diabetes in intensive care units. Future research should broaden this scope to encompass various diseases or conditions, enabling a more comprehensive exploration of predictive models for different patient groups. Incorporating written summaries from physicians could enhance the reliability of predictions. Investigating different patient groups may reveal distinct patterns and factors that influence outcomes, providing a richer understanding of the complexities involved. This expanded scope could lead to the development of customized predictive models that adapt to the unique characteristics of specific medical conditions, thereby increasing the applicability and effectiveness of such models in clinical practice.

5. Conclusions

Predicting mortality and readmission rates for patients in the ICU is crucial for optimizing healthcare expenditures and improving patient outcomes. Identifying high-risk individuals enables healthcare providers to implement customized treatment strategies and proactive interventions such as personalized care plans and rigorous health monitoring. This approach alleviates pressure on healthcare infrastructure while ensuring timely and appropriate patient care. Utilizing predictive models for high-risk groups, such as the elderly, immunocompromised individuals, and patients with chronic conditions, enables clinicians to effectively reduce mortality and readmission risks. Consequently, this enhances the efficiency and efficacy of healthcare services, contributing to overall improvements in healthcare delivery. Given the ongoing global rise in the number of patients with type 2 diabetes and their significantly increased risk of complications, precise clinical predictions for this population have become increasingly important. Such predictions can significantly improve treatment outcomes for type 2 diabetes patients while also alleviating the burden on healthcare systems.

In this study, we utilized the MIMIC-III database to integrate quantitative data collected during patients’ ICU stays. Our objective was to employ machine learning to predict mortality and readmissions among ICU patients. Advanced machine learning techniques, such as bagging and Adaboost, have demonstrated significant potential in accurately predicting mortality and readmission rates in patients with type 2 diabetes. Our methodology accomplishes several key objectives and provides substantial benefits for healthcare practitioners and patients alike. This approach enhances the standard of care by enabling early interventions that effectively reduce both mortality and readmission rates. It also addresses the financial burden associated with these issues through proactive patient management and timely intervention. Utilizing predictive analytics, hospitals can create customized discharge plans tailored to patients’ risk levels, incorporating more intensive follow-up and personalized rehabilitation to further decrease readmission rates. Machine learning models provide valuable decision-support tools, allowing clinicians to make more accurate assessments and develop more effective treatment strategies. By implementing real-time analysis, hospitals can establish early warning systems to identify high-risk patients promptly. This capability facilitates personalized medical interventions, optimizes resource allocation, and improves bed utilization while minimizing unnecessary readmissions. Overall, this approach not only enhances patient care but also boosts the efficiency of healthcare resource management. Moving forward, it is imperative to further refine and validate our predictive model in real-world clinical settings. Continued research efforts are essential to explore the broader implications of our findings on the healthcare system and patient outcomes. By advancing the application of machine learning tools in healthcare, we can ultimately drive improvements in patient care delivery and contribute to the advancement of precision medicine.

Author Contributions

Conceptualization, T.-L.H., C.-M.C. and C.-C.W.; Data curation, C.-C.W., C.L. and T.-N.C.; Formal analysis, C.-C.W., C.L. and T.-N.C.; Methodology, T.-L.H., C.-M.C., C.-C.W., C.L. and T.-N.C.; Supervision, T.-L.H., C.-M.C., C.-C.W. and T.-N.C.; Writing—original draft, C.-C.W. and T.-N.C.; Writing—review and editing, C.-C.W., C.L. and T.-N.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article; further inquiries can be directed to the corresponding author.

Acknowledgments

The authors would like to sincerely thank the editor and reviewers for their kind comments.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. Selected variables.

	Feature
1	Heart Rate
2	Respiratory Rate
3	DSP (diastolic blood pressure)
4	SBP (Systolic blood pressure)
5	Temperature
6	HbA1c (glycosylated hemoglobin)
7	RBG (random blood glucose)
8	Creatinine
9	MBP (Non-Invasive Blood Pressure)
10	Glucose
11	WBC (White Blood Cell)
12	RBC (Red Blood Cell)
13	Sodium Level
14	Serum Bicarbonate Level
15	Age
16	ICU Length of Stay
17	Hospital Length of Stay
18	Gender
19	Marital Status
20	Ethnicity
21	Insurance Type
22	Last ICU Type

References

Zheng, Y.; Ley, S.H.; Hu, F.B. Global aetiology and epidemiology of type 2 diabetes mellitus and its complications. Nat. Rev. Endocrinol. 2018, 14, 88–98. [Google Scholar] [CrossRef] [PubMed]
Biswas, T.; Behera, B.K.; Madhu, N.R. Technology in the management of type 1 and type 2 diabetes Mellitus: Recent status and future prospects. Adv. Diabetes Res. Manag. 2023, 111–136. [Google Scholar]
Daly, A.; Hovorka, R. Technology in the management of type 2 diabetes: Present status and future prospects. Diabetes Obes. Metab. 2021, 23, 1722–1732. [Google Scholar] [CrossRef]
Ye, J.; Yao, L.; Shen, J.; Janarthanam, R.; Luo, Y. Predicting mortality in critically ill patients with diabetes using machine learning and clinical notes. BMC Med. Inform. Decis. Mak. 2020, 20, 295. [Google Scholar] [CrossRef] [PubMed]
Rubin, D.J. Hospital readmission of patients with diabetes. Curr. Diabetes Rep. 2015, 15, 17. [Google Scholar] [CrossRef] [PubMed]
Romano’, M. The Role of Palliative Care in the Cardiac Intensive Care Unit. Healthcare 2019, 7, 30. [Google Scholar] [CrossRef] [PubMed]
Fuchs, L.; Chronaki, C.E.; Park, S.; Novack, V.; Baumfeld, Y.; Scott, D.; McLennan, S.; Talmor, D.; Celi, L. ICU admission characteristics and mortality rates among elderly and very elderly patients. Intensive Care Med. 2012, 38, 1654–1661. [Google Scholar] [CrossRef]
Chew, B.H.; Ghazali, S.S.; Ismail, M.; Haniff, J.; Bujang, M.A. Age ≥ 60 years was an independent risk factor for diabetes-related complications despite good control of cardiovascular risk factors in patients with type 2 diabetes mellitus. Exp. Gerontol. 2013, 48, 485–491. [Google Scholar] [CrossRef]
Davidson, S.; Villarroel, M.; Harford, M.; Finnegan, E.; Jorge, J.; Young, D.; Watkinson, P.; Tarassenko, L. Day-to-day progression of vital-sign circadian rhythms in the intensive care unit. Crit. Care 2021, 25, 156. [Google Scholar] [CrossRef]
Desai, D.; Mehta, D.; Mathias, P.; Menon, G.; Schubart, U.K. Health Care Utilization and Burden of Diabetic Ketoacidosis in the U.S. Over the Past Decade: A Nationwide Analysis. Diabetes Care 2018, 41, 1631–1638. [Google Scholar] [CrossRef]
Tavakolian, A.; Rezaee, A.; Hajati, F.; Uddin, S. Hospital Readmission and Length-of-Stay Prediction Using an Optimized Hybrid Deep Model. Future Internet 2023, 15, 304. [Google Scholar] [CrossRef]
Muniyappa, R.; Gubbi, S. COVID-19 pandemic, coronaviruses, and diabetes mellitus. Am. J. Physiol.-Endocrinol. Metab. 2020, 318, E736–E741. [Google Scholar] [CrossRef] [PubMed]
Nanayakkara, N.; Curtis, A.J.; Heritier, S.; Gadowski, A.M.; Pavkov, M.E.; Kenealy, T.; Owens, D.R.; Thomas, R.L.; Song, S.; Wong, J.; et al. Impact of age at type 2 diabetes mellitus diagnosis on mortality and vascular complications: Systematic review and meta-analyses. Diabetologia 2021, 64, 275–287. [Google Scholar] [CrossRef]
Xu, G.; You, D.; Wong, L.; Duan, D.; Kong, F.; Zhang, X.; Zhao, J.; Xing, W.; Han, L.; Li, L. Risk of all-cause and CHD mortality in women versus men with type 2 diabetes: A systematic review and meta-analysis. Eur. J. Endocrinol. 2019, 180, 243–255. [Google Scholar] [CrossRef] [PubMed]
Wang, B.; Fu, Y.; Tan, X.; Wang, N.; Qi, L.; Lu, Y. Assessing the impact of type 2 diabetes on mortality and life expectancy according to the number of risk factor targets achieved: An observational study. BMC Med. 2024, 22, 114. [Google Scholar] [CrossRef] [PubMed]
Zhang, J.; Fu, R.; Xie, L.; Li, Q.; Zhou, W.; Wang, R.; Ye, J.; Wang, D.; Xue, N.; Lin, X.; et al. A smart device for label-free and real-time detection of gene point mutations based on the high dark phase contrast of vapor condensation. Lab Chip 2015, 15, 3891–3896. [Google Scholar] [CrossRef]
Li, Q.; Fu, R.; Zhang, J.; Wang, R.; Ye, J.; Xue, N.; Lin, X.; Su, Y.; Gan, W.; Lu, Y.; et al. Label-Free Method Using a Weighted-Phase Algorithm To Quantitate Nanoscale Interactions between Molecules on DNA Microarrays. Anal. Chem. 2017, 89, 3501–3507. [Google Scholar] [CrossRef]
Ye, J.; Zhang, R.; Bannon, J.E.; Wang, A.A.; Walunas, T.L.; Kho, A.N.; Soulakis, N.D. Identifying Practice Facilitation Delays and Barriers in Primary Care Quality Improvement. J. Am. Board. Fam. Med. 2020, 33, 655–664. [Google Scholar] [CrossRef]
Sheng, B.; Chen, X.; Li, T.; Ma, T.; Yang, Y.; Bi, L.; Zhang, X. An overview of artificial intelligence in diabetic retinopathy and other ocular diseases. Front. Public Health 2022, 10, 971943. [Google Scholar] [CrossRef]
Ellahham, S. Artificial Intelligence: The Future for Diabetes Care. Am. J. Med. 2020, 133, 895–900. [Google Scholar] [CrossRef]
Knecht, L.A.; Gauthier, S.M.; Castro, J.C.; Schmidt, R.E.; Whitaker, M.D.; Zimmerman, R.S.; Mishark, K.J.; Cook, C.B. Diabetes care in the hospital: Is there clinical inertia? J. Hosp. Med. 2006, 1, 151–160. [Google Scholar] [CrossRef] [PubMed]
Hansen, L.O.; Young, R.S.; Hinami, K.; Leung, A.; Williams, M.V. Interventions to reduce 30-day rehospitalization: A systematic review. Ann. Intern. Med. 2011, 155, 520–528. [Google Scholar] [CrossRef] [PubMed]
Johnson, A.E.W.; Pollard, T.J.; Shen, L.; Lehman, L.W.H.; Feng, M.L.; Ghassemi, M.; Moody, B.; Szolovits, P.; Celi, L.A.; Mark, R.G. MIMIC-III, a freely accessible critical care database. Sci. Data 2016, 3, 160035. [Google Scholar] [CrossRef] [PubMed]
Jiang, Z.; Bo, L.; Xu, Z.; Song, Y.; Wang, J.; Wen, P.; Wan, X.; Yang, T.; Deng, X.; Bian, J. An explainable machine learning algorithm for risk factor analysis of in-hospital mortality in sepsis survivors with ICU readmission. Comput. Methods Programs Biomed. 2021, 204, 106040. [Google Scholar] [CrossRef]
Wei, X.; Min, Y.; Yu, J.; Wang, Q.; Wang, H.; Li, S.; Su, L. Admission Blood Glucose Is Associated With the 30-Days Mortality in Septic Patients: A Retrospective Cohort Study. Front. Med. 2021, 8, 757061. [Google Scholar] [CrossRef]
Guo, C.; Lu, M.; Chen, J. An evaluation of time series summary statistics as features for clinical prediction tasks. BMC Med. Inform. Decis. Mak. 2020, 20, 48. [Google Scholar] [CrossRef]
Huang, Y.; Zhong, Z.; Liu, F. The Association of Coagulation Indicators and Coagulant Agents With 30-Day Mortality of Critical Diabetics. Clin. Appl. Thromb. Hemost. 2021, 27, 10760296211026385. [Google Scholar] [CrossRef]
Sun, C.; Chen, D.; Jin, X.; Xu, G.; Tang, C.; Guo, X.; Tang, Z.; Bao, Y.; Wang, F.; Shen, R. Association between acute kidney injury and prognoses of cardiac surgery patients: Analysis of the MIMIC-III database. Front. Surg. 2022, 9, 1044937. [Google Scholar] [CrossRef]
Yang, W.; Zou, H.; Wang, M.; Zhang, Q.; Li, S.; Liang, H. Mortality prediction among ICU inpatients based on MIMIC-III database results from the conditional medical generative adversarial network. Heliyon 2023, 9, e13200. [Google Scholar] [CrossRef]
Loreto, M.; Lisboa, T.; Moreira, V.P. Early prediction of ICU readmissions using classification algorithms. Comput. Biol. Med. 2020, 118, 103636. [Google Scholar] [CrossRef]
Kim, D.H.; Choi, J.Y.; Ro, Y.M. Region based stellate features combined with variable selection using AdaBoost learning in mammographic computer-aided detection. Comput. Biol. Med. 2015, 63, 238–250. [Google Scholar] [CrossRef] [PubMed]
Breiman, L. Bagging predictors. Mach. Learn. 1996, 24, 123–140. [Google Scholar] [CrossRef]
Peng, L.; Peng, C.; Yang, F.; Wang, J.; Zuo, W.; Cheng, C.; Mao, Z.; Jin, Z.; Li, W. Machine learning approach for the prediction of 30-day mortality in patients with sepsis-associated encephalopathy. BMC Med. Res. Methodol. 2022, 22, 183. [Google Scholar] [CrossRef]
Cox, D.R. The Regression Analysis of Binary Sequences. J. R. Stat. Soc. Ser. B Stat. Methodol. 1958, 20, 215–242. [Google Scholar] [CrossRef]
Windeatt, T. Accuracy/Diversity and Ensemble MLP Classifier Design. IEEE Trans. Neural Netw. 2006, 17, 1194–1211. [Google Scholar] [CrossRef]
Suykens, J.A.; Vandewalle, J. Least squares support vector machine classifiers. Neural Process. Lett. 1999, 9, 293–300. [Google Scholar] [CrossRef]
Yende, S.; van der Poll, T.; Lee, M.; Huang, D.T.; Newman, A.B.; Kong, L.; Kellum, J.A.; Harris, T.B.; Bauer, D.; Satterfield, S.; et al. The influence of pre-existing diabetes mellitus on the host immune response and outcome of pneumonia: Analysis of two multicentre cohort studies. Thorax 2010, 65, 870–877. [Google Scholar] [CrossRef]
Donnelly, J.P.; Nair, S.; Griffin, R.; Baddley, J.W.; Safford, M.M.; Wang, H.E.; Shapiro, N.I. Association of Diabetes and Insulin Therapy With Risk of Hospitalization for Infection and 28-Day Mortality Risk. Clin. Infect. Dis. 2017, 64, 435–442. [Google Scholar] [CrossRef]
de Miguel-Yanes, J.M.; Méndez-Bailón, M.; Jiménez-García, R.; Hernández-Barrera, V.; Pérez-Farinós, N.; López-de-Andrés, A. Trends in sepsis incidence and outcomes among people with or without type 2 diabetes mellitus in Spain (2008–2012). Diabetes Res. Clin. Pract. 2015, 110, 266–275. [Google Scholar] [CrossRef]
Shang, Y.; Jiang, K.; Wang, L.; Zhang, Z.; Zhou, S.; Liu, Y.; Dong, J.; Wu, H. The 30-days hospital readmission risk in diabetic patients: Predictive modeling with machine learning classifiers. BMC Med. Inform. Decis. Mak. 2021, 21, 57. [Google Scholar] [CrossRef]
Sharma, T.; Shah, M. A comprehensive review of machine learning techniques on diabetes detection. Vis. Comput. Ind. Biomed. Art. 2021, 4, 30. [Google Scholar] [CrossRef] [PubMed]

Figure 1. The detailed process of data extraction.

Figure 2. The process of data extraction.

Figure 3. The mortality ROC curves for the different classifiers.

Figure 4. Comparison of AUROC scores for different machine learning methods across 3 days, 30 days, and 365 days.

Figure 5. The readmission ROC curves for the different classifiers.

Table 1. Machine learning classifiers: descriptions and implementation details.

Method	Description	Implementation Details
Adaboost	An adaptive ensemble method that focuses on misclassified samples from previous classifiers to train subsequent classifiers. Sensitive to noise and outliers [31].	Implemented using Python’s sklearn library with a maximum of 50 iterations for hyperparameters. Default values were used for other parameters.
Bagging	A prevalent ensemble learning technique designed to enhance accuracy and stability while mitigating variance and overfitting by integrating multiple predictors [32].	Implemented using Python’s scikit-learn library, constructing an ensemble classifier comprising 500 DecisionTreeClassifiers. Each classifier was trained on a bootstrapped subset of 100 samples.
GaussianNB	A probabilistic classification technique that assumes each feature independently predicts the output variable. The final classification is determined by selecting the class with the highest calculated probability [33].	Implemented using the GaussianNB class from scikit-learn. Assumes that features follow a Gaussian distribution; calculates mean and variance for each class; predicts based on the highest calculated probability.
Logistic Regression	A statistical method for predicting binary outcomes, commonly used for binary classification in machine learning [34].	Implemented using the LogisticRegression class from scikit-learn. Includes L2 regularization by default; uses the logistic function for binary classification; optimizes model parameters via gradient descent to minimize logistic loss.
MLP Classifier	A feed-forward artificial neural network that consists of multiple layers of interconnected neurons. Suitable for tasks involving classification and prediction with various feature sets [35].	Implemented using the MLPClassifier class from scikit-learn. Employs backpropagation and gradient descent for training; supports customizable architectures with multiple layers; uses activation functions like ReLU for hidden layers.
SVC	A powerful algorithm used for both classification and regression tasks, capable of handling linear and nonlinear data. Known for its robustness against overfitting and generalization capabilities [36].	Implemented using the SVC class from scikit-learn. Supports various kernel functions (e.g., linear, RBF); constructs a decision boundary that maximizes the margin between classes; uses a regularization parameter to balance margin maximization and error.

Table 2. Selected patient demographic information.

Terms	Overall
General
Number	14,222 (100%)
Age	68.42 ± 15.23
Gender (male)	7997 (56.22%)
Insurance Type
Self-Pay	74 (0.52%)
Government	271 (1.9%)
Medicare	9211 (64.77%)
Medicaid	1175 (8.26%)
Private	3491 (24.55%)
Ethnicity
Asian	361 (2.54%)
Black/African	2140 (15.05%)
Hispanic/Latino	615 (4.32%)
White	9389 (66.02%)
Other	1717 (12.07%)
Marital Status
Married	6929 (48.72%)
Single	3377 (23.74%)
Divorced	985 (6.93%)
Separated	205 (1.44%)
Widowed	2203 (15.49%)
Unknown	523 (3.68%)
Last ICU type
Coronary Care Unit	2279 (16.02%)
Cardiac Surgery Recovery Unit	2647 (18.61%)
Medical ICU	6306 (44.34%)
Surgical ICU	2031 (14.28%)
Trauma Surgical ICU	959 (6.74%)
Outcomes
Hospital LOS (days) [Q1–Q3]	13.13 [8.13–18.17]
ICU LOS (days) [Q1–Q3)	6.74 [3.82–8.92)

Table 3. The mortality AUROC scores of various classifiers.

Terms	Methods	AUROC
3 Days	Adaboost	0.7997 ± 0.0067
	Bagging	0.8112 ± 0.0047
	GaussianNB	0.7275 ± 0.0340
	Logistic Regression	0.7746 ± 0.0143
	MLP Classifier	0.7993 ± 0.0342
	SVC	0.8051 ± 0.0146
30 Days	Adaboost	0.7952 ± 0.0114
	Bagging	0.7804 ± 0.0068
	GaussianNB	0.7078 ± 0.0124
	Logistic Regression	0.7698 ± 0.0100
	MLP Classifier	0.7619 ± 0.0029
	SVC	0.7704 ± 0.0048
365 Days	Adaboost	0.7898 ± 0.0070
	Bagging	0.7686 ± 0.0122
	GaussianNB	0.7070 ± 0.0082
	Logistic Regression	0.7672 ± 0.0108
	MLP Classifier	0.7524 ± 0.0079
	SVC	0.7633 ± 0.0067

Table 4. The mortality diagnostic precision, recall, F1 score, and accuracy.

Terms	Method	Precision	Recall	F1 Score	Accuracy
3 Days	Adaboost	0.1206 ± 0.0104	0.6165 ± 0.0087	0.2015 ± 0.0152	0.8766 ± 0.0045
	Bagging	0.0998 ± 0.0041	0.8117 ± 0.0006	0.1778 ± 0.0065	0.8107 ± 0.0088
	GaussianNB	0.1129 ± 0.0162	0.5225 ± 0.0842	0.1849 ± 0.0236	0.8832 ± 0.0195
	Logistic Regression	0.0937 ± 0.0083	0.7336 ± 0.0157	0.1666 ± 0.0135	0.8135 ± 0.0139
	MLP Classifier	0.0845 ± 0.0041	0.5844 ± 0.1078	0.1467 ± 0.0024	0.8284 ± 0.0367
	SVC	0.0938 ± 0.0064	0.7607 ± 0.0232	0.1669 ± 0.0101	0.8082 ± 0.0114
30 Days	Adaboost	0.3093 ± 0.0148	0.8029 ± 0.0233	0.4464 ± 0.0182	0.8256 ± 0.0101
	Bagging	0.2505 ± 0.0031	0.8428 ± 0.0132	0.3862 ± 0.0051	0.7655 ± 0.0094
	GaussianNB	0.3487 ± 0.0114	0.5062 ± 0.0275	0.4129 ± 0.0166	0.8742 ± 0.0031
	Logistic Regression	0.2491 ± 0.0097	0.7691 ± 0.0258	0.3763 ± 0.0141	0.7777 ± 0.0032
	MLP Classifier	0.2563 ± 0.0145	0.7818 ± 0.0171	0.3856 ± 0.0146	0.7819 ± 0.0092
	SVC	0.2577 ± 0.0054	0.7749 ± 0.0184	0.3868 ± 0.0083	0.7854 ± 0.0066
365 Days	Adaboost	0.2922 ± 0.0092	0.7579 ± 0.0224	0.4213 ± 0.0063	0.8161 ± 0.0066
	Bagging	0.2636 ± 0.0097	0.8471 ± 0.0219	0.4018 ± 0.0105	0.7777 ± 0.0128
	GaussianNB	0.3446 ± 0.0417	0.5098 ± 0.0169	0.4093 ± 0.0289	0.8693 ± 0.0136
	Logistic Regression	0.2491 ± 0.0064	0.7547 ± 0.0276	0.3743 ± 0.0034	0.7773 ± 0.0037
	MLP Classifier	0.2333 ± 0.0342	0.7514 ± 0.0357	0.3533 ± 0.0353	0.7534 ± 0.0392
	SVC	0.2463 ± 0.0068	0.7485 ± 0.0155	0.3706 ± 0.0074	0.7755 ± 0.0005

Table 5. The readmission AUROC scores of various classifiers.

Method	AUROC
Adaboost	0.8416 ± 0.0120
Bagging	0.8429 ± 0.0085
GaussianNB	0.8231 ± 0.0068
Logistic Regression	0.8360 ± 0.0130
MLP Classifier	0.8487 ± 0.0060
SVC	0.8177 ± 0.0110

Table 6. The readmission classification model performance metrics.

Method	Precision	Recall	F1 Score	Accuracy
Adaboost	0.5306 ± 0.0162	0.8298 ± 0.0213	0.6473 ± 0.0087	0.9249 ± 0.0075
Bagging	0.5175 ± 0.0256	0.8404 ± 0.0348	0.6405 ± 0.0103	0.9216 ± 0.0088
GaussianNB	0.4660 ± 0.0218	0.7868 ± 0.0181	0.5848 ± 0.0159	0.9087 ± 0.0065
Logistic Regression	0.4514 ± 0.0096	0.7986 ± 0.0337	0.5763 ± 0.0049	0.9042 ± 0.0048
MLP Classifier	0.4552 ± 0.0385	0.8541 ± 0.0268	0.5921 ± 0.0260	0.9031 ± 0.0141
SVC	0.4801 ± 0.0124	0.7698 ± 0.0273	0.5909 ± 0.0091	0.9130 ± 0.0045

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Hu, T.-L.; Chao, C.-M.; Wu, C.-C.; Chien, T.-N.; Li, C. Machine Learning-Based Predictions of Mortality and Readmission in Type 2 Diabetes Patients in the ICU. Appl. Sci. 2024, 14, 8443. https://doi.org/10.3390/app14188443

AMA Style

Hu T-L, Chao C-M, Wu C-C, Chien T-N, Li C. Machine Learning-Based Predictions of Mortality and Readmission in Type 2 Diabetes Patients in the ICU. Applied Sciences. 2024; 14(18):8443. https://doi.org/10.3390/app14188443

Chicago/Turabian Style

Hu, Tung-Lai, Chuang-Min Chao, Chien-Chih Wu, Te-Nien Chien, and Chengcheng Li. 2024. "Machine Learning-Based Predictions of Mortality and Readmission in Type 2 Diabetes Patients in the ICU" Applied Sciences 14, no. 18: 8443. https://doi.org/10.3390/app14188443

APA Style

Hu, T.-L., Chao, C.-M., Wu, C.-C., Chien, T.-N., & Li, C. (2024). Machine Learning-Based Predictions of Mortality and Readmission in Type 2 Diabetes Patients in the ICU. Applied Sciences, 14(18), 8443. https://doi.org/10.3390/app14188443

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Machine Learning-Based Predictions of Mortality and Readmission in Type 2 Diabetes Patients in the ICU

Abstract

1. Introduction

2. Materials and Methods

2.1. Data Collection and Preprocessing

2.1.1. Variable Selection

2.1.2. Dealing with an Imbalanced Dataset

2.2. Machine Learning

2.3. Performance Evaluation

3. Results

3.1. Prediction of the Mortality

3.2. Prediction of the Readmission

4. Discussion

4.1. Mortality Principal Findings

4.2. Readmission Principal Findings

4.3. Limitations

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI