Next Article in Journal
Note on the Numerical Solutions of Unsteady Flow and Heat Transfer of Jeffrey Fluid Past Stretching Sheet with Soret and Dufour Effects
Next Article in Special Issue
A Malware Attack Enabled an Online Energy Strategy for Dynamic Wireless EVs within Transportation Systems
Previous Article in Journal
Generalization of Reset Controllers to Fractional Orders
Previous Article in Special Issue
An Energy Efficient Specializing DAG Federated Learning Based on Event-Triggered Communication
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Machine Learning-Based Prediction Models of Acute Respiratory Failure in Patients with Acute Pesticide Poisoning

1
Department of Computer Software Engineering, Soonchunhyang University, Asan 31538, Republic of Korea
2
Department of Medical Informatics, College of Medicine, Korea University, Seoul 02841, Republic of Korea
3
Department of Internal Medicine, Soonchunhyang University Cheonan Hospital, Cheonan 31151, Republic of Korea
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Mathematics 2022, 10(24), 4633; https://doi.org/10.3390/math10244633
Submission received: 23 September 2022 / Revised: 23 November 2022 / Accepted: 4 December 2022 / Published: 7 December 2022
(This article belongs to the Special Issue Applied Statistical Modeling and Data Mining)

Abstract

:
The prognosis of patients with acute pesticide poisoning depends on their acute respiratory condition. Here, we propose machine learning models to predict acute respiratory failure in patients with acute pesticide poisoning using a decision tree, logistic regression, and random forests, support vector machine, adaptive boosting, gradient boosting, multi-layer boosting, recurrent neural network, long short-term memory, and gated recurrent gate. We collected medical records of patients with acute pesticide poisoning at the Soonchunhyang University Cheonan Hospital from 1 January 2016 to 31 December 2020. We applied the k-Nearest Neighbor Imputer algorithm, MissForest Impuer and average imputation method to handle the problems of missing values and outliers in electronic medical records. In addition, we used the min–max scaling method for feature scaling. Using the most recent medical research, p-values, tree-based feature selection, and recursive feature reduction, we selected 17 out of 81 features. We applied a sliding window of 3 h to every patient’s medical record within 24 h. As the prevalence of acute respiratory failure in our dataset was 8%, we employed oversampling. We assessed the performance of our models in predicting acute respiratory failure. The proposed long short-term memory demonstrated a positive predictive value of 98.42%, a sensitivity of 97.91%, and an F1 score of 0.9816.

1. Introduction

Pesticide toxicosis is caused by the ingestion of or exposure to pesticides [1]. In the Republic of Korea, the death toll from toxicosis is 2702 people and 1675 of the 2702 people (61.99%) had toxicosis caused by the ingestion of pesticides [2]. In addition, 71% of patients with pesticide poisoning have been reported to die within 6–24 h [2]. The most common reason for the ingestion of pesticides is suicide [3]. Each year, 110,000 individuals die from pesticide poisoning [3], which accounts for 13.7% of all suicides [3]. In the Republic of Korea, some regions (Chungcheong-do, Gangwon-do, Jeolla-do) have a higher death rate from pesticide poisoning than that the capital area (Seoul, Incheon, and Gyeonggi-do) [2]. Pesticide toxicosis is easily accessible, especially in these regions [2]. Neurological, respiratory, and cardiovascular symptoms have been reported in cases of pesticide toxicosis [1]. The prognosis of pesticide toxicosis depends on the extent of respiratory failure [1]. Respiratory failure is associated with a high death rate in hospitals. Current respiratory failure treatment options can be ineffective [4]. Preventing the failure of multiple organs is crucial in reducing the rate of mortality from respiratory failure [5]. Therefore, the prediction of respiratory failure is important for patient prognosis.
Recent predictions of respiratory failure include predicting respiratory failure based on semi-supervised learning [4]; predicting respiratory failure with clinical data [5]; predicting respiratory failure in patients with coronavirus disease-2019 (COVID-19) [6]; predicting respiratory failure in the intensive care unit (ICU) [7,8]; predicting respiratory failure in pesticide intoxication [9]; and predicting respiratory failure with simple patient trajectories [10].
Machine learning algorithms such as semi-recurrent neural networks (RNNs), extreme gradient boosting, logistic regression (LR), random forest (RF), and long short-term memory (LSTM) have been used to predict respiratory failure in previous studies. In the case of respiratory failure prediction based on semi-RNNs, the positive predictive value (PPV) was 3.3% and the sensitivity was 78.0% [4]. In the case of respiratory failure prediction with clinical data, the sensitivity was 71% [5]. In the case of respiratory failure prediction in patients with COVID-19, the PPV was 74% and the sensitivity was 78% [6]. In the case of respiratory failure prediction in the ICU, the PPV was 42% and the sensitivity was 80% [7]. In the case of respiratory failure prediction in pesticide intoxication, the PPV was 83.3% and the sensitivity was 60.6% [9]. In the case of respiratory failure prediction with simple patient trajectories, the PPV was 22.6% and the sensitivity was 88.1% [10]. Therefore, the performance of algorithms for predicting acute respiratory failure is low.
Our goal is to predict the prognosis for patients with acute pesticide poisoning. However, it is difficult to predict the prognosis because of the various causes of acute pesticide poisoning. We predict acute respiratory failure, an important prognostic factor for patients with acute pesticide poisoning. We predict acute respiratory failure within 24 h using machine learning and three-hour electronic medical records (EMRs) for patients. We perform EMR preprocessing as follows: (1) solve human errors; (2) solve missing values; (3) sliding window; (4) feature selection; (5) data scaling; and (6) solve the imbalance. Data preprocessing is important to improve performance [11,12,13,14]. Using current patient data to fill in the gaps, we imputed missing values using the k-Nearest Neighbor (KNN) imputer algorithm from scikit-learn [15], the MissForest Imputer, or the data average imputation technique. We used a sliding window dataset based on 3 h data. We performed feature selection based on the current medical knowledge and p-values and oversampling. We performed data scaling using MinMaxScaler provided by scikit-learn [15]. For predicting acute respiratory failure in acute pesticide poisoning, we utilize shallow learning such as decision tree (DT), random forest (RF), logistic regression (LR), support vector machine (SVM), adaptive boosting (AB), gradient boosting (GB), and deep learning such as multi-layer perceptron (MLP), recurrent neural network (RNN), long short-term memory (LSTM), and gated recurrent unit (GRU).

2. Materials

2.1. Data

This retrospective cohort study consisted of patients admitted to Soonchunhyang University Cheonan Hospital in the Republic of Korea between January 2016 and December 2020. The patients were over 19 years of age. The patients with pesticide poisoning and respiratory failure within 1 h of admission were excluded. The number of patients was 707. The pesticide categories included glyphosate, glufosinate, paraquat, organophosphate, pyrethroid, and carbamate. After replacing missing data, we performed sliding window data preprocessing, feature selection, and oversampling on the medical records.
When the data preprocessing process was completed, the total data consisted of 11,526 data with 17 features for 3 h. We split the data into the training dataset and test dataset at a 7:3 ratio. The training dataset was then divided with a 7:3 ratio into a training dataset and a test dataset. The training dataset was then divided with a 7:3 ratio into a training dataset and a holdout fold. The training dataset was then divided with a 7:3 ratio into a training dataset and a validation dataset. Using the training dataset, machine learning methods were constructed and evaluated using the holdout fold. The number of respiratory failures was only 909. Oversampling or undersampling can be used to solve the imbalance in the datasets. We used oversampling algorithms such as the synthetic minority oversampling technique (SMOTE), borderline-SMOTE, and adaptive synthetic (ADASYN) given the limited cases of respiratory failure. Figure 1 shows the processing of patient selection. The number of training datasets was 7291, the number of validation set was 1695, and the number of holdout fold was 2421, and the number of test datasets was 3458.

2.2. Replacement Missing Value

Medical records are not free from missing values. The patient may be absent or there may be problems with noise or human errors. To improve machine learning algorithms, missing values need to be solved [11,12]. We solved missing values with the following three steps: (1) replace missing values with the recent data imputer (RDI); (2) apply the KNN imputation algorithm of scikit-learn with highly relevant features; and (3) replace other features through average imputation.

2.2.1. Recent Data Imputer

Time-series data have continuous values over time. The RDI can be used to replace the missing values of each patient with recent data. Figure 2 shows the RDI algorithm. The average imputer, the maximum imputer, and the minimum imputer are non-consistent time-series characteristics. The RDI has time-series characteristics.

2.2.2. k-Nearest Neighbor (KNN) Imputer

There may be missing values after using the RDI. The KNN imputer replaces missing values through distance functions for highly correlated features [13]. Improved performance may be achieved with the KNN compared with that of other imputers, such as the average imputer, maximum imputer, and minimum imputer [13]. We performed KNN imputation with highly relevant features twice as follows: (1) total CO2, pH, HCO3 standard, base excess, and lactate features; (2) pCO2 and pO2 features.

2.2.3. MissForest Imputer

The MissForest is an imputer based on RF [16]. Figure 3 shows the processing of MissForest. The MissForest replaces missing values to the median and trains on the dataset to interpolate missing values into prediction results.
We interpolate total CO2, pH, HCO3 standard, base excess lactate, pCO2, and pO2 after performing the RDI algorithm.

2.3. Feature Selection

Feature selection is important to improve the machine learning algorithm [14]. We perform according to a feature selection. (1) We calculate p-values to confirm unrelated features. (2) We determine features based on current medical knowledge. (3) We analyze the importance of features using RF and GB. (4) Using recursive feature elimination, we analyze both high- and low-ranking features. (5) In low-ranking features, we compare performance results exclusive to each feature.
First, we calculate the p-value for each feature and respiratory failure using ordinary least squares. To calculate the p-value, we utilize the OLS method provided by statsmodels [17]. We ignored features with p-value above 0.05 because they are uncorrelated factors. Table 1 shows the p-values for each feature.
We confirm that the features of p-values above 0.05 are as follows: smoking; alcohol; cardiovascular disease; SBP max; DBP max; RR max; Hb; glucose; BUN; creatinine; pCO2; HCO3 standard; BE; and troponin.
Second, we perform feature selection based on current medical knowledge. The following features are determined based on current medical knowledge: pesticide category; pesticide dose; sex; age; GCS; SBP max; HR max; BT max; WBC; PLT; albumin; total CO2; CRP1; pH; pO2; O2 saturation; and lactate.
Third, we analyze tree-based feature selection methods. Table 2 shows the performance results of RF and GB. We confirm that sex is an unimportant feature.
Fourth, using recursive feature elimination, we analyze high- and low-ranking features. We perform recursive feature elimination based on SVM with linear, LR, RF, DT, and GB. We perform recursive feature elimination provided by scikit-learn. Table 3 shows the performance results of recursive feature elimination based on SVM with linear, LR, DT, and GB. We confirm that sex, albumin, and PLT are unimportant factors.
Fifth, we compare the performance results of each feature based on Table 2 and Table 3. Table 4 shows the performance results of each feature using RF, GB, and MLP. The feature of the second stage is the highest performance in Table 4. We separate our dataset into train and test datasets. In the case of RF and GB, we train algorithms using stratified k-folds by train folds after separating train datasets into train folds and holdout folds. We train MLP algorithms using train folds and early stop using validation folds after separating train datasets into the train, validation, and holdout folds. In addition, we compare the performance of each feature via each algorithm using holdout folds.
For the performance evaluation of each feature, we perform steps one through five. The first and second steps exhibit the highest performance when each feature’s performance is evaluated. Therefore, we used the following features: pesticide category; pesticide dose; sex; age; GCS; SBP max; HR max; BT max; WBC; PLT; albumin; total CO2; CRP1; pH; pO2; O2 saturation; and lactate.

2.4. Hour Sliding Window in 24 H

In this study, our objective was to predict respiratory failure within 24 h using 3 h data. Figure 4 shows the sliding window to time-series data in 24 h based on 3 h data.

2.5. MinMaxScaler

In machine learning, each feature unit is different; thus, the results can be biased. To solve this, it is necessary to use the same data range. In this study, we used the MinMaxScaler provided by scikit-learn [15]. It expresses a value between 0 and 1 through Equation (1).
X = X min max min
The X variable represents a feature in the dataset. The min variable represents the minimum value of each feature. The max variable represents the maximum value of each feature.

2.6. Oversampling

In this study, the prevalence of respiratory failure was 8%, which suggests an imbalance. There are two methods of solving the imbalance: (1) oversampling and (2) undersampling. In this study, the number of cases of respiratory failure was limited and oversampling rather than undersampling tends to be better for improving performance [18]. Therefore, we performed oversampling. We applied SMOTE, borderline-SMOTE, and ADADYN to solve the data imbalance [19].

2.6.1. Synthetic Minority Oversampling Technique (SMOTE)

The SMOTE algorithm was proposed by Chawla et al. [20]. The SMOTE algorithm has three steps as follows: (1) apply the KNN algorithm in the minority class after randomized minority data selection [20]; (2) choose randomized minority data in the nearest data [20]; and (3) locate generated data between one-step and two-step data. Figure 5 shows the processing of the SMOTE algorithm.

2.6.2. Borderline-SMOTE

The borderline-SMOTE algorithm is based on the original SMOTE [21]. Notably, minority data may be far from the minority data-majority data boundary [21]. The borderline-SMOTE is applied at the boundary between minority data and majority data. After the KNN algorithm is applied to the minority data, the borderline-SMOTE algorithm considers data comprising more than half of the data to be borderline. The borderline-SMOTE applies the SMOTE algorithm to minority data at the borderline. Figure 6 shows data processing by the borderline-SMOTE, which generates minority data at the borderline and achieves classifier efficiency [19,21].

2.6.3. Adaptive Synthetic (ADASYN)

The ADASYN algorithm applies the KNN algorithm to minority data to generate minority data if there is a huge amount of majority data [19]. The ADASYN algorithm consists of four steps as follows [19]: (1) KNN algorithm calculates the ratio of minority data to majority data for each minority datum [19]; (2) the sum of the ratios of majority data is divided by each ratio of majority data [19]; (3) it is calculated to repeat through Equation (2) [19]; and (4) generate minority data as much as a repeat on each minority dataset [19]. Figure 7 shows data processing by the ADASYN algorithm.
repeat = second   step   result × ( number   of   majority number   of   minority )

3. Methods

3.1. Shallow Learning

A time-series dataset has three dimensions: number of datasets, amount of time, and number of features, e.g., (11,148, 3, 17). However, the machine learning algorithms provided by scikit-learn [15] use only two dimensions. We used both time-series and non-time-series features. The time-series features were the maximum systolic blood pressure (SBP) in 1 h, maximum heart rate (HR) in 1 h, maximum body temperature (BT) in 1 h, white blood cell (WBC), platelet (PLT), albumin, total CO2, cysteine-rich protein 1 (CRP1), pH, pO2, O2 saturation, and lactate. The non-time-series features were pesticide category, pesticide dose, sex, age, and Glasgow Coma Scale (GCS). We expressed the dataset structure as the number of datasets and number of features, e.g., (11,148, 41).

3.1.1. Logistic Regression (LR)

LR classification is based on the sigmoid function. LR calculates the weight and bias in the training dataset [22]. In this study, LR was used to calculate z with 41 features through Equation (3) [22].
Z = i = 1 41 w i a i + b
LR was used to calculate the sigmoid function with z through Equation (4) [22].
F ( z ) = 1 1 + e z
LR classification was performed by f(z) through Equation (5) [22]. If f(z) is greater than 0.5, it is classified as respiratory failure; otherwise, it indicates non-respiratory failure. We utilized the LR algorithm provided by scikit-learn [15].
predict = { 0     if     f ( z ) 0.5 1     if     f ( z ) > 0.5

3.1.2. Decision Tree (DT)

DT utilizes a binary tree structure, which can classify highly related features of respiratory failure [3]. DT calculates impurity and branches until the leaf node impurity is 0. There are two methods of calculating impurity: (1) the Gini coefficient and (2) the entropy coefficient. In this study, we calculated impurity using the Gini coefficient through Equation (6) [15].
Gini = i = 1 n ( R i ( 1 k = 1 m P k 2 ) )
The n variable represents the number of nodes and the m variable represents the number of outcomes. In this study, m is 2. The Ri variable represents the sample ratio of each branch. The Pk variable represents the class ratio. DT utilizes pruning to solve overfit. Figure 8 shows the DT algorithms. The 1 h prefix indicates the medical record after 1 h of measurement. The 2 h prefix indicates the medical record after 2 h of measurement. The 3 h prefix indicates the medical record after 3 h of measurement. We used the DT model provided by scikit-learn [15].

3.1.3. Random Forest (RF)

RF is an ensemble model that uses many DTs and bootstrap aggregation [23]. Figure 9 shows the processing of RF. After each DT in the RF predicts respiratory failure, the RF classifier votes on the prediction. In this study, we used a max depth hyperparameter of 6 and a n_estimators hyperparameter of 100 in the RandomForestClassifier provided by scikit-learn [15].

3.1.4. Support Vector Machine (SVM)

SVM is a support vector base classifier and not a decision boundary [24]. There is a problem with overfitting when other labels are near to the decision boundary. A support vector is a calculated boundary that minimizes classification error. Figure 10 shows classification based on the support vector of SVM.

3.1.5. Adaptive Boost (AB)

AB has two leaf nodes named as stump [25]. AB reduces the error by providing the next estimator with the weight of the incorrectly predicted respiratory dataset through the stump [25]. Figure 11 shows the AB weight calculation process.

3.1.6. Gradient Boost (GB)

The GB calculates the residual error for solving an incorrect prediction result by subtracting the measured value from the prediction value. GB calculates the residual error based on the tree using actual respiratory failure and prediction results.

3.2. Deep Learning Algorithms

We organize datasets in three dimensions because RNN-based models support time-series data. We utilize TensorFlow to perform deep learning [26].

3.2.1. Multi-Layer Perceptron (MLP)

The perceptron calculates the weight of many features such as EMRs, for prediction. However, single perceptrons suffer with non-linear datasets. In order to solve the nonlinear problem, the perceptron organizes multiple layers, which is referred to as MLP. The MLP does not support three dimensions, so we organize datasets in two dimensions. We implement MLP utilizing dense layers provided by TensorFlow [26]. Figure 12 shows the structure of MLP.

3.2.2. Recurrent Neural Network (RNN)

Without considering the order of the time series, machine learning recognizes the features of time-series as other features [27]. An RNN model has been proposed for considering the order of the time series. The vanilla RNN model calculates the current weights using the weights from the previous EMRs and the current EMRs. We implement RNN utilizing a simple RNN layer provided by TensorFlow. Figure 13 shows the structure of RNN.

3.2.3. Long Short-Term Memory (LSTM)

The vanilla RNN has the problem of considering only previous and current EMRs [28]. LSTM organizes forget gate, input gate, and output gate to solve short-term dependent problems [28]. The input gate calculates reminder information for long-term memory using the weight of previous short-term memory and current EMRs. The forget gate calculates the removal information for long-term memory using the weight of previous short-term memory and current EMRs. The long-term memory updates using the results of the forget gate and input gate. The output gate calculates weight using previous short-term memory and current EMRs and current long-term memory. We utilize the LSTM layer provided by TensorFlow. Figure 14 shows the structure of LSTM.

3.2.4. Gated Recurrent Unit (GRU)

To solve long-term dependency in vanilla RNN, the GRU organizes update gates and reset gates [29]. The weight for the prediction of acute respiratory failure is calculated through previous memory and current EMRs. The update gates determine whether to use the previous memory or the current weight. The reset gates calculate the removal information for memory. We utilize the GRU layer provided by TensorFlow. Figure 15 shows the structure of GRU.

3.3. Stratified-k-Fold

Cross-validation involves the splitting of the training dataset into a training fold and a test fold [30]. The training fold is used for learning in the training step [30]. The test fold is used for validation after the machine learning training step [30]. This technique can be used to solve overfitting [31]. The stratified k-fold is a cross-validation method that separates the data into k-training folds and test folds based on the class ratio (Figure 16).

4. Results

4.1. Evaluation Methods

In machine learning, binary classification is mainly evaluated by a confusion matrix. Table 5 shows the confusion matrix. In this study, the patient dataset was imbalanced. Therefore, the PPV and sensitivity are more important than accuracy. The PPV indicates the rate of actual respiratory failure patients among patients with predicted respiratory failure, which can be calculated through Equation (7).
PPV = TP TP + FP
The sensitivity indicates the rate of patients with predicted respiratory failure among actual patients with respiratory failure, which can be calculated through Equation (8).
Sensitivity = TP TP + FN
The F1 score considers both the PPV and sensitivity, which can be calculated through Equation (9).
F 1   score = 2 × PPV × Sensitivity PPV + Sensitivity

4.2. Characteristics of Study

We used 17 features to predict respiratory failure. Table 6 shows the mean and standard deviation of each feature in this study. We excluded the “pesticide category” from the features because there were several pesticide categories. We construct time series characteristics comprising 3 h measurements. To avoid overfitting, we oversampled the training dataset.

4.3. Evaluation of Imputation

We compared the performances of KNN and missForest after interpolating with RDI. We separate our dataset into train and test datasets. In the case of RF and GB, we train algorithms using stratified k-folds by train folds after separating train datasets into train folds and holdout folds. We train MLP algorithms using train folds and early stop using validation folds after separating train datasets into train, validation, and holdout folds. In addition, we compare the performance of the imputer via each algorithm using holdout folds. The results of KNN and missForest imputation using validation datasets based on RF, GB, and MLP are shown in Table 7.
In this paper, we confirm KNN imputer outperforms the MissForest imputer. We perform the KNN imputer after interpolating through RDI.

4.4. Evaluation of Hyperparameter Tuning

To improve the performance of each algorithm, we perform hyperparameter turning. We separate our dataset into train and test datasets. In the case of RF and GB, we train algorithms using stratified k-folds by train folds after separating train datasets into train folds and holdout folds. We train MLP algorithms using train folds and early stop using validation folds after separating train datasets into train, validation, and holdout folds. In addition, we compare the performance of hyperparameter tuning via each algorithm using holdout folds. On DT and RF, we tune the hyperparameter for the information gain method and max depth. We use SVM to turn hyperparameters for the penalty of square l2 and kernels with radial basis functions (RBF), linear, and polynomial. On AB, we tune the hyperparameters for the number of estimators and learning rate. On GB, we tune hyperparameters for the number of estimators and max depth. On MLP, RNN, LSTM, and GRU, we tune hyperparameters for the unit size and dropout rate. The performance results of each algorithm according to their hyperparameters are shown in Table 8.
DT had the highest F1 score when the information gain method and max depth were set to entropy and 9, respectively. RF had the highest F1 score when the information gain method and max depth were set to entropy and 9, respectively. SVM had the highest F1 score when the penalty of square l2 and kernel was set to 6.5 and radial basis function, respectively. AB had the highest F1 score when the number of estimators and learning rate were set to 150 and 1.0, respectively. GB had the highest F1 score when the number of estimators and max depth was set to 120 and 4, respectively. MLP had the highest F1 score when the unit size and dropout rate were set to 256 and 20%, respectively. RNN had the highest F1 score when the unit size and dropout rate were set to 128 and 30%, respectively. LSTM had the highest F1 score when the unit size and dropout rate were set to 128 and 40%, respectively. GRU had the highest F1 score when the unit size and dropout rate were set to 128 and 40%, respectively.

4.5. Evaluation of Oversampling Algorithms

We separate our dataset into train and test datasets. In the case of shallow machine learning, we train algorithms using stratified k-folds by train folds after separating train datasets into train folds and holdout folds. We train deep learning using train folds and early stop using validation folds after separating train datasets into train, validation, and holdout folds. In addition, we compare the performance of each oversampling via each algorithm using holdout folds. Table 9 shows the performance of each machine learning algorithm and oversampling algorithm. Table 9 showed the best performance at GRU with ADASYN.

4.6. Evaluation of Machine Learning Algorithms

We separate our dataset into train and test datasets. In the case of shallow machine learning, we train algorithms using stratified k-folds by train folds. We train deep learning using train folds and early stop using validation folds after separating train datasets into train and validation folds. In addition, we compare the performance of each algorithm using test datasets. Table 10 shows the highest performance achieved for each machine learning algorithm. In the case of PPV, GRU shows the highest performance. In the case of sensitivity, GB shows the highest performance. In the case of the F1 score, LSTM shows the highest performance.

4.7. Comparison of the Importance of Disease Prediction

We guess that the most important techniques in a machine learning algorithm are imputation, oversampling, and feature selection, in that order. We evaluate the performance of an RDI-based imputer and a mean-based imputer. In the RDI-based imputer, the PPV and sensitivity improved by more than 20% and 10%, respectively. The datasets in the medical field are unbalanced. In the case of diseases such as respiratory failure, they are classified according to their occurrence. The overfitting problem of prediction occurs when machine learning comprised a majority of the data in datasets. Oversampling or undersampling must be performed to solve imbalanced datasets. Table 11 shows the performance results of replacing missing values with the mean or not performing oversampling. In the case of not performing oversampling, the sensitivity is lower than if oversampling had been performed.
Table 4 shows the variance in performance according to the features. We consider the features with the highest performance in Table 4. In the case of machine learning algorithms, the performance depends on the dataset. The sensitivity of machine learning algorithms in Table 10 is more than 80%. However, the PPV of logistic regression is less than 50%. The deep learning models such as RNN, LSTM, and GRU all performed at comparable levels.

5. Discussion

We proposed the prediction model for acute respiratory failure, an important prognostic factor in acute pesticide poisoning patients. The effects of respiratory failure include loss of consciousness, arrhythmias, and death. Table 12 shows the performance of each algorithm for the prediction of respiratory failure. In recent years, respiratory failure prediction models have been developed to predict respiratory failure in COVID-19 patients based on deep learning with semi-supervised learning [4], respiratory failure based on XGBoost using clinical data [5], respiratory failure in COVID-19 patients based on LR [6], respiratory failure in ICU patients based on LightGBM [7], respiratory failure in patients with pesticide poisoning due to intentional pesticide ingestion based on LR [9], cardiac arrest and respiratory failure in ICU patients based on LSTM [10], and respiratory failure in ICU patients based on gradient boosting [8]. In the case of prediction based on semi-supervised learning [4], the PPV was 0.033 and the sensitivity was 0.78. In the case of prediction based on XGBoost using clinical data [5], the sensitivity was 0.71. In the case of prediction of respiratory failure with COVID-19 based on LR [6], the PPV was 0.74 and the sensitivity was 0.72. In the case of prediction based on LightGBM [7], the PPV was 0.42 and the sensitivity was 0.80. In the case of prediction of respiratory failure in patients with pesticide poisoning based on LR [9], the PPV was 0.833 and the sensitivity was 0.606. In the case of prediction based on LSTM [10], the PPV was 0.226 and the sensitivity was 0.881. However, these respiratory failure prediction algorithms are characterized by a large measurement interval, low performance, or large number of features [4,5,6,7,8,9,10]. Our proposed algorithm demonstrated improved respiratory failure prediction within 24 h with higher PPV and sensitivity compared with those of other models.
Our proposed algorithm demonstrated improved respiratory failure prediction within 24 h with higher PPV and sensitivity compared with those of other models. We guess that the pesticide category, white blood cell (WBC), pH, heart rate (HR), and C-reactive protein 1 are important predictors of respiratory failure in acute pesticide poisoning. The performance results of important features based on RF and GB confirmed that the highest-scoring features are the above features. These are the limitations of this paper: First, we conducted a single cohort at the Soonchunhyang University Cheonan Hospital. Second, we proceeded with retrospective research. We have not yet confirmed the validity from an external source. Third, our algorithm required three-hour EMRs. Our algorithm is not applicable to high-risk patients hospitalized for less than three hours. Fourth, our algorithm is difficult to use in hospitals. Our algorithm predicts whether respiratory failure has happened within 24 h. It does not estimate the time or risk score that a patient should experience respiratory failure. Follow-up research is required to decrease the prediction range for respiratory failure or score the patient’s risk or provide information such as the estimated time of respiratory failure.

6. Conclusions

We predicted respiratory failure in patients with pesticide poisoning at Soonchunhyang University Cheonan Hospital. We analyzed the 3 h medical records of individuals with pesticide poisoning to predict respiratory failure within 24 h. In consideration of time-series properties, sliding windows, feature selection, and oversampling were used to replace missing values. Enhanced performance was achieved with the use of LSTM. Moreover, our machine learning technique algorithm could improve the prognosis of patients with pesticide poisoning. In addition, we will enhance the algorithm for predicting respiratory failure within 24 h such that it can predict respiratory failure within 4 or 8 h. We plan to conduct studies to predict respiratory failure in patients admitted to the general ward and ICU.

Author Contributions

Conceptualization, H.L. and H.G.; methodology, Y.K. and M.C.; software, Y.K. and M.C.; validation, N.C., H.G. and H.L.; formal analysis, Y.K.; resources, H.G. and N.C.; data curation, N.C.; writing—original draft preparation, Y.K. and M.C.; writing—review and editing, H.L. and H.G.; visualization, Y.K. and M.C.; supervision, H.L.; project administration, H.L. and H.G.; funding acquisition, H.L. and H.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the MSIT (Ministry of Science and ICT), Korea, under the ICAN (ICT Challenge and Advanced Network of HRD) program (IITP-2022-RS-2022-00156439) supervised by the IITP (Institute of Information and Communications Technology Planning and Evaluation), the Bio and Medical Technology Development Program (No. NRF-2019M3E5D1A02069073) and a Korea University Grant.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki and approved by the Institutional Review Board of Soonchunhyang University Cheonan Hospital (IRB number: 2020-02-016).

Informed Consent Statement

Patient consent was waived because of the retrospective design of the study.

Data Availability Statement

Data sharing not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Cho, N.-J.; Park, S.; Lee, E.Y.; Gil, H.-W. Risk factors to predict acute respiratory failure in patients with acute pesticide poisoning. J. Korean Soc. Clin. Toxicol. 2020, 18, 116–122. [Google Scholar]
  2. Lee, H.; Choa, M.; Han, E.; Ko, D.R.; Ko, J.; Kong, T.; Cho, J.; Chung, S.P. Causative Substance and Time of Mortality Presented to Emergency Department Following Acute Poisoning: 2014-2018 National Emergency Department Information System (NEDIS). J. Korean Soc. Clin. Toxicol. 2021, 19, 65–71. [Google Scholar]
  3. Mew, E.J.; Padmanathan, P.; Konradsen, F.; Eddleston, M.; Chang, S.-S.; Phillips, M.R.; Gunnell, D. The global burden of fatal self-poisoning with pesticides 2006-15: Systematic review. J. Affect. Disord. 2017, 219, 93–104. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  4. Lam, C.; Tso, C.F.; Green-Saxena, A.; Pellegrini, E.; Iqbal, Z.; Evans, D.; Hoffman, J.; Calvert, J.; Mao, Q.; Das, R. Semisupervised Deep Learning Techniques for Predicting Acute Respiratory Distress Syndrome from Time-Series Clinical Data: Model Development and Validation Study. JMIR Form. Res. 2021, 5, e28028. [Google Scholar] [CrossRef]
  5. Sinha, P.; Churpek, M.M.; Calfee, C.S. Machine learning classifier models can identify acute respiratory distress syndrome phenotypes using readily available clinical data. Am. J. Respir. Crit. Care Med. 2020, 202, 996–1004. [Google Scholar] [CrossRef]
  6. Bartoletti, M.; Giannella, M.; Scudeller, L.; Tedeschi, S.; Rinaldi, M.; Bussini, L.; Fornaro, G.; Pascale, R.; Pancaldi, L.; Pasquini, Z. Development and validation of a prediction model for severe respiratory failure in hospitalized patients with SARS-CoV-2 infection: A multicentre cohort study (PREDI-CO study). Clin. Microbiol. Infect. 2020, 26, 1545–1553. [Google Scholar] [CrossRef]
  7. Hüser, M.; Faltys, M.; Lyu, X.; Barber, C.; Hyland, S.L.; Merz, T.M.; Rätsch, G. Early prediction of respiratory failure in the intensive care unit. arXiv 2021, arXiv:2105.05728. [Google Scholar]
  8. Schwager, E.; Jansson, K.; Rahman, A.; Schiffer, S.; Chang, Y.; Boverman, G.; Gross, B.; Xu-Wilson, M.; Boehme, P.; Truebel, H. Utilizing machine learning to improve clinical trial design for acute respiratory distress syndrome. NPJ Digit. Med. 2021, 4, 133. [Google Scholar] [CrossRef]
  9. Cho, N.-J.; Park, S.; Lyu, J.; Lee, H.; Hong, M.; Lee, E.-Y.; Gil, H.-W. Prediction Model of Acute Respiratory Failure in Patients with Acute Pesticide Poisoning by Intentional Ingestion: Prediction of Respiratory Failure in Pesticide Intoxication (PREP) Scores in Cohort Study. J. Clin. Med. 2022, 11, 1048. [Google Scholar] [CrossRef] [PubMed]
  10. Kim, J.; Chae, M.; Chang, H.-J.; Kim, Y.-A.; Park, E. Predicting cardiac arrest and respiratory failure using feasible artificial intelligence with simple trajectories of patient data. J. Clin. Med. 2019, 8, 1336. [Google Scholar] [CrossRef] [Green Version]
  11. Idri, A.; Benhar, H.; Fernández-Alemán, J.; Kadi, I. A systematic map of medical data preprocessing in knowledge discovery. Comput. Methods Programs Biomed. 2018, 162, 69–85. [Google Scholar] [CrossRef]
  12. Benhar, H.; Idri, A.; Fernández-Alemán, J. Data preprocessing for heart disease classification: A systematic literature review. Comput. Methods Programs Biomed. 2020, 195, 105635. [Google Scholar] [CrossRef] [PubMed]
  13. Jadhav, A.; Pramod, D.; Ramanathan, K. Comparison of performance of data imputation methods for numeric dataset. Appl. Artif. Intell. 2019, 33, 913–933. [Google Scholar] [CrossRef]
  14. Li, J.; Cheng, K.; Wang, S.; Morstatter, F.; Trevino, R.P.; Tang, J.; Liu, H. Feature selection: A data perspective. ACM Comput. Surv. 2017, 50, 1–45. [Google Scholar] [CrossRef]
  15. Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Du-bourg, V.; et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. JMLR 2011, 12, 2825–2830. [Google Scholar]
  16. Stekhoven, D.J.; Bühlmann, P. MissForest—Non-parametric missing value imputation for mixed-type data. Bioinformatics 2012, 28, 112–118. [Google Scholar] [CrossRef] [Green Version]
  17. Seabold, S.; Perktold, J. Statsmodels: Econometric and statistical modeling with python. In Proceedings of the 9th Python in Science Conference, Austin, TX, USA, 28 June–3 July 2010; p. 10-25080. [Google Scholar]
  18. Mohammed, R.; Rawashdeh, J.; Abdullah, M. Machine learning with oversampling and undersampling techniques: Over-view study and experimental results. In Proceedings of the 2020 11th International Conference on Information and Communication Systems (ICICS), Irbid, Jordan, 7–9 April 2020; pp. 243–248. [Google Scholar]
  19. He, H.; Bai, Y.; Garcia, E.A.; Li, S. ADASYN: Adaptive synthetic sampling approach for imbalanced learning. In Proceedings of the 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), Hong Kong, China, 1–8 June 2008; pp. 1322–1328. [Google Scholar]
  20. Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
  21. Han, H.; Wang, W.-Y.; Mao, B.-H. Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning. In Proceedings of the International Conference on Intelligent Computing, Hefei, China, 23–26 August 2005; pp. 878–887. [Google Scholar]
  22. Kleinbaum, D.G.; Dietz, K.; Gail, M.; Klein, M.; Klein, M. Logistic regression; Springer: Berlin/Heidelberg, Germany, 2002. [Google Scholar]
  23. Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
  24. Platt, J. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv. Large Margin Classif. 1999, 10, 61–74. [Google Scholar]
  25. Freund, Y.; Schapire, R.E. A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 1997, 55, 119–139. [Google Scholar] [CrossRef] [Green Version]
  26. Abadi, M.; Barham, P.; Chen, J.; Chen, Z.; Davis, A.; Dean, J.; Devin, M.; Ghemawat, S.; Irving, G.; Isard, M.; et al. TensorFlow: A System for Large-Scale Machine Learning. In Proceedings of the 12th USENIX symposium on operating systems design and implementation (OSDI 16), Savannah, GA, USA, 2–4 November 2016; pp. 265–283. [Google Scholar]
  27. Hüsken, M.; Stagge, P. Recurrent neural networks for time series classification. Neurocomputing 2003, 50, 223–235. [Google Scholar] [CrossRef]
  28. Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
  29. Cho, K.; Van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv 2014, arXiv:1406.1078. [Google Scholar]
  30. Refaeilzadeh, P.; Tang, L.; Liu, H. Cross-validation. Encycl. Database Syst. 2009, 5, 532–538. [Google Scholar]
  31. Berrar, D. Cross-Validation. Encycl. Bioinform. Comput. Biol. 2019, 1, 542–545. Available online: https://www.sciencedirect.com/science/article/pii/B978012809633820349X?via%3Dihub (accessed on 23 September 2022).
Figure 1. Patient selection, data preprocessing, and dataset. The training dataset number was 7291, and the validation set number was 1695, and the holdout fold number was 2421, and the test dataset number was 3458.
Figure 1. Patient selection, data preprocessing, and dataset. The training dataset number was 7291, and the validation set number was 1695, and the holdout fold number was 2421, and the test dataset number was 3458.
Mathematics 10 04633 g001
Figure 2. Processing by the recent data imputer (RDI). The black nodes indicate measured values and the white nodes indicate missing values.
Figure 2. Processing by the recent data imputer (RDI). The black nodes indicate measured values and the white nodes indicate missing values.
Mathematics 10 04633 g002
Figure 3. Processing by the MissForest. (1) Interpolate missing values into the median. (2) Train via dataset. (3) Interpolate missing values into the prediction results.
Figure 3. Processing by the MissForest. (1) Interpolate missing values into the median. (2) Train via dataset. (3) Interpolate missing values into the prediction results.
Mathematics 10 04633 g003
Figure 4. Processing by sliding window.
Figure 4. Processing by sliding window.
Mathematics 10 04633 g004
Figure 5. SMOTE processing. Blue indicates the majority dataset (non-respiratory failure). Orange indicates the minority dataset (respiratory failure). Green indicates the minority dataset (respiratory failure), which is generated by SMOTE.
Figure 5. SMOTE processing. Blue indicates the majority dataset (non-respiratory failure). Orange indicates the minority dataset (respiratory failure). Green indicates the minority dataset (respiratory failure), which is generated by SMOTE.
Mathematics 10 04633 g005
Figure 6. Borderline-SMOTE processing. Blue indicates the majority dataset (non-respiratory failure). Orange indicates the minority dataset (respiratory failure). Green indicates the minority dataset (respiratory failure) generated by Borderline-SMOTE.
Figure 6. Borderline-SMOTE processing. Blue indicates the majority dataset (non-respiratory failure). Orange indicates the minority dataset (respiratory failure). Green indicates the minority dataset (respiratory failure) generated by Borderline-SMOTE.
Mathematics 10 04633 g006
Figure 7. ADASYN processing. Blue indicates the majority dataset (non-respiratory failure). Orange indicates the minority dataset (respiratory failure). Green indicates the minority dataset (respiratory failure) generated by ADASYN. When there are more majority data surrounding minority data, more minority data are produced.
Figure 7. ADASYN processing. Blue indicates the majority dataset (non-respiratory failure). Orange indicates the minority dataset (respiratory failure). Green indicates the minority dataset (respiratory failure) generated by ADASYN. When there are more majority data surrounding minority data, more minority data are produced.
Mathematics 10 04633 g007
Figure 8. Decision tree. Orange indicates the prediction of respiratory failure in each node. Blue indicates non-respiratory failure. Darker nodes contain more data.
Figure 8. Decision tree. Orange indicates the prediction of respiratory failure in each node. Blue indicates non-respiratory failure. Darker nodes contain more data.
Mathematics 10 04633 g008
Figure 9. Processing of random forests. The red box in the decision tree indicates the predicted outcome. The random forest classifier is based on decision tree outcomes. Random Forest performs bootstrap aggregation on many decision trees and uses voting to determine the prediction results of many decision trees.
Figure 9. Processing of random forests. The red box in the decision tree indicates the predicted outcome. The random forest classifier is based on decision tree outcomes. Random Forest performs bootstrap aggregation on many decision trees and uses voting to determine the prediction results of many decision trees.
Mathematics 10 04633 g009
Figure 10. It shows the classification of SVM.
Figure 10. It shows the classification of SVM.
Mathematics 10 04633 g010
Figure 11. Processing of adaptive boost. Each estimator transfers the weight of an incorrect prediction after training via the dataset.
Figure 11. Processing of adaptive boost. Each estimator transfers the weight of an incorrect prediction after training via the dataset.
Mathematics 10 04633 g011
Figure 12. Structure of MLP in this paper. The number of units in hidden layer is 256.
Figure 12. Structure of MLP in this paper. The number of units in hidden layer is 256.
Mathematics 10 04633 g012
Figure 13. Structure of the RNN model in this paper. The number of units in the RNN layer is 128.
Figure 13. Structure of the RNN model in this paper. The number of units in the RNN layer is 128.
Mathematics 10 04633 g013
Figure 14. Structure of the LSTM model in this paper. The number of units in the LSTM layer is 128.
Figure 14. Structure of the LSTM model in this paper. The number of units in the LSTM layer is 128.
Mathematics 10 04633 g014
Figure 15. Structure of the GRU model in this paper. The number of units in the GRU layer is 128.
Figure 15. Structure of the GRU model in this paper. The number of units in the GRU layer is 128.
Mathematics 10 04633 g015
Figure 16. Stratified k-fold processing (in this case, k is 4).
Figure 16. Stratified k-fold processing (in this case, k is 4).
Mathematics 10 04633 g016
Table 1. The respiratory failure was correlated to p-values of features.
Table 1. The respiratory failure was correlated to p-values of features.
Featuresp-ValueFeaturesp-Value
Pesticide dose0.000WBC0.000
Sex0.000PLT0.000
Age0.000Albumin0.000
BMI0.023Glucose0.070
Smoking0.326BUN0.622
Alcohol0.313Creatinine0.100
Diabetes disease0.000Total CO20.000
Respiratory disease0.000C-reactive protein 10.000
Cardiovascular disease0.995pH0.000
GCS0.000pCO20.199
SBP max0.088pO20.000
DBP max0.897O2 saturation0.000
HR max0.000HCO3 standard0.356
RR max0.174BE0.120
BT max0.000Troponin0.555
Hb0.221Lactate0.000
BMI: body mass index; GCS: Glasgow Coma Scale; SBP: systolic blood pressure; DBP: diastolic blood pressure; HR: heart rate; RR: respiratory rate; BT: body temperature; max: maximum; Hb: hemoglobin; WBC: white blood cell; PLT: platelet; BUN: blood urea nitrogen; BE: base excess.
Table 2. Performance result of tree-based feature selection method.
Table 2. Performance result of tree-based feature selection method.
FeaturesTree-Based Feature SelectionFeaturesTree-Based Feature Selection
RFGBRFGB
Pesticide category0.0560.073PLT0.0320.013
Pesticide dose0.0320.017Albumin0.0340.008
Sex0.0010.0003total_CO20.0720.025
Age0.0390.019C-reactive_protein_10.0530.057
GCS0.0530.068pH0.1750.247
SBP_max0.0100.003pO20.0180.003
HR_max0.0870.097O2_saturation0.0620.017
BT_max0.0330.029Lactate0.0170.006
WBC0.2240.316
RF: Random Forest; GB: Gradient Boost; GCS: Glasgow Coma Scale; SBP: systolic blood pressure; max: maximum; HR: heart rate; BT: body temperature; WBC: white blood cell; PLT: platelet.
Table 3. Performance result of recursive feature elimination of each algorithm. Low-rank features are albumin, platelet, and sex. High-rank feature is pH.
Table 3. Performance result of recursive feature elimination of each algorithm. Low-rank features are albumin, platelet, and sex. High-rank feature is pH.
Machine Learning AlgorithmLow-Rank FeatureHigh-Rank Feature
SVM with linearAlbuminpH
LRPLTpH
RFSexpH
DTAlbuminpH
GBSexpH
SVM: support vector machine; LR: logistic regression; RF: random forest; DT: decision tree; GB: gradient boost; PLT: platelet.
Table 4. Performance result of feature selection based on RF, GB, and MLP.
Table 4. Performance result of feature selection based on RF, GB, and MLP.
FeatureAlgorithmPPVSensitivityF1 ScoreAUC
ReferenceRF98.92%96.34%0.97610.9812
GB96.81%95.29%0.96040.9751
MLP96.81%95.29%0.96040.9751
Reference exclude sexRF98.92%95.81%0.97340.9786
GB92.78%94.24%0.93510.9681
MLP92.78%94.24%0.93510.9681
Reference exclude PLTRF99.46%95.81%0.97600.9788
GB98.89%93.19%0.95960.9655
MLP92.78%94.24%0.93510.9681
Reference exclude albuminRF98.39%95.81%0.97080.9784
GB96.70%92.15%0.94370.9594
MLP96.70%92.15%0.94370.9594
PPV: positive predictive value; RF: random forest; GB: gradient boosting; MLP: multi-layer perceptron; PLT: platelet.
Table 5. Confusion matrix.
Table 5. Confusion matrix.
Actual
Respiratory failureNon-respiratory failure
PrecisionRespiratory failureTrue positive (TP)False positive (FP)
Non-respiratory failureFalse negative (FN)True negative (TN)
Table 6. The characteristics of our dataset.
Table 6. The characteristics of our dataset.
Training Data (n = 8068)Test Data (n = 3458)
Pesticide dose171.87 ± 138.07177.18 ± 144.12
Sex, male4994, 61.90%2210, 63.91%
Age61.07 ± 16.3761.40 ± 16.48
GCS13.90 ± 2.1013.85 ± 2.19
1h_SBP_max124.00 ± 18.03123.89 ± 18.16
1h_HR_max76.40 ± 14.3776.30 ± 14.36
1h_BT_max36.55 ± 0.3636.55 ± 0.36
1h_WBC7.53 ± 3.177.54 ± 3.15
1h_PLT144.52 ± 64.19142.90 ± 65.09
1h_albumin3.59 ± 0.483.57 ± 0.49
1h_total_CO224.14 ± 3.1524.23 ± 3.15
1h_C-reactive_protein_126.97 ± 50.4928.73 ± 51.61
1h_pH7.43 ± 0.067.43 ± 0.06
1h_pO293.29 ± 26.6293.02 ± 24.74
1h_O2_saturation96.02 ± 3.6796.03 ± 3.62
1h_lactate2.43 ± 2.022.48 ± 2.12
2h_SBP_max123.92 ± 17.90123.93 ± 18.32
2h_HR_max76.40 ± 14.3876.38 ± 14.43
2h_BT_max36.55 ± 0.3536.56 ± 0.37
2h_WBC7.53 ± 3.177.54 ± 3.15
2h_PLT144.12 ± 63.80142.47 ± 64.86
2h_albumin3.58 ± 0.483.57 ± 0.49
2h_total_CO224.14 ± 3.1524.23 ± 3.15
2h_C-reactive_protein_126.98 ± 50.4928.74 ±51.61
2h_pH7.43 ± 0.067.43 ± 0.07
2h_pO293.41 ± 26.2092.98 ± 24.72
2h_O2_saturation95.99 ± 3.7996.00 ± 3.72
2h_lactate2.44 ± 2.032.48 ± 2.13
3h_SBP_max123.84 ± 18.01123.79 ± 17.99
3h_HR_max76.48 ± 14.5276.34 ± 14.36
3h_BT_max36.56 ± 0.3536.56 ± 0.36
3h_WBC7.53 ± 3.177.53 ± 3.15
3h_PLT143.94 ± 63.64142.42 ± 64.84
3h_albumin3.58 ± 0.483.57 ± 0.49
3h_total_CO224.14 ± 3.1624.24 ± 3.15
3h_C-reactive_protein_126.99 ± 50.4828.74 ± 51.61
3h_pH7.43 ± 0.077.43 ± 0.07
3h_pO293.44 ± 25.8892.98 ± 24.74
3h_O2_saturation95.99 ± 3.8195.97 ± 3.88
3h_lactate2.44 ± 2.042.49 ± 2.15
GCS: Glasgow Coma Scale; SBP: systolic blood pressure; HR: heart rate; BT: body temperature; WBC: white blood cell; PLT: platelet.
Table 7. Performance comparison of KNN and missForest based on RF, GB, and MLP.
Table 7. Performance comparison of KNN and missForest based on RF, GB, and MLP.
ImputationAlgorithmPPVSensitivityF1 ScoreAUC
KNN imputerRF98.92%96.34%0.97610.9812
GB97.80%93.19%0.95440.9651
MLP96.81%95.29%0.96040.9751
MissForest imputerRF98.92%96.34%0.97610.9812
GB97.74%90.58%0.94020.9520
MLP96.17%92.15%0.94120.9592
PPV: positive predictive value; RF: random forest; GB: gradient boosting; MLP: multi-layer perceptron.
Table 8. Performance comparison of hyperparameter turning.
Table 8. Performance comparison of hyperparameter turning.
AlgorithmHyperparameterPPVSensitivityF1 ScoreAUC
DTFunction of computational complexityGiniMax depth895.76%82.72%0.88760.9120
996.49%86.39%0.91160.9306
1098.17%84.29%0.90700.9208
Entropy897.27%93.19%0.95190.9648
998.35%93.72%0.95980.9679
1098.31%91.62%0.94850.9574
RFFunction of computational complexityGiniMax depth898.66%76.96%0.86470.8844
998.76%83.25%0.90340.9158
1098.74%82.20%0.89710.9105
Entropy898.86%91.10%0.94820.9550
998.92%96.34%0.97610.9812
1098.91%95.29%0.97070.9760
SVMRegularization parameter5.5kernelRBF99.32%76.96%0.86730.8846
Linear77.36%42.92%0.55220.7093
Poly97.43%79.58%0.87610.8970
6.0RBF99.33%77.49%0.87060.8872
Linear76.58%44.50%0.56290.7167
Poly96.84%80.10%0.87680.8994
6.5RBF99.34%79.06%0.88050.8950
Linear75.68%43.98%0.55630.7138
Poly86.84%80.10%0.87680.8994
ABNumber of estimators140Learning rate0.594.08%74.87%0.83380.8723
1.095.43%87.43%0.91260.9354
2.06.91%84.82%0.12770.4344
1500.594.34%78.53%0.85710.8907
1.096.07%89.53%0.92680.9461
2.06.91%84.82%0.12770.4344
1600.594.97%79.06%0.86290.8935
1.094.97%89.01%0.91890.9430
2.06.91%84.82%0.12770.4344
GBNumber of estimators110Max depth395.73%82.99%0.88450.9094
497.77%91.62%0.94590.9572
597.80%93.19%0.95440.9651
120396.45%85.34%0.90560.9254
497.80%93.19%0.95440.9651
597.80%93.19%0.95440.9651
130396.45%85.34%0.90560.9254
497.78%92.15%0.94880.9598
596.55%87.96%0.92050.9384
MLPUnit size64Dropout rate20%97.13%88.48%0.92600.9413
30%96.63%90.05%0.93220.9489
40%95.71%81.68%0.88140.9068
12820%97.71%89.53%0.93440.9467
30%98.24%87.43%0.92520.9365
40%94.97%79.06%0.86290.8935
25620%96.81%95.29%0.96040.9751
30%97.80%93.19%0.95440.9651
40%92.73%80.10%0.85960.8798
RNNUnit size64Dropout rate20%76.2862.30%0.68590.8032
30%86.26%82.20%0.84180.9054
40%71.34%66.49%0.68830.8210
12820%98.32%92.15%0.95140.9601
30%97.85%95.29%0.96550.9755
40%78.08%59.69%0.67660.7913
25620%96.65%90.58%0.93510.9515
30%64.88%69.63%0.67170.8320
40%98.22%86.91%0.92220.9339
LSTMUnit size64Dropout rate20%95.72%93.72$0.94710.9668
30%98.29%90.05%0.93990.9496
40%72.15%59.69%0.65330.7886
12820%96.15%91.62%0.93830.9565
30%98.29%90.05%0.93990.9459
40%98.88%92.67%0.95680.9629
25620%98.87%91.62%0.95110.9577
30%97.77%91.62%0.94590.9572
40%98.86%91.10%0.94820.9550
GRUUnit size64Dropout rate20%97.75%91.10%0.94310.9546
30%98.82%87.43%0.92780.9367
40%98.32%92.15%0.95140.9601
12820%98.88%92.15%0.95390.9603
30%97.16%89.53%0.93190.9465
40%97.30%94.24%0.95740.9701
25620%97.78%92.15%0.94880.9598
30%98.31%91.10%0.94570.9548
40%98.34%93.19%0.95700.9653
PPV: positive predictive value; DT: decision tree; RF: random forest; SVM: support vector machine; RBF: radial basis function; poly: polynomial; AB: adaptive boost; GB: gradient boost; MLP: multi-layer perceptron; RNN: recurrent neural network; LSTM: long short-term memory; GRU: gated recurrent unit.
Table 9. Prediction performance. GRU with ADASYN demonstrated the highest performance.
Table 9. Prediction performance. GRU with ADASYN demonstrated the highest performance.
Machine Learning AlgorithmOversamplingPPVSensitivityF1 ScoreAUC
LRSMOTE42.42%86.39%0.56900.8817
Borderline SMOTE43.24%85.34%0.57390.8787
ADASYN44.24%86.39%0.58510.8853
DTSMOTE86.27%92.15%0.89110.9545
Borderline SMOTE92.22%93.19%0.92710.9626
ADASYN86.12%94.24%0.90000.9647
RFSMOTE96.43%98.95%0.97670.9932
Borderline SMOTE95.43%98.43%0.96910.9901
ADASYN96.43%98.90%0.97650.9929
SVMSMOTE91.09%96.34%0.93640.9776
Borderline SMOTE90.53%90.05%0.90290.9462
ADASYN90.40%93.72%0.92030.9643
ABSMOTE83.11%95.29%0.88780.9681
Borderline SMOTE85.57%90.05%0.87760.9438
ADASYN83.49%92.67%0.87840.9555
GBSMOTE95.96%99.48%0.97690.9956
Borderline SMOTE95.90%97.91%0.96890.9877
ADASYN98.45%99.48%0.98960.9967
MLPSMOTE98.94%97.91%0.98420.9891
Borderline SMOTE96.84%96.34%0.96590.9803
ADASYN98.38%95.29%0.96810.9758
RNNSMOTE97.33%95.29%0.96300.9753
Borderline SMOTE98.35%93.72%0.95980.9679
ADASYN96.35%96.86%0.96610.9827
LSTMSMOTE97.85%95.29%0.96550.9755
Borderline SMOTE98.38%95.29%0.96810.9758
ADASYN98.93%96.86%0.97880.9838
GRUSMOTE98.3–8%95.29%0.96810.9758
Borderline SMOTE98.38%95.29%0.96810.9758
ADASYN98.42%97.91%0.98160.9889
PPV: positive predictive value; LR: logistic regression; DT: decision tree; RF: random forest; SVM: support vector machine; AB: adaptive boost; GB: gradient boost; MLP: multi-layer perceptron; RNN: recurrent neural network; LSTM: long short-term memory; GRU: gated recurrent unit.
Table 10. Highest prediction performance of each machine learning algorithm.
Table 10. Highest prediction performance of each machine learning algorithm.
Machine Learning AlgorithmsOversamplingPPVSensitivityF1 ScoreAUC
LRADSYN44.47%83.88%0.58120.8745
DTBorderline SMOTE92.11%94.14%0.93120.9672
RFSMOTE96.09%98.90%0.97470.9928
SVMSMOTE89.49%96.70%0.92960.9786
ABSMOTE86.33%94.87%0.90400.9679
GBADASYN96.10%99.27%0.97660.9946
MLPSMOTE97.48%99.27%0.98370.9952
RNNADASYN95.67%97.07%0.96360.9835
LSTMADASYN98.18%98.90%0.98540.9937
GRUADASYN98.53%98.17%0.98350.9902
PPV: positive predictive value; LR: logistic regression; DT: decision tree; RF: random forest; SVM: support vector machine; AB: adaptive boost; GB: gradient boost; MLP: multi-layer perceptron; RNN: recurrent neural network; LSTM: long short-term memory; GRU: gated recurrent unit.
Table 11. Performance result of each scenario.
Table 11. Performance result of each scenario.
ScenarioAlgorithmPPVSensitivityF1 ScoreAUC
ReferenceLR44.47%83.88%0.58120.8745
DT92.11%94.14%0.93120.9672
RF96.09%98.90%0.97470.9928
SVM89.49%96.70%0.92960.9786
AB86.33%94.87%0.90400.9679
GB96.10%99.27%0.97660.9946
MLP97.48%99.27%0.98370.9952
RNN95.67%97.07%0.96360.9835
LSTM98.18%98.90%0.98540.9937
GRU98.53%98.17%0.98350.9902
Replace missing values via averageLR22.26%70.70%0.33860.7477
DT63.06%83.15%0.71720.8949
RF53.24%87.18%0.66110.9031
SVM41.05%83.15%0.54960.8646
AB58.21%73.99%0.65160.8472
GB75.40%86.45%0.80550.9201
MLP65.49%81.32%0.72550.8882
RNN62.10%78.02%0.69160.8697
LSTM41.63%80.22%0.54820.8529
GRU65.73%85.71%0.74400.9094
Does not perform oversamplingLR81.46%45.05%0.58020.7209
DT98.05%92.31%0.95090.9608
RF98.85%94.14%0.96440.9702
SVM99.06%77.29%0.86830.8861
AB94.80%86.81%0.90630.9320
GB98.38%89.01%0.93460.9444
MLP95.67%80.95%0.87700.9032
RNN90.87%83.88%0.87240.9158
LSTM77.73%62.64%0.69370.8055
GRU91.09%86.08%0.88510.9268
PPV: positive predictive value; LR: logistic regression; DT: decision tree; RF: random forest; SVM: support vector machine; AB: adaptive boost; GB: gradient boost; MLP: multi-layer perceptron; RNN: recurrent neural network; LSTM: long short-term memory; GRU: gated recurrent unit.
Table 12. Comparison of the performance between the proposed algorithm and algorithms in other studies.
Table 12. Comparison of the performance between the proposed algorithm and algorithms in other studies.
AlgorithmsFeaturesPatient Data RangeSensitivityPPVAUC
[4]Semi-supervised learning2532 h0.780.0230.78
[5]XGBoost24-0.71--
[6]LR26-0.720.740.89
[7]LightGBM25-0.80-0.746
[8]GradientBoosting1066 h0.5340.6430.769
[9]LR7-0.6060.8330.912
[10]LSTM82 h0.8810.2260.886
Our algorithmLSTM 173 h0.98170.98900.9937
PPV: positive predictive value; LR: logistic regression; LSTM: long short-term memory; RF: random forest.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Kim, Y.; Chae, M.; Cho, N.; Gil, H.; Lee, H. Machine Learning-Based Prediction Models of Acute Respiratory Failure in Patients with Acute Pesticide Poisoning. Mathematics 2022, 10, 4633. https://doi.org/10.3390/math10244633

AMA Style

Kim Y, Chae M, Cho N, Gil H, Lee H. Machine Learning-Based Prediction Models of Acute Respiratory Failure in Patients with Acute Pesticide Poisoning. Mathematics. 2022; 10(24):4633. https://doi.org/10.3390/math10244633

Chicago/Turabian Style

Kim, Yeongmin, Minsu Chae, Namjun Cho, Hyowook Gil, and Hwamin Lee. 2022. "Machine Learning-Based Prediction Models of Acute Respiratory Failure in Patients with Acute Pesticide Poisoning" Mathematics 10, no. 24: 4633. https://doi.org/10.3390/math10244633

APA Style

Kim, Y., Chae, M., Cho, N., Gil, H., & Lee, H. (2022). Machine Learning-Based Prediction Models of Acute Respiratory Failure in Patients with Acute Pesticide Poisoning. Mathematics, 10(24), 4633. https://doi.org/10.3390/math10244633

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop