1. Introduction
Severe Acute Respiratory Syndrome CoronaVirus 2 (SARS-CoV-2) has caused the present pandemic of coronavirus disease 2019 (COVID-19) [
1]. The first cases of SARS-CoV-2 appeared as an eruption in the Chinese region of Hubei in December 2019 [
2].
In the first week of March 2020, over 400,000 cases were confirmed globally, in 130 countries, and by 29 January 2021, the confirmed cases had risen to a little above 100,819,363 million in 250 countries/regions, with over 2,176,159 deaths worldwide [
3].
At the beginning of 2021, the number of countries struggling with the COVID-19 pandemic rose to over 250. The number of cases is increasing rapidly in many countries. One of the most important needs in this period in which the severity of the epidemic increased is the number of beds and ventilators (respirators) in ICUs. Intensive care units (ICUs) are critical to improving the survival of patients with serious COVID-19, to supply continuous oxygen help in aided ventilation when needed [
4,
5], and attention around the clock. ICUs are a valuable asset in areas with a high number of patients with COVID-19 [
6].
However, many countries are worried about the lack of health infrastructure in the face of the rapidly increasing number of cases [
7]. While governments have applied various protection measures in the process, health units are working to prevent the tsunami caused by a large number of infected individuals to be treated [
8]. For instance, Spain and Italy have been hit very hard with tremendous documented cases and deaths [
9]. Especially in Italy, critical resources such as protective equipment, ventilators, and even medical staff are becoming deficient. Doctors are being forced to choose to whom care should be prioritized [
10].
According to the paper by Emanuel et al. [
11], the regular approach of treating people on a “first-come, first-served” basis should not apply during these times. They suggested that prioritizing some indicators related to age, respiratory, and cardiac systems should be a better approach to consider the patients. While the COVID-19 pandemic, which has affected the entire world, caused a noticeable slowdown or even almost complete halt in all businesses and industry, the necessity of overloading the health system and using the health-related resources and health personnel effectively have been revealed.
Since the beginning of the pandemic, a large number of academics have produced significant papers and contributions to the struggle with COVID-19. Although the proposed study focuses on the decisions at the operational level, most of the studies relevant to COVID-19 have concentrated on strategic-level decisions such as spreading models or governments’ policies.
For instance, Giordano et al. [
12] proposed a new model that predicts the course of the epidemic to help plan an effective control strategy for Italy. Their discoveries provide policymakers with a tool to assess the consequences of possible strategies, including lockdown and social distancing, as well as testing and contact tracing.
To mitigate the COVID-19 outbreak, Carli et al. [
13] proposed an optimal control approach that supports governments in defining the most effective strategies to be adopted during post-lockdown mitigation phases in a multi-region scenario. Then, Pare et al. [
14] presented a variety of mathematical models that have been proposed to capture the dynamic behavior of epidemic processes and to estimate the spreading parameters of the virus. For an excellent review of COVID-19 forecasting and SIR models, the reader is referred to Rahimi et al. [
15]. Therefore, it is necessary to act immediately and develop systematic methodologies in order to overcome the aforementioned issues, maintain the healthcare system, and fight with the current pandemic by protecting valuable and limited resources and the healthcare personnel.
This research proposes a multi-decision-making procedure (AHP) and XGBoost to aid healthcare professionals in prioritizing patients infected with COVID-19 based on the results of biological laboratory examinations, to provide the desired intensive care facilities, and to manage patients’ health conditions by indoor healthcare providers.
The applied methodology in this paper includes three main phases. In the first part, the XGBoost classifier discriminated patients in a dataset into patients with COVID-19 who need ICU admission and those who do not. Then, the necessary criteria that are considered for ICU admission were determined. It is expected that all the criteria do not have the same priority. For instance, vasopressor need may be more urgent than the arrhythmia problem of a patient for ICU admission. For this reason, the criteria weights were determined using AHP in the second part. Finally, the next question is which patient positive with COVID-19 will use the ICU first in an emergency or limited-resource situation. To answer that question, the criteria weights were applied to rank the patients who need ICU treatment in the last part.
The Analytical Hierarchical Process (AHP) is a multiple-criteria decision-making approach that provides a structured and simple framework for decision-making [
16,
17]. Medical, information management systems, engineering, financial, geography, business, industry, education, and healthcare sectors have all the used AHP to tackle difficult decision problems [
18,
19,
20]. A set of classification or regression trees is used in XGBoost, which is based on DT ensembles [
21]. It predicts a target variable using training data (with multiple features) [
22,
23].
According to the investigated studies (see Ref. [
24]), it can be clearly said that AHP approaches are commonly used in various subsections of healthcare management. In addition, Angelis et al. [
25] mentioned that AHP approaches may give a more comprehensive and straightforward approach in healthcare to efficiently capture decision-makers‘ concerns, compare esteem trade-offs, and evoke their esteem inclinations. In expansion, AHP strategies might illuminate the improvement of a choice bolster framework in healthcare, contributing toward more productive, levelheaded, and authentic asset assignment choices.
At the time of writing, there is no research on the integrated system “XGBoost and AHP method” to determine and prioritize the patient status of COVID-19 to refer to health services, but there are other studies that have only prioritized the status without classification steps. This research determined the necessary standards based on knowledgeable human choices, and we studied machine learning methods. On the other hand, other studies on the economic impact of the pandemic on China and the world [
26] have used behavioral and social science to support the response to COVID-19, the pandemic [
27], the food supply chain during the COVID-19 pandemic [
28], etc. Readers can easily find different COVID-19 papers on different topics from different angles.
The authors were motivated to write this paper because they needed to determine the best strategy for accurately separating and prioritizing many patients infected with COVID-19 based on multi-laboratory examination features. If the proposed method is imposed on indoor healthcare providers (such as clinics and hospitals), medical staff are supposed to manage infected patients and distinguish between health conditions for large-scale admissions, as well as ensure treatment equity between treatment structures across affected areas. The paper is organized as follows: a brief introduction and the potential of the proposed solution to the problem are presented in
Section 1. The dataset preprocessing and the proposed methodology phases of the prioritization of COVID-19 patients are shown in
Section 2. The results are discussed in
Section 3, and the conclusion is presented in
Section 4.
3. Experimental and Results
The experimental procedure comprised three phases. In the first phase, a classification of the datasets was performed by using the XGBoost classifier based on various clinical variables for patients that needed to be admitted to ICUs and those who did not. The performance of the XGBoost classifier was then compared with other classifier algorithms.
The second phase showed the important features that were selected according to their effect on the XGBoost classifier decision-making. In the third phase, five patients were randomly selected from the dataset, and the AHP model was then employed to determine the priority of these patients to be admitted to ICUs based on selected important features that were recommended from the previous phase. In the last step, the AHP model decision was compared with the decision of the decision-making team. In the classification phase, 80% of the 550 various clinical variables of the patients with confirmed COVID-19 were employed for training and the rest for testing.
A confusion matrix for the testing dataset that has 110 cases was developed for the Xtreme gradient boosting (XGBoost) classifier and the counterpart classifiers, as shown in
Figure 3. A confusion matrix is a technique for summarizing a classification algorithm’s performance. When you have an unbalanced amount of observations in each class or over two classes in your dataset, classification accuracy alone can be misleading. From a confusion matrix, accuracy, sensitivity, and specificity rates were computed and are shown in
Table 3.
The findings revealed that the XGBoost model could classify between patients that need to be admitted to ICUs or those who do not by an achieved accuracy of 97%. It was also noticeable that the XGBoost classifier attained a significantly higher accuracy than the corresponding counterpart classifiers did. This is because the confusion matrix as shown in
Figure 3 revealed that the tested XGBoost classifier could correctly identify 54 cases having severe symptoms who require admission into ICUs (TP) and 52 cases as patients who do not require to admission to ICUs (TN). Therefore, the XGBoost classifier had achieved a higher accuracy because of its high ability of classification and hence provided a useful and efficient diagnosis of COVID-19 cases that need ICUs using various common clinical variables test data. In addition, the XGBoost classifier had achieved the highest value for the sensitivity of 96% because it had two positively tested cases that were wrongly identified as negatively tested cases (FP) and three negatively tested cases that were wrongly identified as positively tested (FN).
On the other hand, the difference between the specificity and sensitivity values for ANN, KNN, and SVM classifiers was very high, so these classifiers were biased into a certain class. It was also noticed that the specificity rates were higher than sensitivity rates, which means these classifiers were biased to distinguish the cases that do not need admission to ICUs.
DT and RF were almost of the same performance, but their results remained unsatisfactory compared to the XGBoost classifier. The SVM classifier showed the lowest performance and therefore lesser ability of discrimination between cases, even if the settings for the classifier were altered.
To interpret the XGBoost classifier model and to show the relative importance of each feature and its effect on the predicting ability, a SHAP summary plot was performed and is shown in
Figure 4. Each point in the SHAP summary plot represents a row of the dataset. It can show the positive or negative relationships for each variable with the target.
Features are sorted in descending order according to their importance. The horizontal location in the SHAP summary plot shows whether the effect of that value is associated with a higher or lower prediction. The x-axis points show the effect of the feature on the estimation of a specific patient. Color refers to either high (red) or low (blue) relative variables. Positive SHAP values show that the model predicted patients with confirmed COVID-19 that need ICUs, while a negative SHAP value shows patients with confirmed COVID-19 who do not need ICUs. SHAP values farther away from zero mean a bigger impact for a certain feature.
It was noticed from
Figure 4 that the topmost important clinical variables that had a significant effect on the XGBoost model’s prediction were the lymphocytes, PCR, diastolic blood pressure, respiratory rate, urea concentration, creatinine, neutrophils, P02 venous blood gas, age above 65, sodium, TGO, GGT, glucose, and lactate.
It was observed from
Figure 4 that patients predicted by the model who urgently need ICU admission had high values in some features such as the respiratory rate, PCR, urea concentration, creatinine, age, blood pressure, and lymphocytes, and low values in other features such as oxygen saturation, lymphocytes, sodium, hematocrit, and lactate.
We then employed the AHP model to weight each clinical variable that was recommended from the SHAP summary plot. The results of the AHP method were presented after performing all the steps illustrated in
Section 2.2.3.
In the first stage of the AHP method, a four-level analytic hierarchical tree was constructed and is shown in
Figure 5. The first level was the goal of this study, which is to determine the prioritization of patients with COVID-19 to ICU admission. The second level represents the five key criteria: blood test, liver function test, kidney function test, blood gas analyzer, and vital signs. The third level then shows the detailed composition of the five major criteria into 14 subcriteria: blood test is divided into linfocitos, neutrophils, PCR, sodium, glucose, lactate, and TGO. Kidney function test is divided into urea and creatinine, vital signs is divided into age above 65, respiratory rate, diastolic blood pressure, and liver function, and blood gas analyzer remains as a single criterion GGT and P02 venous, respectively. Afterward, the last level of the decision hierarchy comprises the five patients (alternatives) that need to be ranked to determine the prioritization of patients with COVID-19 to ICU admission based on the selected criteria. The five patients were selected randomly from datasets.
After constructing the decision hierarchy, a set of pair-wise comparison matrices for levels 2 and 3 of the analytic hierarchical tree were created. The pair-wise comparison judgments in this study were obtained through a conversation with the decision-making team. For each of these matrices, pair-wise comparisons were performed between each of the matrix’s two members, using the relative importance scale proposed by Saaty [
35].
After constructing the pair-wise comparison matrix, the next step was the normalization to form the matrix elements on a common scale. Then, the computation of criteria weights or vectors of priorities in the matrix was accomplished by applying terms of matrix algebra. After calculating the weights of each criterion in level 2 and subcriteria in level 3, the results were rearranged in descending order of priority.
Table 4,
Table 5,
Table 6 and
Table 7 show the weights of the judgment matrix and all priority values (eigenvector) for hierarchy elements in level 2 and 3.
From
Table 4, the ranking list of critical criteria showed that the weight of blood tests of 46% occupied the top-most ranking in the list, followed by liver function test (26%), kidney function test, and blood gas analyzer having both weights of 11% and vital signs (6%).
It was noticed from
Table 5,
Table 6 and
Table 7 that the top subcriteria having the highest weights in all lists were the lymphocytes test, urea, and age above 65, while TGO, creatinine, and diastolic blood pressure achieved the lowest weights in all lists. The subcriteria GGT and PO2 venous had the same weights for criteria Level 2 (liver function test and blood gas analyzer).
It was also observed that the CR values shown in
Table 4,
Table 5,
Table 6 and
Table 7 were all less than 0.1, which accepted and proved that the expert’s inputs were consistent. After evaluation of the weights for each criterion, the overall score for five patients was computed and is shown in
Table 8.
As illustrated in
Table 8, the proposed AHP model ranked patient C as the first patient (priority for admission to an ICU) with the highest overall score of 2.116 (28%), patient E as the second in order with an overall score of 1.731 (23%), patient B as the third with an overall score of 1.508 (20%), patient A as the fourth with an overall score of 1.206 (16%), and patient D to be the fifth in order (least priority for admission to an ICU) with a lowest overall score of 0.869 (12%).
To validate the output of this AHP model, the results obtained from the proposed system were compared with the evaluation of the decision-making team for the same five patients that were ranked from the AHP model.
Figure 6 presents the differences between AHP prioritization results (solid line labeled with an overall score of patients) and experts ranking (dash line), and it was observed that the experts ranked patients A, B, C, and E as the same risk level, while experts had varying judgments concerning patient D, because experts evaluated patients A and D as having the same priority level 4. By reference to the value of AHP overall Score of patients, a tiny difference between patients A (0.12) and patient D (0.11) was found, revealing that the evaluation of both the AHP system and experts was the same for patient D.
The experiment was repeated three times on other randomized patients to investigate the variance in AHP and expert’s decision ranking and are shown in
Figure 7. It was noticed from the curves that the AHP and expert’s decisions were the same for all patients, while there was a decision variation for patient A in
Figure 7b, and patients D and C in
Figure 7c. From these results, it was concluded that slight differences between patients in the overall score resulting from the AHP regimen do not show actual differences in the level of risk for this patient. The results showed that when the difference between the total score values of two patients in the AHP is less than 0.01, both patients are at the same risk level.