Next Article in Journal
The Effect of Long-Term Cryopreservation on the Properties and Functionality of Platelet-Rich Plasma
Previous Article in Journal
Strategies for Survival of Staphylococcus aureus in Host Cells
Previous Article in Special Issue
Prolactin Role in COVID-19 and Its Association with the Underlying Inflammatory Response
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Role of Artificial Intelligence in Identifying Vital Biomarkers with Greater Precision in Emergency Departments During Emerging Pandemics

by
Nicolás J. Garrido
1,2,3,
Félix González-Martínez
2,3,4,
Ana M. Torres
2,3,
Pilar Blasco-Segura
5,
Susana Losada
4,
Adrián Plaza
4 and
Jorge Mateo
2,3,*
1
Internal Medicine, Virgen de la Luz Hospital, 16002 Cuenca, Spain
2
Expert Medical Analysis Group, Institute of Technology, University of Castilla-La Mancha, 16071 Cuenca, Spain
3
Expert Medical Analysis Group, Instituto de Investigación Sanitaria de Castilla-La Mancha (IDISCAM), 45071 Toledo, Spain
4
Department of Emergency Medicine, Virgen de la Luz Hospital, 16002 Cuenca, Spain
5
Department of Pharmacy, General University Hospital, 46014 Valencia, Spain
*
Author to whom correspondence should be addressed.
Int. J. Mol. Sci. 2025, 26(2), 722; https://doi.org/10.3390/ijms26020722
Submission received: 17 December 2024 / Revised: 12 January 2025 / Accepted: 14 January 2025 / Published: 16 January 2025
(This article belongs to the Special Issue COVID-19: Advances in Pathophysiology and Therapeutics)

Abstract

:
The COVID-19 pandemic has accelerated advances in molecular biology and virology, enabling the identification of key biomarkers to differentiate between severe and mild cases. Furthermore, the use of artificial intelligence (AI) and machine learning (ML) to analyze large datasets has been crucial for rapidly identifying relevant biomarkers for disease prognosis, including COVID-19. This approach enhances diagnostics in emergency settings, allowing for more accurate and efficient patient management. This study demonstrates how machine learning algorithms in emergency departments can rapidly identify key biomarkers for the vital prognosis in an emerging pandemic using COVID-19 as an example by analyzing clinical, epidemiological, analytical, and radiological data. All consecutively admitted patients were included, and more than 89 variables were processed using the Random Forest (RF) algorithm. The RF model achieved the highest balanced accuracy at 92.61%. The biomarkers most predictive of mortality included procalcitonin (PCT), lactate dehydrogenase (LDH), and C-reactive protein (CRP). Additionally, the system highlighted the significance of interstitial infiltrates in chest X-rays and D-dimer levels. Our results demonstrate that RF is crucial in identifying critical biomarkers in emerging diseases, accelerating data analysis, and optimizing prognosis and personalized treatment, emphasizing the importance of PCT and LDH in high-risk patients.

1. Introduction

The COVID-19 pandemic has significantly transformed the clinical and immunological profiles of patients. Insights into the molecular composition of the virus have been crucial in identifying emerging variants, such as Omicron (BA.2, BA.4, BA.5), predicting their behavior, and implementing large-scale vaccination programs [1,2,3]. These vaccines have demonstrated high efficacy in reducing disease severity and mortality; however, challenges such as superinfections and allergic reactions persist. Although hypersensitivity reactions to these vaccines are rare, guidelines from organizations like the European Academy of Allergy and Clinical Immunology (EAACI) have provided valuable recommendations for their management [4,5].
During this period, established prediction tools for other diseases, including the Wells scale and the YEARS algorithm, have been applied to COVID-19 patients to evaluate their utility. Findings indicate that these models, originally developed for the general population, exhibit limitations when applied to COVID-19 cases, highlighting the necessity for disease-specific predictive tools [6,7,8].
Additionally, geospatial models have been utilized to estimate COVID-19 infection and mortality rates. These models have been effective in identifying high-risk conditions [9], though their accuracy could be enhanced by integrating additional factors, such as mortality in long-term care facilities and adherence to social distancing protocols.
Advancements in molecular biology and virology have enabled the identification of critical conditions associated with COVID-19, such as multisystem inflammatory syndrome in children (MIS-C), a serious complication with low mortality rates [9]. Alterations in immune cells, including T cells, B cells, and mast cells, have been linked to long-term symptoms in some patients, commonly referred to as prolonged COVID-19. The potential association between atopic diseases, such as asthma and rhinitis, and susceptibility to COVID-19 remains a topic of ongoing investigation.
These molecular biology advancements have been applied to establish correlations between specific molecules and severe COVID-19 outcomes. Studies have revealed a reduced diversity and increased clonal expansion of T-cell receptors in severe cases, particularly in patients under 55 years old, suggesting an impaired immune repertoire in this group [10,11]. Additionally, molecular biomarkers such as ferritin, D-dimer, C-reactive protein (CRP), and lactate dehydrogenase (LDH) have been identified as effective indicators for distinguishing mild from severe cases. Higher expression of certain genes, such as MX1 and AR, has been associated with lower risk in women [12,13,14].
The understanding of these biomarkers has also deepened insights into the effects of both the infection and the vaccines developed. For instance, studies have examined the relationship between the Sputnik vaccine and autoimmune diseases, reporting neurological and hematological manifestations such as thrombosis and Guillain–Barré syndrome in some patients, despite the vaccine’s overall safety [15]. Other research has linked the pandemic to an increase in Graves’ disease (GD), particularly among female smokers, suggesting immune system hyperactivation. The persistence of this trend remains uncertain [16].
In recent years, artificial intelligence (AI) and machine learning (ML) have demonstrated significant potential to revolutionize medical diagnostics. While the literature on their application to identify relevant biomarkers for specific pathologies remains limited, AI is anticipated to play a pivotal role in pathology workflows. Algorithms based on imaging and computational pathology enhance diagnostic accuracy, particularly in complex cases such as cancer, and automate tasks like immunohistochemical biomarker evaluation [17,18,19]. AI is also transforming pathology education through interactive training environments. However, its integration raises ethical concerns, including patient privacy and consent. Effective collaboration between pathologists and AI technologies is essential to ensure that these tools augment, rather than replace, professional expertise.
This study aims to illustrate how the integration of ML algorithms in emergency services can rapidly identify critical biomarkers for vital prognosis, with a particular focus on COVID-19 patients. By analyzing extensive epidemiological, clinical, analytical, and radiological data, the proposed approach seeks to enhance the identification of severe disease patterns.

2. Results

This section presents the results obtained from patient records used for training and validation, focusing on the prediction of mortality in COVID-19 patients, as well as the identification of the factors with the greatest influence on mortality. The performance of the proposed system has been compared with various ML classification methods widely recognized in the scientific community. The most significant variables in this prediction were also analyzed, employing widely recognized standard parameters as detailed in Table 1 and Table 2. The results, presented in Table 1, indicate that the Random Forest (RF) model developed in this study outperformed the other methods in terms of performance, achieving an accuracy of nearly 93%. Specifically, it demonstrated 5.44% higher accuracy than the k-Nearest Neighbors (KNN) algorithm and 15.56% higher than the Gaussian Naive Bayes (GNB). In contrast, the Bayesian linear discriminant analysis (BLDA), Decision Trees (DT), and Support Vector Machines (SVM) methods recorded the lowest accuracy rates among those analyzed.
Additional metrics were assessed, including the Youden’s Index (DYI), Matthews Correlation Coefficient (MCC), Kappa Index, and Area Under the Curve (AUC). The MCC stands out as a robust statistical measure, as its score reflects the effectiveness of predictions considering all four categories of the confusion matrix (true negatives, false negatives, true positives, and false positives), as well as the balanced distribution of positive and negative instances in the dataset. As indicated in Table 2, the RF model presented in this study achieved an MCC close to 1, demonstrating superior accuracy in mortality prediction compared to other methods. Additionally, the Kappa Index, analyzed in Table 2, confirmed that the RF system significantly outperformed both KNN and GNB.
To enhance clarity in the representation of the results, the measurements of the training and test datasets (Figure 1) were grouped using radar charts. In this format, a complete circle on the grid indicates ideal performance across all metrics. It is crucial that the test set results are consistent with those of the training set, as significant differences could indicate overfitting.
In the analysis performed, the training data consistently achieved high scores across all metrics, while the test data also obtained solid results, though slightly lower, demonstrating the absence of overfitting. As shown in Figure 1, the proposed RF model excelled in both phases (training and testing) with balanced performance. On the other hand, while KNN and GNB showed similar results, the SVM algorithm demonstrated more limited predictive capacity.
The Receiver Operating Characteristic (ROC) curve was also generated to compare the performance of the proposed system with other ML methods. The results of different systems in predicting mortality variables are presented in Figure 2. In this figure, the proposed RF model achieved an AUC of 0.93, outperforming the KNN method, which obtained an AUC of 0.87, the closest in performance.
Despite optimization of the KNN model, the results consistently demonstrated that the RF model provides superior performance in key metrics such as accuracy, recall, F1 score, and AUC-ROC. This is attributed to RF’s inherent ability to handle complex interactions between features and its boosting approach, which iteratively adjusts errors, something that KNN, based on the bagging method, does not achieve in the same manner.
Furthermore, the proposed RF method offers several key advantages in ML. It is a robust and flexible model that combines multiple decision trees to improve accuracy and reduce the risk of overfitting. Through randomization in both sample selection and feature splitting for nodes, RF performs well with noisy data and irrelevant variables. Additionally, it is effective for classification problems and can handle large datasets with many features without requiring normalization. It also provides variable importance metrics, enhancing model interpretability.
On the other hand, the RF model assigned weights to the most relevant variables for predicting mortality. The most significant variables included elevated levels of procalcitonin (PCT), patient age, and initial oxygen saturation measured in the emergency department (Figure 3). Other important factors identified were LDH, CRP, chest X-ray infiltrates, and D-dimer levels. Elevated PCT levels are associated with severe inflammatory responses and secondary bacterial infections, often signaling complications such as sepsis or organ dysfunction. The age in our study conducted during the first wave behaved as a risk factor associated with mortality, although it is true that some authors have described differences in severity with the Omicron variant. Reduced oxygen saturation reflects hypoxemia, a critical marker of respiratory failure. Elevated LDH levels indicate a certain degree of tissue damage and are associated with increased lactate production, which often occurs in hypoxemic conditions, although LDH levels may also rise in other pathophysiological processes. Patients with severe asthma face an increased risk of respiratory complications due to pre-existing airway inflammation. Elevated CRP (C-reactive protein) levels indicate a state of hyperinflammation, which may result from inflammatory, infectious, autoimmune, or other processes, while the need for oxygen therapy signifies advanced disease and significant pulmonary impairment. Together, these variables not only highlight key pathophysiological processes in COVID-19 but also enable effective risk stratification and timely clinical interventions [20,21,22].
The system also highlighted the importance of oxygen therapy and the presence of comorbidities. Biochemical values such as ALT and albumin levels had a moderate impact on the prediction.
The system has been validated with an external database from the General University Hospital of Valencia to verify the proposed method. During the same time period, consecutive patients who tested positive for SARS-CoV-2 via PCR, aged over 18, and presenting with symptoms associated with COVID-19 were included. This new dataset comprised 200 patients. As can be seen in Table 3 and Table 4, the RF algorithm maintains a similar behavior to that produced with the Hospital Virgen de la Luz data. Specifically, RF demonstrated consistent performance metrics, including accuracy, precision, specificity, and sensitivity.
On the other hand, other ML methods, such as DT, SVM, and KNN, exhibited a decline in accuracy and failed to match the performance of the RF algorithm in this external validation. These results confirm the robustness of the RF model and its capacity to generalize effectively across different datasets.
With these findings, the proposed system proves to be highly accurate in identifying critical biomarkers and predicting mortality in COVID-19 patients. This reinforces its utility as a practical tool that can assist physicians in emergency departments to optimize patient management and resource allocation during emerging pandemics.

3. Discussion

Several recent studies highlight the impact of AI in biomarker discovery, particularly in the fields of oncology and personalized medicine. Advanced AI tools, such as deep learning and spatial biology, are being employed to identify key biomarkers that predict tumor responses to treatments, including immunotherapies. For instance, these models analyze complex tissue structures and cellular interactions to enhance diagnosis and personalize treatments, excelling in recognizing biomarkers such as PD-L1 expression, which can predict the effectiveness of immunological therapies in patients [23,24].
In this context, AI also facilitates analyses that combine genomic and epigenomic data to identify gene expression patterns and deoxyribonucleic acid (DNA) methylation changes directly linked to cancer [25,26], advancing toward highly personalized treatment strategies based on the molecular profile of each patient.
Another application of AI is in the identification of imaging biomarkers in the field of pathology, where deep neural networks are capable of analyzing radiological and histological data to improve colorectal cancer detection rates [27,28] or provide more precise diagnoses of precancerous cervical lesions [29,30]. These tools help identify factors that predict metastasis or recurrence, enabling the design of individualized treatment plans. AI is accelerating these processes by automating and improving biomarker validation studies, streamlining the drug development pipeline.
These advances demonstrate how AI could revolutionize biomarker discovery, speeding up the development of more precise and personalized medical interventions.
In our study, we used the COVID-19 pandemic as a case example. The most influential variable identified was PCT, followed by LDH, CRP, and D-dimer. During a pandemic, it is crucial to quickly and accurately identify the biomarkers that have the most impact on mortality in order to combat the disease effectively. AI could provide this response right from the hospital emergency services.
To date, studies have investigated the relationship between PCT and bacterial co-infection in COVID-19 patients with viral pneumonia [21,31], but there are no studies linking this biomarker as a standalone prognostic value. It was not until late 2023 that PCT began to be interpreted as an inflammation marker in severe disease [32,33].
PCT is a serum polypeptide found in minimal amounts (0.5 ng/mL) in plasma and can rise within hours in severe bacterial infections. Its synthesis is triggered by bacterial endotoxins and inflammatory cytokines, primarily interleukin (IL)-1beta, IL-6, and tumor necrosis factor-alpha [34]. Under normal conditions, it is primarily synthesized by thyroid C-cells. We believe that AI could have led to these results much earlier.
Our study also highlighted the importance of the LDH variable. LDH is an enzyme involved in lactate production, a byproduct generated when an organ experiences oxygen deprivation, as discovered later [35]. We believe that with the use of AI, we could have linked it to acute respiratory distress syndrome (ARDS) much earlier, closely monitoring patients at risk of poor outcomes, optimizing detection, and conducting early studies to achieve curative treatment. In an emerging pandemic, this early identification could change the natural history of a disease.
It is true that these markers may suggest bacterial infections of any other origin or state of tissue hypoxia that could result from other types of pathologies. The most valuable feature we aim to leverage from AI is its ability to rapidly analyze large volumes of data, providing an initial approximation of the most significant biomarkers. These biomarkers might already be known or associated with a specific pathology; however, given that medicine has very few disease-specific markers, identifying their potential relevance to another disease could prove highly significant. This becomes even more critical in the context of an emerging disease or a known pathology exhibiting atypical behavior and approached directly from the hospital emergency department. Subsequent studies will be conducted to achieve a more precise understanding of the disease; however, AI will assist in the initial approach.
Early identification of predictive variables in any emerging disease is crucial to saving lives, as it allows for the identification of patients at higher risk of complications before they manifest. Early identification facilitates the implementation of timely interventions, such as specific treatments, intensive monitoring, or preventive measures, which can make the difference between recovery and fatal outcomes. Additionally, understanding these variables in the emergency setting helps optimize medical resources, prioritizing care for those who need it most, and contributes to the development of evidence-based public health strategies. In emerging diseases, where time is critical and initial knowledge is limited, this predictive capability can significantly reduce mortality and mitigate the overall impact of the disease [36,37,38].
Our findings enhance the understanding of AI’s role in elucidating pathogenesis and assisting in the prediction and management of severe cases of a new disease.
On the other hand, the RF algorithm is a robust and widely used technique in supervised ML, owing to its high accuracy, generalization capability, and resistance to overfitting. RF combines multiple independent decision trees through a bagging process, which helps reduce model variance and improve stability. This strategy not only reduces the risk of overfitting but also provides greater generalization ability compared to individual decision trees [39,40].
One of the main advantages of RF is its ability to handle high-dimensional data and work effectively with missing or imbalanced data, making it well-suited for classification problems. Additionally, the model can capture nonlinear and complex relationships between variables without the need for parametric adjustments [40,41,42].
Another relevant benefit is model interpretability through feature importance estimation, an integrated function in RF that allows the identification of variables that contribute the most to predictions, which is valuable in applications where interpretability is crucial. Finally, the inherently parallel design of RF enables computationally efficient implementation, making it feasible to apply to large data volumes and tasks where precision and scalability are essential [40].
The results presented support the robustness and generalization capability of the proposed system, demonstrating its effectiveness for application in diverse clinical contexts without compromising accuracy or reliability. Its implementation in an additional hospital not only validates the model in an external setting but also underscores its applicability in real-world scenarios, enhancing the identification of critical biomarkers for the vital prognosis of COVID-19 patients.

4. Materials and Methods

4.1. Patients

The study took place at the Virgen de la Luz Hospital, the primary healthcare facility for the metropolitan area of Cuenca in Castilla-La Mancha, Spain. Between 2 March and 30 April 2020, more than 13,000 individuals visited the emergency department with symptoms indicative of COVID-19. Among these, 708 cases were selected from patients who tested positive for SARS-CoV-2 through molecular detection of the virus using transcription-mediated amplification (TMA) technology with Panther equipment on nasopharyngeal swabs and were aged over 18 years with symptoms associated with COVID-19. Patients were excluded if they presented outside this period, were minors, or could not undergo PCR testing for various reasons, including patient refusal, anatomical nasopharyngeal abnormalities that prevented proper sample collection, voluntary discharge requests, or discharge prior to testing.
This observational cross-sectional study was based on data gathered from a review of medical records from the emergency department. In total, 89 variables were collected: death or discharge, sore throat, dyspnea, diarrhea, fever, chest pain, asthenia, cough, headache, anosmia, myalgia, ageusia, Glasgow Coma Scale score, quick Sequential (qSOFA) score, PO2/FiO2 ratio, pH, partial pressure of oxygen (PO2), partial pressure of carbon dioxide (PCO2), pharmacological immunosuppression, institutionalization, smoking, vascular disease, active cancer, hypertension, obesity, chronic kidney disease, dyslipidemia, diabetes mellitus, chronic obstructive pulmonary disease (COPD), severe asthma, liver disease, thromboembolic disease, nationality, age, medical record number, sex, respiratory rate, oxygen saturation, systolic and diastolic blood pressure, fraction of inspired oxygen (FiO2), heart rate, body temperature, thoracic computed tomography (CT) findings, emergency department chest X-ray showing bilateral or unilateral infiltrates, total proteins, creatinine, albumin, troponin, leukocyte count, D-dimer, hemoglobin, lymphocyte count, platelet count, ferritin, prothrombin time (PT), CRP, LDH, alanine transaminase (ALT), PCT, antibiotics, immunomodulators (tocilizumab, cyclosporine, anakinra, baricitinib), antivirals (lopinavir/ritonavir, emtricitabine/tenofovir disoproxil, darunavir/cobicistat), corticosteroids (dexamethasone or methylprednisolone), hydroxychloroquine, low molecular weight heparin (LMWH), bronchodilators, oxygen therapy, dexamethasone (including dose), LMWH, methylprednisolone (including dose), duration of illness, hospitalization dates, and duration.
The study protocol was approved by the Clinical Research Ethics Committee of the Hospital Virgen de la Luz. All principles of the Declaration of Helsinki and the Spanish Data Protection Act 15/1999 were strictly observed, ensuring the anonymity of the patients. The physicians responsible for data collection were distinct from those who performed the subsequent analysis.

4.2. Artificial Intelligence Method

The integration of AI is transforming medicine by offering advanced solutions for clinical data analysis, early diagnosis, and the development of personalized treatments. With the capacity to process large volumes of information—such as medical imaging, genomic data, and health records—it enhances precision and accelerates clinical decision-making [43]. AI-based tools, including deep neural networks and ML models, have demonstrated their effectiveness in detecting complex diseases, identifying digital biomarkers, and stratifying patient risk [43,44,45,46]. AI not only enhances diagnostic accuracy but also enables the prediction of disease progression and the adaptation of therapeutic interventions to the individual characteristics of each patient, paving the way for more personalized, preventive, and precise medicine [47,48,49,50].
In this analysis, the RF algorithm was applied, an ensemble method that leverages the bagging aggregation approach to construct multiple decision trees independently, thereby reducing variance and improving model robustness. RF employs bootstrapping, or sampling with replacement, from the training data to generate a collection of independent trees, where each tree is trained on a random subset of the original dataset. Additionally, at each tree node, a random subset of features is selected instead of considering all variables. This introduces an additional source of randomness, reduces inter-tree correlation, and enhances the ensemble’s predictive accuracy [43,51,52].
During the training process, each tree independently makes decisions, and the results are combined using a majority-voting scheme for classification. Feature importance is assessed by analyzing the decrease in accuracy or the Gini index when the values of a feature are randomly permuted. This enables the identification of the most relevant variables for the model. Finally, model performance was evaluated using specific metrics such as accuracy, sensitivity, specificity, and the AUC, validating its predictive capacity and generalization to unseen data.
In RF, an ensemble of decision tree {T1, T2, …, Tm} is constructed using the bagging approach. To build each tree Ti, the following steps are performed:
  • Bootstrapping (Sampling with Replacement): From a training dataset containing n observations and pp features, a subset Di is generated by randomly selecting n samples with replacement from the original dataset. This technique allows certain data points to appear multiple times in Di, while others may not appear at all.
  • Random Feature Selection: At each node of each tree, instead of evaluating all pp features, a random subset of kk features is selected, where k p . This reduces correlation between individual trees, enhancing the model’s generalization capability.
  • Node Splitting Criterion: Each node is split based on an impurity reduction criterion, such as entropy or the Gini index, in classification tasks. In this study, the Gini index was used. The impurity GG of a node with class proportions pk is defined as:
    G = 1 k = 1 K p k 2
  • Tree Aggregation: Once the trees are trained, RF predictions are obtained through aggregation. For a set of tree {T1, T2, …, Tm}, the final prediction  y ^  is determined by majority voting:
    y ^ = m o d e { T 1 x ,   T 2 x ,   ,   T m x }
  • Feature Importance: The importance of each feature is measured by evaluating the change in the splitting criterion when the feature is randomly permuted in the dataset. For Gini index-based importance, a feature is considered important if permuting it increases node impurity across the trees.
  • Model Evaluation: The model’s performance was assessed using metrics such as accuracy, sensitivity, specificity, and AUC in classification problems or mean squared error (MSE) in regression tasks.
In this study, the proposed method underwent comprehensive evaluation by comparison with various ML techniques for classifying COVID-19 patients. Algorithms included in the comparative analysis were GNB [43], KNN [43], BLDA [43], SVM [43], and DT [43], along with the novel RF system developed in this study. The implementation and evaluation of the models were performed using MATLAB’s Statistics and Machine Learning Toolbox (version 2024a). To mitigate overfitting, a five-fold cross-validation strategy was employed. Data were split into two subsets, with 70% allocated for training and 30% for testing, ensuring independence between patient groups in each subset. Figure 4 schematically illustrates the study workflow, starting with patient selection and database creation, followed by the training and validation phases of the ML models.
The advantages of the various ML methods used in the article are outlined below. SVM is a classification-focused algorithm designed to find an optimal hyperplane in a higher-dimensional space, maximizing the margin between different classes. Additionally, SVM effectively handles non-linear data through the use of the “kernel trick”, which transforms the data into a more manageable space [43,53]. BLDA extends the approach of Linear Discriminant Analysis (LDA) by incorporating additional probabilistic assumptions. This method assumes a multivariate normal distribution within each class and applies Bayesian methodologies, making it particularly valuable in scenarios where classes exhibit distinct distributions or variances [43,54]. DT are predictive models structured as trees, comprising a root node, internal nodes, and leaf nodes. The depth of the tree directly impacts its generalization capabilities, and pruning techniques are employed to mitigate overfitting. The construction process involves iteratively selecting features to partition the data, aiming to maximize homogeneity at each split [43,55]. GNB is a variant of the Naive Bayes model that assumes a Gaussian distribution for input features. Commonly used for classification tasks, GNB requires a labeled training dataset. Parameters of the Gaussian distribution are calculated for each class, and classification is performed using Bayes’ rule, providing a probabilistic estimation [43,56]. Finally, KNN is a supervised learning algorithm that relies on the majority vote of the k nearest neighbors for classification. It depends on a labeled training dataset, employing a selected distance metric and a specified value of k. The classification process determines the label for a new point by voting among its k nearest neighbors [43,57].
The parameters of the models were optimized with a Bayesian approach. It generates a short sequence of simulated experiments with different combinations of the hyperparameters, keeping the values that present the best AUC and balanced accuracy. The configurations employed in this study are shown in Table 5.

5. Conclusions

The proposed algorithm has proven to be a pivotal tool in the discovery of biomarkers for emerging diseases in emergency departments, accelerating the analysis of large datasets and enabling the identification of critical factors for prognosis and treatment to guide personalized medicine. In the context of pandemics, implementing AI-based systems could facilitate the earlier identification of key biomarkers. In our study, the importance of PCT and LDH was demonstrated for prioritizing interventions in patients at vital risk. The proposed RF algorithm excels in analyzing complex patterns, integrating genomic and epigenomic data, and optimizing medical resources, contributing to the design of personalized strategies that enhance diagnostic precision and the management of severe cases. This technology not only accelerates biomarker validation but also fosters the development of innovative and tailored treatments, underscoring its significance in transforming medical responses to emerging pathologies and substantially reducing mortality.

Author Contributions

Conceptualization, N.J.G., F.G.-M. and J.M.; methodology, N.J.G., P.B.-S., F.G.-M., A.M.T. and J.M.; formal analysis, N.J.G., S.L., A.P., F.G.-M. and J.M.; investigation, N.J.G., P.B.-S., F.G.-M., S.L., A.P., A.M.T. and J.M.; writing—original draft preparation, N.J.G., F.G.-M., A.M.T., A.P., P.B.-S. and J.M.; writing—review and editing, N.J.G., A.M.T., P.B.-S., F.G.-M., S.L., A.P. and J.M.; supervision, F.G.-M. and J.M.; project administration, N.J.G. and J.M.; funding acquisition, N.J.G., F.G.-M. and J.M. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported financially by the University of Castilla-La Mancha, the Diputación Provincial de Cuenca, and Fundación Investigación Hospital General Universitario de Valencia (Spain).

Institutional Review Board Statement

This research was approved by This research was approved by the Ethics Committee of the Hospital Virgen de la Luz.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The datasets used and/or analyzed during the present study are available from the corresponding author upon reasonable request.

Acknowledgments

This study was supported by the Institute of Technology (University of Castilla La Mancha), and the Virgen de la Luz Hospital in Cuenca (Spain).

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Yuan, Y.; Jiao, B.; Qu, L.; Yang, D.; Liu, R. The development of COVID-19 treatment. Front. Immunol. 2023, 14, 1125246. [Google Scholar] [CrossRef] [PubMed]
  2. Ao, D.; He, X.; Liu, J.; Xu, L. Strategies for the development and approval of COVID-19 vaccines and therapeutics in the post-pandemic period. Signal Transduct. Target. Ther. 2023, 8, 466. [Google Scholar] [CrossRef] [PubMed]
  3. Zhu, Y.; Sharma, L.; Chang, D. Pathophysiology and clinical management of coronavirus disease (COVID-19): A mini-review. Front. Immunol. 2023, 14, 1116131. [Google Scholar] [CrossRef]
  4. Anand, U.; Jakhmola, S.; Indari, O.; Jha, H.C.; Chen, Z.-S.; Tripathi, V.; de la Lastra, J.M.P. Potential therapeutic targets and vaccine development for SARS-CoV-2/COVID-19 pandemic management: A review on the recent update. Front. Immunol. 2021, 12, 658519. [Google Scholar] [CrossRef]
  5. Zhang, H.-P.; Sun, Y.-L.; Wang, Y.-F.; Yazici, D.; Azkur, D.; Ogulur, I.; Azkur, A.K.; Yang, Z.; Chen, X.; Zhang, A.; et al. Recent developments in the immunopathology of COVID-19. Allergy 2022, 78, 369–388. [Google Scholar] [CrossRef] [PubMed]
  6. Franco-Moreno, A.I.; Bustamante-Fermosel, A.; Ruiz-Giardin, J.M.; Muñoz-Rivas, N.; Torres-Macho, J.; Brown-Lavalle, D. Utility of probability scores for the diagnosis of pulmonary embolism in patients with SARS-CoV-2 infection: A systematic review. Rev. Clin. Esp. 2022, 223, 40–49. [Google Scholar] [CrossRef] [PubMed]
  7. Franco-Moreno, A.; Palma-Huerta, E.; Fernández-Vidal, E.; Madroñal-Cerezo, E.; Marco-Martínez, J.; Romero-Pareja, R.; Izquierdo-Martínez, A.; Carpintero-García, L.; Ruiz-Giardín, J.M.; Torres-Macho, J.; et al. External validation of the CHEDDAR score for suspected pulmonary embolism in patients with SARS-CoV-2 infection in an independent cohort. J. Thromb. Thrombolysis 2024, 57, 352–357. [Google Scholar] [CrossRef] [PubMed]
  8. Vielhauer, J.; Benesch, C.; Pernpruner, A.; Johlke, A.L.; Hellmuth, J.C.; Muenchhoff, M.; Scherer, C.; Fink, N.; Sabel, B.; Schulz, C.; et al. How to exclude pulmonary embolism in patients hospitalized with COVID-19: A comparison of predictive scores. Thromb. J. 2023, 21, 51. [Google Scholar] [CrossRef] [PubMed]
  9. Mousavi Aghdam, M.; Crowley, Q. Application of GIS and spatiotemporal analyses in viral infection modelling using multiple datasets—A case study on the SARS-CoV-2 epidemic. Semergen 2024, 50, 102159. [Google Scholar] [CrossRef] [PubMed]
  10. Marín-Benesiu, F.; Chica-Redecillas, L.; Arenas-Rodríguez, V.; de Santiago, E.; Martínez-Diz, S.; López-Torres, G.; Cortés-Valverde, A.I.; Romero-Cachinero, C.; Entrala-Bernal, C.; Fernandez-Rosado, F.J.; et al. The T-cell repertoire of Spanish patients with COVID-19 as a strategy to link T-cell characteristics to the severity of the disease. Hum. Genom. 2024, 18, 94. [Google Scholar] [CrossRef] [PubMed]
  11. Xu, J.; Li, X.X.; Yuan, N.; Li, C.; Yang, J.G.; Cheng, L.M.; Lu, Z.-X.; Hou, H.-Y.; Zhang, B.; Hu, H.; et al. T cell receptor β repertoires in patients with COVID-19 reveal disease severity signatures. Front. Immunol. 2023, 14, 1190844. [Google Scholar] [CrossRef] [PubMed]
  12. Martinez-Diz, S.; Morales-Álvarez, C.M.; Garcia-Iglesias, Y.; Guerrero-González, J.M.; Romero-Cachinero, C.; González-Cabezuelo, J.M.; Fernandez-Rosado, F.J.; Arenas-Rodríguez, V.; Lopez-Cintas, R.; Alvarez-Cubero, M.J.; et al. Analyzing the role of ACE2, AR, MX1 and TMPRSS2 genetic markers for COVID-19 severity. Hum. Genom. 2023, 17, 50. [Google Scholar] [CrossRef] [PubMed]
  13. Meseldžić, N.; Prnjavorac, B.; Dujić, T.; Malenica, M.; Glamočlija, U.; Prnjavorac, L.; Bedak, O.; Kadrić, S.I.; Marjanović, D.; Bego, T. Association of ACE2 and TMPRSS2 genes variants with disease severity and most important biomarkers in COVID-19 patients in Bosnia and Herzegovina. Croat. Med. J. 2024, 65, 220. [Google Scholar] [CrossRef] [PubMed]
  14. Alam, N.; Lodhi, G.M.; Khan, U.A.; Zia, A.; Azam, M.; Khan, J.; Shah, T.A.; Okla, M.K.; Ali younous, Y.; Bourhia, M. Association of ACE2 and TMPRSS2 towards COVID-19 susceptibility. Discov. Life 2024, 54, 6. [Google Scholar] [CrossRef]
  15. Vera-Lastra, O.; Mora, G.; Lucas-Hernández, A.; Ordinola-Navarro, A.; Rodríguez-Chávez, E.; Peralta-Amaro, A.L.; Medina, G.; Cruz-Dominguez, M.P.; Jara, L.J.; Shoenfeld, Y. New onset autoimmune diseases after the Sputnik vaccine. Biomedicines 2023, 11, 1898. [Google Scholar] [CrossRef] [PubMed]
  16. Barajas Galindo, D.E.; Ramos Bachiller, B.; González Roza, L.; García Ruiz de Morales, J.M.; Sánchez Lasheras, F.; González Arnáiz, E.; Ariadel Cobo, D.; Ballesteros Pomar, M.D.; Rodríguez, I.C. Increased incidence of Graves’ disease during the SARS-CoV2 pandemic. Clin. Endocrinol. 2022, 98, 730–737. [Google Scholar] [CrossRef]
  17. Sandeep, F.; Kiran, N.; Rahaman, Z.; Devi, P.; Bendari, A. Pathology in the age of artificial intelligence (AI): Redefining roles and responsibilities for tomorrow’s practitioners. Cureus 2024, 16, e56040. [Google Scholar] [CrossRef]
  18. Lu, H.; Li, L.; Ong, K.; Wang, Y.; Jiao, Y.; Wang, X.; Cai, C.; Zhang, J.; Hou, J.; Zhao, H.; et al. AI-Based Computational Pathology and Its Contribution to Precision Medicine. In Frontiers in Bioimage Informatics Methodology; World Scientific: Singapore, 2024; pp. 167–193. [Google Scholar]
  19. Casado, P.; Cutillas, P.R. Proteomic characterization of acute myeloid leukemia for precision medicine. Mol. Cell. Proteom. MCP 2023, 22, 100517. [Google Scholar] [CrossRef]
  20. Balzanelli, M.G.; Distratis, P.; Dipalma, G.; Vimercati, L.; Catucci, O.; Amatulli, F.; Cefalo, A.; Lazzaro, R.; Palazzo, D.; Aityan, S.K.; et al. Immunity profiling of COVID-19 infection, dynamic variations of lymphocyte subsets, a comparative analysis on four different groups. Microorganisms 2021, 9, 2036. [Google Scholar] [CrossRef]
  21. Galli, F.; Bindo, F.; Motos, A.; Fernández-Barat, L.; Barbeta, E.; Gabarrús, A.; Ceccato, A.; Bermejo-Martin, J.F.; Ferrer, R.; Riera, J.; et al. Procalcitonin and C-reactive protein to rule out early bacterial coinfection in COVID-19 critically ill patients. Intensive Care Med. 2023, 49, 934–945. [Google Scholar] [CrossRef]
  22. Balzanelli, M.G.; Distratis, P.; Lazzaro, R.; Cefalo, A.; Catucci, O.; Aityan, S.K.; Dipalma, G.; Vimercati, L.; Inchingolo, A.D.; Maggiore, M.E.; et al. The Vitamin D, IL-6 and the eGFR markers a possible way to elucidate the lung–heart–kidney cross-talk in COVID-19 disease: A foregone conclusion. Microorganisms 2021, 9, 1903. [Google Scholar] [CrossRef]
  23. Vranic, S.; Gatalica, Z. PD-L1 testing by immunohistochemistry in immuno-oncology. Biomol. Biomed. 2023, 23, 15–25. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
  24. Hao, L.; Li, S.; Deng, J.; Li, N.; Yu, F.; Jiang, Z.; Zhang, J.; Shi, X.; Hu, X. The current status and future of PD-L1 in liver cancer. Front. Immunol. 2023, 14, 1323581. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
  25. Thomas, J.; Klebanov, A.; John, S.; Miller, L.S.; Vegesna, A.; Amdur, R.L.; Bhowmick, K.; Mishra, L. CEACAMS 1, 5, and 6 in disease and cancer: Interactions with pathogens. Genes Cancer 2023, 14, 12–29. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
  26. Ma, R.X.; Wei, J.R.; Hu, Y.W. Characteristics of Carcinoembryonic Antigen-Related Cell Adhesion Molecules and Their Relationship to Cancer. Mol. Cancer Ther. 2024, 23, 939–948. [Google Scholar] [CrossRef] [PubMed]
  27. Gan, P.; Li, P.; Xia, H.; Zhou, X.; Tang, X. The application of artificial intelligence in improving colonoscopic adenoma detection rate: Where are we and where are we going. Gastroenterol. Hepatol. 2023, 46, 203–213, (In English & Spanish). [Google Scholar] [CrossRef] [PubMed]
  28. Chow, K.W.; Bell, M.T.; Cumpian, N.; Amour, M.; Hsu, R.H.; Eysselein, V.E.; Srivastava, N.; Fleischman, M.W.; Reicher, S. Long-term impact of artificial intelligence on colorectal adenoma detection in high-risk colonoscopy. World J. Gastrointest. Endosc. 2024, 16, 335. [Google Scholar] [CrossRef] [PubMed]
  29. Ruiz, L.M.; Chahla, R.E.; Vega, I.M.; Ortega, E.S.; Barrenechea, G.G.; Contreras, M.F. Artificial Intelligence: Accuracy for the diagnosis of precancerous lesions of the cervix. Medicina 2024, 84, 459–467. (In Spanish) [Google Scholar] [PubMed]
  30. Kim, S.; An, H.; Cho, H.W.; Min, K.J.; Hong, J.H.; Lee, S.; Song, J.Y.; Lee, J.K.; Lee, N.W. Pivotal Clinical Study to Evaluate the Efficacy and Safety of Assistive Artificial Intelligence-Based Software for Cervical Cancer Diagnosis. J. Clin. Med. 2023, 12, 4024. [Google Scholar] [CrossRef] [PubMed]
  31. Shi, J.; Zhuo, Y.; Wang, T.Q.; Lv, C.E.; Yao, L.H.; Zhang, S.Y. Procalcitonin and C-reactive protein as diagnostic biomarkers in COVID-19 and Non-COVID-19 sepsis patients: A comparative study. BMC Infect. Dis. 2024, 24, 45. [Google Scholar] [CrossRef] [PubMed]
  32. Al-Janabi, G.; Al-Fahham, A.; Alsaedi, A.N.N.; Al-Amery, A.Y.K. Correlation between hepcidin and procalcitonin and their diagnostic role in patients with COVID-19. Wiad. Lek. 2023, 76, 65–70. [Google Scholar] [CrossRef] [PubMed]
  33. Gugo, K.; Tandara, L.; Juricic, G.; Pavicic Ivelja, M.; Rumora, L. Effects of Hypoxia and Inflammation on Hepcidin Concentration in Non-Anaemic COVID-19 Patients. J. Clin. Med. 2024, 13, 3201. [Google Scholar] [CrossRef]
  34. Julián-Jiménez, A.; García, D.E.; Merinos-Sánchez, G.; de Guadiana-Romualdo, L.G.; del Castillo, J.G. Diagnostic accuracy of procalcitonin for bacteremia in the emergency department: A systematic review. Rev. Esp. Quimioter. 2023, 37, 29–42. [Google Scholar] [CrossRef] [PubMed]
  35. Wang, S.; Liu, J.; Hu, S.; Mao, Y. LDH and NLR, as inflammatory markers, the independent risk factors for COVID-19 complicated with respiratory failure in elderly patients. Pak. J. Med. Sci. 2024, 40, 2112–2117. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
  36. Murdoch, B. Privacy and artificial intelligence: Challenges for protecting health information in a new era. BMC Med. Ethics 2021, 22, 122. [Google Scholar] [CrossRef] [PubMed]
  37. Gerke, S.; Minssen, T.; Cohen, G. Ethical and legal challenges of artificial intelligence-driven healthcare. In Artificial Intelligence in Healthcare; Academic Press: Cambridge, MA, USA, 2020; pp. 295–336. [Google Scholar]
  38. Linkeviciute, A.; Curigliano, G.; Peccatori, F.A.; Pakutinskas, P. The regulatory impact of a harmonized artificial intelligence regulation proposal on the clinical research landscape in the European Union. BioLaw J.-Riv. BioDiritto 2022, 509–524. [Google Scholar] [CrossRef]
  39. Gaïffas, S.; Merad, I.; Yu, Y. WildWood: A new random forest algorithm. IEEE Trans. Inf. Theory 2023, 69, 6586–6604. [Google Scholar] [CrossRef]
  40. Chen, X.; Yu, D.; Zhang, X. Optimal weighted random forests. arXiv 2023, arXiv:2305.10042. [Google Scholar]
  41. Rothacher, Y.; Strobl, C. Identifying informative predictor variables with random forests. J. Educ. Behav. Stat. 2024, 49, 595–629. [Google Scholar] [CrossRef]
  42. Fife, D.A.; D’Onofrio, J. Common, uncommon, and novel applications of random forest in psychological research. Behav. Res. Methods 2023, 55, 2447–2466. [Google Scholar] [CrossRef] [PubMed]
  43. Han, J.; Pei, J.; Tong, H. Data Mining: Concepts and Techniques; Morgan Kaufmann: San Francisco, CA, USA, 2022. [Google Scholar]
  44. Gil-Rojas, S.; Suárez, M.; Martínez-Blanco, P.; Torres, A.M.; Martínez-García, N.; Blasco, P.; Torralba, M.; Mateo, J. Application of Machine Learning Techniques to Assess Alpha-Fetoprotein at Diagnosis of Hepatocellular Carcinoma. Int. J. Mol. Sci. 2024, 25, 1996. [Google Scholar] [CrossRef] [PubMed]
  45. Escobar-Ipuz, F.A.; Torres, A.M.; García-Jiménez, M.A.; Basar, C.; Cascón, J.; Mateo, J. Prediction of patients with idiopathic generalized epilepsy from healthy controls using machine learning from scalp EEG recordings. Brain Res. 2023, 1798, 148131. [Google Scholar] [CrossRef] [PubMed]
  46. Soria, C.; Arroyo, Y.; Torres, A.M.; Redondo, M.Á.; Basar, C.; Mateo, J. Method for classifying schizophrenia patients based on machine learning. J. Clin. Med. 2023, 12, 4375. [Google Scholar] [CrossRef]
  47. Mora, D.; Nieto, J.A.; Mateo, J.; Bikdeli, B.; Barco, S.; Trujillo-Santos, J.; Soler, S.; Font, L.; Bosevski, M.; Monreal, M.; et al. Machine learning to predict outcomes in patients with acute pulmonary embolism who prematurely discontinued anticoagulant therapy. Thromb. Haemost. 2022, 122, 570–577. [Google Scholar] [CrossRef]
  48. Suárez, M.; Gil-Rojas, S.; Martínez-Blanco, P.; Torres, A.M.; Ramón, A.; Blasco-Segura, P.; Torralba, M.; Mateo, J. Machine Learning-Based Assessment of Survival and Risk Factors in Non-Alcoholic Fatty Liver Disease-Related Hepatocellular Carcinoma for Optimized Patient Management. Cancers 2024, 16, 1114. [Google Scholar] [CrossRef] [PubMed]
  49. Johnson, K.B.; Wei, W.Q.; Weeraratne, D.; Frisse, M.E.; Misulis, K.; Rhee, K.; Zhao, J.; Snowdon, J.L. Precision medicine, AI, and the future of personalized health care. Clin. Transl. Sci. 2021, 14, 86–93. [Google Scholar] [CrossRef] [PubMed]
  50. Suárez, M.; Martínez, R.; Torres, A.M.; Ramón, A.; Blasco, P.; Mateo, J. Personalized risk assessment of hepatic fibrosis after cholecystectomy in metabolic-associated steatotic liver disease: A machine learning approach. J. Clin. Med. 2023, 12, 6489. [Google Scholar] [CrossRef] [PubMed]
  51. Probst, P.; Wright, M.N.; Boulesteix, A.L. Hyperparameters and tuning strategies for random forest. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2019, 9, e1301. [Google Scholar] [CrossRef]
  52. Zhu, M.; Xia, J.; Jin, X.; Yan, M.; Cai, G.; Yan, J.; Ning, G. Class weights random forest algorithm for processing class imbalanced medical data. IEEE Access 2018, 6, 4641–4652. [Google Scholar] [CrossRef]
  53. Valkenborg, D.; Rousseau, A.J.; Geubbelmans, M.; Burzykowski, T. Support vector machines. Am. J. Orthod. Dentofac. Orthop. 2023, 164, 754–757. [Google Scholar] [CrossRef]
  54. Rajaguru, H.; Kumar Prabhakar, S. Bayesian Linear Discriminant Analysis for Breast Cancer Classification. In Proceedings of the 2017 2nd International Conference on Communication and Electronics Systems (ICCES), Coimbatore, India, 19–20 October 2017; pp. 266–269. Available online: https://ieeexplore.ieee.org/abstract/document/8321279 (accessed on 23 January 2024).
  55. Kourou, K.; Exarchos, T.P.; Exarchos, K.P.; Karamouzis, M.V.; Fotiadis, D.I. Machine learning applications in cancer prognosis and prediction. Comput. Struct. Biotechnol. J. 2014, 13, 8–17. [Google Scholar] [CrossRef] [PubMed]
  56. Chaplot, N.; Pandey, D.; Kumar, Y.; Sisodia, P.S. A Comprehensive Analysis of Artificial Intelligence Techniques for the Prediction and Prognosis of Genetic Disorders Using Various Gene Disorders. Arch. Comput. Methods Eng. 2023, 30, 3301–3323. [Google Scholar] [CrossRef]
  57. Uddin, S.; Haque, I.; Lu, H.; Moni, M.A.; Gide, E. Comparative performance analysis of K-nearest neighbour (KNN) algorithm and its different variants for disease prediction. Sci. Rep. 2022, 12, 6256. [Google Scholar] [CrossRef]
Figure 1. (a) illustrates the results of the training phase using a radar plot to display all the parameters analyzed in the study, where a larger area indicates higher predictive accuracy. (b) presents the results of the testing phase, also in radar plot format, allowing for a direct comparison between the algorithms evaluated.
Figure 1. (a) illustrates the results of the training phase using a radar plot to display all the parameters analyzed in the study, where a larger area indicates higher predictive accuracy. (b) presents the results of the testing phase, also in radar plot format, allowing for a direct comparison between the algorithms evaluated.
Ijms 26 00722 g001
Figure 2. This figure presents the ROC curves for the six ML algorithms that were evaluated.
Figure 2. This figure presents the ROC curves for the six ML algorithms that were evaluated.
Ijms 26 00722 g002
Figure 3. The figure presents a histogram showcasing the key parameters that contribute to the prediction of mortality in emergency COVID-19 patients.
Figure 3. The figure presents a histogram showcasing the key parameters that contribute to the prediction of mortality in emergency COVID-19 patients.
Ijms 26 00722 g003
Figure 4. The figure illustrates the learning and validation stages employed in this study.
Figure 4. The figure illustrates the learning and validation stages employed in this study.
Ijms 26 00722 g004
Table 1. The table displays the average values and standard deviations of specificity, recall, MCC, AUC, and F1 score.
Table 1. The table displays the average values and standard deviations of specificity, recall, MCC, AUC, and F1 score.
RecallSpecificityMCCAUC1F1 Score
SVM84.85 ± 0.8784.65 ± 0.8575.20 ± 0.820.85 ± 0.0284.49 ± 0.88
BLDA82.03 ± 0.9681.83 ± 1.0372.70 ± 0.850.82 ± 0.0281.68 ± 1.01
DT83.84 ± 0.9183.64 ± 0.9374.31 ± 0.920.84 ± 0.0283.49 ± 0.93
GNB77.16 ± 1.0876.98 ± 1.1068.39 ± 1.050.77 ± 0.0276.84 ± 1.04
KNN87.28 ± 0.7487.07 ± 0.7577.35 ± 0.780.87 ± 0.0186.91 ± 0.73
RF92.72 ± 0.5192.50 ± 0.4882.18 ± 0.450.93 ± 0.0192.34 ± 0.49
Table 2. The table presents the average values and standard deviations of accuracy, precision, Kappa, and DYI.
Table 2. The table presents the average values and standard deviations of accuracy, precision, Kappa, and DYI.
AccuracyPrecisionKappaDYI
SVM84.75 ± 0.8384.14 ± 0.8475.45 ± 0.8184.75 ± 0.82
BLDA81.93 ± 0.9981.35 ± 0.9872.94 ± 0.9581.93 ± 1.02
DT83.74 ± 0.9283.15 ± 0.9074.55 ± 0.8983.74 ± 0.91
GNB77.07 ± 1.0576.52 ± 1.0368.61 ± 1.0277.07 ± 1.06
KNN87.17 ± 0.7686.55 ± 0.7477.61 ± 0.7687.17 ± 0.75
RF92.61 ± 0.4991.95 ± 0.4882.45 ± 0.4692.61 ± 0.48
Table 3. The table displays the average values and standard deviations of specificity, recall, MCC, AUC, and F1 score for external validation.
Table 3. The table displays the average values and standard deviations of specificity, recall, MCC, AUC, and F1 score for external validation.
RecallSpecificityMCCAUC1F1 Score
SVM82.95 ± 0.8282.67 ± 0.8073.51 ± 0.830.82 ± 0.0282.76 ± 0.83
BLDA79.83 ± 1.0279.89 ± 1.0571.02 ± 0.910.79 ± 0.0279.76 ± 1.03
DT81.56 ± 0.9681.48 ± 0.9769.86 ± 0.950.81 ± 0.0281.37 ± 0.94
GNB74.99 ± 1.0974.87 ± 1.1266.59 ± 1.070.74 ± 0.0274.78 ± 1.06
KNN85.23 ± 0.7685.19 ± 0.7676.52 ± 0.720.85 ± 0.0185.31 ± 0.78
RF91.84 ± 0.5291.35 ± 0.5181.09 ± 0.490.91 ± 0.0191.57 ± 0.51
Table 4. The table presents the average values and standard deviations of accuracy, precision, Kappa, and DYI for external validation.
Table 4. The table presents the average values and standard deviations of accuracy, precision, Kappa, and DYI for external validation.
AccuracyPrecisionKappaDYI
SVM82.89 ± 0.8982.67 ± 0.9173.58 ± 0.8882.54 ± 0.89
BLDA79.90 ± 1.0279.96 ± 1.0371.01 ± 0.9779.87 ± 1.04
DT81.39 ± 0.9681.42 ± 0.9470.13 ± 0.9381.37 ± 0.95
GNB75.06 ± 1.0875.48 ± 1.0666.53 ± 1.0475.02 ± 1.07
KNN85.34 ± 0.7885.27 ± 0.7975.93 ± 0.7785.22 ± 0.78
RF91.75 ± 0.5291.47 ± 0.5181.34 ± 0.5191.57 ± 0.52
Table 5. Main hyperparameters of the machine learning algorithms evaluated in the study.
Table 5. Main hyperparameters of the machine learning algorithms evaluated in the study.
Method Parameters
SVM Kernel function: Gaussian
Sigma = 0.5
C = 1.0
Numerical tolerance = 0.001
Iteration limit = 100
DT Minimum number of instances in leaves = 4
Minimum number of instances in internal nodes = 6
Maximum depth = 100
BLDAKernel: Bayesian
GNB Usekernel: False
fL = 0
Adjust = 0
KNN Number of neighbours = 20
Distance metric: Euclidean
Weight: Uniform
RFNumber of estimators: 120,
Maximun_depth: 20,
Minimum_samples_split: 10,
Minimum_samples_leaf: 4,
Maximun _features: ‘sqrt’
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Garrido, N.J.; González-Martínez, F.; Torres, A.M.; Blasco-Segura, P.; Losada, S.; Plaza, A.; Mateo, J. Role of Artificial Intelligence in Identifying Vital Biomarkers with Greater Precision in Emergency Departments During Emerging Pandemics. Int. J. Mol. Sci. 2025, 26, 722. https://doi.org/10.3390/ijms26020722

AMA Style

Garrido NJ, González-Martínez F, Torres AM, Blasco-Segura P, Losada S, Plaza A, Mateo J. Role of Artificial Intelligence in Identifying Vital Biomarkers with Greater Precision in Emergency Departments During Emerging Pandemics. International Journal of Molecular Sciences. 2025; 26(2):722. https://doi.org/10.3390/ijms26020722

Chicago/Turabian Style

Garrido, Nicolás J., Félix González-Martínez, Ana M. Torres, Pilar Blasco-Segura, Susana Losada, Adrián Plaza, and Jorge Mateo. 2025. "Role of Artificial Intelligence in Identifying Vital Biomarkers with Greater Precision in Emergency Departments During Emerging Pandemics" International Journal of Molecular Sciences 26, no. 2: 722. https://doi.org/10.3390/ijms26020722

APA Style

Garrido, N. J., González-Martínez, F., Torres, A. M., Blasco-Segura, P., Losada, S., Plaza, A., & Mateo, J. (2025). Role of Artificial Intelligence in Identifying Vital Biomarkers with Greater Precision in Emergency Departments During Emerging Pandemics. International Journal of Molecular Sciences, 26(2), 722. https://doi.org/10.3390/ijms26020722

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop