Next Article in Journal
Endo-Aortic Clamping with the IntraClude® Device in Minimally Invasive Total Coronary Revascularization via Left Anterior Thoracotomy (TCRAT)
Previous Article in Journal
Is an Electronic Nose Able to Predict Clinical Response following Neoadjuvant Treatment of Rectal Cancer? A Prospective Pilot Study
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Impact and Effectiveness of COVID-19 Vaccines Based on Machine Learning Analysis of a Time Series: A Population-Based Study

by
Rafael Garcia-Carretero
1,*,†,
Maria Ordoñez-Garcia
2,*,†,
Oscar Vazquez-Gomez
1,
Belen Rodriguez-Maya
1,
Ruth Gil-Prieto
3 and
Angel Gil-de-Miguel
3
1
Internal Medicine Department, Mostoles University Hospital, Rey Juan Carlos University, 29835 Mostoles, Spain
2
Hematology Department, Mostoles University Hospital, Rey Juan Carlos University, 29835 Mostoles, Spain
3
Department of Preventive Medicine and Public Health, Rey Juan Carlos University, 28933 Madrid, Spain
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
J. Clin. Med. 2024, 13(19), 5890; https://doi.org/10.3390/jcm13195890
Submission received: 13 September 2024 / Revised: 28 September 2024 / Accepted: 30 September 2024 / Published: 2 October 2024
(This article belongs to the Section Epidemiology & Public Health)

Abstract

:
Background: Although confirmed cases of infection with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) have been declining since late 2020 due to general vaccination, little research has been performed regarding the impact of vaccines against SARS-CoV-2 in Spain in terms of hospitalizations and deaths. Objective: Our aim was to identify the reduction in severity and mortality of coronavirus disease 2019 (COVID-19) at a nationwide level due to vaccination. Methods: We designed a retrospective, population-based study to define waves of infection and to describe the characteristics of the hospitalized population. We also studied the rollout of vaccination and its relationship with the decline in hospitalizations and deaths. Finally, we developed two mathematical models to estimate non-vaccination scenarios using machine learning modeling (with the ElasticNet and RandomForest algorithms). The vaccination and non-vaccination scenarios were eventually compared to estimate the number of averted hospitalizations and deaths. Results: In total, 498,789 patients were included, with a global mortality of 14.3%. We identified six waves or epidemic outbreaks during the observed period. We established a strong relationship between the beginning of vaccination and the decline in both hospitalizations and deaths due to COVID-19 in all age groups. We also estimated that vaccination prevented 170,959 hospitalizations (CI 95% 77,844–264,075) and 24,546 deaths (CI 95% 2548–46,543) in Spain between March 2021 and December 2021. We estimated a global reduction of 9.19% in total deaths during the first year of COVID-19 vaccination. Conclusions: Demographic and clinical profiles changed over the first months of the pandemic. In Spain, patients over 80 years old and other age groups obtained clinical benefit from early vaccination. The severity of COVID-19, in terms of hospitalizations and deaths, decreased due to vaccination. Our use of machine learning models provided a detailed estimation of the averted burden of the pandemic, demonstrating the effectiveness of vaccination at a population-wide level.

1. Introduction

The coronavirus disease 2019 (COVID-19) pandemic has had a significant impact on the health of the population, as well as significant implications in all sectors of society and the daily lives of citizens [1,2,3,4]. It is claimed that high levels of vaccination coverage, the characteristics of the omicron variant, and increased diagnostic testing likely contributed to the observed impact of the pandemic in the last months of 2021. In addition, there was a very high incidence of confirmed cases, but a majority of these had mild symptoms or were asymptomatic. This placed a significant strain on primary health care rather than hospitals. Therefore, the occupancy percentage of hospital and intensive care unit (ICU) beds was much lower than expected relative to what occurred over the remainder of the pandemic [5,6,7,8].
By February 2022, more than 92% of the Spanish population over the age of 12 was fully vaccinated [9]. Current evidence indicates that the various COVID-19 vaccines have achieved high levels of effectiveness in restricting moderate and severe forms of the disease and reducing lethality. Vaccines, despite reducing the probability of infection, are less effective at completely preventing virus replication in the upper respiratory mucosa of a vaccinated individual, which means that transmission is possible from vaccinated individuals who have been infected, even if the disease is mild or asymptomatic [10,11,12,13,14,15]. This makes it infeasible to aim for the virus’s eradication at present. Therefore, researchers should focus their efforts toward reducing the severity of infections while maintaining a level of transmission that is manageable and does not generate an excessive burden on the healthcare system.
As noted, due to the increase in vaccination coverage and the immunity generated from natural infections, the majority of the population is protected against severe COVID-19 [16]. Data show that protection has been maintained, even against a variant antigenically different enough from the previous ones to produce very high incidence rates in the population that previously had immunity.
Observational studies, such as case–control or cohort studies, are not always feasible, so several studies using alternative approaches have been conducted to demonstrate the effectiveness of vaccination [8,17]. Likewise, several meta-analyses have studied the effectiveness of vaccination from the following three perspectives: efficacy against infection, efficacy against severe disease (and, hence, reduction in risk of hospital admission), and ability to reduce the transmissibility of vaccinated individuals who become infected [13,15,16,18]. However, the impact of vaccination in terms of decreasing hospitalizations and deaths has not yet been investigated in a nationwide, population-based, epidemiological study in Spain.
Given the unique characteristics of the Spanish healthcare system and the country’s age-stratified vaccination strategy, studying Spain offers an opportunity to understand the differential impact of vaccination across diverse demographic groups, contributing insights that are not directly generalizable to other populations.

Hypothesis and Objectives of Our Research

We designed a population-based study to assess vaccination as a major public health intervention. By this means, we investigated whether vaccines have been beneficial to the Spanish population. Our research objective was to determine whether vaccination reduced the number of hospitalizations and deaths. We conducted our study in three stages. First, we described the differences between two periods, namely the first months of the pandemic, during which no vaccination was present, and the last months of 2021, when a high proportion of the Spanish population was vaccinated. Secondly, we compared trends in hospitalizations and deaths with the vaccination rate. Finally, we assessed the effectiveness of vaccines against severe disease in terms of the reduction in hospitalizations and mortality due to COVID-19. We estimated the number of averted hospitalizations and deaths. We also compared the evolution of the pandemic across the following two scenarios: vaccination (the observed scenario) and non-vaccination (an estimated scenario). The estimated scenario was fitted using time series and machine learning analyses.

2. Materials and Methods

2.1. Data Collection and Study Design

We designed a retrospective, population-based study using data collected from electronic health records. We collected data from the Spanish Minimum Basic Data Set at Hospitalization (MBDS-H), provided by the Spanish Ministry of Health [19]. We also collected data related to COVID-19 vaccination in the European Union/European Economic Area (EU/EEA) from the European Centre for Disease Prevention and Control [20]. Figure A1 shows a flow chart of the study.
MBDS-H is a mandatory administrative registry of hospital discharges that covers more than 95% of Spanish hospitals, including public centers in the National Spanish Health System and private hospitals. Nearly 97% of total hospital discharges are covered in the database. The MBDS-H is exclusively built from discharge reports. Microdata from patients include information on sex, age, dates of admission and discharge, type of discharge, primary and secondary diagnoses at discharge, length of stay, and surgical or obstetric procedures, among other data. Other administrative data are recorded by default, including the province where the hospitalization occurred, place of residence, and cost of hospitalization. By default, the Ministry of Health provides de-identified data to ensure patient privacy; thus, no names or personal information were recorded. The purpose of the MBDS-H is to facilitate the development of retrospective studies for the calculation of the burden of hospitalization and assessment of risk factors from thousands of patients, i.e., enabling population-based studies. From 2016 onward, MBDS-H has used the coding system of the International Classification of Diseases, 10th edition. MBDS-H is considered a valuable system for the epidemiological analysis of any coded disease.
Vaccination data were downloaded from the European Centre for Disease Prevention and Control [20]. These data were collected through the European Surveillance System. All EU/EEA Member States are requested to report basic indicators on vaccination (vaccines categorized by manufacturer, number of doses administered, vaccinated population, etc.). Data are categorized by target and age group at a national level.

2.2. Inclusion and Exclusion Criteria

In this retrospective study, cases were collected from the MBDS-H from the Spanish Ministry of Health. We included all patients with the code for COVID-19 (U07.1) in any diagnostic position (either primary or secondary diagnosis) from 1 January 2020 to 31 December 2021.
All age groups were studied, with special emphasis on those older than 60 years of age. We analyzed the healthcare impacts in terms of mortality and ICU admission by dividing the population into age groups. Patients with incomplete data regarding ICU admission, mortality, length of stay, or COVID-19 disease were excluded. We excluded patients with unknown data to ensure the accuracy and completeness of the dataset. As length of stay is a key outcome variable in the analysis of disease severity and healthcare utilization, including patients with missing values could bias the results and reduce the robustness of our conclusions.

2.3. Definition of Waves

We categorized the pandemic following a previous epidemiological study [21]. Using only data from Spain, we split the entire pandemic period into outbreaks or epidemiological waves based on the 14-day cumulative incidence, which marked the turning point for each wave. Every turning point indicated the end of one wave and the beginning of the next one, similar to the methodology used in previous epidemiological studies [22].
As mentioned in the introduction, the first and second stages of our study were descriptive. We analyzed the evolution of the pandemic and its outbreaks, comparing the first waves, during which time vaccination was absent, with the last waves of 2021, when vaccination was present. Herein, we describe the demographic and epidemiological differences between the two periods and their relationships with vaccination.

2.4. Vaccination Rollout

The Vaccination Strategy Against COVID-19 in Spain was developed by the Spanish Ministry of Health [23]. The working group prioritized certain age groups to receive the vaccine based on the supply of doses and the availability of current evidence, taking into consideration the demographic characteristics of the Spanish population [24]. Assessments of ethical concerns and risk factors were also considered to prioritize certain age groups over others. The elderly and healthcare workers were the first groups to receive the vaccine, and the rollout moved forward through the rest of the age groups over the course of 2021. We assessed the trends of vaccination over time using data from the European Centre for Disease Prevention and Control.
We split the population into age groups to assess the evolution of the pandemic in terms of hospitalizations and deaths. Then, we compared the vaccination rates to those trends by age group.

2.5. Estimated Scenarios of the Unvaccinated Population

We developed a population-based, epidemiological study to compare the following two scenarios: observed hospitalizations and deaths before and after vaccination and an estimated non-vaccination scenario using time series and machine learning models.

2.6. Mathematical Modeling of Hospitalizations and Mortality

We utilized ElasticNet and random forest models to forecast the impact of vaccination by fitting the models to a training dataset from July 2020 to February 2021 and validated them using a testing set. Each model was evaluated using cross-validation metrics (see Appendix A).

2.7. Machine Learning Algorithms

The models included ElasticNet for linear predictions and random forest to capture nonlinear patterns. Each algorithm was tuned to achieve optimal performance. The former assumes linearity, and the latter makes no assumptions on linearity. Researchers and data scientists apply machine learning algorithms in various fields, including health care, finance, and natural language processing [25,26,27].

2.8. Fitting the Models

To fit the models, we first split our time-series dataset into a training and a testing set. The training set covers the period between 1 July 2020 and 29 February 2021. We excluded the first wave (March to June 2020) because we considered it an outlier that could add noise to the final model. The testing set was not used to develop the models but for comparison purposes only. We made no assumptions on the likelihood, normality, or linearity of the training set. We fit the models to the training set, tuning the hyperparameters of each algorithm to achieve the best accuracy. For EN, we set alpha, and for RF, we set mtry and the numbers of trees. A key mathematical condition when tuning the models was that they should fit accurately with the observed data, i.e., the training dataset.
We used R package randomforest for the RF model and R package glmnet for the EN model. We fitted the two models to time series of both hospitalizations and deaths. Thus, we computed four models. Once they were developed, we forecasted data for the next months, namely 1 March 2021 to 31 December 2021. Finally, by comparing the estimated hospitalizations and deaths (had vaccination never been implemented) with the observed data, we could explore the number of events that were averted in the last 10 months of 2021 due to vaccination.

2.9. Statistical Analyses

All statistical and machine learning-based analyses were conducted using R language version 4.3.2 (Vienna, Austria) [28]. Statistical significance was set at 0.05.

3. Results

3.1. Nationwide Overview of the Pandemic

We included data from 498,789 hospital admissions (Table 1) and excluded 113 patients due to inconsistent or incomplete data. We split the observation period into waves, as described above (Figure 1). The first waves included more than 115,000 hospitalizations, and this number dropped up to 50,000 in the fourth and fifth waves. Men were admitted at higher rates than women (56.1%, p = 0.001), with no changes in the distribution during the pandemic. The median age was 66, but this tended to decrease across the fourth and fifth waves (59 and 57, respectively). Length of stay in both the standard hospitalization ward and in the ICU was more heterogeneous, and no clear trend could be established. Although the nationwide mortality ratio was 14.3% in Spain, we observed a decreasing trend from the first wave (18.2%) to the fourth and fifth waves (7.4% and 10.1%, respectively). Comorbidities such as type 2 diabetes, hypertension, coronary disease, dementia, kidney disease, malignancy (either solid tumor or hematological malignancy), and chronic respiratory disease showed a decreasing trend starting with the fourth wave. Other comorbidities, such as liver and cerebrovascular diseases, showed no changes. Obesity and heart failure showed a more heterogeneous trend.
Table 2 shows disaggregated data of hospital admissions, ICU admissions, and mortality by age group. These data are also represented in Figure A2.
Among hospitalizations, the predominant age group were >60 years old in the second and third waves. The >80-year-old age group dropped dramatically in the fourth and fifth waves, and the group of 18- to 49-year-olds was predominant in the fifth wave. Regarding mortality, deaths in all ages dropped quite evenly, although the patients who were more affected were >60 years old (Figure 2). Figure 2 displays data beginning with the second wave, as details of the following waves are of interest to compare the second and third waves on one side with the fourth and fifth waves on the other.

3.2. Vaccination Rollout

Figure 3 plots the vaccination rollout in Spain, both globally and by age group. Vaccination began in December 2020 with the elderly and healthcare workers. By 31 December 2021, 80.3% of the whole Spanish population was fully vaccinated, i.e., having received the complete regimen, and 97.2% had received at least one dose. By April 2021, 75% of >80-year-olds and 48% of >60-year-olds were fully vaccinated. Figure A3 provides more detail on age groups regarding vaccination coverage over time.

3.3. Hospitalizations and Deaths in an Estimated Scenario

As noted, our approach involved estimating both hospital admissions and mortality by parsing the time series using machine learning algorithms. We developed four models—one for hospitalizations and another for deaths—using each algorithm. Figure 4 shows the observed and estimated scenarios. Figure A4 shows the models with confidence intervals. Table 3 shows the estimates of hospitalizations and deaths in the absence of vaccination. Using the RF model, we estimated that 251,830 hospitalizations and 37,673 deaths would have occurred in a non-vaccination scenario during the period between March and December 2021. According to the EN model, the estimated numbers of hospitalizations and deaths were 307,617 and 37,141, respectively. Compared to the observed data, we estimated that vaccination prevented 115,172 hospitalizations and 25,078 deaths with the RF model and 170,959 hospitalizations and 24,546 deaths with the EN model. Finally, we plotted Figure 5, showing the cumulative hospitalizations and deaths, with both the RF and EN models.

4. Discussion

4.1. Descriptive Analyses

We have described the high number of hospitalizations and deaths during the first waves of the pandemic in Spain. We have demonstrated how this trend began to decrease in March to April 2021 as a result of vaccination, which was the major public health intervention during the COVID-19 pandemic.
Overall, the first wave showed the highest number of hospitalizations, the highest mortality rate, the longest hospital and ICU lengths of stay, and the oldest patients. The fourth and the fifth waves showed a decreasing trend in terms of hospitalizations and mortality. In addition, these last waves showed an overall younger, healthier population.
While the sixth wave was included in our analyses, the results might not be reliable, as this wave ended mid-February 2022, and its results are not fully represented in tables and figures. However, the fourth and fifth waves showed that the demographic profile of hospitalized individuals changed with respect to the previous waves, showing a turning point in the evolution of the pandemic.
With respect to admissions by age group, the group of patients under 17 contributed only marginally during the observed period of the pandemic. However, patients over 60 years old were the largest group of those admitted to the hospital due to COVID-19. Patients between 18 and 59 years old were hospitalized at a lower rate. Additionally, most of the deaths occurred in patients >60 years old, particularly in patients over 80 years old, whereas mortality in the rest of the age groups was marginal, as seen in Figure 3.

4.2. Vaccination Rollout

Vaccination in Spain began in late December 2020, as soon as vaccines were proven to be safe and to offer significant protection against severe forms of COVID-19, as part of a European initiative [29]. Within only a few weeks of the beginning of vaccination (2.2% of total population by March and 10% by May 2021), we observed a rapid decline in both hospitalizations and deaths beginning in March and April 2021. We also observed a strong temporal correlation between decreasing hospitalizations and deaths on one side and the evolving vaccination rollout on the other (Figure 2 and Figure 3). The decline in hospitalizations and deaths was first observed in patients over 80 years old, showing a relationship between vaccination and protection against both outcomes. This relationship can also be seen in the remaining age groups after the beginning of vaccination. This steady pace of vaccination consolidated the decline in the severity of the pandemic. Our data are also in line with other studies that have investigated the benefit of vaccination and its protective effects [30,31,32]. We can state that in Spain, vaccination led to a significant reduction in the severity of COVID-19 across all age groups, with particularly marked benefits observed in the elderly population. While the overall reduction in hospitalizations and deaths due to vaccination is consistent with global findings, the timing and magnitude of these changes in Spain were influenced by the country’s specific vaccination rollout strategy and healthcare infrastructure.

4.3. Modeling and Estimating Data in a Non-Vaccination Scenario

It can be challenging to quantify the impact of vaccination if an incomplete picture of the pandemic is obtained. Infections and confirmed cases are either often under-reported or underestimated [21,33]. For this reason, we relied on reported hospitalizations and deaths to determine this impact instead of trends of non-hospitalized, confirmed cases.
Our reference publication was that of Barandalla et al. [34], who developed simulated curves of hospitalization in the absence of vaccines and compared those curves with the observed incidence. That study investigated hospitalizations in Spain between February 2020 and June 2021. The authors estimated the expected hospitalizations during 2021 in the absence of vaccination, extrapolating data from the second wave. The scenario of an unvaccinated population was estimated to create a statistical model as follows. The authors disaggregated the entire population curve across age groups and took the proportion of hospitalization of age groups of unvaccinated or less vaccinated population as a reference. These proportions of hospitalizations were extrapolated to the remaining groups, yielding curves of the real incidence of hospitalization and curves of expected hospitalization in the absence of vaccines for each age group. Finally, these curves were compared. Showing a decrease in incidence, they demonstrated the beneficial impact of vaccination on hospitalizations. Likewise, vaccine effectiveness against hospitalization in ≥65-year-old age groups was estimated from October 2021 to March 2022 in a European study [35]. The reference group was the unvaccinated population. The authors performed a survival analysis using the Cox proportional hazards regression model to estimate the hazard ratios of hospitalization.
It is beyond the scope of this manuscript to discuss all studies that have used mathematical models to estimate mortality in the absence of vaccination, but it is worth mentioning some of them. A mathematical model reported by Watson et al. [17] estimated that 14.4 million deaths were prevented in 185 countries in 2021. The authors used a framework based on a “susceptible–exposed–infections–recovered–susceptible” model to estimate a non-vaccination scenario. This model was fitted using MCMC, and the authors calculated the time-varying reproductive number to determine the estimated number of contagions. Havers et al. [8] conducted a cross-sectional study that included adults hospitalized with COVID-19, comparing vaccinated versus unvaccinated individuals. Both studies demonstrated the effectiveness of vaccination and its impact on the evolution of the COVID-19 pandemic using different mathematical approaches. Auto-regressive time-series modeling was assessed in other studies [36,37].
In summary, previous studies conducted in Spain [34] have reported a significant reduction in hospitalizations following vaccination rollout using different modeling approaches. Our study adds to these findings by incorporating machine learning methods and providing a more granular age-stratified analysis, which is lacking in other reports. International studies, such as that by Watson et al. [17], have demonstrated similar impacts of vaccination on a nationwide scale, supporting the observed trends in our Spanish cohort.
Machine learning has also been used to estimate the evolution of the COVID-19 pandemic in terms of confirmed cases. Kırbaş et al. [38] conducted a comparative study using different approaches, including ARIMA, neural networks, and long short-term memory (LSTM), to forecast the evolution of the pandemic. LSTM provided predictions with the best accuracy. Neural networks were used by Nabi et al. to study the dynamics of confirmed cases of COVID-19 [39]. Although deep learning (i.e., neural networks) seems to have better prediction accuracy than standard statistical methods (ARIMA) or machine learning, it entails high costs in terms of computational resources and time [40].
Having discussed some of the more relevant publications in this field, we consider this study to stand out due to the use of advanced machine learning algorithms such as ElasticNet and random forest, which allowed us to create accurate non-vaccination scenarios. This approach enabled us to quantify the impact of vaccination with high precision. Furthermore, by analyzing hospitalization and mortality trends across multiple age groups, we have provided a more granular understanding of how vaccination influenced disease severity in different demographic subpopulations within Spain. Such detailed insights have not been previously reported in similar population-based studies.
Vaccination conferred sufficient protection against severe disease and altered the course of the COVID-19 pandemic. Given the conditions of the pandemic, measuring the impact of vaccination directly by comparing a vaccination scenario with a non-vaccination scenario was not possible at a nationwide level. This is why mathematical models are useful for estimating non-vaccination scenarios to achieve such comparisons. Thanks to our estimated scenarios, we could assess the impact of vaccination in Spain. Our approaches generated estimations of hospitalizations and deaths averted as a result of vaccination against SARS-CoV-2.

4.4. Limitations

We estimated how the pandemic would have evolved if no vaccines had been available by estimating a new scenario, but we did not include non-pharmaceutical interventions, viral variants, or limitations on mobility that could have altered the viral evolution in the absence of vaccination. It is of interest to mention that the last waves of 2021 in Spain, which were primarily caused by the omicron variant and its descendants (B.1.1.529), presented different characteristics than the previous waves [21,33], but its impact was not included in our model. In addition, forecasting using the time-series signature can be very accurate, particularly when time-based patterns are present in the underlying data. As with most uses of machine learning, the prediction is only as good as the patterns in the data. Forecasting using this approach may not be suitable when patterns are not present or when the future is highly uncertain (i.e., past results are not a suitable predictor of future performance). We could not use ARIMA or MCMC to create the estimated scenario, so we did not compare different approaches. Although it has been found that mortality due to COVID-19 may have been under-reported [41], in Spain, almost all deaths occurred in hospital, so our data can be considered reliable. This is key when modeling and fitting a machine learning algorithm because the final model can only be as good as the provided data. The wide confidence intervals, particularly for the ElasticNet model, reflect the inherent uncertainty in modeling complex phenomena such as pandemic outcomes. This uncertainty arises from potential changes in transmission dynamics, population behavior, and viral variants. While the confidence intervals indicate variability, the consistency in trend direction across models (ElasticNet and RandomForest) suggests that the main conclusions remain robust despite this uncertainty.

5. Conclusions

We fit mathematical models to estimate both hospitalizations and deaths due to COVID-19 in a non-vaccination scenario. We determined the impact of vaccination by estimating the hospitalizations and deaths that, otherwise, could have occurred if vaccines had not been administered. In Spain, demographic and clinical profiles shifted significantly during the first months of the pandemic, reflecting the differential impact of early vaccination efforts. Vaccination altered the evolution of the COVID-19 pandemic and prevented up to 24,546 deaths in Spain in 2021. Vaccination reduced not only mortality but also the number of hospitalizations and the burden of the pandemic. Its protective effect was observable shortly after the beginning of vaccination for each age group. Machine learning approaches can be useful in uncertain contexts because a time-series signature can provide accurate forecasts. By integrating machine learning models and age-stratified analyses, our study provides a comprehensive view of the pandemic’s evolution in Spain, demonstrating how targeted vaccination strategies can alter disease trajectories at a national level.

Author Contributions

Conceptualization, R.G.-C. and M.O.-G.; methodology, R.G.-C. and O.V.-G.; software, R.G.-C.; validation, R.G.-C. and M.O.-G.; formal analysis, R.G.-C. and B.R.-M.; investigation, R.G.-C. and M.O.-G.; data curation, R.G.-C. and B.R-M.; writing—original draft preparation, R.G.-C. and O.V.-G.; writing—review and editing, R.G.-P. and A.G.-d.-M.; visualization, R.G.-C.; supervision, R.G.-P. and A.G.-d.-M.; project administration, R.G.-P. and A.G.-d.-M.; funding acquisition, R.G.-C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

This study was approved by the Ethical Board of Universidad Rey Juan Carlos (ID number 2610202334423). No identifying information was included in the manuscript. Because the authors used historical data, informed consent was not necessary. All procedures involving human participants were conducted in accordance with the ethical standards of the responsible institutional and/or national research committee and the 1964 Declaration of Helsinki and its later amendments or comparable ethical standards.

Informed Consent Statement

Not applicable.

Data Availability Statement

A contract signed with the Spanish Health Ministry, which provided the dataset, prohibits the authors from providing their data to any other researcher. Furthermore, the authors must destroy the database upon the conclusion of their investigation. The database cannot be uploaded to any public repository.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
ARIMAAuto-Regressive Integrated Moving Average
COVID-2019Coronavirus Disease 2019
ENElasticNet
EU/EEAEuropean Union/European Economic Area
LSTMLong Short-Term Memory
MBDS-HMinimum Basic Data Set at Hospitalization
SARS-CoV-2Severe Acute Respiratory Syndrome Coronavirus 2
MCMCMarkov Chain with Monte Carlo
RFRandom Forest

Appendix A. Descriptive Analyses

Appendix A.1. Estimated Scenarios of the Unvaccinated Population

Designing research to investigate the effectiveness of vaccination in terms of reducing hospitalizations is a valuable endeavor but can be challenging. As noted, our objective was to determine whether vaccination reduced the number of hospitalizations and deaths. There are two common options for the design of clinical research on vaccines. The first approach is a retrospective cohort study, in which patients are divided into two groups (e.g., vaccinated and unvaccinated). The other design is a matched case–control study, in which a subset of vaccinated patients are matched with an equal number of unvaccinated patients based on relevant characteristics. In either design, the aim is to compare the hospitalization and death rates and clinical characteristics between the two groups. However, due to a lack of data, traditional research designs such as cohort and case–control studies would not be feasible in our context.
Thus, we designed a population-based, epidemiological, nationwide study to compare the following two scenarios: the observed scenario describing actual hospitalizations and deaths before and after vaccination and an estimated scenario simulating the trends of the pandemic had vaccination not occurred in 2021. The first scenario indicates that the outcomes in the first months of the pandemic were widely different from those of the last months of 2021 (with vaccination). Finally, we compared the two scenarios to extract the estimated effect of vaccination.
Figure A1. Flow chart and study design.
Figure A1. Flow chart and study design.
Jcm 13 05890 g0a1
Figure A2. Evolution of the COVID-19 pandemic in terms of hospitalizations and deaths. All waves from the observation period are included.
Figure A2. Evolution of the COVID-19 pandemic in terms of hospitalizations and deaths. All waves from the observation period are included.
Jcm 13 05890 g0a2
Figure A3. Vaccination rollout in Spain for the entire population (fully vaccinated individuals), disaggregated by age group. Elderly patients were prioritized for the first dose of the vaccine.
Figure A3. Vaccination rollout in Spain for the entire population (fully vaccinated individuals), disaggregated by age group. Elderly patients were prioritized for the first dose of the vaccine.
Jcm 13 05890 g0a3

Appendix A.2. Mathematical Modeling of Hospitalizations and Mortality

First, we transformed our dataset into a time series. Among the methods used to analyze time series, traditional statistical models such as auto-regressive (AR) models can be specified as linear regressions on the lags of the time series. For example, an AR model only looks at the relationship between lags of a series and its future values. Seasonality and trend are key components of a given time series. While the trend represents a gradual change in the data, depicting long-term growth or decline, seasonality describes the short-term patterns that occur within a single unit of time and repeats indefinitely. Another useful technique is Markov Chain with Monte Carlo (MCMC) simulation, which describes dynamic changes based on the state of a given value and the chance of its transition. The MCMC algorithm explores the parameter space to find values that maximize the likelihood of the observed data.
Once historical data on the number of deaths from COVID-19 were collected, we prepared the data for analysis by checking for missing values, outliers, and inconsistencies. Then, we aggregated data on daily hospitalizations and daily deaths into appropriate time intervals obtain a time series. Exploratory analyses were used to understand the trends and patterns in the historical data. This involved creating visualizations and summary statistics to identify any seasonality or trends in the occurrence of COVID-19.
Figure A4. Models developed with random forest (A,B) and with ElasticNet (C,D) to estimate non-vaccination scenarios. Blue dots represent the observed values, while the smooth curves represent the estimated values and prediction intervals to account for variance between the model predictions and the observed data.
Figure A4. Models developed with random forest (A,B) and with ElasticNet (C,D) to estimate non-vaccination scenarios. Blue dots represent the observed values, while the smooth curves represent the estimated values and prediction intervals to account for variance between the model predictions and the observed data.
Jcm 13 05890 g0a4
With a clean dataset, the next stage involved selecting a forecasting method. Estimating population data, specifically the number of events due to COVID-19, is a common but challenging task in epidemiology and clinical research. To make such predictions, statistical methods and models can be used to forecast future trends, but choosing an appropriate forecasting method depends on the characteristics of the data. More specifically, the choice of method depends on the complexity of the data and the availability of relevant predictor variables. As noted, common methods include time-series analysis, regression analysis, and machine learning techniques. We discarded time-series analysis methods such as Auto-Regressive Integrated Moving Average (ARIMA) because we were unable to capture and project time-dependent patterns in the data, given their nature. Specifically, ARIMA estimations did not converge in our dataset, likely because the developed model was not a good fit for the data. Likewise, MCMC required some assumptions that could not be fulfilled. The characteristics of our data and the underlying dynamics of COVID-19 hospitalizations did not justify the choice of the normal distribution and the assumption of independence in time steps. In addition, adjustments and the fine tuning of some parameters based on the likelihood of our data and the prior distributions were either too complex or unavailable.

Appendix A.3. Machine Learning Algorithms

We employed two algorithms, namely ElasticNet (EN) and random forest (RF). The former assumes linearity, and the latter makes no assumptions on linearity. We used two machine learning algorithms because these can capture non-obvious (both linear and nonlinear) patterns in data. We evaluated each model’s performance using appropriate metrics, such as mean absolute percentage error, through cross-validation to ensure reliability for the training period (July 2020 to February 2021). Then, we used each model to forecast future hospitalizations and deaths in the population for the desired time period (March to December 2021). We created point forecasts (single estimates) and prediction intervals (confidence intervals) to quantify the uncertainty in our predictions.
EN is a machine learning technique used for linear regression and feature selection. It combines two regularization methods, namely L1 and L2 regularization [42]. L1 encourages some feature coefficients to be exactly zero, effectively performing feature selection by eliminating less important features. L2, on the other hand, penalizes large coefficients and prevents overfitting. EN strikes a balance between these two regularization techniques by introducing a hyperparameter that controls the mix of L1 and L2 penalties. This hyperparameter, often denoted as alpha, allows one to adjust the level of feature selection and regularization. A value of alpha equal to 0 corresponds to L2 regularization, while a value of 1 corresponds to L1 regularization. Any value in between blends the characteristics of both methods. EN is valuable when dealing with datasets containing many features, as it helps prevent overfitting and can automatically select the most relevant features. It is commonly used in predictive modeling, where the goal is to create accurate models that generalize well to new data while optimizing feature usage. Researchers and data scientists apply EN in various fields, including health care, finance, and natural language processing [25,26,27].
RF is a powerful machine learning technique used in various fields, including clinical research and engineering. It is essentially a collection of decision trees, where each tree is a simple predictive model [43,44]. It uses different subsets of the available data and features to create each decision tree. What sets RF apart is its random nature. This randomness injects diversity into models. By combining predictions from multiple trees, RF models become robust and less prone to overfitting, which makes them excellent at making generalizations from data. For researchers, RF can be used to make predictions based on complex, multidimensional data. It is well-suited to handle both numerical and categorical data, which is key in fields such as health care, where patient information can include a mix of variables. Clinicians and engineers find RF useful for various applications [45,46], such as disease prediction, image analysis, and quality control in manufacturing. RF models are known for their versatility, reliability, and ability to produce accurate and interpretable results, which makes them a valuable tool for decision support and pattern recognition.

References

  1. Hosseinzadeh, P.; Zareipour, M.; Baljani, E.; Moradali, M. Social Consequences of the COVID-19 Pandemic. A Systematic Review. Investig. Educ. Enferm. 2022, 40, e10. [Google Scholar] [CrossRef] [PubMed]
  2. Mofijur, M.; Rizwanul Fattah, I.M.; Alam, M.A.; Islam, A.B.M.S.; Ong, H.C.; Rahman, S.M.A.; Najafi, G.; Ahmed, S.F.; Uddin, M.A.; Mahlia, T.M.I. Impact of COVID-19 on the social, economic, environmental and energy domains: Lessons learnt from a global pandemic. Sustain. Prod. Consum. 2021, 26, 343–359. [Google Scholar] [CrossRef] [PubMed]
  3. Osofsky, J.; Osofsky, H.; Mamon, L. Psychological and social impact of COVID-19. Psychol. Trauma Theory Res. Pract. Policy 2020, 12, 468–469. [Google Scholar] [CrossRef] [PubMed]
  4. Saladino, V.; Algeri, D.; Auriemma, V. The Psychological and Social Impact of COVID-19: New Perspectives of Well-Being. Front. Psychol. 2020, 11, 577684. [Google Scholar] [CrossRef] [PubMed]
  5. Patel, B.; Murphy, R.E.; Karanth, S.; Shiffaraw, S.; Peters, R.M., Jr.; Hohmann, S.F.; Greenberg, R.S. Surge in Incidence and Coronavirus Disease 2019 Hospital Risk of Death, United States, September 2020 to March 2021. Open Forum Infect Dis 2022, 9, ofac424. [Google Scholar] [CrossRef]
  6. Delahoy, M.; Ujamaa, D.; Whitaker, M.; O’Halloran, A.; Anglin, O.; Burns, E.; Cummings, C.; Holstein, R.; Kambhampati, A.K.; Milucky, J.; et al. Hospitalizations Associated with COVID-19 Among Children and Adolescents—COVID-NET, 14 States, March 1, 2020–August 14, 2021. MMWR Morb. Mortal. Wkly. Rep. 2021, 70, 1255–1260. [Google Scholar] [CrossRef]
  7. Taylor, C. COVID-19–Associated Hospitalizations Among Adults During SARS-CoV-2 Delta and Omicron Variant Predominance, by Race/Ethnicity and Vaccination Status—COVID-NET, 14 States, July 2021–January 2022. MMWR Morb. Mortal. Wkly. Rep. 2022, 71, 466–473. [Google Scholar] [CrossRef]
  8. Havers, F.P.; Pham, H.; Taylor, C.A.; Whitaker, M.; Patel, K.; Anglin, O.; Kambhampati, A.K.; Milucky, J.; Zell, E.; Moline, H.L.; et al. COVID-19-Associated Hospitalizations Among Vaccinated and Unvaccinated Adults 18 Years or Older in 13 US States, January 2021 to April 2022. JAMA Intern. Med. 2022, 182, 1071–1081. [Google Scholar] [CrossRef]
  9. de Sanidad, M. Situación actual Coronavirus. Available online: https://www.sanidad.gob.es/profesionales/saludPublica/ccayes/alertasActual/nCov/situacionActual.htm (accessed on 24 January 2024).
  10. Stefanelli, P.; Trentini, F.; Petrone, D.; Mammone, A.; Ambrosio, L.; Manica, M.; Guzzetta, G.; d’Andrea, V.; Marziano, V.; Zardini, A.; et al. Tracking the progressive spread of the SARS-CoV-2 Omicron variant in Italy, December 2021 to January 2022. Euro Surveill. 2022, 27, 2200125. [Google Scholar] [CrossRef]
  11. Assessment of the Further Spread and Potential Impact of the SARS-CoV-2 Omicron Variant of Concern in the EU/EEA, 19th Update. 2022. Available online: https://www.ecdc.europa.eu/en/publications-data/covid-19-omicron-risk-assessment-further-emergence-and-potential-impact (accessed on 15 October 2023).
  12. Markov, P.; Ghafari, M.; Beer, M.; Lythgoe, K.; Simmonds, P.; Stilianakis, N.; Katzourakis, A. The evolution of SARS-CoV-2. Nat. Rev. Microbiol. 2023, 21, 361–379. [Google Scholar] [CrossRef]
  13. Soheili, M.; Khateri, S.; Moradpour, F.; Mohammadzedeh, P.; Zareie, M.; Mortazavi, S.M.M.; Manifar, S.; Kohan, H.G.; Moradi, Y. The efficacy and effectiveness of COVID-19 vaccines around the world: A mini-review and meta-analysis. Ann. Clin. Microbiol. Antimicrob. 2023, 22, 42. [Google Scholar] [CrossRef] [PubMed]
  14. Harder, T.; Koch, J.; Vygen-Bonnet, S.; Külper-Schiek, W.; Pilic, A.; Reda, S.; Scholz, S.; Wichmann, O. Efficacy and effectiveness of COVID-19 vaccines against SARS-CoV-2 infection: Interim results of a living systematic review, 1 January to 14 May 2021. Eurosurveillance 2021, 26, 2100563. [Google Scholar] [CrossRef] [PubMed]
  15. Graña, C.; Ghosn, L.; Evrenoglou, T.; Jarde, A.; Minozzi, S.; Bergman, H.; Buckley, B.S.; Probyn, K.; Villanueva, G.; Henschke, N.; et al. Efficacy and safety of COVID-19 vaccines. Cochrane Database Syst. Rev. 2022, 12, CD015477. [Google Scholar] [PubMed]
  16. Yang, Z.R.; Jiang, Y.W.; Li, F.X.; Liu, D.; Lin, T.F.; Zhao, Z.Y.; Wei, C.; Jin, Q.Y.; Li, X.M.; Jia, Y.X.; et al. Efficacy of SARS-CoV-2 vaccines and the dose–response relationship with three major antibodies: A systematic review and meta-analysis of randomised controlled trials. Lancet Microbe 2023, 4, e236–e246. [Google Scholar] [CrossRef] [PubMed]
  17. Watson, O.; Barnsley, G.; Toor, J.; Hogan, A.; Winskill, P.; Ghani, A. Global impact of the first year of COVID-19 vaccination: A mathematical modelling study. Lancet Infect. Dis. 2022, 22, 1293–1302. [Google Scholar] [CrossRef]
  18. Zeng, B.; Gao, L.; Zhou, Q.; Yu, K.; Sun, F. Effectiveness of COVID-19 vaccines against SARS-CoV-2 variants of concern: A systematic review and meta-analysis. BMC Med. 2022, 20, 200. [Google Scholar] [CrossRef]
  19. Ministerio de Sanidad, Consumo y Bienestar Social. Portal Estadistico. Area de Inteligencia de Gestion. Available online: https://pestadistico.inteligenciadegestion.mscbs.es/publicoSNS/comun/ArbolNodos.aspx?idNodo=23525 (accessed on 6 July 2019).
  20. Data on COVID-19 vaccination in the EU/EEA. 2023. Available online: https://www.ecdc.europa.eu/en/publications-data/data-covid-19-vaccination-eu-eea (accessed on 15 October 2023).
  21. Garcia-Carretero, R.; Vazquez-Gomez, O.; Gil-Prieto, R.; Gil-de Miguel, A. Hospitalization burden and epidemiology of the COVID-19 pandemic in Spain (2020–2021). BMC Infect. Dis. 2023, 23, 476. [Google Scholar] [CrossRef]
  22. Instituto de Salud Carlos III. Informe no 128 Situación de COVID-19 en España a 10 de mayo de 2022.pdf. 2022. Available online: https://www.isciii.es/QueHacemos/Servicios/VigilanciaSaludPublicaRENAVE/EnfermedadesTransmisibles/Documents/INFORMES/Informes%20COVID-19/INFORMES%20COVID-19%202022/Informe%20n%C2%BA%20128%20Situaci%C3%B3n%20de%20COVID-19%20en%20Espa%C3%B1a%20a%2010%20de%20mayo%20de%202022.pdf (accessed on 9 June 2022).
  23. Gobierno de España. Estrategia de vacunación COVID-19. Available online: https://www.vacunacovid.gob.es/ (accessed on 12 September 2024).
  24. Rodriguez-Maroto, G.; Atienza-Diez, I.; Ares, S.; Manrubia, S. Vaccination strategies in structured populations under partial immunity and reinfection. J. Phys. A Math. Theor. 2023, 56, 204003. [Google Scholar] [CrossRef]
  25. Garcia-Carretero, R.; Vigil-Medina, L.; Barquero-Perez, O.; Ramos-Lopez, J. Relevant Features in Nonalcoholic Steatohepatitis Determined Using Machine Learning for Feature Selection. Metab. Syndr. Relat. Disord. 2019, 17, 444–451. [Google Scholar] [CrossRef]
  26. Garcia-Carretero, R.; Barquero-Perez, O.; Mora-Jimenez, I.; Soguero-Ruiz, C.; Goya-Esteban, R.; Ramos-Lopez, J. Identification of clinically relevant features in hypertensive patients using penalized regression: A case study of cardiovascular events. Med. Biol. Eng. Comput. 2019, 57, 2011–2026. [Google Scholar] [CrossRef]
  27. Garcia-Carretero, R.; Vigil-Medina, L.; Barquero-Perez, O.; Mora-Jimenez, I.; Soguero-Ruiz, C.; Goya-Esteban, R.; Ramos-Lopez, J. Logistic LASSO and Elastic Net to Characterize Vitamin D Deficiency in a Hypertensive Obese Population. Metab. Syndr. Relat. Disord. 2020, 18, 79–85. [Google Scholar] [CrossRef] [PubMed]
  28. R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2020; Available online: https://www.r-project.org/ (accessed on 12 September 2024).
  29. Farsalinos, K.; Poulas, K.; Vantarakis, A.; Leotsinidis, M.; Kouvelas, D.; Docea, A.O.; Kostoff, R.; Gerotziafas, G.T.; Antoniou, M.N.; Polosa, R.; et al. Improved strategies to counter the COVID-19 pandemic: Lockdowns vs. primary and community healthcare. Toxicol. Rep. 2021, 8, 1–9. [Google Scholar] [CrossRef] [PubMed]
  30. Connors, M.; Graham, B.; Lane, H.; Fauci, A. SARS-CoV-2 Vaccines: Much Accomplished, Much to Learn. Ann. Intern. Med. 2021, 174, 687–690. [Google Scholar] [CrossRef]
  31. Kustin, T.; Harel, N.; Finkel, U.; Perchik, S.; Harari, S.; Tahor, M.; Caspi, I.; Levy, R.; Leshchinsky, M.; Ken Dror, S.; et al. Evidence for increased breakthrough rates of SARS-CoV-2 variants of concern in BNT162b2-mRNA-vaccinated individuals. Nat. Med. 2021, 27, 1379–1384. [Google Scholar] [CrossRef] [PubMed]
  32. Wang, Z.; Muecksch, F.; Schaefer-Babajew, D.; Finkin, S.; Viant, C.; Gaebler, C.; Hoffmann, H.H.; Barnes, C.O.; Cipolla, M.; Ramos, V.; et al. Naturally enhanced neutralizing breadth against SARS-CoV-2 one year after infection. Nature 2021, 595, 426–431. [Google Scholar] [CrossRef]
  33. Garcia-Carretero, R.; Vazquez-Gomez, O.; Ordoñez-Garcia, M.; Garrido-Peño, N.; Gil-Prieto, R.; Gil-de Miguel, A. Differences in Trends in Admissions and Outcomes among Patients from a Secondary Hospital in Madrid during the COVID-19 Pandemic: A Hospital-Based Epidemiological Analysis (2020–2022). Viruses 2023, 15, 1616. [Google Scholar] [CrossRef]
  34. Barandalla, I.; Alvarez, C.; Barreiro, P.; de Mendoza, C.; González-Crespo, R.; Soriano, V. Impact of scaling up SARS-CoV-2 vaccination on COVID-19 hospitalizations in Spain. Int. J. Infect. Dis. 2021, 112, 81–88. [Google Scholar] [CrossRef]
  35. Sentís, A.; Kislaya, I.; Nicolay, N.; Meijerink, H.; Starrfelt, J.; Martínez-Baz, I.; Castilla, J.; Nielsen, K.F.; Hansen, C.H.; Emborg, H.-D.; et al. Estimation of COVID-19 vaccine effectiveness against hospitalisation in individuals aged ≥65 years using electronic health registries; a pilot study in four EU/EEA countries, October 2021 to March 2022. Eurosurveillance 2022, 27, 2200551. [Google Scholar] [CrossRef]
  36. Chyon, F.; Suman, M.; Fahim, M.; Ahmmed, M. Time series analysis and predicting COVID-19 affected patients by ARIMA model using machine learning. J. Virol. Methods 2022, 301, 114433. [Google Scholar] [CrossRef]
  37. Maleki, M.; Mahmoudi, M.; Wraith, D.; Pho, K. Time series modelling to forecast the confirmed and recovered cases of COVID-19. Travel Med. Infect. Dis. 2020, 37, 101742. [Google Scholar] [CrossRef]
  38. Kırbaş, İ.; Sözen, A.; Tuncer, A.; Kazancıoğlu, F. Comparative analysis and forecasting of COVID-19 cases in various European countries with ARIMA, NARNN and LSTM approaches. Chaos Solitons Fractals 2020, 138, 110015. [Google Scholar] [CrossRef] [PubMed]
  39. Nabi, K.; Tahmid, M.; Rafi, A.; Kader, M.; Haider, M. Forecasting COVID-19 cases: A comparative analysis between recurrent and convolutional neural networks. Results Phys. 2021, 24, 104137. [Google Scholar] [CrossRef] [PubMed]
  40. Makridakis, S.; Spiliotis, E.; Assimakopoulos, V.; Semenoglou, A.; Mulder, G.; Nikolopoulos, K. Statistical, machine learning and deep learning forecasting methods: Comparisons and ways forward. J. Oper. Res. Soc. 2023, 74, 840–859. [Google Scholar] [CrossRef]
  41. Whittaker, C.; Walker, P.G.; Alhaffar, M.; Hamlet, A.; Djaafara, B.A.; Ghani, A.; Ferguson, N.; Dahab, M.; Checchi, F.; Watson, O.J. Under-reporting of deaths limits our understanding of true burden of COVID-19. BMJ 2021, 375, n2239. [Google Scholar] [CrossRef] [PubMed]
  42. Tibshirani, R. Regression Shrinkage and Selection via the Lasso. JOurnal R. Stat. Soc. Ser. B (Methodol.) 1996, 58, 267–288. [Google Scholar] [CrossRef]
  43. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  44. Cutler, D.; Edwards, T., Jr.; Beard, K.; Cutler, A.; Hess, K.; Gibson, J.; Lawler, J.J. Random forests for classification in ecology. Ecology 2007, 88, 2783–2792. [Google Scholar] [CrossRef]
  45. Garcia-Carretero, R.; Vazquez-Gomez, O.; Lopez-Lomba, M.; Gil-Prieto, R.; Gil-de Miguel, A. Insulin Resistance and Metabolic Syndrome as Risk Factors for Hospitalization in Patients with COVID-19: Pilot Study on the Use of Machine Learning. Metab. Syndr. Relat. Disord. 2023, 21, 443–452. [Google Scholar] [CrossRef]
  46. Garcia-Carretero, R.; Vigil-Medina, L.; Mora-Jimenez, I.; Soguero-Ruiz, C.; Barquero-Perez, O.; Ramos-Lopez, J. Use of a K-nearest neighbors model to predict the development of type 2 diabetes within 2 years in an obese, hypertensive population. Med. Biol. Eng. Comput. 2020, 58, 991–1002. [Google Scholar] [CrossRef]
Figure 1. Evolution of hospitalizations during the COVID-19 pandemic in Spain from March 2020 to December 2021, split into waves. Blue dots represent raw data, whereas the red line represents a 7-day moving average of the time series. Vertical dash line at the end of the figure denotes that there were no available data for the sixth epidemic wave (it would continue, otherwise).
Figure 1. Evolution of hospitalizations during the COVID-19 pandemic in Spain from March 2020 to December 2021, split into waves. Blue dots represent raw data, whereas the red line represents a 7-day moving average of the time series. Vertical dash line at the end of the figure denotes that there were no available data for the sixth epidemic wave (it would continue, otherwise).
Jcm 13 05890 g001
Figure 2. Evolution of the COVID-19 pandemic in terms of hospitalizations (A) and in-hospital deaths (B) between September 2020 and December 2021 (the first wave is not included in the visualization), disaggregated by age group. The vertical dashed line denotes the new year.
Figure 2. Evolution of the COVID-19 pandemic in terms of hospitalizations (A) and in-hospital deaths (B) between September 2020 and December 2021 (the first wave is not included in the visualization), disaggregated by age group. The vertical dashed line denotes the new year.
Jcm 13 05890 g002
Figure 3. Vaccination rollout in Spain for the entire population (i.e., fully vaccinated individuals (A)) disaggregated by age group (B). Elderly patients were prioritized to receive the first dose of vaccine.
Figure 3. Vaccination rollout in Spain for the entire population (i.e., fully vaccinated individuals (A)) disaggregated by age group (B). Elderly patients were prioritized to receive the first dose of vaccine.
Jcm 13 05890 g003
Figure 4. Models developed with random forest (A,B) and with ElasticNet (C,D) to estimate non-vaccination scenarios. Turquoise dots represent the observed values, while red dots represent estimated values. Note the good fit of the model in the train region before it estimates the values in the forecast region.
Figure 4. Models developed with random forest (A,B) and with ElasticNet (C,D) to estimate non-vaccination scenarios. Turquoise dots represent the observed values, while red dots represent estimated values. Note the good fit of the model in the train region before it estimates the values in the forecast region.
Jcm 13 05890 g004
Figure 5. Cumulative sum estimated with random forest (A,B), showing the observed values and the estimates with 95% confidence intervals, and with ElasticNet (C,D). We forecasted values from 1 March to 31 December 2021.
Figure 5. Cumulative sum estimated with random forest (A,B), showing the observed values and the estimates with 95% confidence intervals, and with ElasticNet (C,D). We forecasted values from 1 March to 31 December 2021.
Jcm 13 05890 g005
Table 1. Epidemiological and demographic characteristics of patients hospitalized in Spain between 2020 and 2021.
Table 1. Epidemiological and demographic characteristics of patients hospitalized in Spain between 2020 and 2021.
All WavesFirstSecondThirdFourthFifthSixthp Value
Hospital admissions498,789115,356127,114126,62351,00654,57024,1200.001
Sex (men)56.1%56.2%55.5%56.2%57.5%55.3%56.2%0.001
Age, median (IQR)66 (28)69 (25)68 (28)69 (25)59 (24)57 (38)65 (28)0.001
Hospital stay, median (IQR)8 (9)11.9 (9)8 (9)12.1 (9)7 (7)10.4 (8)6 (7)0.001
ICU
  Admissions54,35410,21813,30214,4417194694122580.001
  ICU (%)10.98.910.511.414.112.79.40.001
  ICU stay, median (IQR)10 (21)11 (22.3)10 (21.4)11 (21.5)11 (21.4)10 (18.7)6 (9.1)0.001
Mortality
  Deaths71,43721,03718,22920,4003793548724910.001
  Mortality rate (%)14.318.214.316.17.410.110.30.001
Comorbidities
  Type 2 diabetes21.720.623.224.318.318.321.40.001
  Hypertension33.835.234.737.231.125.031.60.001
  AMI7.17.37.47.95.35.77.50.001
  CHF7.26.57.88.44.26.87.90.001
  Dementia4.85.55.45.22.13.93.60.001
  Kidney disease10.510.311.211.76.59.711.10.001
  Liver disease0.40.40.40.50.30.40.50.001
  Malignancy5.65.45.86.04.25.37.40.001
  Obesity12.99.212.614.017.014.413.80.001
  COPD7.47.67.47.95.96.68.80.001
  CEVD0.70.70.80.90.50.60.80.001
ICU: intensive care unit; AMI: acute myocardial infarction; CHF: congestive heart failure; CEVD: cerebrovascular disease; COPD: chronic obstructive pulmonary disease; IQR: interquartile range. Age is expressed in years. Hospital/ICU stay is expressed in days.
Table 2. Outcomes in terms of total admissions, ICU admissions, and mortality of hospitalized patients disaggregated by age group.
Table 2. Outcomes in terms of total admissions, ICU admissions, and mortality of hospitalized patients disaggregated by age group.
TotalFirst WaveSecond WaveThird WaveFourth WaveFifth WaveSixth Wavep-Value
Admissions
  ≤176568583(8.9%)1622(24.7%)1021(15.5%)657(10%)1764(26.9%)921(14%)0.001
  18–4999,57018,223(18.3%)23,525(23.6%)18,542(18.6%)13,669(13.7%)20,660(20.7%)4951(5%)0.001
  50–5978,53518,524(23.6%)19,534(24.9%)19,432(24.7%)11,560(14.7%)6262(8%)3223(4.1%)0.001
  60–79178,66745,100(25.2%)44,052(24.7%)48,808(27.3%)18,198(10.2%)13,267(7.4%)9242(5.2%)0.001
  ≥80125,83430,642(24.4%)35,998(28.6%)36,365(28.9%)5806(4.6%)11,707(9.3%)5316(4.2%)0.001
ICU
  ≤1749095(19.4%)109(22.2%)74(15.1%)62(12.7%)111(22.7%)39(8%)0.001
  18–4910,2271493(14.6%)2186(21.4%)1,957(19.1%)1513(14.8%)2633(25.7%)445(4,4%)0.001
  50–5911,1872117(18.9%)2733(24.4%)2994(26.8%)1705(15.2%)1240(11.1%)398(3.6%)0.001
  60–7928,5485862(20.5%)7159(25.1%)8300(29.1%)3537(12.4%)2526(8.8%)1164(4.1%)0.001
  ≥802372342(14.4%)754(31.8%)687(29%)185(7.8%)248(10.5%)156(6.6%)0.001
Deaths
  ≤l7274(14.8%)7(25.9%)6(22.2%)2(7.4%)6(22.2%)2(7.4%)0.001
  18–491282345(26.9%)300(23.4%)253(19.7%)92(7.2%)222(17.3%)70(5.5%)0.001
  59–593274905(27.6%)760(23.2%)861(26.3%)277(8.5%)343(10.5%)128(3.9%)0.001
  60–7925,4277937(31.2%)5,914(23.3%)7074(27.8%)1871(7.4%)1733(6.8%)898(3.5%)0.001
  ≥8040,85511,707(28.7%)11,111(27.2%)12,041(29.5%)1499(3.7%)3127(7.7%)1370(3.4%)0.001
Mortality ratios were calculated by dividing the the presented events in a given age group in a wave by the total population in each group.
Table 3. Observed, estimated, and averted events in the first year of vaccination.
Table 3. Observed, estimated, and averted events in the first year of vaccination.
HospitalizationsDeaths
Events(95% CI)Events(95% CI)
RandomForest
  Observed136,658(NA)12,595(NA)
  Estimated251,830(216,99–286,663)37,673(317,13–43,633)
  Averted115,172(80,339–150,005)25,078(191,18–31,038)
ElasticNet
  Observed136,658(NA)12,595(NA)
  Estimated307,617(214,502–400,733)37,141(15,143–59,138)
  Averted170,959(77,844–264,075)24,546(2,548–46,543)
Estimation computed for the period between March, 2021 and December, 2021. We estimated data using the following two models: ElasticNet and random forest. NA: non-applicable; CI: confidence interval.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Garcia-Carretero, R.; Ordoñez-Garcia, M.; Vazquez-Gomez, O.; Rodriguez-Maya, B.; Gil-Prieto, R.; Gil-de-Miguel, A. Impact and Effectiveness of COVID-19 Vaccines Based on Machine Learning Analysis of a Time Series: A Population-Based Study. J. Clin. Med. 2024, 13, 5890. https://doi.org/10.3390/jcm13195890

AMA Style

Garcia-Carretero R, Ordoñez-Garcia M, Vazquez-Gomez O, Rodriguez-Maya B, Gil-Prieto R, Gil-de-Miguel A. Impact and Effectiveness of COVID-19 Vaccines Based on Machine Learning Analysis of a Time Series: A Population-Based Study. Journal of Clinical Medicine. 2024; 13(19):5890. https://doi.org/10.3390/jcm13195890

Chicago/Turabian Style

Garcia-Carretero, Rafael, Maria Ordoñez-Garcia, Oscar Vazquez-Gomez, Belen Rodriguez-Maya, Ruth Gil-Prieto, and Angel Gil-de-Miguel. 2024. "Impact and Effectiveness of COVID-19 Vaccines Based on Machine Learning Analysis of a Time Series: A Population-Based Study" Journal of Clinical Medicine 13, no. 19: 5890. https://doi.org/10.3390/jcm13195890

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop