Next Article in Journal
A Social Determinants Perspective on Adolescent Mental Health during the COVID-19 Pandemic
Previous Article in Journal
Long COVID: A Narrative Review and Meta-Analysis of Individual Symptom Frequencies
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Global Network Analysis of COVID-19 Vaccine Distribution to Predict Breakthrough Cases among the Vaccinated Population

Aspiring Scholars Directed Research Program (ASDRP), Department of Biological, Human, and Life Sciences Fremont, Fremont, CA 94539, USA
*
Author to whom correspondence should be addressed.
COVID 2024, 4(10), 1546-1560; https://doi.org/10.3390/covid4100107
Submission received: 31 July 2024 / Revised: 12 September 2024 / Accepted: 17 September 2024 / Published: 25 September 2024

Abstract

:
As the COVID-19 pandemic began spreading worldwide in late 2019 and early 2020, many vaccine candidates were developed to combat the disease. However, new COVID-19 variants such as Omicron and Delta continue to emerge globally despite advancements in vaccine technology, leaving certain countries and variants more vulnerable than others to future outbreaks of these variants. This research aims to analyze the susceptibility of different countries to a COVID-19 outbreak, present the first visualization of the spread of COVID-19, and predict which countries are at greater risk for future outbreaks of new variants based on various factors. We created interactive maps to understand the pandemic’s spread and identify high-risk countries based on their vaccination percentages. Then we employed binary classification, K-nearest neighbors (KNN), and neural network machine learning models to predict each country’s risk factor. The risk factor determines whether a country is safe from a new COVID-19 variant based on vaccine percentage and government stringency. The neural network achieved the highest accuracy, classifying countries as high risk or low risk with 94% accuracy. Inspired by the Albert Barabasi model, we graphed connections between countries based on vaccination percentages. These graphs illustrate the correlation between the two countries and better demonstrate how their vaccination rates relate to the probability of a new COVID-19 outbreak.

1. Introduction

A virus that originated in Wuhan, China, [1] spread across the entire world in months and became known as COVID-19. COVID-19 is caused by SARS-CoV-2 and comes from the coronavirus family [2], which often targets the proteins in the lungs of humans. Due to their spike protein structure [3], they can easily latch onto surfaces such as human skin, making it easier to spread and even harder to remove. Unlike SARS-CoV-1 [4], a virus in the family of coronaviruses, SARS-CoV-2 spreads much quicker, allowing it to reach far more people in less time. SARS-CoV-2 can remain in its the active state considerably longer than SARS-CoV-1, causing it to infect more people at a faster rate with just one spike protein [5].
COVID-19, specifically, is a viral disease made up of a positive sense RNA [6]. Because of its rapid spread, preventative measures were put into place to protect others within and outside the country. This impacted every country’s economy [5] as it decreased imports and exports while also increasing psychological issues and causing an immense number of casualties [5].
As people began to learn more about COVID-19, new variants began to appear in countries all around the world. A variant is a mutation or a new version of the original virus. Variants occur as a virus enters a host cell. As it copies its DNA into the host, errors may occur during this replication, causing a new variation of the virus [7]. For SARS-CoV-2, specifically, there have been multiple variants: Omicron, Delta, Beta, and Alpha.
With each variant that arose, many complications occurred, as they may not have been covered by the current vaccines in production. This led to the continuous spread of COVID-19 as variants continued to emerge. To allow countries and governments to better prepare for future outbreaks, this project uses machine learning as well as network science to model and predict the risk of COVID-19 spreading in different countries.
Despite vaccines being introduced, many countries were unable to combat the disease due to lack of resources, leading to massive outoutbreaks [8]. There were two types of responses from various countries: high-performing countries and low-performing countries [8]. High-performing countries could combat the disease, as they had their whole government involved in this process, had the support of the people, and were able to purchase resources. Low-performing countries, on the other hand, struggled due to a lack of proper infrastructure, a lack of certain political support, and a lack of resources to sustain any outbreak response. Some believe that a more global response would help prevent major outbreaks, as was seen in the case of COVID-19. We focused primarily on the economic issues and tried to create a model that would be able to help allocate resources to these low-performing countries so that those countries would have a better chance of fighting the disease and the outbreaks could be contained.
COVID-19 vaccine distribution is the best candidate to be presented as a social network. Inspired by Albert Barabasi’s concept of network science, we developed a visualization of the COVID-19 vaccine distribution network to better understand the correlation between different countries in their COVID-19 vaccine type percentages [9]. We effectively used a machine learning model to predict the risk factor of a country, allowing nations to better understand their susceptibility to new COVID-19 variants. Vaccination and stringency rates can both serve as indicators for a country’s likelihood of effectively dealing with a large COVID-19 outbreak; they were therefore used as features for our machine learning models [10]. The techniques used in our study include K-nearest neighbors, binary classification, and neural networks.
K-nearest neighbors (KNN) is a machine learning algorithm centered around classification [11]. It uses training data that have certain labels (stringency, vaccine percentage, and risk). Then, based on what is being tested, the model will classify the training data based on the k neighbors (the points close by). The model will try to classify based on what groups are closest to what the model is trying to predict. KNN is part of a larger group of models called binary classification. Binary classification uses its inputs (vaccine percentage, stringency, and risk) and predicts the classification (risk level) [12]. Finally, neural networks are algorithms modeled after the neurons in the brain and rely on testing and training data to learn [13]. Similar to the brain, they rely on changing the connection of neurons to learn, allowing them to perform many information-processing tasks.
Network science was another tool we used, as it allowed us to map the network between the counties. Network science is the study of patterns of social phenomena, often through the analysis of connections between distinct elements. A network is a set of nodes that are connected with edges to visualize interactions between groups. Network scientists use computational tools [14] to analyze these networks to uncover patterns and principles that can help us understand how they function over time and can be used to address real-world problems such as disease spread.
Accessibility to the global vaccination network is vital to individuals, the government, medical infection centers, and the CDC as they can visualize the spread of COVID-19, the different vaccine distributors, and the risk factors for each country depending on stringency and vaccine percentages. Because of its visual representations, many people (not just the scientific community) will be able to understand and benefit from this study by learning more about COVID-19 and finding better ways to prepare themselves. These resources will allow governments to better prepare their country in case another COVID-19 variant occurs, as the networks will be able to show the countries that are in greater danger. Along with COVID-19, these networks can serve as a template for any other dangerous outbreaks that may happen in the future.

2. Related Studies

Previous studies have focused significantly on predicting the spread of the COVID-19 virus [15]. Hirschprung et al. focused on using multiple different regression models and machine learning to predict the number of cases in countries around the world [16]. Wiezoreck et al. used a neural-network-based deep-learning architecture to achieve a forecasting accuracy of around 99% [17]. Studies such as Tomar et al. focused more on specific countries such as India, and they were able to use a long short-term memory network to forecast the spread of the virus up to 90 days in advance [18]. Finally, Yadav et al. also used statistical modeling methods to forecast the spread of COVID-19.
Other research [19], a little less common, claims that, along with mass lockdowns, certain apps that employ tracing techniques can help identify and contain the spread of COVID. While this might be controversial in the US, it has been studied in Latin American countries, where researchers also used applied network models such as the Susceptible Exposed Infectious Recovered (SEIR) model to help them understand the spread of the disease for use in tracking devices. Our project, however, looks more at data from countries than from individuals, as we are trying to predict the danger for a country.
Fewer studies have focused on individual countries’ risk predictions based on their vaccination data. Huang, X et al. [19] used geographic information systems (GIS) to map the risk and spread of diseases by using spatial analysis to identify clusters of infections. Barda et al. focused on predicting the risk of death in certain countries in low-data situations [20]. Similarly, Pal et al. used a neural-network-based method to evaluate each individual country’s risk of the spread of COVID-19 [21]. Bird et al. also explored the effects of using various ensemble models on predicting the spread of the virus [22]. Finally, Chakraborty et al. developed a real-time forecast of COVID-19 cases to determine the risk of the virus. However, these studies have not included a fully comparative analysis of different machine learning models using the most up-to-date data available.

3. Methodology

To model and understand the probability of outbreaks at a given geographical location, we implemented a neural network [23] and used network science [24] to model vaccine-resistant variants across counties. An interactive digital map was created using R studio version 4.3.3 and JavaScript to visualize the connection between countries with similar vaccines. The process of building the model and network consisted of the following steps:
  • Step 1: Data Collection
(a)
We collected data from multiple reliable sources, including the CDC, the New York Times (NYT) [25], and the Johns Hopkins Coronavirus Resource Center (JHCRC) [26]. The datasets included the following key metrics for 45 countries and 14 vaccine candidates:
(i)
Country population
(ii)
Number of vaccinated individuals
(iii)
Total vaccine doses distributed
(iv)
Average number of doses per person
(v)
Stringency indices for COVID-19 measures [11,27,28]
(b)
Due to limited data on some vaccine candidates used in only a few countries, we excluded those candidates from the model. We relied on the CDC, NYT, and JHCRC for global data, assuming they accurately represent worldwide trends. However, some potential bias may exist due to data inconsistencies.
  • Step 2: Data Translation
(a)
We transferred the collected data from Table 1 into Google spreadsheets to facilitate the development of an interactive map.
  • Step 3: Data Preprocessing
(a)
We preprocessed the data in Table 2 to compute the percentage of people vaccinated per country per manufacturer. The steps involved and results from preprocessing are as follows:
  • Given: Total doses of vaccine (per manufacturer), the population of each country, the number of people vaccinated from each country, and the total number of doses for all vaccines for each country
  • We calculated the average number of doses per person by dividing the total number of vaccine doses by the number of vaccinated individuals.
    • Example: If country “A” had 1000 doses of vaccine given out and had 500 people vaccinated, then 1000/500 = 2. This means country “A” has an average of 2 doses of vaccine per person
  • Using the average number of doses per person and the number of doses per manufacturer we found the number of people vaccinated per manufacturer.
  • We then divided the result by the population of the country which gave us the percentage of people vaccinated categorized by vaccine manufacturer per country
    • Example: If country “A” had 500 doses of Pfizer and 500 doses of Moderna, then dividing by the average number of doses per person would give us 500/2 = 250. This means 250 people were vaccinated with the Pfizer vaccine and 250 people were vaccinated with Moderna. If country “A” had a population of 2000, then 250/2000 = 0.125. This means that 12.5% of the people were vaccinated with the Pfizer vaccine and another 12.5% with Moderna
  • Step 4: Interactive Map Development
(a)
We used two R packages in version 4.3.3, Highcharter and Tidyverse, which allowed us to create an interactive map. We mapped the processed data on the interactive global map with data imported and integrated into the map. Using the Highcharter package, we were able to combine the R code with the JavaScript, HTML, and CSS code, allowing us to make the interactive world map. For the HTML part, we drew in the Highcharter JavaScript code, brought in a few modules, and created a “website” for the map to run on. This “website” was what the map was generated on as we ran the code. For the JavaScript portion, we used the “worldgeojson” data package from Highcharter and integrated a series map. This package uses ISO codes and a map image to illustrate the map. This also added some map navigation with the scroll wheel function, allowing users to actively interact with the map.
  • Step 5: Machine Learning Model Implementation
(a)
We applied machine learning algorithms to assess the risk of infection for each vaccine candidate:
  • The binary classification model uses a supervised machine learning algorithm to categorize the countries as high-risk and low-risk for vulnerability to a COVID-19 pandemic [29] The algorithm used for the binary classification model was logistic regression, a method for fitting a regression curve to the data for predicting a label for a set of data [30].
  • The K-nearest neighbors is another supervised machine-learning model that uses the proximity of data points to make classifications. A K-nearest neighbors algorithm is a lazy-learning algorithm, meaning that a model is not created until a query is performed.
  • The final machine learning method used was a neural network algorithm called NeuralNet (NN). NN uses backpropagation to estimate the neural network’s weights [31]. NeuralNet is a package from R that consists of many functions Figure 1 necessary for neural network training and testing, including customizing the number of layers in our model and predicting categories with a test dataset. The model features were vaccination percentages for the specific candidate and stringency rate. These vaccination data were split into a 70–30 train–test split, and the data underwent min–max normalization. The NN consisted of five hidden layers and used resilient backpropagation with weight backtracking. The NN demonstrated the highest accuracy, ranging from 94.3% to 99.02% depending on the vaccine candidate after the data were divided into a 70–30 split. We generated a NeuralNet function (Figure 2) for each vaccine candidate; however, the one shown below is for Pfizer.
  • Step 6: Neural network model training
(a)
The training data were categorized as 1 (high risk) and 0 (low risk) based on the threshold we set. We set the threshold to be 50% for all vaccine candidates.
(b)
Along with the neural network shown in Figure 3, we made a confusion matrix (Table 3) for each vaccination candidate to visualize how well our model predicts the risk.
  • Step 7: Network Science for Global Vaccine Connectome
(a)
We used network science to visualize global vaccine distribution using Gephi and google sheets. We had a sheet recording the countries, an assigned number, and their vaccine percentages. The nodes represent vaccine percentages for each country, and the edges represent connections between countries that distribute the same vaccine candidate. Regarding the edges, a thicker edge between countries indicates more social connectedness between those two countries. The network was visualized using Gephi, and the connectome was ranked according to vaccine percentage numbers. A set of specific color codes was used to represent the risks of infection. Red edges and nodes meant a low risk of infection, light yellow edges and nodes meant a medium risk of infection, and blue edges and nodes meant a high risk of infection. Counties at a higher risk were assigned if the stringency was low and the vaccine percentage was high for a specific vaccine candidate. The threshold for determining high or low risk was selected by comparing the countries within each vaccine candidate to each other to determine whether the country was considered to be at a high risk. This was necessary, as each vaccine candidate had a different threshold of high or low risks (J&J for example had its highest at 25%, while Pfizer had its highest at 80%). Additionally, we ran the Fruchterman–Reingold model to form the network into a circular shape to make it easier to visualize.

4. Results and Discussion

A neural network was used to predict the risk of COVID-19 infection in countries based on vaccination rates and vaccine distribution. Each country was assigned a risk factor of either 1 (high risk) or 0 (low risk). We used a neural network to predict risk based on various vaccine candidate data, such as Pfizer, Moderna, J&J, Sinopharm, and Sinovac, while also testing for the accuracy of each model (Table 4). We will use Pfizer as an example of how to interpret the networks. Pfizer had 43 countries that used this vaccine, and, according to the 70–30 split, 30% of them equals 13. This means that the model tried to predict 13 countries. Using the accuracy rate (which we will discuss later), which was around 97% for Pfizer, we can determine the amount the model predicted correctly and incorrectly (for Pfizer twelve were correct and one was incorrect). Incorrect predictions may have arisen because these countries were close to the cut off for the risk. The cutoff for Pfizer, for example, was 50%; countries near 50% may have therefore caused confusion in the model (one country was around 47%). Another issue for other vaccines could be the amount of available data, as some vaccine candidates were only used in five countries; thus, there were not enough data to train the model.
Along with understanding how the neural network works, it is important to understand the risk, as that will help researchers draw more accurate conclusions. To understand the risk, it is important to understand that countries in which a vaccine candidate is administered to a larger population will likely be at higher risk than countries administering a less “popular” vaccine candidate; however, they are at the higher end of the spectrum. An example of this would be comparing countries that have Pfizer vs. J&J. Countries at high risk in our Pfizer NN are at a much higher risk compared to the countries that are at high risk in J&J, as J&J was not administered in large numbers as Pfizer was. The summary shown in Table 5 of all our countries’ risks explains why some countries with certain vaccine candidates are at a higher risk compared to other vaccine candidates. We also used a neural network that was trained and tested for all vaccine candidate data.
To help with visualization we created an interactive map (Figure 4) allowing users to hover over countries to explore the vaccination rates for each country. Using R, we performed exploratory data analysis on the data we collected and created a database to store our vaccination data and to perform calculations on the vaccination percentages. We created a foundational UX using HTML and CSS and plotted all of the countries on our program. Maps for each vaccine candidate were developed, as well as a map showing overall vaccination rates.
We then used network science to model the vaccination candidates using Gephi, where the nodes represent countries, and the edges represent connections between countries that distribute the same vaccine candidate. The Fruchterman–Reingold model is a force-directed layout model that calculates the optimal positions of each node in our network, minimizing the total energy in the system. This allowed our network (Figure 5) to include and arrange countries outside of those that shared the same vaccine distributor. This helped provide a visual understanding of which countries distributed the vaccine and which countries did not; it also showed how these countries’ data correlated with those of other countries.
A thicker edge and darker color between countries indicate more social connected-ness between those two countries and thus a higher vaccine correlation. This higher risk was shown by the cooler colors, while the warmer colors represent lower risk.
You can find the rest of our interactive networks, as well as the interactive maps at https://www.jneurolab.org/covid-vacnet-website (accessed on 1 January 2020).
With all this information, we were able to conclude that if the next variant begins to spread and is shown to produce disease despite a specific vaccine candidate, we will be able to predict the countries that will be strongly affected. For example, if a new COVID variant arises that severely affects people who have taken the Pfizer vaccine, but not any other vaccine, we can analyze the Pfizer network to understand how the vaccinated subpopulations of countries that distribute Pfizer will be affected by the new COVID variant. Countries that are not connected by the Pfizer network either will not be as affected or do not have any data associated with them, and the countries that are connected by the network will be affected to some degree. The countries in blue will be at a low risk of being severely affected by an outbreak of the new COVID variant, while countries in red will be at high risk of being severely affected. The countries in white have a risk value that lies between the blue and red countries.
Our results have shown that countries that are not reliant on just one vaccine and have their population taking multiple vaccines will be better able to withstand a variant that may target one vaccine. This is the case because, even if a part of their population is susceptible to the variant, there is a whole other part of the country that is taking other vaccines; as a result, they may be able to fight the variant, thus decreasing the probability of an outbreak. However, it is important to note that there may still be some bias in our findings as a result of the lack of data on some vaccines, allowing our deep-learning model to overfit the data.

5. Conclusions and Limitations

In summary, our findings suggest that the risk of COVID-19 infection in countries can be accurately predicted using a neural network developed with R that takes into account vaccination percentages and stringency indices. From the results of our neural network model, we were able to calculate risk and distinguish countries at high risk of a COVID-19 outbreak from those countries at low risk based on each country’s vaccination percentage and stringency. This would then allow us to determine which countries would be the most resilient to a new variant of the COVID-19 virus that is immune to a specific vaccine candidate such as Pfizer or Moderna. If a country relies solely on one vaccine, then it may have a larger issue compared to a country that has its population using multiple vaccines, and these models can help track data and inform the corresponding government in case another outbreak occurs. Knowing this information would make it possible to allocate resources to countries that we find may struggle.
The map made with JavaScript, HTML, and R makes it easier to interpret the results of our machine-learning algorithm. Instead of searching through a table full of numbers, we could just hover over the country of interest and see each of their vaccine percentages.
We discovered that we could apply network science to our results from machine learning algorithms to connect the countries from our global map. Thicker edges mean that vaccine percentages are both relatively high and similar between the corresponding countries. If one specific variant of COVID-19 evades one or a few of the vaccines, we would be able to tell which countries would be affected the most by observing the edges and by identifying the countries that rely more on that vaccine.
Our results have important implications for understanding the spread of COVID-19 and developing strategies to control the pandemic while identifying countries that may require immediate medical support. Targeting these low-performing countries could aid more globally coordinated responses, which were lacking during COVID, either by allowing high-performing countries to better identify other countries that need assistance or by giving more support to these low-performing countries, thus making them better equipped to be a part of these coordinated responses. Even though a global response is an important factor for promoting an effective response, it is equally important to use network science to determine where to focus that coordinated response in order to minimize casualties. These results help us understand the spread of COVID-19, because if one country is at a higher risk due to reliance solely on one vaccine candidate, other countries that share the same vaccine candidate will also likely be at risk. Of course, drawing this conclusion would mean having to account for many other factors, such as the interaction between people within those countries; however, our networks can help identify which countries may be at a higher risk. Our networks would help healthcare workers identify which countries are more correlated with each other in terms of the probability of an outbreak. This can be seen based on the weight of the edges that have that country as one of the nodes.

6. Limitations

Finally, it is important to realize that there were numerous factors involved in the progression of COVID-19, and our data were limited. Our project could be used to help identify countries that may be at risk; however, there are many other factors, such as political issues, that our project does not address. For example, some countries were not able to counter COVID due to poor government structures or lack of support from the populace. Some vaccination candidates, such as Sinopharm, only had 4–5 countries with data. This could lead to bias in our models for the vaccination candidates with less data available. Our networks assume one vaccine is effective against a variant. Our project can be used as a preliminary way to identify countries that may be in need; however, administering aid is not especially easy due to resistance from certain countries or the complexities surrounding foreign affairs throughout the globe. Another limitation on the use of this model is the need for data on how vaccines interact with the variant. We would need to know which vaccines are and are not effective against the variants to determine which countries would be at risk. Once we have these data, conclusions from our model can be made.

7. Future Works

One option for improving this study could be to build the network on top of the world map, which would allow for cleaner visualization; updating networks in real time could be an option as well. Our networks and neural networks can also potentially be used to predict the countries that may struggle the most with a future disease based on stringency and vaccination percentages. These parameters could even be expanded, and research could be carried out to determine other factors that may affect the disease. This could even be explored for COVID to make this model more accurate. Examples of other factors could be the economic situation of certain countries or cultural biases towards vaccines.

Author Contributions

Conceptualization, methodology, formal analysis, investigation, writing – review and editing, supervision, project administration, funding acquisition: S.J., Methodology, software, investigation, resources, formal analysis, data curation: P.B., E.Z., S.P., M.B. & A.D. Validation, Visualization, writing – review and editing: P.B., E.Z. & A.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

We used data from ourworlddata to determine the stringency: https://ourworldindata.org/covid-stringency-index#:~:text=The%20nine%20metrics%20used%20to,movements%3B%20and%20international%20travel%20controls (accessed on 30 July 2022). We used BBC news to get some data for India: https://www.bbc.com/news/world-asia-india-56345591 (accessed on 23 February 2023). We used NYT for our data for other countries (data might be different as we started collecting data in 2021–2022 so there may be changes from then to now): https://www.nytimes.com/interactive/2021/world/covid-cases.html (accessed on 12 July 2022).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Chaudhary, J.K.; Yadav, R.; Chaudhary, P.K.; Maurya, A.; Kant, N.; Rugaie, O.A.; Haokip, H.R.; Yadav, D.; Roshan, R.; Prasad, R.; et al. Insights into COVID-19 vaccine development based on immunogenic structural proteins of SARS-COV-2, host immune responses, and herd immunity. Cells 2023, 10, 2949. [Google Scholar] [CrossRef] [PubMed]
  2. Seladi-Schulman, J. Coronavirus vs. SARS. Healthline, 15 September 2021. [Google Scholar]
  3. How viral mutations occur in SARS-COV-2. Yale Medicine, 19 February 2021.
  4. Johnson, S. What you need to know about SARS (severe acute respiratory syndrome). Healthline, 31 March 2017. [Google Scholar]
  5. Rabaan, A.A.; Al-Ahmed, S.H.; Haque, S.; Sah, R.; Tiwari, R.; Malik, Y.S.; Dhama, K.; Yatoo, M.I.; Bonilla-Aldana, D.K.; Rodriguez-Morales, A.J. SARS-CoV-2, SARS-CoV, and MERS-COV: A comparative overview. Infez. Med. 2020, 28, 174–184. [Google Scholar] [PubMed]
  6. De Groot, R.J.; Baker, S.C.; Baric, R.S.; Brown, C.S.; Drosten, C.; Enjuanes, L.; Fouchier, R.A.M.; Galiano, M.; Gorbalenya, A.E.; Memish, Z.A.; et al. Middle East Respiratory Syndrome Coronavirus (MERS-CoV): Announcement of the Coronavirus Study Group. J. Virol. 2013, 87, 7790–7792. [Google Scholar] [CrossRef] [PubMed]
  7. Brant, A.C.; Tian, W.; Majerciak, V.; Yang, W.; Zheng, Z.M. SARS-COV-2: From its discovery to genome structure, transcription, and replication. Cell Biosci. 2023, 11, 136. [Google Scholar]
  8. Uddin, S.; Haque, I.; Lu, H.; Moni, M.A.; Gide, E. Comparative performance analysis of K-nearest neighbour (KNN) algorithm and its different variants for disease prediction. Sci. Rep. 2022, 12, 6256. [Google Scholar] [CrossRef]
  9. Barabási, A.-L. Scale-Free Networks: A Decade and Beyond. Science 2009, 325, 412–413. [Google Scholar] [CrossRef]
  10. Park, M.-B.; Ranabhat, C.L. COVID-19 trends, public restrictions policies and vaccination status by economic ranking of countries: A longitudinal study from 110 countries. Arch. Public Health 2022, 80, 197. [Google Scholar] [CrossRef]
  11. Kumari, R.; Srivastava, S.K. Machine Learning: A Review on Binary Classification. Int. J. Comput. Appl. 2017, 160, 11–15. [Google Scholar] [CrossRef]
  12. Mehlig, B. Machine Learning with Neural Networks; University of Gothenburg: Gothenburg, Sweden, 2021. [Google Scholar]
  13. Sputnik, V. COVID-19 vaccine candidate appears safe and effective. Lancet 2021, 397, 642–643. [Google Scholar]
  14. Shah, S.; Mulahuwaish, A.; Ghafoor, K.Z.; Maghdid, H.S. Prediction of global spread of COVID-19 pandemic: A review and research challenges. Artif. Intell. Rev. 2021, 55, 1607–1628. [Google Scholar] [CrossRef]
  15. Hirschprung, R.S.; Hajaj, C. Prediction model for the spread of the COVID-19 outbreak in the global environment. Heliyon 2021, 7, e07416. [Google Scholar] [CrossRef]
  16. Wieczorek, M.; Siłka, J.; Woźniak, M. Neural network powered COVID-19 spread forecasting model. Chaos Solitons Fractals 2020, 140, 110203. [Google Scholar] [CrossRef] [PubMed]
  17. Tomar, A.; Gupta, N. Prediction for the spread of COVID-19 in India and effectiveness of preventive measures. Sci. Total. Environ. 2020, 728, 138762. [Google Scholar] [CrossRef] [PubMed]
  18. Barda, N.; Riesel, D.; Akriv, A.; Levy, J.; Finkel, U.; Yona, G.; Greenfeld, D.; Sheiba, S.; Somer, J.; Bachmat, E.; et al. Developing a COVID-19 mortality risk prediction model when individual-level data are not available. Nat. Commun. 2020, 11, 4439. [Google Scholar] [CrossRef]
  19. Huang, X.; Zhang, R.; Li, X.; Dadashova, B.; Zhu, L.; Zhang, K.; Li, Y.; Shen, B. Health-Based Geographic Information Systems for Mapping and Risk Modeling of Infectious Diseases and COVID-19 to Support Spatial Decision-Making. In Translational Informatics. Advances in Experimental Medicine and Biology; Shen, B., Ed.; Springer: Singapore, 2022; Volume 1368. [Google Scholar]
  20. Pal, R.; Sekh, A.A.; Kar, S.; Prasad, D.K. Neural Network Based Country Wise Risk Prediction of COVID-19. Appl. Sci. 2020, 10, 6448. [Google Scholar] [CrossRef]
  21. Bird, J.J.; Barnes, C.M.; Premebida, C.; Ekárt, A.; Faria, D.R. Country-level pandemic risk and preparedness classification based on COVID-19 data: A machine learning approach. PLoS ONE 2020, 15, e0241332. [Google Scholar] [CrossRef]
  22. COVID-19 World Map: Cases, Deaths and Global Trends. The New York Times, 11 March 2023.
  23. Beck, M.W. NeuralNetTools: Visualization and Analysis Tools for Neural Networks. J. Stat. Softw. 2018, 85, 1–20. [Google Scholar] [CrossRef] [PubMed]
  24. Albert, R.; Barabási, A.-L. Statistical mechanics of complex networks. Rev. Mod. Phys. 2002, 74, 47–97. [Google Scholar] [CrossRef]
  25. COVID-19 Map—Johns Hopkins Coronavirus Resource Center; Johns Hopkins Coronavirus Resource Center: Baltimore, MD, USA, 2020.
  26. Serafino, M.; Monteiro, H.S.; Luo, S.; Reis, S.D.S.; Igual, C.; Neto, A.S.L.; Travizano, M.; Andrade, J.S.; Makse, H.A. Digital contact tracing and network theory to stop the spread of COVID-19 using big-data on human mobility geolocalization. PLoS Comput. Biol. 2022, 18, e1009865. [Google Scholar] [CrossRef]
  27. Centers for Disease Control and Prevention. CDC COVID Data Tracker. Centers for Disease Control and Prevention; Centers for Disease Control and Prevention: Atlanta, GA, USA, 2021. [Google Scholar]
  28. Dong, E.; Du, H.; Gardner, L. An interactive web-based dashboard to track COVID-19 in real time. Lancet Infect. Dis. 2020, 20, 533–534. [Google Scholar] [CrossRef]
  29. Cramer, J.S. The origins of logistic regression. SSRN Electron. J. 2003, 19, 4. [Google Scholar] [CrossRef]
  30. Günther, F.; Fritsch, S. neuralnet: Training of Neural Networks. R J. 2010, 2, 30–38. [Google Scholar] [CrossRef]
  31. Stevens, N.T.; Wilson, J.D. The past, present, and future of network monitoring: A panel discussion. Qual. Eng. 2021, 33, 715–718. [Google Scholar] [CrossRef]
Figure 1. Methodology. This is a visual representation to help understand our methodology. While we will explain each step in more detail, this introduces you to our steps.
Figure 1. Methodology. This is a visual representation to help understand our methodology. While we will explain each step in more detail, this introduces you to our steps.
Covid 04 00107 g001
Figure 2. NN function. This is an example of the Pfizer function we used to create our neural network.
Figure 2. NN function. This is an example of the Pfizer function we used to create our neural network.
Covid 04 00107 g002
Figure 3. Pfizer network output. This figure shows an example of our Pfizer NN (the blue circles represent the biases).
Figure 3. Pfizer network output. This figure shows an example of our Pfizer NN (the blue circles represent the biases).
Covid 04 00107 g003
Figure 4. Vaccinated percentage per location. This map shows countries and their vaccination percentage; countries with warmer colors such as red or orange have a higher percentage than countries that have cooler colors such as blue and green.
Figure 4. Vaccinated percentage per location. This map shows countries and their vaccination percentage; countries with warmer colors such as red or orange have a higher percentage than countries that have cooler colors such as blue and green.
Covid 04 00107 g004
Figure 5. Pfizer (top) and Moderna (bottom) networks. The networks developed using the Albert Barabasi model demonstrate the similarities of the vaccination percentages of each country. Shown here are the Pfizer and Moderna networks. Unlike Moderna, Pfizer has far more countries with higher vaccination percentages. This means that there are more countries with larger populations (above 50%) that have this vaccine. While many countries administer the Moderna vaccine, the percentages are much lower. Blue countries are those with a high vaccine percentage, while red countries have a low vaccine percentage (yellow is in the middle). Countries connected with blue lines have similarly high vaccine percentages, while red means that they share a low vaccine percentage (yellow countries share a mid-range percentage: not high, but not low). The mid-range number is around the middle of the highest and lowest vaccine percentages. Networks are zoomed for more clearly visualizing the countries that have the vaccine. The countries surrounding the networks either had no data or none of that vaccine (this is the key for all the maps that follow). For Pfizer, most of the countries have blue nodes, while also being connected with blue lines. This means that for countries that use Pfizer, most rely on this vaccine candidate, as the majority of their population is vaccinated with it. This means that if there is another COVID outbreak and Pfizer is proven to not be as effective, these countries will struggle more due to their heavy reliance on Pfizer. Countries with green and yellow may be safer, especially if their population uses many other vaccination candidates. Red countries have a very small percentage of their population vaccinated with Pfizer; one would therefore need to look at the other vaccine candidates to determine if that country has used other vaccine candidates or if this is their only one. If there is only one, that means a small percent of their population is vaccinated; hence, they will also struggle if there is another outbreak.
Figure 5. Pfizer (top) and Moderna (bottom) networks. The networks developed using the Albert Barabasi model demonstrate the similarities of the vaccination percentages of each country. Shown here are the Pfizer and Moderna networks. Unlike Moderna, Pfizer has far more countries with higher vaccination percentages. This means that there are more countries with larger populations (above 50%) that have this vaccine. While many countries administer the Moderna vaccine, the percentages are much lower. Blue countries are those with a high vaccine percentage, while red countries have a low vaccine percentage (yellow is in the middle). Countries connected with blue lines have similarly high vaccine percentages, while red means that they share a low vaccine percentage (yellow countries share a mid-range percentage: not high, but not low). The mid-range number is around the middle of the highest and lowest vaccine percentages. Networks are zoomed for more clearly visualizing the countries that have the vaccine. The countries surrounding the networks either had no data or none of that vaccine (this is the key for all the maps that follow). For Pfizer, most of the countries have blue nodes, while also being connected with blue lines. This means that for countries that use Pfizer, most rely on this vaccine candidate, as the majority of their population is vaccinated with it. This means that if there is another COVID outbreak and Pfizer is proven to not be as effective, these countries will struggle more due to their heavy reliance on Pfizer. Countries with green and yellow may be safer, especially if their population uses many other vaccination candidates. Red countries have a very small percentage of their population vaccinated with Pfizer; one would therefore need to look at the other vaccine candidates to determine if that country has used other vaccine candidates or if this is their only one. If there is only one, that means a small percent of their population is vaccinated; hence, they will also struggle if there is another outbreak.
Covid 04 00107 g005
Table 1. Original Data: This is a snippet of our data showingthe country, the day the data were updated, and the total number of vaccines in each country for each manufacturer (this snippet does not show all manufactures and countries).
Table 1. Original Data: This is a snippet of our data showingthe country, the day the data were updated, and the total number of vaccines in each country for each manufacturer (this snippet does not show all manufactures and countries).
EntityCodeDayPfizer/BioNTechModernaOxford/AstraZenecaJohnson&JohnsonSputnik VSinovacSinopharm/BeijingCanSinoNovavaxCovaxinMedicagoSanofi/GSKSKYCovioneValnevaTotal Population
ArgentinaARG2023-02-1119,724,29517,371,74926,774,900020,751,112028,941,172985,24300000045,847,428
AustriaAUT2023-02-0316,763,4861,663,8381,593,159368,380000014,89300140021629,065,484
BelgiumBEL2023-02-0321,931,0504,389,1472,849,228428,55200004060000011,658,404
BulgariaBGR2023-02-103,090,202511,919478,547530,21900000000006,866,274
CanadaCAN2023-12-0463,482,16528,804,5852,815,49823,591000028,433086300037,742,154
ChileCHL2023-08-308,067,7240549,6730025,943,3950000000019,116,201
CroatiaHRV2023-10-214,045,117520,646568,527204,67700001373000004,067,642
CyprusCYP2023-02-031,309,054199,684254,53131,0160000898000001,220,541
CzechiaCZE2023-02-2115,662,3931,641,250886,784413,735034877011,0555000010,761,297
DenmarkDNK2023-02-1012,188,4601,781,705155,95246,00500000000005,825,798
EcuadorECU2023-01-288,552,67905,009,1630015,812,9350536,88200000018,033,738
EstoniaEST2023-02-031,632,730244,530238,91979,14100002160000001,324,323
European Union 2023-02-21664,200,258154,146,95467,192,14118,693,9131,845,37687802,319,8020303,7991260390509234748,835,153
FinlandFIN2023-02-0310,286,8492,068,166553,905000000000005,553,102
FranceFRA2023-02-19121,155,54124,121,7477,862,9621,090,803000038,6620036450065,520,147
GermanyDEU2023-02-20138,189,81431,585,36912,803,0493,761,2290000159,8590000707283,975,691
Hong KongHKG2023-02-1911,848,62600008,832,512000000007,585,785
HungaryHUN2023-02-039,778,1981,079,2041,252,978345,4381,807,39202,315,13000000009,618,215
IcelandISL2023-03-2876,55864931850000000000344,646
IndiaIND
IrelandIRL2023-02-179,603,0201,668,8161,217,678241,6010000850000005,008,554
ItalyITA2023-02-2195,718,71334,324,51112,169,2811,508,434000042,98300880060,320,493
JapanJPN2023-02-2029781552783,268,972117,83100000300,25000000125,802,521
LatviaLVA2023-07-111624384715,409262,043293,84309270438000001,855,735
LiechtensteinLIE2023-01-272247948,5990264000030000038,416
LithuaniaLTU2023-07-083315576328,959536,495295,56600000000002,670,680
LuxembourgLUX2023-02-17745,265342,155105,05941,522000049600000640,202
MaltaMLT2023-02-03747,037278,881227,87032,4210000000000443,646
NepalNPL2023-07-041,797,5827,336,24314,088,3493,693,4230019,972,478000000029,960,704
NetherlandsNLD2023-09-0222,697,2208,448,3842,473,242755,296000027070000017,195,298
NorwayNOR2023-02-039,766,9242,386,984148,123739400000000005,492,570
PeruPER2023-02-1951,520,2836,301,4498,186,52300021,266,40500000003,358,7011
PolandPOL2023-02-1041,943,0543,843,8705,292,3392,728,519000015,7460000033,587,011
PortugalPRT2023-02-1017,796,1203,928,4462,276,0831,138,78808423456802011210320010,150,252
RomaniaROU2023-06-1112,914,4391,008,835849,5592,054,653000000000018,900,064
SloveniaSVN2023-02-032,325,179236,211322,679135,7250000218000002,079,690
South AfricaZAF2023-12-2628,684,331009,366,280000000000069,797,779
South KoreaKOR2023-02-2152,280,18113,434,67620,084,9221,516,2950000258,451000536051,385,361
UruguayURY2023-02-152,565,508091,200003,248,201000000003,505,403
SwedenSWE2023-02-0317,262,6524,174,7511,322,0630000071110000010,265,156
SwitzerlandCHE2023-02-206,210,23110,633,270063,45400003427000008,821,366
UkraineUKR2022-02-2314,774,0133,044,8994,041,48720,68009,802,2310000000043,042,464
United StatesUSA2023-02-15400,059,167251,049,942018,976,061000078,68300000336,115,637
UruguayURY2023-02-152,565,508091,200003,248,201000000003,505,403
Table 2. Preprocessed data: These preprocessed data show the percentage of people vaccinated per country per manufacturer. Numbers greater than 1 mean that there were multiple doses administered in the country.
Table 2. Preprocessed data: These preprocessed data show the percentage of people vaccinated per country per manufacturer. Numbers greater than 1 mean that there were multiple doses administered in the country.
Pfizer/BioNTech%Moderna%Oxford/AstraZeneca%Johnson&Johnson%CovishieldSputnik V%%Sinovac%Sinopharm/Beijing%CanSino%Novavax%Covaxin%Medicago%Sanofi/GSK%SKYCovione%Valneva%
0.4302159550.3789034580.584000045000.45261234700.6312496310.021489602000000
1.8491551030.1835354850.175738990.040635448000000.001642825001.5 × 10−500.000238487
1.881136560.3764792330.2443926290.036759062000003.5 × 10−500000
0.4500551540.0745555740.0696952960.07722077500000000000
1.6819963430.7631939870.0745982330.000625057000000.00075334902.3 × 10−5000
0.42203594700.0287543010001.35714177700000000
0.9944623940.1279970070.13976820.050318342000000.00033754200000
1.0725194810.1636028610.2085394920.025411682000000.00073573900000
1.4554372950.1525141440.0824049370.038446574003.2 × 10−57.15527 × 10−600.0010272934.6 × 10−70000
2.0921528690.3058302060.0267692080.00789677200000000000
0.47425991200.2777662070000.87685287400.029770977000000
1.2328789880.1846452870.1804084050.05975959000000.00163102200000
0.8869779350.2058489820.0897288820.0249639900.0024643291.2 × 10−50.00309788100.0004056951.7 × 10−705.2 × 10−601.2 × 10−5
1.8524509360.3724343620.099746952000000000000
1.8491341450.3681577060.1200083080.01664836000000.000590078005.6 × 10−500
1.6455930560.3761251460.1524613710.044789497000000.00190363400008.4 × 10−5
1.561951202000001.16435042600000000
1.0166333360.1122041880.1302713650.0359149800.18791345400.2407026670000000
0.2221351760.0188396212.9 × 10−60.0002466300000000000
00001.23933630.06248300000.256657215000
1.9173238420.3331931730.2431196710.048237675000000.0001697100000
1.5868357210.5690356510.2017437260.025006991000000.000712577001.5 × 10−600
2.3673255880.6619022520.0009366350000000.00238667700000
0.8753318770.3855124790.1412071230.15834319004.8 × 10−61.5 × 10−500.00023602500000
0.5851468141.26507184500.006872137000007.8 × 10−500000
1.2414725840.1231742480.2008832960.11067069100000000000
1.1641091410.5344485020.1641028930.064857654000000.00077475500000
1.6838583010.6286115510.5136302370.07307853600000000000
0.0599979890.244862170.4702275690.1232755750000.6666224530000000
1.3199666560.4913194290.1438324590.043924566000000.00015742700000
1.7782065590.4345841750.0269678860.00134618200000000000
1.533934740.1876156530.24374074300000.6331734910000000
1.2487879320.1144451350.1575710030.081237327000000.00046881200000
1.7532687860.3870294060.2242390630.112193077000.0008298320.00045003801.98025 × 10−51.19209 × 10−503.2 × 10−600
0.6833013370.0533773320.0449500590.10871143100000000000
1.118041150.1135799090.1551572590.06526213000000.00010482300000
0.410963378000.13419166300000000000
1.0174139090.2614494820.3908685590.029508307000000.0050296620001.0 × 10−50
0.73187248400.026016980000.92662698100000000
1.6816745890.4066914330.1287913210000000.00069273200000
0.7039987911.20539948100.007193217000000.00038848900000
0.3432427340.0707417450.0938953450.000480456000.22773396500000000
1.1902426520.7469153900.056456942000000.00023409500000
0.73187248400.026016980000.92662698100000000
Table 3. Confusion matrix for Pfizer. This is an image of the confusion matrix for Pfizer. It means that the model accurately predicts 8 low risks (0) and 25 high risks (1) while incorrectly predicting 1 low risk.
Table 3. Confusion matrix for Pfizer. This is an image of the confusion matrix for Pfizer. It means that the model accurately predicts 8 low risks (0) and 25 high risks (1) while incorrectly predicting 1 low risk.
Actual
Predicted0—Low Risk1—High Risk
0—Low Risk80
1—High Risk125
Table 4. Accuracy for each vaccine candidate using a neural network. Our accuracy rate was the highest for the Sinovac vaccine candidate (it has the least amount of available data, however), and the combined vaccine data had the lowest accuracy rate.
Table 4. Accuracy for each vaccine candidate using a neural network. Our accuracy rate was the highest for the Sinovac vaccine candidate (it has the least amount of available data, however), and the combined vaccine data had the lowest accuracy rate.
Vaccine CandidateAccuracy RateError
Pfizer0.965517240.03448276
Moderna0.9883080.011692
J&J0.9656990.034301
All Vaccines0.94310.056785
Table 5. Summary of risks for each vaccine candidate. This table summarizes the risk for each vaccine candidate and the countries that are most affected by each candidate.
Table 5. Summary of risks for each vaccine candidate. This table summarizes the risk for each vaccine candidate and the countries that are most affected by each candidate.
VaccineCountries at Risk
Pfizer (countries with this vaccine are at a higher risk because most of their population has the vaccine—around 75%.)Japan, Austria, Finland, South Africa, Bulgaria, Ireland, Estonia, Norway, Czechia, Denmark
Moderna (countries with this vaccine and at “risk” are not at a very high level of risk, because the highest vaccination rate is around 30%; in other words, not very many people are vaccinated with this vaccine. As long as other vaccine candidates are used, these countries should not be at high risk.)Latvia, Netherlands, Spain, Canada, US, Switzerland, Italy, Luxembourg, Lichtenstein
Johnson & Johnson (countries with this vaccine and at “risk” are not at a very high level of risk, because the highest vaccination rate is around 10%; in other words, not very many people are vaccinated with this vaccine.)Romania, Bulgaria, Latvia, South Africa
Sinovac (countries with this vaccine are at a somewhat high risk because almost half of their population is vaccinated—around 45%.)Chile, Ecuador
Sinopharm (countries with this vaccine are at a higher risk because most of their population has the vaccine—around 50%.)Argentina
Sputnik (countries with this vaccine are at a somewhat high risk because almost half of their population is vaccinated—around 45%.)Argentina
Oxford (countries with this vaccine and at “risk” are not at a very high level of risk, because the highest vaccination rate is around 30%; in other words, not many people are vaccinated with that vaccine.)Argentina, Nepal, South Korea, Malta
Covaxin (countries with this vaccine and at “risk” are not at a very high level of risk, because the highest vaccination rate is around 26%; in other words, not many people are vaccinated with this vaccine.)India
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Bodapati, P.; Zhang, E.; Padmanabhan, S.; Das, A.; Bhattacharya, M.; Jahanikia, S. A Global Network Analysis of COVID-19 Vaccine Distribution to Predict Breakthrough Cases among the Vaccinated Population. COVID 2024, 4, 1546-1560. https://doi.org/10.3390/covid4100107

AMA Style

Bodapati P, Zhang E, Padmanabhan S, Das A, Bhattacharya M, Jahanikia S. A Global Network Analysis of COVID-19 Vaccine Distribution to Predict Breakthrough Cases among the Vaccinated Population. COVID. 2024; 4(10):1546-1560. https://doi.org/10.3390/covid4100107

Chicago/Turabian Style

Bodapati, Pragyaa, Eddie Zhang, Sathya Padmanabhan, Anisha Das, Medha Bhattacharya, and Sahar Jahanikia. 2024. "A Global Network Analysis of COVID-19 Vaccine Distribution to Predict Breakthrough Cases among the Vaccinated Population" COVID 4, no. 10: 1546-1560. https://doi.org/10.3390/covid4100107

Article Metrics

Back to TopTop