*3.3. Questionnaire*

The questionnaire collected a total of 188 responses. A total of 6 responses were excluded due to the participants not meeting the requirements for this study or not agreeing to their data being shared and so we used the complete 182 responses in the analysis (Table A1).

A total of 31.9% of participants were between 18 and 29 years (the largest age group of participants), with 90.1% stating they had previously searched for information regarding COVID-19 online (e.g., Google). The most common length of time spent on social media was recorded as 'daily' (64.3%). Most of the participants (85.7%) had previously accepted all vaccines they had been offered), 73.8% were not concerned about receiving a COVID-19 vaccination, 17.1% were slightly concerned, 4.3% were very concerned and 4.3% stated that they were impartial.

We asked whether participants had accepted—or will accept—a COVID-19 vaccine. Of the 182 participants, 8.2% have not/will not accept the vaccine, 1.6% said they did not know, and the majority (90.1%) stated that they had already or would accept a vaccine. The most likely reason (40.2%) for accepting a COVID-19 vaccine was 'I want the world to go back to how it used to be before the COVID-19 pandemic', whereas the most common reason for not accepting the COVID-19 vaccine was 'I have done my own research and do not believe them to be safe' (52.9%).

In response to whether the participants would allow their child under the age of 18 to have a COVID-19 vaccination if they were offered them in the future, 26.8% would not vaccinate and 5.4% probably would not vaccinate their children against COVID-19. A total of 17.9% were unsure whether they would vaccinate their children, 8.9% probably would and 41.1% said yes, they would vaccinate their children. Participants with adult children (18 or older) or without children automatically skipped this question. We compared level of concern to vaccination acceptance or rejection (Figure 7). Out of 52 participants showing some level of concern, 15 of these participants rejected the vaccine.

We asked how the participants would consider their current depth of knowledge regarding vaccinations generally. Knowledge scores ranged from 0 (no knowledge) to 5 (deep/thorough knowledge). Overall, 2.2% stated that they had no understanding, 74.2% felt they had some understanding, and 23.6% had a deep understanding.

Several chi-square tests (significance level, alpha, of 0.05) were performed to determine whether there was an association between certain vaccine refusal prediction factors (Table 6). The results show that the uptake of COVID-19 vaccines was dependent on previous vaccine history (*p* < 0.001) and an individuals' level of concern (*p* < 0.001). However, vaccination understanding (*p* = 0.949491), age (*p* = 0.057899) and time spent on social media (*p* = 0.925771) did not influence the acceptance of COVID-19 vaccinations. Chi-square analysis was also performed between responses of the statement 'Vaccine safety and effec-

tiveness data are often false' and intensity of concern and found a significant relationship (*p* < 0.001) (Table 6). The majority of respondents who were not concerned about receiving a COVID-19 vaccine 'strongly disagreed' with the statement (52.89%), whereas those who were most concerned stated that they 'don't know' (42.86%).

**Table 6.** Chi-square statistical analysis to determine a dependent association between accepting a COVID-19 and the variables in the table. Vaccine safety (far right column) was analysed against how concerned the participant was.


#### **4. Discussion**

*4.1. Machine Learning vs. Lexicon-Based Approaches*

Sentiment analysis research has become popular over the past two decades [40,61,62]; as more efficient sentiment classification models are devised [63] and studies have compared automated analysis of conversations on social media with manual approaches [64].

Prior studies have compared machine learning methods of text analysis (i.e., SVM) with lexicon-based approaches [28,65,66] and often conclude the machine learning methods are more effective. For example, Sattar et al. (2021) concluded that VADER was less accurate than machine learning applications and used TextBlob in their study [28]. However, Dhaoui et al. (2015) determined that both approaches performed similarly when analysing Facebook reviews for both positive and negative classification [67]. Much of the literature on this is contradictory and highlights the need for continued research in this area of comparing the accuracy and precision of the machine and lexicon methods. For example, Nguyen et al. (2018) stated that SVM displayed 89% accuracy and 90% precision in comparison to VADER (83% and 90%, respectively) [68], whereas in a different study, SVM's accuracy and precision were different (71.8% and 66.8% and, respectively) as were that of lexicon-based approaches (71.1% and 65.1% and, respectively) [69]. Despite much of the literature claiming the inferiority of lexicon-based approaches, our research required classification of how positive and negative online sentiment was: one advantage of the VADER model [41].

In other studies, Microsoft Azure has been found to yield better results when compared to other analyser tools such as Stanford NLP [64], IBM Watson Natural Language Understanding, OpinionFinder 2.0 and Sentistrength [70]. However, as Azure only identifies polarity, it is a less accurate method of measuring an individual's opinion towards a topic compared to other approaches such as VADER [71] and so part of this study compared the sentiment analysis approaches of Microsoft Azure and VADER.

Previous studies have explored sentiment surrounding COVID-19 vaccinations on Twitter [72,73]. Xue et al. (2020) used Latent Dirichlet Allocation (LDA)—a machine learning approach—and collected four million tweets on COVID-19 using 25 search words. Their aim was to identify popular themes, sentiment, bigrams and unigrams. The NRC Emotion Lexicon classified sentiments into several emotions including anger, fear, surprise, sadness, disgust, joy, trust and anticipation and revealed that Twitter users display 'fear' when discussing new cases of COVID-19, as opposed to 'trust' [74]. Bhagat et al. (2020) used TextBlob to perform sentiment analysis and scraped 154 articles from blogging and news websites. Over 90% of the articles were positive and blogs were found to be more positive than newspaper articles [75]. Sattar et al. (2021) adopted a similar approach to the present study, analysing COVID-19 vaccine sentiment using a large number of tweets (*n* = ~1.2 million) using a lexicon-based classifier, namely VADER and TextBlob. They also defined their neutral sentiments between −0.05 and 0.05 and determined that public sentiment was more positive than negative.
