*2.1. Data Collection*

In order to share information on Twitter as widely as possible, Twitter provides broad access to public Twitter data via their own Application Programming Interface (API). In this study, Twitter's official API was used to collect tweets in real time between 1 July 2021 and 21 July 2021. The language filter arguments "EN" and "RT" were applied to only select English tweets and filter out re-tweets. Tweet scraping was conducted using 43 search terms relating to COVID-19 vaccinations (Table 1) on Twitter's asymmetric cryptography (OAuth2) process and saved into an SQLite database. Following a small pilot study to establish which key words would be most useful to investigate, key words were selected based on the COVID-19 vaccines available in the UK at the time of data collection and also to avoid collecting a large number of tweets that would have discussed vaccines in general rather than being specifically related to COVID.

A total of 137,781 tweets were collected and stored in a database. Data collected included the user's display name, twitter handle, tweet text and date/time the tweet was published.


**Table 1.** Text mining parameter details.
