**2. Materials and Methods**

We performed a cross-sectional analysis of aggregated COVID-19 vaccination coverage data in Guatemala from 13 February 2020 to 30 November 2022. As the lowest level of data availability was at the municipal level, we conducted an ecological analysis of factors associated with primary COVID-19 vaccine series' coverage by municipality.

Sociodemographic data variables at the municipal level were obtained through the 2018 Guatemala Population and Housing Census (Table 1) [22]. Municipalities are organized into 22 departments, and some variables only available at the departmental level were obtained through the 2014–2015 Demographic and Health Survey [29]. We chose sociodemographic variables *a priori* that could be proxies for healthcare access, poverty, and related variables that we hypothesized to be related to vaccination uptake [30–32]. We elected to use the general poverty indicator (percentage of each municipality population experiencing poverty) developed by Figueroa Chávez and colleagues and shared with our team [33]. Figueroa Chávez and colleagues developed the general poverty indicator by using the associations between sociodemographic variables in the 2018 Census and the poverty measure in the 2014 National Survey of Living Conditions [34] to estimate poverty at the municipality level [35]. According to their findings, the general poverty indicator ranged from 10.56% in the Jocotenango municipality in Sacatepéquez, to 94.59% in the Senahú municipality in Alta Verapaz [35]. COVID-19 vaccination data differentiated by municipality, department, and age were available at the MSPAS of Guatemala surveillance websites from 25 February 2021 to 30 November 2022 [11,12]. SARS-CoV-2 testing data (either antigen or polymerase chain reaction tests) and death data were also available through MSPAS from 13 February 2020 to 30 November 2022 [11]. A completed primary COVID-19 vaccination course was considered to be two doses of any of the four nationally available vaccines among people aged six years and older, consistent with current national guidelines [11]. The data used in this study were all de-identified, aggregated, and, with the exception of the general poverty indicator, publicly available.

**Table 1.** Data elements used in the analysis.


Municipal-level independent variables included in the model were the percentage of each municipality population reported to be of Mayan ethnicity, living in a rural residence, of the female sex, having attained primary school or higher educational level, experiencing poverty, aged 0–17 years or ≥60 years, and having died due to COVID-19 (Table 1). The reported number of SARS-CoV-2 tests by municipality was an additional independent variable. Independent variables at the departmental level included the under-five childhood mortality rate, the percentage of women aged 15–49 years who reported problems accessing health services when ill due to distance to a health establishment, the percentage of children aged 12–23 months who had received a third dose of Pentavalent vaccine (a combination vaccine against diphtheria, tetanus, pertussis, hepatitis B, and *Haemophilus influenzae* type b), and the Gini coefficient indicating income inequality. The dependent variable was the percent coverage of each municipality population with a complete primary COVID-19 vaccination course. Proportions of municipalities with completed COVID-19 vaccination and SARS-CoV-2 tests exceeding 100.0% (as total population estimates were from 2018) were capped at 99.0%. Given that the proportion of the population that died from COVID-19 by municipality was relatively small, the variable was scaled by 100 in the model to achieve a similar order of magnitude to the other variables.

Two subanalyses were also performed to assess whether the demographic associations with vaccination in the overall model were consistent among the subgroups. In the first, the dependent variable was limited to the population aged 60 years or older who had completed COVID-19 vaccination. We chose this subgroup given the initial national focus on vaccinating older adults. In the second subanalysis, the SARS-CoV-2 cases and COVID-19 vaccination data were confined to the period of the highest national COVID-19 related death rate, from 13 February 2020 to 1 October 2021 [6]. All count variables (derived from the census and the MSPAS) were converted to percentages to account for differences in municipal total populations. Data on deaths due to COVID-19 were missing for four municipalities (San Juan Tecuaco, Santa Rosa; Concepción, Sololá; Santa Catarina Palopó, Sololá; Río Blanco, San Marcos) and were removed from the multivariable models.

We calculated descriptive statistics for sociodemographic characteristics among municipalities and departments. We used Pearson correlation coefficients and variance inflation factors to assess potential collinearity within our model. The poverty indicator used in our analysis was developed using some of the variables included in our model, however, these common variables were not heavily weighted in the poverty index [33,35]. As this was the most robust measure of poverty by municipality despite potential collinearity, we performed a sensitivity analysis of the model without the poverty indicator and found similar results and chose to retain this variable (Supplementary Table S1). We identified municipalities with high Indigenous populations, rurality, and poverty who achieved a COVID-19 vaccination coverage of at least 70%, according to World Health Organization (WHO) guidelines [36]. We assessed relationships between municipal- and departmentallevel factors and COVID-19 vaccination using multi-level modeling, allowing for random department-level intercepts to account for differences between departments. The model results were robust to different specifications of the underlying error distribution. A multilevel linear regression model was selected to maximize both model fit and interpretability. All variables were included in the full multivariable model, and those variables with associations significant at *p* < 0.05 were included in the simplified multivariable model. We used a normal approximation of a 1000 replicate parametric bootstrap to generate our 95% confidence intervals and present both the marginal R2 (representing the proportion of the variance explained by the model-fixed effects) and conditional R2 (representing the proportion of the variance explained by both the fixed and random effects) for each model. All analyses were performed using R Statistical Software (v.4.2.2; Vienna, Austria) [37]. We hypothesized, based on our literature review, that COVID-19 vaccination would be negatively associated with higher poverty, rurality, and Indigenous population.
