*2.4. Variables*

The outcome of interest was not having received any dose of the pentavalent vaccine. The explanatory variables used were: household urban/rural location; wealth quintiles calculated from household proxies for socio-economic status using Principal Component Analysis (PCA); presence of a telephone or a radio set as a possible means of communication in the household; maternal/caregiver characteristics including relationship with the child, age, marital status, educational level, occupation, religion, the number of children in the household; gender of the child; birth registration of the child with the civil authority; potential financial barriers from the household such as having to pay for a vaccination

card or for another immunization-related service; and caregiver knowledge of vaccinepreventable diseases.

#### *2.5. Data Analysis*

We performed weighted descriptive analyses of household characteristics in the study sample with categorical variables reported as frequencies and percentages and continuous variables summarized using means and standard deviations or medians and inter-quartile ranges, depending on normality of the distribution.

The extrapolation of ZD children in the general population was made based on the target number of the DRC surviving infants estimated at 4,037,161 for 2021.

We conducted a bivariate analysis between ZD and factors using the Rao-Scott chisquare test, as it is adequate for multistage sampling, to compare the proportions according to the socio-demographic, economic, communicational characteristics, and those related to the system when the expected minimum was ≥5. Then, a multivariable logistic regression model was fitted. The automatic selection of variables using the forward type was used with an entry probability of 0.05. We considered it acceptable after verification of the area under the receiver operating characteristic curve (ROC area = 0.6640). We used the Archer–Lemeshow test to assess the goodness-of-fit of the logistic regression model, as the data was data collected using a complex survey design that involved clustering. Measures of association between each variable and ZD were reported as Adjusted Odd Ratio (AOR) along with their 95% Confidence Intervals (95% CI). Before gauging the final model, we checked the collinearity effect among the variables. The final model only included the variables whose effects remained significant after adjustment.

All analyses were conducted using Stata version 17 (StataCorp, College Station, TX, USA). To account for the complex sampling design, the svy command and the weighting taking into account the multistage design were used.
