*2.6. Co-Variates*

Data, such as age (years), DM type (1 or 2), education level (elementary, middle, or high school), marital status (married/single), employment (yes/no), per capita income (total family income divided by the number of persons living in the same house, in USD),

housing conditions (adequate when all had regular garbage collection, tap water, and sewerage system), pre-existing chronic diseases (hypothyroidism or chronic hypertension), and parity (number of previous childbirths) were obtained from medical records and were complemented in a personal interview with the researchers using structured questionnaires. Physical activity was assessed at baseline using the short form of the International Physical Activity Questionnaire (active, irregularly active, and sedentary) [33].

Skin color (white/black/brown/yellow) and years living with DM were self-reported. Energy intake (kcal) at baseline was obtained using a 24-h dietary recall from which the reported portions of food were converted into grams to quantify the energy content using food composition tables [34,35]. Gestational age (weeks) was calculated using the first ultrasonography performed in prenatal care, which was obtained at the time of inclusion in the study (all <28 weeks of pregnancy).

### *2.7. Statistical Analyses*

Data are presented as medians and interquartile ranges (IQR) for numeric variables, and absolute (*n*) and relative frequencies (%) for categorical variables. The normality of the continuous variables was assessed using histograms, kurtosis, and asymmetry measures. Mann–Whitney U and Kruskal–Wallis tests were used to compare continuous numerical variables, and chi-square or Fisher's exact tests were used for categorical variables. Genotype and allele frequencies of each variant were determined by direct counting, and deviations from Hardy–Weinberg equilibrium (HWE) were evaluated using chi-square tests.

Paired linkage disequilibrium (LD) patterns were determined for each gene using r2 statistics (r2 cutoff ≥ 0.8). Haplotype frequencies or allelic phase determination were estimated by expectation maximization (EM algorithm), and estimation uncertainty was included in the statistical models applied for association analyses in the form of weights. The homozygous/heterozygous genotypes and lower frequency alleles (minor allele frequency, MAF) in our population, or those containing them, were compared with the higher frequency alleles or genotypes containing higher frequency alleles (reference). Haplotype analyses used the most common haplotype in our population as a reference.

The incidences of excessive GWG were analyzed based on the events and years of persons at risk based on the follow-up time from the most likely date of conception to the most probable date of the outcome. Incidences and 95% confidence intervals (95% CIs) were estimated according to asymptotic standard errors calculated from a gamma distribution. Pregnant women who did not present with the outcome were considered from the most likely date of conception and censored on the day of delivery.

The results of the time-to-event analyses were presented in the form of hazard ratios (HRs) with 95% CIs, and the risks of progression to the events described above were estimated using Cox proportional hazard models. The assumption of risk proportionality was tested using correlation analyses and χ2 tests based on Schoenfeld scaled residuals and transformed survival times. The effects of the genetic characteristics of interest were corrected for phenotypic characteristics with at least one suggested association (*p*-value ≤ 0.1) with the outcome of interest and the marginal effects presented in the form of aHR. Pregestational BMI was not included in the adjusted model because of its potential mediating effect between genotype and outcome. Each polymorphism was evaluated using the additive, dominant, and recessive models.

Statistical analyses were performed using R software (Version 4.1.1) and its "genetics" and "survival" packages. Power analysis and sample size estimates were performed using the R code available on the Power and Sample Size platform (http://powerandsamplesize. com/Calculators/Test-Time-To-Event-Data/Cox-PH-2-Sided-Equality accessed on 18 January 2022). Considering the overall prevalence of the event of 50%, the frequency of minor allele carriers of 35%, a mean hazard ratio of 2, and alpha = 0.05, the minimum sample size for Cox proportional models estimated for power (1-eta) of 0.8 was 144. Nonetheless, we had a limited sample size (*n* = 70) that reached 56.35% statistical power for this analysis.
