*2.4. Statistical Analyses*

Statistical analysis was carried out with Stata16 (StataCorp, College Station, TX, USA) LLC for windows [55]. The comparison between the four groups of the study (OB-OB, OB-NW, NW-NW, and NW-OB) was based on chi-square tests (χ2) for categorical variables and analysis of variance (ANOVA) for quantitative variables.

The effect size for the difference between means was estimated through the standardized Cohen's *d* coefficient, considering null effect size for *|d|* < 0.20, low-poor for *|d|* > 0.20, moderate-medium for *|d|* > 0.50, and large-high effect for *|d|* > 0.80) [56]. The effect size for the difference between proportions was estimated through the standardized Cohen's *h* coefficient, which is interpreted similarly to Cohen's *d* measure and calculated through the arcsine transformation of the rates registered in each group (null effect size is considered for *|h|* < 0.20, low-poor for *|h|* > 0.20, moderate-medium for *|h|* > 0.50, and large-high for *|h|* > 0.80) [57].

An increase in Type-I error due to multiple significance tests was controlled with the Finner method [58], a family-wise error rate (FWER) stepwise procedure, which has proved more powerful than the classical Bonferroni correction. When controlling the *k*-FWER, a fixed number of k-1 of erroneous rejections is tolerated, and under the assumption that all the null hypotheses are equal, controlling the FWER at level α is equivalent to the problem of combining the original-unadjusted *p*-values to obtain single testing for the null hypothesis (H0), which is at level α. For example, from a procedure R that controls the FWER at level α is equivalent to derive a single testing procedure of level α by rejecting the H0 whenever R(*p*) is not empty (that is, whenever R(p) rejects at least one hypothesis). In practice, the Finner method is employed by adjusting the rejection criteria for each of the individual hypotheses fixing the FWER no higher than a certain prespecified significance level. The procedure starts sorting the *p*(unadjusted)-values (p1, ... , pk), achieved in k-independent null-hypothesis tests, into the order of lowest to highest. Then, the next algorithm is used: p(adjusted) = (1 − (1-p(unadjusted))ˆ(total tests/position within the ordered tests).
