*2.8. Bioinformatics and Statistical Analyses*

The sequencing reads were processed as described earlier [43]. Briefly, the reads were denoised, merged and checked for the presence of chimaeras in dada2 [44]; they were then classified using SILVA v.132 reference [45]. Following this, OTUs were constructed, and a shared OTU table was calculated with Mothur [46]. Ecological analyses were performed in R using functions implemented in the vegan [47] and GuniFrac packages [48,49]. An unweighted UniFrac distance matrix was generated using the GuniFrac function from the rarefied OTU table and tree using the Relaxed Neighbor Joining algorithm implemented in clearcut [50]. Unconstrained ordination was performed using non-metric multidimentional scaling (NMDS), implemented in the metaMDS function of vegan. The grouping significance was tested with PERMANOVA (adonis function) using 999 permutations; dbRDA was performed on the unweighted UniFrac distance matrix using dbrda. The significance of the dbRDA model was tested with anova.rda using 999 permutations. *p*-value < 0.05 was considered significant.

Differences in taxa abundance were assessed with ANOVA (aov function in R); the normality of data was checked with the Shapiro–Wilk test (shapiro.test) and homogeneity of variance by Levene's test (levene.test of the lawstat packege). Where the assumptions were not fulfilled, the Kruskal–Wallis test was used (kruskal.test). *p*-values were corrected for multiple comparisons using Benjamini-Hochberg FDR (*p*.adjust); a significance level of 0.05 was used. Clinical categorical data were analyzed using two-tailed Fisher's exact test in R (fisher.test); again, a significance level of 0.05 was used.
