*4.2. Amplification, Library Preparation, and Sequencing*

Bacterial identification was performed by sequencing the 16S rRNA gene's hypervariable regions. The 16S rRNA gene was amplified by multiplex PCR using Ion Torrent 16S Metagenomics kit (Ion Torrent, Thermo Fisher Scientific Inc., Alcobendas, Spain), with two sets of primers, which targets regions V2, V4, and V8, and V3, V6–7, and V9, respectively. Amplification was carried out in a SimpliAmp thermal cycler (Thermo Fisher Scientific Inc., Alcobendas, Spain) running the following program: denaturation at 95 ◦C for 10 min, followed by a 3-step cyclic stage consisting of 25 cycles of denaturation at 95 ◦C for 30 s, annealing at 58 ◦C for 30 s, and extension at 72 ◦C for 20 s; at the end of this stage, the program concludes with an additional extension period at 72 ◦C for 7 min and the reaction is stopped by cooling at 4 ◦C. The resulting amplicons were tested by electrophoresis through 2% agarose gels in tris-acetate-EDTA (TAE) buffer, purified with AMPure® XP Beads (Beckman Coulter, Inc, Atlanta, GA, USA), and quantified using QubitTM dsDNA HS Assay Kit in a Qubit 3 fluorometer (Thermo Fisher Scientific Inc., Alcobendas, Spain).

A library was generated from each sample using the Ion Plus Fragment Library Kit (Ion Torrent), whereby each library is indexed by ligating Ion Xpress™ Barcode Adapters (Ion Torrent) to the amplicons. Libraries were purified with AMPure® XP Beads and quantified using the Ion Universal Library Quantitation Kit (Ion Torrent, Thermo Fisher Scientific Inc., Alcobendas, Spain) in a QuantStudio 5 Real-Time PCR Instrument (Thermo Fisher Scientific Inc., Alcobendas, Spain). The libraries were then pooled and clonally amplified onto Ion Sphere Particles (ISPs) by emulsion PCR in an Ion OneTouch™ 2 System (Ion Torrent) according to the manufacturer´s instructions. Sequencing of the amplicon libraries was carried on an Ion 530™ Kit (Ion Torrent) on an Ion S5™ System (Ion Torrent). After sequencing, the individual sequence reads were filtered by the Torrent Suite™ Software v5.12.1 to remove low quality and polyclonal sequences.

### *4.3. Bioinformatics and Statistical Analysis*

The obtained sequences were analyzed and annotated with the Ion Reporter 5.18.2.0 software (Thermo Fisher Scientific Inc., Alcobendas, Spain) using the 16S rRNA Profiling workflow 5.18. Clustering into OTUs and taxonomic assignment were performed based on the Basic Local Alignment Search Tool (BLAST) using two reference libraries, MicroSEQ® 16S Reference Library v2013.1 and the Greengenes v13.5 database. For an OTU to be accepted as valid, at least ten reads with an alignment coverage of ≥90% between hit and query were required. Identifications were accepted at the genus and species level with sequence identity of >97% and >99%, respectively. Annotated OTUs were then exported for analysis with R (v.4.1.2) (https://www.R-project.org/), where data were converted to phyloseq object [48] and abundance bar plots were generated. Data were converted to DESeq2 object [49] that uses a generalized linear model based on a negative binomial distribution to calculate differential abundance between groups. Thus, the differential abundance analysis was conducted according to the phyloseq package vignette with bioconductor DESeq2 (https://bioconductor.org/packages/devel/bioc/vignettes/phyloseq/inst/doc/ phyloseq-mixture-models.html#import-data-with-phyloseq-convert-to-deseq2, accessed on 10 January 2022). The raw abundance matrix was imported into phyloseq object (as specified in the documentation of phyloseq with DESeq2) and subsequently converted to DESeq2 object. Then, estimated size factors were used with the DESeq2 function to obtain

the differential abundance. DESeq automatically searches for outliers and, if possible, replaces the outlier values estimating mean-dispersion relationship. If it is not possible to replace, *p*-values are replaced by NA. R (v.4.1.2) was also used to perform a non-metric multidimensional scaling (NMDS) analysis on Bray–Curtis dissimilarity measures among samples based on relative OTU abundances (i.e., percentages). The relative abundances of OTUs were also used to test for statistically significant differences among age and sex groups. Group OTU compositions were compared through the non-parametric statistical tool ANOSIM. The 90% confidence data ellipses for each of the age groups were plotted. Alpha diversity was estimated based on Chao1, Shannon, and Inverse-Simpson indices by using the phyloseq package. To test for statistically significant differences between pairwise groups in alpha diversity, the non-parametric Wilcoxon test was used. Frequency of appearance was obtained by calculating the percentage of individuals in each age group in which that taxon occurs. The bar plots aggregated by groups (age and/or gender) show the aggregated relative abundance (sum of relative abundances). Krona charts that aid in the estimation of relative abundances even within complex metagenomic classifications were generated as previously described [50]. All the other graphs were generated with the R package ggplot2 version 3.3.3., including the confidence data ellipses which were plotted using the 'stat\_ellipse' function, also from this package [51].
