**1. Introduction**

The field of microbiome research is quickly evolving and unravelling. Causal links between distinct microbial consortia, their collective functions, and host pathophysiology during the various stages of life are becoming increasingly clear. Studies of microbiome plasticity, composition, and function based on a distinction of the host phenotypes may lay the foundation for both therapeutic and preventive interventions [1]. Indeed, new practical aspects of microbiome studies will be focused on the personalization of actions as well as on an understanding of the inherent individual variability of microbiomes at di fferent ages, stages of development, conditions, and internal or external influences. These studies will allow the comprehension of physiological features to explain, or predict, human health and disease states. Therefore, clinical studies need to be well designed and the subject/patient phenotype properly selected. Age and many other factors have the potential to strongly influence the results, thus clinical studies on microbiota in children should take into account the di fferences that naturally occur during growth. Other technical challenges that need to be addressed are linked to properly establishing, harmonizing, and standardizing clinical protocols for sample collection, processing, sequencing, and analysis that also takes into account the "microbiome's age". The issues of diet, environment, host immune system, and genetics as key factors for determining microbiome and microbiota profiles have not been fully resolved yet. All of these influences can impact on the

microbiota composition at any age and may sometimes be di fficult to harmonize and standardize during clinical investigation.

Clinical and microbiological translation urgently needs to implement the main information on microbiota. This review aims to give a rapid overview of child microbiota in order to guide pediatricians to a better understanding of the field while trying to limit biases and intrinsic pitfalls before the study design and starting any clinical trials. Even if most of the reported literature and data specifically refer to the best studied community, in other words, the one inhabiting the gut, the knowledge discussed in the text, together with more practical aspects and recommendations, can also be adapted to the study of other medically-relevant communities (e.g., in nasal-oral cavities).

#### **2. Basic Knowledge on Gut Microbes**

The human body harbors trillions of microbial cells mainly represented by bacteria, but also includes archea, viruses, fungi, and parasites. These communities establish extensive networks of cross-feeding (trophic) interactions, consuming, producing, and exchanging hundreds of metabolites with each other and with their human host, with whom they constitute a unique ecological entity called "holobiont" [2,3]. Their highest density is reached in the intestinal compartment, particularly in the lower segments. Here, bacteria are estimated to reach a number of 10<sup>14</sup> cells and their density in stool have been calculated in the order of 10<sup>11</sup> per gram of dry material [4]. Although less-well studied, many other body habitats within healthy individuals are occupied by microbial communities such as the mouth and oral tract, nostrils, skin, vagina. The term 'microbiota' literally means all living organisms within a body-site habitat. More specifically, the term "gut microbiota" indicates the resident intestinal bacterial communities, and from a practical point of view, it is generally investigated, with obvious biases, through the analysis of fecal samples, which are easy and non-invasive to collect. The term 'microbiome' is used instead to refer to the genetic content of these microorganisms. Conventionally, research in the field is mainly focused on bacterial microbiome, but further fascinating results have come from the study of "virome", or the viruses inhabiting the gut, of "mycome", which reveals another intriguing world of gu<sup>t</sup> fungi, and of "parasitome".

New genetic and sequencing technologies have opened the way to the 'metagenomic' approach, which directly analyzes the total microbial genomes contained in a sample, that in turn, allows information to be acquired on the genomic links between function and phylogenetic evolution. Other approaches faced in the field include 'metatranscriptomics', the study of the whole RNA repertoire from a microbial community; 'metaproteomics', the study of the entire protein content from the community; and 'meta-metabolomics', the study of small-molecule metabolites produced through the interaction of diet and microbiome [5–7].

The analysis of the gene coding for the ribosomal 16S rRNA is very useful for studying gu<sup>t</sup> bacteria. 16S rRNA is a component of the prokaryotic ribosome and is coded by a gene spanning about 1500 bp. The 16S rRNA gene is highly conserved between di fferent species of bacteria, but presents nine variable ("V") regions that allow identification at the genus or species level. After amplification of, typically, 2–3 V regions, the obtained sequences are clustered into nearly-identical tags called 'phylotypes' or 'operational taxonomic units' (OTUs). These terms refer to a group of microbes generally through the threshold of sequence homology between their 16S rRNA genes (e.g., ≥98% for a 'species'-level phylotype) [8].

Eukaryotic components of the microbiota (e.g., fungi and protozoans) can be analyzed through homologous ribosomal gene sequences (small-subunit rRNA, SSU rRNA), while viral communities that lack ribosomal genes are investigated through shotgun DNA sequencing, or via primers targeted on conserved sequences in viral families. The above approaches are referred to as culture-independent, while culturomics is a culturing approach that uses multiple culture conditions, combined with the MALDI-TOF mass spectrometry and/or the 16S rRNA sequencing, for the isolation and identification of the largest possible number of bacterial species [9].

The gu<sup>t</sup> hosts taxonomically diverse archaea, bacteria, fungi, and viruses. Studies report at least 22 bacterial phyla in the body, mainly represented (>90%) by *Actinobacteria*, *Firmicutes*, *Proteobacteria*, and *Bacteroidetes*. In the gut, *Bacteroidetes* and *Firmicutes* represent the predominant phyla [10–12]. In addition to taxonomic composition, taxonomic diversity also needs to be considered in evaluating the homeostasis of microbiota. In particular, two parameters are routinely employed for this purpose: alpha diversity (within-sample diversity, how many taxa or lineages are present in a sample), and beta diversity (between-sample diversity, to which extent the guts of di fferent subjects or patients share taxa or lineages). Parameters that need to be evaluated when computing these ecological indices are richness (i.e., how many bacterial taxa) and evenness, which also takes into account the relative abundance of taxa, in addition to presence/absence, and compares it between subjects or patients [13].

In this context, measures of species richness (for example, the number of observed species or the Chao1 index, which is an abundance-based estimator of species diversity) and phylogenetic measures (Faith's phylogenetic diversity) are sensitive to the number of sequences per sample, whereas this is true to a much minor extent for metrics that combine richness and evenness (Shannon index).

Statistical and computational analyses still remain the main challenge in microbiome research. Some methods currently used for their power and e ffect size analysis are based on PERMANOVA, Dirichlet Multinomial, or random forest analysis [14]. Parametric statistical tests (for example, the Student's t-test and ANOVA) as well as measures of correlation including Spearman's rank correlation can be used on the basis of the phenotypes under study and the type of information the researcher wants to capture.

#### **3. The Intestinal Microbiota from Birth Throughout Childhood**

Addressing neonatal and early-life microbiota is pivotal as many of the events capable of shaping microbial communities even in adults take place during this phase of life: gestational age at birth, type of delivery, breast vs. formula feeding, weaning, use of antibiotics, etc. [15,16]. When neonatal microbiota begins is still a subject of grea<sup>t</sup> debate. The "sterile womb paradigm", in other words, the notion that, under physiological conditions, the human fetal environment is sterile and microbial colonization begins with birth, has been accepted for decades. Recently, with the burst of metagenomic studies, there has been a group of papers that have found traces of a lowly abundant bacterial colonization in the placenta, endometrium, amniotic fluid, and meconium in healthy, full-term pregnancies (see Nature Editorial by C. Willyard, 2018, [17] and references therein). This has led some researchers to date back the seeding of the microbiota to before birth ("in utero colonization hypothesis"). The field is still the subject of much debate, and the results appear in general to be controversial. Recently, several scientists have underlined that, even if it is possible that not all healthy babies are born sterile as previously thought, particular caution is necessary when working on samples bearing a low microbial biomass due to the heavy contamination issues notoriously connaturated with such samples when using molecular approaches based on next-generation sequencing [17]. Other important points that have been raised are the di fficulty of maintaining a strict sterility when collecting samples related to the in utero environment within a clinical setting, and the impossibility of using NGS-based techniques to discriminate DNA from viable cells and DNA belonging to dead organisms or derived from translocation from the blood stream [15,17].

The human intestine at birth is an aerobic environment, as such, while the adult gu<sup>t</sup> microbiota is dominated by obligate anaerobes belonging to the *Firmicutes* and *Bacteroidetes* phyla, the neonatal pioneer flora is composed by aerotolerant taxa, mainly belonging to the *Enterobacteriaceae* family (phylum: *Proteobacteria*). In a matter of days, however, these microorganisms will reduce oxygen levels, and the intestinal lumen becomes anaerobic. This allows the colonization by strict anaerobes, dominated by *Bifidobacterium* (phylum: *Actinobacteria*); *Clostridium* (phylum: *Firmicutes*); and *Bacteroides* (phylum: *Bacteroidetes*) [18,19]. During the first months, the diet of the infant is almost exclusively milk, favoring milk oligosaccharide fermenters as the already cited *Bifidobacterium*, represented, at this stage, by many species. Other predominant bacterial taxa are represented by *Enterococcaceae*, *Streptococcaceae*, and *Lactobacillaceae* [15].

A very recent paper [20] addressed the development of gu<sup>t</sup> microbiota in a large cohort of children, comprising cases who seroconverted to islet cell autoantibody positivity, children who developed type 1 diabetes (T1D), and matched controls (healthy). This interesting analysis followed the longitudinal maturation of the microbiome from 3 to 46 months of age and determined the covariates that significantly affected its development. Globally, this study harmonized data by collecting 12,500 stool samples from 903 children in three di fferent European countries and three US states. Breastfeeding and birth mode resulted in being the main factors able to drive gu<sup>t</sup> microbiome during the developmental phase by changing some relevant bacterial clusters. The authors proposed three distinct phases of microbiome progression: a developmental phase (months 3–14), a transitional phase (months 15–30), and a stable phase (≥31 months). The Shannon diversity index changed significantly during the first two phases, unchanging only during the stable phase. This study represents a very nice model of how to harmonize the age of the children with other covariate factors. Figure 1 presents a proposal for pediatricians to use a personalized staging of the enrolled individuals to di fferentiate relevant microbial clusters and dominating phyla.


**Figure 1.** The figure represents the seven golden steps that the pediatrician should follow before the enrollment of individuals/patients in the microbiota study.

#### **4. Issues to be Considered for Studying Microbiome in Clinical Studies**

#### *Study Design and Patient Selections*

Pediatricians should select children cohorts by trying to limit the confounding factors that have the potential of diluting the statistical estimates of the e ffect sizes of the microbiome. Thus, as an example, when defining disease-specific signatures, the diseased population should be recruited with particular care in choosing patients who display a relatively homogeneous clinical phenotype. The choice of controls is also a challenging question: a good control population includes patients with a clinical phenotype that is a clear contrast from the one under study, while matching other relevant criteria. To reduce the heterogeneity of the cohort, it is indeed mandatory to clearly define inclusion and exclusion criteria by considering the factors a ffecting microbiota analysis (see below) and matching, accordingly, cases and controls. In this regard, it is crucially important to collect information about potential confounding factors, among which age group, for moderating influences that can artifactually alter results and the outcomes of interest. This is important in order to decrease co-variability and heterogeneity during the enrollment, by increasing the power of the analysis in parallel. The collected information will form part of the "metadata" (covariates) surrounding the sample and will later be used

in analyzing the data. To ensure consistency, recording the maximum information about the subjects, sample, and experimental procedures is recommended. Finally, before starting the study protocol, a sample size should be estimated on the basis of the expected effect size, and evaluated by means of a pilot study or based on similar previous studies. Other recent approaches rely on computing the estimated sample sizes by calculating the independent effect sizes on microbiota variation of other factors (covariates) relevant to the phenomenon under study [21].

Table 1 summarizes the key aspects to consider when designing and conducting a microbiome study, lists the possible confounders and pitfalls, and presents practical solutions for risk mitigation.


**Table 1.** Practical aspects to follow when drawing and studying a Microbiome.


**Table1.** *Cont.*

#### **5. Major Pre-, Peri-, and Post-Natal Factors A**ff**ecting the Child Gut Microbiota**

A schematic representation of the factors that are able to affect the dynamics and composition of the intestinal microbiota is given in Figure 2.

**Figure 2.** Infant microbiota composition (**a**) and the main "major" and "minor" factors affecting analysis and results in microbiota studies (**b**).

#### *5.1. Maternal Factors Influencing Infant Microbiota*

#### 5.1.1. Changes Related to Vertical Transmission of Maternal Metabolites

During gestation, bacteria in the mother's intestine have been shown to drive the future immune maturation of the neonatal gu<sup>t</sup> through the passage of soluble molecules from the placenta in the absence of direct colonization and of the vertical transmission of viable bacterial cells [22,23]. These bacteria are able to induce specific changes in the gu<sup>t</sup> of newborns, creating new microbiota profiles.

#### 5.1.2. Changes Related to Dietary Patterns and Lifestyle

The intestinal microbiota is strongly personalized and influenced by a plethora of environmental and inter-individual variables including body mass index (BMI), exercise frequency, and dietary patterns and habits (which in turn, are strongly related with cultural factors and lifestyle). It has

been reported that the infant's fecal microbiota composition is influenced by the BMI and weight gain of the mother during pregnancy [24,25]. In general, the maternal microbial reservoir plays a crucial role in the acquisition and development of early infant microbiota, which in turn is the key to establishing a healthy host–microbiome symbiosis with long-lasting health e ffects. Therefore, it can be easily understood as to why maternal diet and lifestyle should be monitored and categorized as relevant metadata in infant microbiota studies. In an early phase, after the huge microbial "inoculum" at birth, the infant continues to directly acquire maternal gu<sup>t</sup> strains from di fferent sources (e.g., from skin, mouth, milk) and these are likely to become stable colonizers of the infant gut. Later in life, increasingly important roles are also played by other factors such as shared diet and lifestyle.

## *5.2. Genetic Factors*

There is growing evidence that geographical origin and host genetic makeup influence the acquisition and development of the gu<sup>t</sup> microbiota, with clear associations reported between the host genotype and the relative abundances of di fferent bacterial taxa. For example, Bonder et al. [26] described a single nucleotide polymorphism (SNP) in the LCT locus (coding for human lactase) that is related to varying abundances of *Bifidobacterium*. Goodrich et al. [27], by comparing microbiota across samples belonging to either monozygotic and dizygotic twin pairs, reported a number of microbial taxa whose abundances were strongly influenced by host genetics. Among such taxa, the *Christensellaceae*, considered a microbiome-based marker of obesity and is significantly enriched in individuals with low BMI, resulted in the most highly heritable taxon. Any data related to the genetic hardware of the child should then be noticed.

#### *5.3. Mode of Delivery*

At birth, the infant gu<sup>t</sup> communities tend to resemble the maternal vagina or skin microbiota in cases of vaginal or cesarean section (C-section) delivery, respectively [19,28]. Even later, when these "pioneer" foundation populations have been replaced, the birth mode seems to exert significant long-term e ffects on the structure of the gu<sup>t</sup> microbiota. At 24 months of age, the gu<sup>t</sup> microbial communities of cesarean delivered infants still appear to be less diverse [15]. Even in children as old as seven years, some authors have reported the enduring influence of the mode of delivery, but data are somewhat contrasting regarding this point [19]. Vaginally delivered infants tend to be colonized by *Lactobacillus* and *Prevotella*, while C-section neonates are preferentially colonized by microorganisms from maternal skin, and the hospital sta ff or environment.

#### *5.4. Mode of Infant Feeding*

Breastfed infants receive, from their mothers' milk, a complex mix that will a ffect the milieu within which their own microbiota will develop. This mix is made up of nutrients, antimicrobial proteins, short chain fatty acids (SCFA), secretory IgA, non-digestible oligosaccharides (HMOs, human milk oligosaccharides, that promote the proliferation of specific gu<sup>t</sup> bacterial taxa in the neonate), and live bacteria, even if previously considered germ-free [15]. The source of the "milk microbiota", which has a transient nature and declines rapidly at weaning, has recently been another subject of debate. At least some of the bacteria is thought to reach the mammary gland through an endogenous route called the enteromammary pathway, which has not been fully elucidated yet. It has also been suggested that mammary skin microbiota can travel via the lymphatic and vascular circulations to the breast ([15,16] and references therein). Gut microbiota di fferences between breastfed and formula-fed infants are indeed well documented. The former exhibit lower diversity indexes, indicative of a more uniform population where *Bifidobacterium* and *Lactobacillus* dominate. The latter are characterized by more diverse communities, with higher proportions of *Bacteroides*, *Clostridium*, *Streptococcus*, *Veillonella*, *Atopobium*, and *Enterobacteriaceae* [29]. Finally, compositional di fferences in microbial communities in human milk sampled from di fferent geographical locations have been studied and reported to create strong variability between newborn microbiota [30].

## *5.5. Gestational Age*

While in full-term infants, delivery and feeding mode are reported to represent the major drivers of microbiota development, in preterm (PT) infants (<37 weeks of gestation), the gestational age seems to have the biggest impact on the assembly of gu<sup>t</sup> communities [19,31,32]. PT neonates experience a number of unique challenges in the establishment of their microbiota. Their colonization patterns are characterized by the involvement of peculiar microbial sources, mainly bacteria deriving from the neonatal intensive care unit (NICU) environment [33]. Not rarely, these are strains implicated in nosocomial infections such as *Enterococcus* spp., *Staphylococcus aureus*, *Klebsiella pneumoniae*, *Acinetobacter* spp., *Pesudomonas aeruginosa*, and other *Enterobacteriaceae* [34] with their burden of antibiotic resistance genes. Other relevant features of this peculiar colonization trajectory are its extreme inter-individual variability, and the fact that, across studies, it does not appear to be univocally linked to health outcomes as necrotizing enterocolitis and late-onset sepsis. Instead, the colonization process seems to reflect the co-occurrence of a variety of nosocomial "variables" [35], among which are parenteral nutrition and antibiotic usage (see below). Antibiotics, normally administered to these patients, in turn perturbate the colonization process by killing bacteria acquired during birth and promoting the growth of taxa significantly di fferent from those found in more physiological situations [31]. In conclusion, the PT microbiota appears to be more unstable than that of full-term equivalents and is believed to be associated with a delay in the establishment of an adult-type signature microbiota [16]. All these individuals should be carefully selected and clearly categorized by the clinician before enrollment into the microbiota study.
