Stool

Samples were collected at home by patients within 24 h before the endoscopy and kept at −20 ◦C till delivery in the hospital where they were frozen at −80 ◦C.

#### *2.3. Extraction and Quantification of DNA*

DNA was extracted from each biological sample by using commercial kits (all from Qiagen; Hilden, Germany) and following the manufacturer's instructions according to the suggested procedure for bacteria (DNeasy ® Blood & Tissue Handbook). Specifically, the QIAamp DNA Stool Mini kit was used for feces; the QIAamp DNA Blood Mini Kit was used for saliva; and the DNeasy Blood and Tissue Kit was used for duodenal biopsies. More in depth, stool samples were first solubilized in a bu ffer provided in the kit in order to remove polymerase chain reaction inhibitors contained in feces; then, the DNA extraction was performed applying the same procedures used for the other sample biotypes. The DNA concentration of each sample was assessed fluorometrically.

#### *2.4. Production of 16S rRNA Amplicons (V3–V4 Regions) and Sequencing*

For amplicon production, the V3–V4 hypervariable regions of the prokariotic 16S rRNA gene were targeted [11]. Polymerase chain reaction was performed in a 50-μL volume containing template DNA, 1× HiFi HotStart Ready Mix (Kapa Biosystems; Wilmington, MA, USA) and 0.5 μM of each primer. The cycling program, performed on a MJ Mini thermal cycler (Promega corp.; Madison, WI), included an initial denaturation cycle (95 ◦C for 3 min), followed by a variable number of cycles (25 for saliva; 30 for feces and mucosa) at 94 ◦C for 30 sec, at 55 ◦C for 30 sec, at 72 ◦C for 30 sec, and at a final extension (72 ◦C for 5 min) [12,13]. Cleanup of amplicons was performed using Agencourt AMPure XP SPRI magnetic beads (ThermoFisher Scientific; Waltham, MA, USA). Illumina sequencing libraries were finally constructed through the link of indices (Nextera XT Index Kit, Illumina; San Diego, CA, USA), quantified using a Qubit 2.0 Fluorometer (ThermoFisher Scientific), normalized, and pooled. Libraries underwent paired-end sequencing on an Illumina MiSeq platform at BMR Genomics (Padua, Italy).

### *2.5. Power Calculation*

In this cross-sectional observational study, a convenience sample of 80 cases will be enrolled, possibly distributed as follows: 10 potential CD patients, 10 complicated CD patients, 20 active CD patients, 20 treated CD patients, and 20 non-CD patients. The e ffect size (mean di fference/standard deviation) that can be elicited, given the sample size and 80% power, was computed based on the primary endpoint, i.e., the comparison of the microbiota composition between CD groups (potential, active, treated, and refractory) and non-CD patients (controls). A conservative alpha of 0.001 was used, given the multiple endpoints and comparisons planned. The e ffect size that can be discovered on the basis of these hypotheses will be 1.8 when comparing potential and treated patients to controls and 1.4 when comparing active and treated patients to controls.
