*2.6. Data Analysis and Taxonomical Identification*

Sanger sequencing results were manually reviewed and processed using CLC Genomics Workbench 8.5 software (Qiagen, Hilden, Germany), and all obtained sequences were submitted to the GenBank database.

Intra- and interspecific distances between the barcode sequences were calculated in MEGA v11.0.9 [38]. The best DNA/Protein models (ML) search function determined the best substitution model for each barcode alignment. The selected best substitution model for the alignment was used to calculate the distances. Analyses were conducted using the Tamura 3-parameter model [39] with a gamma distribution (shape parameter = 0.51), and all positions containing gaps and missing data were eliminated (complete deletion parameter) for the trnL-F barcode; and the Kimura 2-parameter model [40] with a gamma distribution and complete deletion for ITS1, ITS2, and 5--ETS barcode (gamma distribution shape parameter: 0.77, 0.76, and 1.11, respectively).

Raw sequencing reads were trimmed using the trimmomatic software v.0.38 [41] with the parameters "LEADING: 3 TRAILING: 3 SLIDINGWINDOW: 4: 10 MINLEN: 40". Taxonomic classification of the reads was carried out using the BLAST-based bioinformatic pipeline described elsewhere [37]. Taxons that demonstrated abundance less than 1% for all barcodes in each sample were discarded from the analysis. Spearman's rank-order correlation was used to calculate the correlation between the mapped reads' abundance per species and actual pollen abundance in the artificial pollen mixes.

Analysis results were aggregated and plotted using Python with the Pandas [42], Matplotlib [43], and Seaborn [44] packages.
