**1. Introduction**

Grass pollen is one of the major causes of allergy, affecting 10–30% of the population around the globe [1,2]. There are over 400 species of grass in Europe, and their pollen is recognized as the leading cause of pollinosis [3]. About 100 species of grass could be found in the European part of Russia [4], flowering periods of which often overlap, and their pollen allergenicity is estimated to be from moderate to very high [5]. Aerobiological monitoring is a necessary element of the complex of anti-allergic measures allowing for tracking and predicting the dynamics of the concentration of major allergens in the air and adjusting the therapy and lifestyle of patients with pollinosis. The standard method

**Citation:** Omelchenko, D.O.; Krinitsina, A.A.; Kasianov, A.S.; Speranskaya, A.S.; Chesnokova, O.V.; Polevova, S.V.; Severova, E.E. Assessment of ITS1, ITS2, 5--ETS, and *trnL-F* DNA Barcodes for Metabarcoding of Poaceae Pollen. *Diversity* **2022**, *14*, 191. https://doi.org/10.3390/d14030191

Academic Editors: W. John Kress, Morgan Gostel and Michael Wink

Received: 30 January 2022 Accepted: 2 March 2022 Published: 5 March 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

of pollen identification in the air samples is light microscopy. However, one of the major disadvantages of pollen light microscopy analysis is that similar pollen morphology of Poaceae species makes it challenging to discriminate species in airborne pollen mixes, which impairs the quality of aerobiological monitoring [6,7]. DNA metabarcoding is an alternative approach that has been actively developing recently, allowing qualitative (to the level of species or genus for some taxa) and quantitative (to some extent) composition analysis of complex biological mixes. It employs high-throughput sequencing (HTS) and comparative analysis of specific DNA sequences called "DNA barcodes" to discriminate species present in the mix. DNA barcoding has been widely used in various areas of botanical research; for example, the phylogeny of wild cherry [8], archaeobotany of grapevine [9], authentication and identification of medicinal [10] and poisonous [11] plants, and plant species composition of honey [12,13].

Choosing the correct DNA barcode for the target taxa is one of the main problems of plant barcoding [14,15]. The resolution capacity of each of the primary chloroplast markers (first a combination of *matK* and *rbcL* recommended by the CBOL group [16] and later the nuclear ribosomal internal transcribed spacer (ITS) regions and several intergenic spacers) vary significantly between different taxa (for a review, see [17]). Many studies focused on the DNA barcoding of plants; note, that the identification at the high-rank taxa (order, family) is successful in more than 90% of cases, while insufficient data on reference DNA barcode sequences prevents determination to the level of genus or species [13,18]. Therefore, the right choice of DNA barcode and the primers to amplify them is the key to successful species identification.

The regions of the chloroplast genome *rbcL*, *matK*, *trnL*, *trnH-psbA,* and nuclear ITS2 are most often used as plant DNA barcodes. Some of these barcodes have been used with varying success for metabarcoding pollen (airborne or from food products such as honey). However, only *rbcL*, *matK*, ITS2, and *trnL* barcodes have been studied compared to the palynological analysis for assessing qualitative and quantitative consistency [19]. In particular, a comprehensive study of ITS2 and *rbcL* has shown their usability in metabarcoding of pollen for the construction of pollinator networks and qualitative analysis of pollen mixes. Though, the quantitative relativity of the metabarcoding results and real pollen abundance of mixture components has been low [20,21]. Another study has assessed *trnL* and ITS1 for quantitative pollen analysis using metabarcoding and concluded that *trnL* demonstrates the best sequence-to-pollen prediction [22]. Furthermore, comparative studies have shown a good capability of *trnL* intron and *trnL-trnF* (*trnL-F*) intergenic spacers, ITS region, and their combinations to resolve grass species [23–26]. Indel and SNP patterns of the *trnL-F* intergenic spacer and ITS region have been employed for infrageneric classification and phylogeny study of *Chascolytrum* and *Festuca* genera [27,28].

External transcribed spacer (ETS) is another nuclear DNA barcode closely related to the ITS region in rDNA, but it is less frequently used than ITS. However, ETS is regarded as a promising DNA barcode as the taxon-specific informativity of the ETS sequence has proved to be the highest among nuclear and plastid barcodes in several studies [29–31].

Many published studies report the species identification of different grasses using only some of these barcodes and focusing on a particular plant taxon (e.g., [32,33]). In this study, we have compared the plastome *trnL-F* and nuclear ITS1 and ITS2 barcodes with the 5--ETS barcode and assessed their capability to identify the pollen of a diverse set of 14 grass species of different genera from the Poaceae family. New Poaceae-specific primers were designed to amplify the 5--ETS fragment suitable for the HTS sequencing as its length is less than 600 bp for all species in the study (maximum length for Illumina paired-end sequencing at present). Additionally, we have optimized the protocol for DNA extraction from pollen grains to obtain high-quality DNA for amplification and sequencing. To identify the pollen composition, we have created a local barcode sequence database for the reference Poaceae species using Sanger sequenced *trnL-F*, *ITS1*, *ITS2*, and 5- -*ETS* sequences of the live field samples, herbarium specimens, and available GenBank records. All four barcode sequences were tested by their capacity to resolve the composition of the grass pollen mixes using artificial pollen mixes of various complexity.
