*2.4. Sequence Analysis*

Bidirectional sequencing was performed for rbcL, matK, and ITS2 barcode markers. The obtained sequences were assembled and aligned in Geneious Prime v2021 (geneious.com (accessed on 27 December 2021).) and MEGA X. [70] using the Muscle algorithm. The sequences were then submitted to NCBI GenBank through a web-based sequence submission tool 'BankIt,' and accessions numbers were obtained for all the studied barcode markers (Table S1).

Further, the sequences were subjected to the taxonomic evaluation using the NCBI Gen-Bank BLASTn to obtain homologies between the fragments [71]. In addition, unsupervised OTU picking methods were employed, where the phylogenetic analysis was performed using MEGA, and the assessment of OTUs was performed using ABGD and ASAP.

Along with the unsupervised OTU picking methods, Supervised Machine Learning methods (SML) were also implemented to recognize divergent taxa. The aligned datasets were formatted to the WEKA's required file format using the FASTA to WEKA converter [54]. Further, in WEKA machine learning, the random forest and sequential minimal optimization classifiers were used through the 10-folds of cross-validation [72].

Phylogenetic tree construction was carried out in MEGAX. Initially, the phylogenetic model test was performed to determine the best-fit nucleotide substitution model with the lowest BIC scores (Bayesian Information Criterion). Accordingly, Maximum Likelihood (ML) phylogeny was inferred using the Kimura-2-Parameter (K2P) model with discrete gamma distribution was selected for the rbcL dataset. For the matK dataset, a ML phylogenetic tree was constructed using the General Time Reversible model (GTR) with discrete Gamma distribution (G). For the ITS2 dataset, ML phylogeny was achieved using the K2P model with discrete gamma distribution and invariant sites (G + I). All phylogenetic trees were given a bootstrap support of 1000. Moreover, for the phylogenetic tree annotation, the Interactive Tree of Life webserver was used. In addition, the ASAP webserver was used to build the partitions for species delimitation.
