*4.2. Clustering of Repeatome Elements*

To process Illumina NGS data and to compare the repetitive DNA fraction of the studied species, a public web server running RE version 1 (http://www.repeatexplorer.org) (Cesk ˇ é Budˇejovice, Czech Republic) was used [53]. The discovery and characterization of repetitive elements in the genome was performed using "clustering" tools. An all-to-all sequence comparison of sequencing reads was performed using the mgblast tool. All hits with similarities above 90% over at least 55% of the sequence length were recorded, thus identifying a set of related DNA fragments. The information on similarity hits was used for construction of a graph in which nodes represent sequence reads and the edges between nodes correspond to similarity hits (Figure 2A). This algorithm was first applied to each species separately and subsequently for the seven species in conjunction for the comparative analysis of repeatome quantitative values. For comparative analysis, sampling was performed proportionally to the genome size of the species (Table 2) [26].
