*2.1. Genome Assembly: Standard and Custom Tools*

The initial assembly was performed using Unicycler [13], an assembly tool optimized for the circular genomes. According to Unicycler, there were 10 circular contigs with a similar read coverage that ranged from 733.16 to 1835.25 and a total length of 404 Kb. The BLAST analysis indicated that these 10 contigs are mitochondrial, based on the presence of typical plant mitochondrial genes and on high similarity with the *Fallopia multiflora* mitogenome. These results are unexpected. Despite plenty of evidence that plant mitogenomes can exist in the form of multiple circles and even non-circular forms due to intramolecular recombination mediated by the repeats [14], the single circular molecule, which includes all the subcircles (so-called master circle), is usually recovered in the genome assemblies. This is true especially for the cases where the data are not limited by the shotgun short-read libraries but include mate pair libraries and/or long read data that allow resolving of long repeats. There are several reports of bipartite mitogenomes [15,16] including in Polygonaceae [7]. The higher number of circles is much rarer, in particular a number of circles more than 10 was found only in 4 out of 307 assembled plant mitochondrial genomes deposited in NCBI GenBank (as of 21 February of 2020). All of them represent very special cases: Extremely enlarged mitogenomes in the genus *Silene* [17] and parasitic plants *Lophophytum mirabile* [1] and *Cynomorium* species [18], in which a large part of the mitogenome is acquired by HGT from their hosts. Therefore, initially, we supposed that this result could be a misassembly. To check this, we developed a new assembly tool; it is called Elloreas (ELongating LOng REad ASsembler). It is based on principles similar to NOVOPlasty [19], a seed-and-extend algorithm optimized for the assembly of plastid and mitochondrial genomes out of whole genome sequencing data. While NOVOPlasty was created for short-read assembly, Elloreas performs best with long reads, though it can work with short reads too. An important feature of Elloreas is that it indicates the presence of alternative paths of the extension (in case if they exist).

Basically, Elloreas works in the following way:


6. It repeats all steps from "2." to "6." for this extended contig.

The work of Elloreas is regulated by multiple parameters, which can optionally be changed by the user, for example, the minimum percent identity required to map a read on a contig and the minimum number of reads supporting an extension required to extend the contig. We used contigs (identified as mitochondrial based on the BLAST search) assembled by Canu and Falcon, two widely used long-read assemblers, as starting sequences. The extension of these contigs by Elloreas showed that they correspond to circular chromosomes: After several iterations of extension, the parts on the 5 and 3 ends were found to be identical. Additionally, Elloreas indicated that during the extension there were no "forks", i.e., several alternative extensions supported by similar amounts of reads. This confirmed the existence of distinct circular chromosomes inferred by Unicycler. The sequences of Elloreas and Unicycler contigs were identical with one exception, an 857-bp deletion in Unicycler contig mito2. The mapping of the raw reads on Unicycler contigs showed that the variant assembled by Elloreas is the correct one. We expect that Elloreas will be useful for the assembly and assembly tests of other small genomes, whether circular or not.
