**1. Introduction**

Mitochondria (mt) are semi-autonomous organelles that are part of almost all eukaryotic cells (cells with clearly defined nuclei). Their primary function is to produce a steady supply of adenosine triphosphate (ATP). Mitochondria are thus termed the 'powerhouses' or 'energy factories' of cells. Chloroplasts (cp) and mitochondria most likely originated from formerly free-living bacteria through endosymbiotic acquisition, which can explain the presence of their own genomes [1,2]. With rapid developments in sequencing and genome assembly methods, an increasing number of complete organelle genomes have been assembled in the last decade. Thus far, over 4900 complete chloroplast and plastid genomes have been assembled but only 321 plant mitogenomes have been assembled and deposited in GenBank Organelle Genome Resources (as of 14 May 2020; https://www.ncbi.nlm.nih. gov/genome/browse/), suggesting that their assembly is complex and difficult.

Mitochondria are specific to each plant and have complex genome structures [3–5], variable genome sizes [6,7], numerous repetitive sequences [8,9], multiple RNA editing modifications [10,11], and frequent gene gains or losses during evolution [9,12,13]. In seed plant mitogenomes, the genome sizes are highly variable, ranging from an exceptionally small genome of 66 kb in the parasitic plant *Viscum scurruloideum* [14] to the largest multi-chromosomal genome of 11.3 Mb in *Silene conica* [15]. Even

if two species are evolutionarily close, their genome sizes may vary considerably. The mitogenome sizes of plants in the subfamily Papilionoideae range from 271 kb in *Medicago truncatula* [16] to 588 kb in *Vicia faba* [17], while the mitogenomes of most papilionoid legumes are approximately 400 kb in length [18]. This wide variation in mitogenome size can be attributed to the proliferation of repetitive sequences and the acquisition of foreign DNA from other organisms during evolution [19,20].

Previous studies have documented that the mitogenomes of seed plants are enriched with repetitive sequences, including simple sequence repeats (SSRs), tandem repeats, and dispersed repeats. The SSRs in plant mitogenomes are commonly used as molecular markers for studying genetic diversity and identifying species [21]. The tandem repeats occur in a broad range of plant mitogenomes, which can also serve as molecular markers for unravelling population processes in plants [22]. Large dispersed repeats are the main causes of genome rearrangements, which may generate multipartite structures [13,23–25].

Although the mitogenome sizes of seed plants are variable, the functional genes of NADH dehydrogenase, ubiquinol cytochrome c reductase, ATP synthase, and cytochrome c biogenesis are quite conservative, except for succinate dehydrogenase genes and ribosomal proteins. Many primordial mt genes have been lost during evolution, which has been found to be closely related to their specific functions. For example, *sdh3* and *sdh4* were lost in all gramineous mitogenomes, the *rps11* gene was lost in the differentiation of gymnosperms and angiosperms [26], and the *cox2* gene was lost in the differentiation of the Phaseoleae and Glycininae [18]. Strikingly, nearly all of the universally present NADH dehydrogenase genes were lost from the mitogenome of *Viscum scurruloideum*, with the loss closely associated with its parasitic lifestyle [14].

The Fabaceae, commonly known as legumes, is an economically and ecologically important family of flowering plants ranging from small annual herbs to giant trees, most of which are herbaceous perennials. This family is the third-largest angiosperm family after the Asteraceae and Orchidaceae [27,28], consisting of about 770 genera and more than 20,000 species. A recent study by the Legume Phylogeny Working Group (LPWG) reclassified the three widely-accepted Fabaceae subfamilies (Caesalpinioideae, Minosoideae, and Papilionoideae) into six new subfamilies (Cercidoideae, Detarioideae, Duparquetioideae, Dialioideae, Caesalpinioideae, and Papilionoideae) based on a taxonomically-comprehensive phylogeny [28]. However, due to the complexity of plant mitogenomes, only 27 mitogenomes of Fabaceae species have been assembled and deposited in the NCBI Nucleotide database (14 May 2020), including 19 species in the Papilionoideae, six species in the Caesalpinioideae, one species of *Cercis canadensis* in the Cercidoideae, and one species of *Tamarindus indica* in the Detarioideae.

In this study, we assembled the complete mitogenome of the common bean *Phaseolus vulgaris*, an herbaceous annual plant grown worldwide for its edible dry seeds or unripe fruit. The common bean is one of the most important grain legumes for human consumption and plays an important role in sustainable agriculture due to its ability to fix atmospheric nitrogen [29]. We analyzed its gene content, repetitive sequences, RNA editing sites, selective pressure, and phylogenetic position, then made comparisons with other plant mitogenomes. The complete mitogenome of *P. vulgaris* will provide important information for the investigation of mitogenomic evolution among the Fabaceae family and aid the functional study of fabaceous mitogenomes. Mitochondrial biogenesis is very important in plant breeding and knowledge of the complete mitogenome provides an opportunity to conduct further important genomic breeding studies in the common bean.
