*1.3. Characteristic of Peanut Genome*

Cultivated peanut is allotetraploid (2n = 4× = 40, AABB) with a genome size of 2800 Mb/1C and the genome composition of cultivated peanut was shown to have derived from a recent hybridization of *A. duranensis* (A subgenome) and *A. ipaensis* (B subgenome) [23– 26]. As the polyploidization event occurred recently, the genetic diversity of cultivated peanut is extremely low [27]. Peanut subgenomes are very closely related [28,29] and

have an estimated repetition rate of 64% [1], which makes the assembly of peanut genome sequences extremely difficult [1,26,30]. The genome sequences of the diploid ancestors (*A. duranensis* and *A. ipaensis*) of cultivated peanut were reported in 2016, which became the basis for understanding the genome of cultivated peanut [26]. The sequencing results of A. *duranensis* (A genomic progenitor) and A. *ipaensis* (B genome progenitor) provided new insights into the biology, evolution, and genome changes of cultivated peanut and accelerated the molecular breeding of peanut varieties [31].

Recently, the cultivated peanut allotetraploid A. *hypogaea* genome was sequenced in 2019 and compared with the related diploid *A. duranensis* and *A. ipaensis* genomes. A total of 39,888 A subgenome genes and 41,526 B subgenome genes were annotated in the allotetraploid subgenome [32].

### *1.4. Development of Molecular Markers Using Next Generation Sequencing (NGS) Technology*

In 2005, pyrosequencing technology was implemented using large-scale parallel sequencing or deep sequencing, revolutionizing next generation sequencing (NGS) technology and biological genomic research [33]. In the past decade, NGS technology made significant progress, and the cost of sequencing dropped sharply [27]. In addition, there have been innovative improvements in the productivity and accuracy of sequencing data. In particular, genome-wide studies using de novo assembly, resequencing, and a variety of bioinformatic methods have enabled the production of large numbers of single-nucleotide polymorphisms (SNPs) and simple sequence repeats (SSR) in complex genomes [26,34– 36]. In recent work, high-throughput genotyping was conducted using NGS technology through double-digest restriction-site-associated DNA sequencing (ddRADseq), a total of 14,663 SNPs were developed, and a genetic linkage map based on SNPs was constructed using 1765 SNP markers in 166 F9 RIL population from a cross between Zhonghua 5 and ICGV86699 [37]. Numerous SNP and cleaved amplified polymorphic sequence (CAPS) markers were developed from the re-sequencing of two Korean peanut germplasms of K-Ol and Pungan, which indicates that the molecular marker information can provide valuable guidance and information for peanut breeding programs [27].

Due to the relatively large genome size and the low genetic diversity in cultivated peanut, developing SNP array chips for high-throughput genotyping is necessary [38]. By DNA resequencing and the RNA sequencing of 41 peanut genetic materials and wild diploid ancestors, a total of 163,782 SNPS were obtained. A total of 58,233 unique SNP sequences with large amounts of information were selected to construct the high-density SNP array Axiom\_Arachis with 58K SNPs [39]. The high-density SNP Axiom\_Arachis array with 58K SNPs could be used to accelerate the process of high-resolution mapping and molecular breeding in peanuts.

### *1.5. Applications of High-Density SNP Arrays in Crops*

As the most abundant type of DNA sequence variation in the genome, SNPs could be successfully used to associate the genotypic variations with target phenotypes. Highdensity SNP arrays have been developed for high-resolution mapping of crops and are widely used in many applications that require a large number of molecular markers, such as high-density genetic profiling, genome-wide association study (GWAS), and genomic selection [38,40,41]. One hundred and seven U.S. peanut mini core collections were genotyped using a 58K Affymetrix SNP array and a total of 13,527 highly polymorphic SNP markers were selected for marker-trait associations in arachidic and behenic fatty acid compositions [42]. A total of 2882 polymorphic SNPs retained from the second edition of the Axiom\_Arachis array (Axiom\_Arachis2) were used to identify loci controlling pod construction trait using 195 F7 recombinant inbred lines (RILs) [43]. The 48K Axiom Arachis2 SNP array was applied to identify single nucleotide polymorphisms (SNP) among the two sets of RILs and the two original Nod+ parental lines to explore the genetic factors and genetic regions controlling nodulation in peanut [44].

Genomic-assisted breeding (GAB) using large amounts of genomic data related to important agronomic traits could be used to develop new varieties faster than when using traditional breeding methods. Detailed genetic maps consisting of thousands of arraybased SNPs have been used for the identification of genes controlling target traits [41,45]. GWAS, also known as whole-genome association study, is an observational study of a genome-wide set of genetic variants in different individuals to investigate whether any variant is associated with the target traits [46]. Any phenotypic differences could then be connected back to the underlying causative loci via various mapping approaches, including quantitative trait loci (QTL) mapping. Many research groups have used GWAS to identify associations between genotypes and phenotypes as well as to discover novel biological mechanisms [47]. Currently, most GWAS have been performed using high-throughput SNP data obtained by SNP arrays with a greater density of variants and a wide range of allele frequencies [48–51]. The GWAS format is easy to share and generate, and GWAS can be conducted using various applications and software [46].
