**3. Results**

#### *3.1. Genome Resequencing and Sequence Polymorphism Identification*

The successful mapping of QTLs relies on the genetic maps with high-density molecular markers between the accessions. Previously, the genome sequence data of *G. paniculata*, a wild-type accession with pink flowers (WT-P), was assembled onto 17 chromosomes [25]. To develop sufficient molecular markers for *G. paniculata* genetic research, we detected sequence polymorphisms between WT-P and another wild-type accession with white flowers (WT-W) through genome resequencing by the high-throughput sequencing platform MGISEQ-2000 PE150. After filtering, a total of 8.05 Gb of clean reads was generated, 82.23% of which were mapped to the reference genome, displaying an average sequencing depth of 5.70 (Figure 2B). Different kinds of natural genetic variations were detected between the reference and resequencing genome, including 2,377,499 SNPs, 1,366,056 InDels, 1403 SVs, and 28 CNVs, whose densities were shown on the circus map (Figure 2A). Interestingly, the InDels preferred to distribute at the end of the chromosomes rather than the centromeric region, as shown by the circus map. Meanwhile, the length of most InDels (>80%) was less than 5 bp, and the InDels between 5 and 10 bp accounted for 10% of this variation. There were about ~5% (68,302/1,366,056) InDels over 10 bp which are suitable for genome-wide marker construction (Figure 2C).

#### *3.2. Construction of InDel Markers for Polymorphism Analysis*

To develop InDel markers that can discriminate alleles between WT-P and WT-W, insertions or deletions over 10 bp were chosen as candidates with the interval of the neighbouring markers set as ~2 Mb. Sequence fragments about 400 bp long that contained either the insertions or deletions were used as templates to design primers. In total, 407 pairs of primers were designed for 17 chromosomes (Figure 3). To validate the newly designed markers, PCR analysis was conducted and the products were analysed by gel electrophoresis. Of the 407 markers, 289 markers distinguished the alleles of WT-P and WT-W clearly. Another 34 markers produced close bands on the 3.5% gel, but could still discriminate the alleles of WT-P and WT-W. These markers can be used when the chromosome region has limited markers, probably separated by gel with higher concentration. The success rates of the designed primers varied across the chromosomes from 40% to 92.9%, and the average success rate was as high as 71.0% (Table 1). Our data provided the successful establishment of genome-wide InDel markers based on a genetic map for *G. paniculata*. Nevertheless, it has to be acknowledged that for some chromosomes, such as Chr.4, Chr.7, Chr.12 and Chr.14, there were obvious gaps between two available markers, which was probably due to the low density of InDels on the centromeric region of these chromosomes. Thus, it might be essential to develop other molecular markers such as SNPs to compensate for these empties in the future.

**Figure 2.** Resequencing of WT-W based on the WT-P genome sequence. (**A**) Genomic structure variation distribution between the two *G. paniculata* wild-type accessions. a: reference sequence. b: SNP density distribution. c: InDel distribution density. d: CNV duplication. e: CNV deletion. f: SV insertion. g: SV deletion. h: SV inversion. i: SV translocation. Abbreviations include SNP: Single Nucleotide Polymorphism; InDel: Insertion/Deletion; CNV: Copy Number Variations; SV: Structure Variation. (**B**) The sequencing coverage depth distribution map of each chromosome of *G. paniculata*. The mean of read depth was calculated using the coverage depth (10,000 bp as the statistical window) by logarithm (log2). (**C**) The distribution of the InDel length between WT-P and WT-W.

**Figure 3.** The physical map of 407 InDel markers distributed across all 17 chromosomes of *G. paniculata* genome. The name code of the InDel marker was presented as a chromosome number with the physical distance. Green markers discriminate alleles between WT-P and WT-W. Red markers amplified close bands on gel, and black markers were unavailable.


**Table 1.** The successful rates of InDel markers for all 17 chromosomes.

Note: Green markers discriminate alleles between WT-P and WT-W. Red markers amplified close bands on gel.

#### *3.3. InDel Marker Polymorphisms among Commercial Cultivars*

The wild and commercial cultivars possess excellent agronomic traits, for example, wild types are generally more resistant, while the commercial varieties display larger flowers and more petals. However, limited research focuses on the genetic regulators underlying these traits, causing the relative mechanisms to remain unknown. To explore the applicability of the InDel markers designed in distinguishing the alleles between wildtype and commercial varieties, PCR amplification was conducted using the genomic DNA of WT-P and four commercial varieties (YX1-4) as templates. Out of the 407 pairs of

primers, 191 were able to discriminate alleles between WT-P and commercial cultivars. The polymorphism of the InDel markers between WT-P and commercial cultivars was then analysed by pairwise comparisons (Table 2). In total, the number of available markers for each pair of accessions ranged from 31 (YX1 vs. YX4) to 173 (YX1 vs. WT-P), with an average of 92. The InDel markers were suitable to discriminate alleles between WT-P and commercial cultivars (an average of 171 markers available) since a high degree of polymorphism was observed (Figure 4), whereas the markers available between the commercial cultivars were no more than 50. This implies that the commercial cultivars are closely related, which is consistent with the observation that all four commercial cultivars bloom white flowers but differ only in flower size.

**Figure 4.** Matrix of the polymorphisms using the InDel markers among the five accessions of *G. paniculata*. Blue squares are WT-P bands, green, yellow and orange squares represent bands different from WT-P, and grey squares mean no bands detected.


**Table 2.** Number of InDel markers that were polymorphic in pairwise comparison of five *G. paniculata* accessions.
