*3.1. The Frequency and Distribution of Different SSR Types in Cucurbita Genomes*

A total of 103,056 microsatellite sequences were identified in the *Cucurbita* genome, including 34,375 SSR loci in the 269.9 Mb draft genome sequence of *C. moschata* cv. Rifu, 30,577 SSR loci in the 271.4 Mb draft genome sequence of *C. maxima* cv. Rimu, and 38,104 SSR loci in the 263 Mb draft genome sequence of *C. pepo* MU-CU-16 (Table S2). *Cucurbita pepo* had the largest number of markers with the smallest reference genome size, indicating the highest average density of markers (145 SSR/Mb). To obtain more information, we used *C. pepo* with a higher marker density as the control for the following comparative genomic analysis.

Here, we analyzed repeat types ranging from di-nucleotide to octa-nucleotide. Among all of these nucleotide motifs, di-nucleotide motifs (41.0%) were the most common type, accounting for 41.78%, 39.90%, and 41.01% of the total SSR loci discovered in *C. moschata*, *C. maxima*, and *C. pepo*, respectively, followed by tri-nucleotide motifs (16.97%, 19.19%, and 17.88%, respectively), whereas octa-nucleotide motifs (3.78%, 3.76%, and 3.38%, respectively) were the least represented repeat type in the three *Cucurbita* genomes (Table S2). In general, the frequency of the total SSR loci decreased with the increase in motif length, except for hepta-nucleotide SSRs.

We further examined the distribution of SSR motifs with regard to their repeat numbers (Figure 1). For all the repeat types, with an increase in the repeat number, the SSR frequency decreased sharply, and this change was more obvious in the longer SSR motifs (Figure 1). Consequently, the mean repeat numbers in the di-nucleotides were the highest of all of the repeat types. The analysis of individual SSR types revealed that some specific motifs were more prevalent than others in each class (Figure S1). For example, the AT motif was the most frequent di-nucleotide type in all three genomes, accounting for 31.61% (in *C. moschata*), 28.81% (in *C. maxima*), and 30.45% (in *C. pepo*) of the total di-nucleotide loci. Similarly, the AAT, AAAT, AAAAT, AAAAAT, AAAAAAT, and AAAAAAAT motifs (AATAATAT motif in *C. maxima*) were the most frequent types in each class. These results indicated that AnT-rich motifs were the most abundant in all SSR motifs in the *C. moschata*, *C. maxima*, and *C. pepo* genomes.

**Figure 1.** Distribution of SSR motif repeat numbers and relative frequency in *Cucurbita* genome. The vertical axis shows the abundance of microsatellites that have different motif repeat numbers (from 3 to >20) with different colors.

We also investigated the SSR density in each chromosome of the three *Cucurbita* species and found that the density of microsatellite loci was not correlated with the chromosome size (Table S3). For example, in the *C. moschata* genome, the SSR density of the longest chromosome (Chr04) had a medium density of SSRs, while Chr02, which is much shorter than Chr04, had the highest SSR density. A similar trend was also observed in the other two genomes, indicating that the distribution of SSRs was uneven in the *Cucurbita* chromosomes. To better understand the distributions of different SSR motifs, we further checked their frequencies on each chromosome (Figure 2). Our results showed that the distribution of different SSR types on the chromosomes corresponded with their frequencies and SSR density in the *Cucurbita* whole genomes.

The genomic sequences containing these microsatellites were screened for PCR primer design, and 94,272 SSR microsatellite loci were found to contain suitable flanking sites for SSR primer design. While *C. moschata* had the lowest proportion of SSRs suitable for primers design (84.75%), the percentages in *C. maxima* and *C. pepo* reached 94.53% and 95.09%, respectively (Table S2). Though the di-nucleotide repeat types were the most frequent in all three genomes, they did not exhibit good performance in primer design. Interestingly, the hexata-nucleotide repeat types had the highest ratio of SSRs suitable for primer design in all three genomes, followed by penta-nucleotide repeat types, indicating that the longer motifs were more suitable for primer design in *Cucurbita* species. Finally, a total of 91,248 SSR primers (28,194 in *C. moschata*, 28,061 in *C. maxima*, and 34,993 in *C. pepo*) were designed, with some primers including more than one SSR locus as the compound SSR (Tables S4–S6).

**Figure 2.** (**A**) The distribution of SSR repeat types on each chromosome in *C. moschata*. (**B**) The distribution of SSR repeat types on each chromosome in *C. maxima*. (**C**) The distribution of SSR repeat types on each chromosome in *C. pepo*. The vertical axis shows the number of microsatellites from di-nucleotide to octo-nucleotide which are discriminated by different colors. The horizontal axis shows different chromosomes of *C. ssp*, and LG00 means all the chromosome unanchored scaffolds.

## *3.2. Chromosome Synteny Relationships of C. pepo with Other Cucurbitaceae Species*

In order to understand the universality and correlation of SSR markers among Cucurbitaceae crops, we compared and analyzed the cross-species SSR markers between *C. pepo* and other Cucurbitaceae species by in silico PCR. We identified 391 cross-species SSR markers between *C. pepo* and *C. sativus*, 425 cross-species SSR markers between *C. pepo* and *C. melo*, 717 cross-species SSR markers between *C. pepo* and *C. lanatus*, 11,732 crossspecies SSR markers between *C. pepo* and *C. maxima*, and 15,274 cross-species SSR markers between *C. pepo* and *C. moschata* (Tables S7–S11). The collinear blocks to inversion blocks ratio was 26:26 between the *C. pepo* and *C. sativus* genomes, 25:36 between the *C. pepo* and *C. melo* genomes, 51:38 between the *C. pepo* and *C. lanatus* genomes, 154:158 between the *C. pepo* and *C. maxima* genomes, and 153:152 between the *C. pepo* and *C. moschata* genomes. Interestingly, the ratio of collinear blocks to inversion blocks was nearly 1:1 among the three *Cucurbita* species. Each *C. pepo* chromosome shared 3–36 SSR markers with *C. sativus*, *C. lanatus*, or *C. melo*. However, most of the *C. pepo* chromosome shared a larger number of SSR markers (3-1,436) with *C. maxima* or *C. moschata*. The *C. pepo* syntenic block, CpeCma7, had the largest number of shared SSR markers (i.e., 296) between *C. pepo* chromosome Cpe1 and *C. maxima* chromosome Cma4.

The physical positions of those common shared markers were compared. The main syntenic relationships between *C. pepo* and other Cucurbitaceae species are listed in Table 1, and the syntenic relationships visualized for *C. pepo* with *C. lanatus*, *C. melo*, and *C. sativus* are shown in Figure 3. The main syntenic relationships among the chromosomes revealed complex mosaic patterns. In Figure 3, each *C. pepo* chromosome was syntenic to more than two chromosomes in other Cucurbitaceae species. The *C. pepo* chromosomes Cpe9 and Cpe16 had the simplest syntenic pattern with watermelon, and each of them was mainly syntenic to one watermelon chromosome (Table 1). Cpe9 was syntenic to watermelon

chromosome W5, and 14 commonly shared SSR markers were found between Cpe9 and W5. From the markers CpeSSR15544 to CpeSSR16107, there were three blocks belonging to watermelon chromosome W5, and each block contained at least four SSR markers. According to the continuous physical positions of these markers on both of the reference genomes, the syntenic blocks CpeWM37 and CpeWM38 showed an inversion pattern, and the syntenic block CpeWM39 showed a collinear pattern between *C. pepo* and *C. lanatus*. Similar comparisons were carried out between *C. pepo* and *C. sativus* or *C. pepo* and *C. melo* using the cross-species SSR markers. The *C. pepo* chromosomes Cpe7, Cpe8, Cpe11, and Cpe20 had the simplest syntenic pattern with *C. sativus*, and each of them was only syntenic to one cucumber chromosome. Meanwhile, the simplest syntenic patterns between *C. pepo* and *C. melo* were mainly found on chromosomes Cpe15, Cpe18, Cpe19, and Cpe20. The most complicated syntenic pattern was found on *C. pepo* chromosome Cpe1, which corresponded to five chromosomes of *C. moschata*, four chromosomes of *C. maxima*, seven chromosomes of *C. lanatus*, three chromosomes of *C. sativus*, and five chromosomes of *C. melo*.

**Table 1.** The main syntenic relationships of *C. pepo* with other Cucurbitaceae species.


The syntenic relationships among different *Cucurbita* species were simple and clear. For instance, each of the 20 chromosomes in *C. pepo* was mainly syntenic with one chromosome in *C. moschata* or *C. maxima* (Figure 4), implying that the chromosomes in the *Cucurbita* genomes were highly conserved during evolution. Our results also showed that there were three main relationship patterns among the *C. pepo*, *C. maxima*, or *C. moschata* genomes, including (1) the eleven linear relationship chromosomes between *C. pepo* and *C. maxima* or *C. moschata* such as Cpe2–Cmo1–Cma1. Most of the cross-markers in the corresponding chromosomes showed collinear patterns. (2) There were eight inverted

relationship chromosomes between *C. pepo* and *C. maxima* or *C. moschata*. For example, the chromosome Cpe1 of *C. pepo* was inverted to the chromosome Cmo4 of *C. moschata* and Cma4 of *C. maxima*. (3) There was a mosaic pattern between *C. pepo* and *C. maxima* or *C. moschata*, for example, Cpe4–Cmo11–Cma11.

**Figure 3.** Syntenic relationships of *C. pepo* with (**A**) *C. lanatus*, (**B**) *C. melo*, and (**C**) *C. sativus*. Chromosome synteny between *C. pepo* and *C. sativus* was based on 391 cross-species markers; synteny between *C. pepo* and *C. melo* was based on 425 cross-species markers; synteny between *C. pepo* and *C. lanatus* was based on 717 cross-species markers. W1–W11 represent *C. lanatus*' eleven chromosomes, M01–M12 represent *C. melo*'s twelve chromosomes, C01–C07 represent *C. sativus*'s seven chromosomes, and LG01–LG20 represent *C. pepo*'s twenty chromosomes. Syntenic blocks are connected by the same color lines from *C. pepo* chromosomes.

**Figure 4.** Chromosome synteny of *C. pepo* (blue) with *C. moschata* (green) and *C. maxima* (yellow). The physical positions of chromosomes of each crop in the figure are arranged clockwise. Chromosome synteny between *C. pepo* and *C. moschata* was based on 14,276 cross-species markers; synteny between *C. pepo* and *C. maxima* was based on 10,655 cross-species markers. Cpe1–Cpe20 represent *C. pepo*'s twenty chromosomes, Cmo1–Cmo20 represent *C. moschata*'s chromosomes, and Cma1–Cma20 represent *C. maxima* chromosomes. The syntenic relationship between *C. pepo* and *C. moschata* are connected with the green color lines, and the syntenic relationship between *C. pepo* and *C. maxima* are connected with the yellow color lines.
