*3.2. Evaluation of Japonica-Type SNP Value*

To identify the SNPs that originated from the *indica* and *japonica* genomes, the SNP value was evaluated. The SNPs showing a variety value of 4 and a reference value of 1 were *japonica*-originated SNPs. The total number of *japonica*-originated SNPs in each 100 kb block of each chromosome were counted in order to identify introgressed regions from *japonica*. Previously, we discriminated the Tongil genome into segments that originated from *indica* and *japonica* using the sliding window method [3]. In this study, we allowed for two exception from the HYV in order to include Takanari, Cheongcheong, Nampung and Minghui 63. Accordingly, a total of 14 *japonica*-originated genomic regions, which were shared by at least six HYVs on *japonica*-originated segmen<sup>t</sup> of Tongil, were detected. The common *japonica*-originated genomic regions were distributed on nine chromosomes, not including chromosomes 8, 10, and 12. There were three *japonica*-type regions on chromosome 2 and two regions on chromosomes 1, 4, and 7. Furthermore, the regions were clustered or closely located on each chromosome. The size of the regions was varied from 0.1 Mb for Chr7-1 and Chr11-1 to 2 Mb for Chr1-2. Out of the 14 regions, seven were common in eight HYVs (Figure 1, Table 3).


Abbreviation is as follows: ex.—except.

**Table 3.** *Japonica*-type SNP frequency (%) of Tongil and the other HYVs at common *japonica*-type regions among the eight HYVs.

### *3.3. QTL Comparison and Representative Gene Selection in Common japonica-Originated Genomic Regions*

To elucidate the function of common *japonica*-type regions in HYVs, we first investigated the reported QTLs in the Q-TARO database [35]. A total of 101 selected QTLs for seven categories were co-located, with 14 common *japonica*-type regions on nine chromosomes. Category classification was carried out by checking the character and trait names of each QTL manually. For instance, the yield-related trait category contained various characters which could a ffect yield potential, such as source activity- and sink-related morphological traits, and sterility. Only three regions on chromosome 2 were co-located with QTLs for all the seven trait categories. For eating quality, abiotic stress, and the yield-related category, 80 QTLs were identified. The largest number (10) of co-located QTLs was detected in the Chr6-1 region for eating quality. All the regions were co-located with the QTLs for abiotic stress tolerance (Table 4). This information of co-located QTLs, with common *japonica*-type regions, suggests that common genomic regions in HYVs might be mainly associated with quality, yield, and abiotic stress tolerance.

**Table 4.** Classification of the reported QTLs co-location with common *japonica*-type regions in eight HYVs.


Furthermore, we selected 39 genes containing non-synonymous SNPs, which could a ffect the molecular function of genes, and are clearly annotated in the databases of 12 common *japonica* chromosomal introgressions. There was no target gene that satisfied the above-mentioned condition in Chr4-2 and Chr7-1. The largest number (13) of selected genes was located on Chr1-2, which is the largest region, spanning 2 Mb. Only one gene was selected from Chr2-2, Chr7-2, and Chr11-1. The size of these three blocks was 0.1–0.2 Mb. The genes annotated from the major criteria of interest were *Os01g0348900, Os06g0130000, Os06g0130100* (stress tolerance), *Os06g0130400* (eating quality), and *Os01g0367100* (yield potential) (Table 5).

### *3.4. SNP Marker Development and Genotyping Using Fluidigm Platform*

A total of 39 SNP markers were designed in the 39 selected genes from common *japonica*-originated genomic regions, by one marker per one gene. The SNPs for the marker were selected from among the non-synonymous SNPs. Five SNP markers, out of the 39 SNP markers, were designed in 3 or 5 UTR. In addition, 14 agronomic traits, related to SNP markers in *indica*-*japonica* SNP set 2 [29] and four previously developed yield related SNP markers [36], were also used for the genotyping of 94 diverse germplasms. A total of 57 SNP markers were genotyped for 94 germplasms using the Fluidigm system, and consequently, 54 SNPs showed polymorphism and a clear genotype, except one monomorphic SNP marker designed in *Os01g0348900* on the Chr1-2 block and two SNP markers which showed low base call quality, SaF-CT and SLG7-GC, in *indica*-*japonica* SNP set 2. Therefore, we conducted a further analysis using a total of 54 polymorphic SNP markers (Figure 2, Table S2).


### **Table 5.** Selected 39 genes in the common *japonica*-type regions.

**Figure 2.** Genomic location of 54 polymorphic SNP markers used in this study. The markers represented by black and red indicate those newly developed on common *japonica* regions and those previously developed for agronomic trait related genes, respectively.

The results of genotyping showed a dividing pattern for 94 varieties. A phylogenetic analysis of 94 varieties was carried out using 54 polymorphic SNP markers. There were four groups, including IND1, IND2, HYV, and JAP, in the phylogenetic tree (Figure 3). All the sequenced HYVs, except Takanari, which possesses *indica* allele of Chr1-1 and Chr1-2, were clustered in the HYV-group with seven Tongil-type and five *indica* varieties. These varieties were developed by inter-subspecific crosses or by inter-varietal crosses including the parents of Tongil-type varieties. For example, Keunseombyeo was derived from a cross between Dasanbyeo and Namyeongbyeo [37]. Taebaekbyeo was used for the development of Hanareumbyeo and Dasanbyeo (Figure S1, F–G). On the other hand, among the five *indica* varieties, four *indica* (IR24, IRBB23, IRBB61, and IRBB66) were developed by the International Rice Research Institute (IRRI). IRBB23, IRBB61, and IRBB66 are near-isogenic line series of IR24 for the improvement of the resistance against bacterial leaf blight [38]. IR24 is a good eating quality and high-yielding variety developed from the inter-subspecific crosses using one tropical *japonica* (CP-SLO) and two *indica* (SIGADIS and IR8) [39]. Furthermore, IR24 was included in the breeding program of six HYVs, which were used for resequencing in this study. (Figure S1, C–H). This implies that the specific genomic regions were conserved in the HYV group by the selections during the conventional breeding programs for HYV development.

**Figure 3.** Phylogenetic tree of 94 germplasms based on 54 SNP markers and a genotype heat map. A 38 SNP marker set (left) and 16 SNP marker set (right) were developed on common *japonica* regions and agronomic trait related genes, respectively. The color of the marker ID is the same as in Figure 2. The varieties highlighted with yellow are eight resequenced HYVs. Homozygous alleles, which are identical to Nipponbare, were represented as red, and this is different from Nipponbare, represented as green and heterozygous alleles as blue. Grey indicates a missing genotype. The percentage values in parenthesis under each subgroup represent the percentage for homozygous *indica* allele in the 38 SNP marker set and 16 SNP marker set.

A total of 38 SNP markers were developed in the common Tongil-like *japonica*-like regions and discriminated IND1, IND2, and HYV by the frequency of *japonica*-alleles. The HYV group showed 93.8% for *japonica*-alleles frequency, which is similar to that of JAP. A total of 16 SNP markers were related to yield and some agronomic traits could distinguish JAP from the other three groups. For the 16 SNP set, the JAP group represented 86.1% of *japonica* allele, while the other three groups showed a lower *japonica* allele frequency, which was less than 30%. The HYV group contained *indica* alleles and were informed by the makers that were linked to the genes associated with plant architecture (SD1-GA, NAL1, and TAC-CT), yield potential (GIF1, Hd6-AT, Ghd7, and GW8-AG), and subspecies differentiation (Rd-GA, qSH1-TG, and S5-TC). They contained more than 50% japonica alleles using the markers linked to the genes for grain shape and quality (GRF4, GS3-CA, qSW5-AG, GS6-GT, and WAXY-TG) of HYV-type. Practically, the markers designed in the *japonica*-originated genomic regions and the yield-related markers from *indica* varieties can differentiate HYV from *indica* and *japonica* varieties. In fact, a high proportion of *japonica* alleles on Chr1-1, Chr1-2, and Chr3-1 were found in IND2, which consist of three *aus* varieties, one wild rice relative accession, and one weedy rice variety.
