*3.2. LOC\_Os09g26970 on qGL9.1 Links to Grain Length in Rice*

*LOC\_O09g26970* is a cytochrome P450 structural domain (PF00067) gene, and the cytochrome P450 gene family is one of the largest supergene families in plants [31]. There are 356 P450 genes in the rice genome, and P450 plays an important role in various biochemical pathways that produce primary and secondary metabolites [32], some of which are essential for controlling plant cell proliferation and expansion. The proteins encoding cytochrome P450 families such as *D11* [21], *GW10* [22], *BSR2* [33], *GL3.2* [34], and *GE* [35] play an important role in regulating rice grain shape. In particular, the P450 family proteins encoded by *D11* and *GW10* play an active role in controlling the grain size through the biosynthetic pathway of brassinolide. In plants, BR is an essential steroid hormone that regulates many processes during plant development. It is involved in various biological reactions, such as stem elongation and vascular differentiation [36], especially in the regulation of grain size. Based on the re-sequencing data, this study found that *LOC\_Os09g26970* was significantly enriched by KEGG enrichment analysis, which was related to the biosynthesis of brassinolide (ko00905). Therefore, it is speculated that the pathway of *qGL9.1* regulating grain shape is likely to be similar to that of *D11* and *GW10*. We further sequenced the CDS region of *LOC\_Os09g26970* and found that there were 13 SNPs within it. Some of the haplotypes showed significant differences in grain length and grain width. We speculated that this locus was functional for grain length and grain width, but only showed differences in grain length in the genetic population of this study, which may have been caused by limited genetic variation. Next, we will construct various transgenic materials, such as knockout, overexpression, and complementation of *LOC\_Os09g26970,* to verify its biological function in regulating rice grain length and analyze whether the effect of *LOC\_Os09g26970* on grain length is affected by the BR pathway by applying exogenous BR.

### *3.3. Breeding Value and Potential of qGL9.1*

In general, the cooking and eating quality of *japonica* rice is better than that of *indica* rice. While *indica* rice has longer grains and a better appearance quality than *japonica* rice, the quality of indica rice with long grains is often inferior to that of *japonica* rice [2,37]. In recent years, the molecular breeding and utilization of grain shape genes in *indica* and *japonica* rice completed several important tasks. New *indica* hybrid rice varieties, Taifengyou 55 and Taifengyou 208, with an improved grain yield and quality were developed by pyramiding semi-dominant *GS3* and *GW7TFA* alleles from tropical *japonica* rice varieties [38]. The *GW8* and *GS3* alleles were polymerized into HJX74 to produce short and wide grains, resulting in the breeding of Huabiao 1 [12]. Using the deletion of *TGW6* and its alleles in the functional region, the functional marker CAPs6-1 of *TGW6* was developed and screened in order to quickly screen rice varieties carrying *TGW6* [39]. In this study, allelic variation A from *japonica* Pin20 was present in a small number of indica rice samples, but has not been identified in other *japonica* rice samples, indicating that *LOC\_Os09g26970* may be a rare grain shape regulator in *japonica* rice germplasm. In addition, 10 SNPs in the coding region of *LOC\_Os09g26970* had nine haplotypes in 3010 rice germplasms. The grain length of the germplasm containing the Pin20 genotype was 9.36 mm, while the average grain length of the other haplotypes was 8.60 mm. Therefore, Hap9 is the optimal haplotype of grain length, and Hap9 has the largest GL/W, which is an ideal allelic variation related to grain length. In addition, the GL/W of the germplasm corresponding to the nine haplotypes showed long-grain characteristics (the minimum GL/W was 2.52). Therefore, the nine haplotypes of *LOC\_O09g26970* are helpful to determine the grain length of rice germplasm. Moreover, we selected one SNP from Hap9 as a molecular marker to analyze the individuals with significant differences in grain length in the BC3F<sup>3</sup> population and found that SNP10

could be used as a target for marker grain length. Next, in *India–japonica* hybrid breeding, the selection of Hap9 can not only retain the excellent quality traits of japonica rice, but also help to improve its grain length. The KASP marker used in the study is a marker that is closely linked to GL, and it can be directly used for molecular-assisted selection. In addition, this locus can be inherited by offspring by inter-*japonica* hybridization, and long-grain varieties can be directly selected by conventional breeding methods.

#### **4. Materials and Methods**

#### *4.1. Plant Materials*

In this study, two japonica varieties, short-grain female parent SJ15 and long-grain male parent Pin20, were used as parental lines to develop 667 F<sup>2</sup> individuals and the corresponding F2:3 population. The F<sup>2</sup> population was planted during the normal growing season (from April to October) and the mature seeds were harvested subsequently. All 667 F2:3 lines were used for grain type identification after maturity. To fine map the target gene, one F<sup>2</sup> individual plant with long grains was selected to obtain BC3F<sup>1</sup> seeds by backcrossing with SJ15, and a BC3F<sup>1</sup> individual plant with a long-grain phenotype was selfcrossed to generate the BC3F<sup>2</sup> (725 individuals) and BC3F<sup>3</sup> (1570 individuals) populations. The 667 F2:3 individuals were used for QTL-seq and QTL mapping, and BC3F<sup>2</sup> and BC3F<sup>3</sup> were used to fine-map the *qGL9.1* candidate gene. All of the lines and their parents were planted at the Northeast Agricultural University experimental station (Heilongjiang Province, China; 47◦98 N, 128◦08 E; 128 m above sea level).

#### *4.2. Evaluation of Grain Type for Rice*

The grain size of the F2:3 and BC3F<sup>3</sup> populations was investigated when the rice was fully mature. We collected all of the spikes of each line in envelopes, placed them in natural light to dry, and then put them in an oven at 37 ◦C for one week. Three main spikes of each line, with approximately the same appearance, were randomly selected and used to measure the spike length and spike grain number. The grain length (GL) and grain width (GW) of 10 seeds of each line were measured with vernier calipers and the ratio of GL to GW was calculated. The phenotypic data for each line were measured in three replicates, and their average was used for data statistics.

#### *4.3. Construction of Segregating Pools and Whole-Genome Re-Sequencing*

Young leaves from 667 individuals of the F<sup>2</sup> population were collected separately for total genomic DNA extraction using a modified cetyltrimethylammonium bromide (CTAB) method [40]. Then, the genomic DNA of 30 extremely GL-type and 30 extremely GW-type individuals were selected as two bulked pools. To simplify the following description, we abbreviated the GL-type DNA pool as GL-pool, and the GW-type DNA pool as GWpool. For GL-pool, GW-pool, and the two parents, isolated DNA was quantified using a Nanodrop 2000 spectrophotometer (Thermo Scientific, Fremont, CA, USA). All DNA from the GL-pool and GW-pool was quantified at precise concentrations with a Qubit® 2.0 Fluorometer (Life Technologies, Carlsbad, CA, USA). Equal amounts of DNA from the GL-pool and GW-pool plants were mixed. The four DNA libraries were sequenced on the Illumina MiSeq platform using the MiSeq Reagent Kit v2 (500 cycles) (Illumina Inc., San Diego, CA, USA).

#### *4.4. QTL-Seq Analysis*

The raw sequencing data were filtered using an internal Perl script, provided by Biomarker Technology Co. Ltd. (Beijing, China). These high-quality data were then mapped to the Nipponbare-Reference-IRGSP-1.0 [41] using the Burrows–Wheeler aligner [42]. Using the Picard tool (https://sourceforge.net/projects/picard/) (accessed on 8 June 2021), repeat reads were removed based on the clean reads located in the reference genome. The SNP and InDel (1–5 bp) calling was realized with GATK [43], using the default settings. A series of filters were also used to obtain highly accurate SNP and InDel sets [44]. The association analysis was performed using the ED [45], calculation of the G statistic [46,47], and two-tailed Fisher's exact test [48] based on SNP. Finally, the overlapping interval of the three methods was used as the QTL interval.

#### *4.5. Further Mapping of the qGL9.1*

To further delimit the position of *qGL9.1*, we developed KASP markers linked to the *qGL9.1* interval, and then KASP marker primers were designed with Primer 5 software (Premier Biosoft International, Corina Way, Palo Alto, CA, USA) based on the re-sequencing data of the two parents. The 50 end of each KASP marker forward primer was ligated with FAM (50 -GAAGGTGACCAAGTTCATGCT-30 ) and HEX (5 0 -GAAGGTCGGAGTCAACGGATT-30 ) linker sequences. All polymorphic markers between the parents were selected and 667 F<sup>2</sup> individuals were genotyped using polymorphic markers to construct linkage maps and narrow down the candidate regions using the inclusive composite interval mapping (ICIM) module of QTL IciMapping 4.2. (http://www.isbreeding.net) (accessed on 19 June 2023) and are listed in Supplementary Table S2. The threshold of the LOD score for declaring the presence of a significant QTL was determined by a permutation test with 1000 repetitions at *p* < 0.001. Then, 725 BC3F<sup>2</sup> individuals and 1570 BC3F<sup>3</sup> individuals were used to screen the recombinants across the kompetitive allele-specific PCR (KASP) markers between the target regions. Each KASP marker contained two allele-specific forward primers and one common reverse primer. The reaction mixture was prepared according to the protocol described by KBiosciences (http://www.ksre.ksu.edu/igenomics) (accessed on 19 June 2023). All of the KASP primers are listed in Supplementary Table S2.

#### *4.6. Fine Mapping and Candidate Gene Screening of qGL9.1*

To fine map *qGL9.1*, the plants with interval heterozygous *qGL9.1* in the BC3F<sup>2</sup> population were identified by the KASP marker, and the BC3F<sup>3</sup> secondary population was obtained by selfing. The BC3F<sup>3</sup> population was genotyped, and the recombinant plants were screened to achieve fine-mapped *qGL9.1*. The main methods of mining candidate genes were as follows: (1) Ensembl (http://ensemblgenomes.org/) (accessed on 19 June 2023) was used to annotate the candidate genes, and the possible domains of candidate genes were detected by the Pfam database (http://pfam.xfam.org/) (accessed on 19 June 2023). (2) Mutant genes were screened according to sequencing information. (3) The genes with sequence variation were analyzed by using qRT-PCR. When the young panicles began to differentiate, those of Pin 20 and SJ15 were sampled at the lengths of 2cm, 5cm, 7cm, and 13cm. The expression characteristics of the candidate genes between the parents were analyzed by using qRT-PCR.

The total RNA of the rice was extracted according to the steps of the GeneCopoeia-BlazeTaq™ SYBR® Green qPCR Mix 2.0 extraction kit, and RNA purification and reverse transcription were carried out according to the steps of SIMGEN of Hangzhou Xinjing Biological Reagent Co., Ltd (No.8, Xiyuan 1st Road, Xihu District, Hangzhou, Zhejiang, China). Amplification was performed with a Roche LightCycler96 fluorescence quantitative PCR instrument at Northeastern Agricultural University. According to the transcription sequence of the gene, the specific primers of the candidate gene were designed with Premier 5.0 software, and the sequence is shown in Supplementary Table S10. The original *Actin1* in the rice was used as the internal reference [49], and the specificity of the primers was based on the standard melting curve. Three replicates were set for each sample, and the relative expression of genes in tissues was calculated using the the 2-∆∆Ct method. qRT-PCR analysis was performed as previously described [50].

#### *4.7. Haplotype Analysis of Candidate Genes*

According to the RFGB database (Haplotype analysis module of https://www.rmbreeding. cn/index.php (accessed on 19 June 2023)), the differential bases in the coding region of candidate genes between parents were searched, and the haplotypes of these differential bases in 3010 rice varieties and the variation of each haplotype in the different rice germplasms

were analyzed. The phenotypic data of the grain length, grain width, and aspect ratio in the RFGB database and their genomic information were used to analyze the differences between the different haplotypes of the candidate genes.

#### *4.8. Development of KASP Markers and Validation of GL*

To verify the above-identified *LOC\_Os09g26970* with GL potential, two non-synonymous SNPs (nSNPs) were screened from the exons of *LOC\_Os09g26970*, and the corresponding KASP markers were developed. The upstream and downstream 100-bp sequences of the target nSNPs were extracted from the Nipponbare genome sequence. Each KASP marker contained two allele-specific forward primers and a common reverse primer. The reaction mixture was prepared according to the instructions of KBiosciences (http://www.ksre.ksu.edu/igenomics (accessed on 19 June 2023)), and the KASP primers are shown in Supplementary Table S2.

#### **5. Conclusions**

In this study, we used F<sup>2</sup> and BC3F<sup>3</sup> populations to identify a major QTL *qGL9.1* controlling rice grain length from long-grain variety Pin20 by re-sequencing and fine mapping. Furthermore, combined with functional annotation, variation detection, and qRT-PCR analysis, the gene *LOC\_Os09g26970* encoding a P450 protein was identified as a candidate gene for *qGL9.1*. *LOC\_Os09g26970* and was divided into nine haplotypes in 3010 rice germplasm, and Hap9, which was consistent with the genotype of Pin20, contributed the most to the grain length among all of the haplotypes. In summary, we found a new grain length gene in early-maturing *japonica* rice in the northernmost part of China, and the molecular breeding application of this gene will hopefully assist in tackling the difficult situation of improving the yield of early-maturing *japonica* rice.

**Supplementary Materials:** The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/ijms241411447/s1.

**Author Contributions:** L.Y., P.L. and D.Z. conceived and designed the research. L.Y. and P.L. participated in data analysis. L.Y., P.L., J.W., H.L., H.Z. and W.X. performed the material development, sample preparation, and data analysis. L.Y. and P.L. wrote the manuscript. D.Z. corrected the manuscript. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was financially supported by the Heilongjiang Province Key R&D Program (2022ZX02B03), the Natural Science Foundation of Heilongjiang Province, China (LH2022C021), and the Postdoctoral Fund to Pursue Scientific Research of Heilongjiang Province, China (LBH-Q21097). All grants were provided by L.Y.

**Data Availability Statement:** The data presented in this study are available on request from the corresponding author. The data are not publicly available due to much unpublished genomic information from the resequencing data.

**Conflicts of Interest:** There are no conflict of interest to declare.

#### **References**


4. Choi, B.S.; Kim, Y.J.; Markkandan, K.; Koo, Y.J.; Song, J.T.; Seo, H.S. *GW2* Functions as an E3 Ubiquitin Ligase for Rice Expansin-Like 1. *Int. J. Mol. Sci.* **2018**, *19*, 1904. [CrossRef]


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
