1. Introduction
In India, as the major food crop for more than 1.25 billion people, rice is grown in a 43.40-million-hectare (mha) area, with a paddy production of 157.20 million tones (mt) [
1]. Twenty percent of the world’s rice production is contributed by India [
2], which forms a major segment of Asia’s global share, amounting to about 670 million tonnes [
3,
4]. Although first in rice acreage, India ranks second after China in production. China produces 211.1 mt from 30.44 mha with a productivity of 69.32 q/ha compared with India’s productivity of 23.9 q/ha [
5,
6].
Among the two subspecies of the cultivated rice (Oryza sativa), indica and japonica have become genetically isolated in the evolutionary process due to their strong cross hybridization barrier. This has resulted in the containment of genetic diversity within each subspecies. Although predominant in the rice gene pool indica cultivars are relatively less productive than the japonica, japonica genotypes are relatively more diverse than indica genotypes. The Indian rice, which is majorly composed of indica, types is adapted to the tropical environment, while the japonica rice includes two subtypes: temperate and tropical (upland). Currently common in China and other southeast Asian nations, japonica rice prefers a cooler climate. The tropical japonica (TRJ), also known as bulu rice, is an intermediate type between indica and japonica. Considered previously as the third subspecies javanica, this group of genotypes can hybridize with both indica and japonica by overcoming the crossability barrier. Therefore, tropical japonicas are considered the bridge for indica–japonica hybridization and a source for increasing genetic diversity in rice. Tropical japonica, have fewer tillers, sturdy culm, and vigorous root architecture.
Being genetically less diverse, rice has a low average level of heterosis [
7]. Therefore, merging the genetic diversity across subspecies could bring in increased genetic diversity and therefore increased heterosis, for which tropical
japonica would be useful. Therefore, the genetic diversity in the tropical
japonica rice germplasm can be of great advantage for two reasons: (i) as one of the parents in the hybrid development of tropical
japonicas hybridizing both
indica and
japonica lines and (ii) as the bridge between true
indica and
japonica parents augmenting additional diversity during parental population improvement. Nevertheless, the use of tropical
japonica in hybrid rice breeding has been limited due to their lower productivity and other undesirable traits [
8]. Furthermore, the information on their natural diversity among these group of genotypes is restricted and is less characterized at the molecular level [
9]. An assessment of genetic diversity can be carried out using morphological, biochemical, and molecular markers. This would aid in selecting diverse parental lines for hybrid development. Morphological selection is reasonable for the exploitation of quantitative traits and often marred by the high influence of the environment. In this context, molecular-marker-based diversity is more reliable as it is free from environmental influence. Thus, markers help in the precise characterization of genome diversity, identifying novel alleles and facilitating the development of new breeding lines with desirable traits. A wide range of marker systems are now available, among which SSRs are commonly used due to their abundance, genome-wide distribution, and high polymorphism [
10,
11,
12].
Genotypes accumulate several viable mutations during the evolutionary process, which forms the basis of genetic diversity. Furthermore, forces such as recombination, random drift, natural selection, etc., shape the genetic structure of populations. In the recent past, understanding the population structure has become a feature of great interest, as this can help in thte selection of diverse parents and in mapping marker–trait associations. As a tool, an analysis of population structure can estimate the similarity levels among the individuals, subpopulations, as well as admixtures. When the samples are drawn with diverse geographical origins, an analysis of population structure indicates the pattern of geographical distribution among the populations. In practice, the most common approaches to stratifying population structure are model-based predictions and principal component analysis (PCA). The model-based predictions help in estimating the co-ancestry coefficients for defining population structure, whereas the PCA helps in reducing the multi-dimensional data into principal components that are orthogonal and independent, thereby providing opportunities to visualize the distribution pattern of the individuals [
13].
In this study, we characterized a set of tropical japonica lines collected worldwide, using SSR markers and agro-morphological traits along with 23 indica, 4 aus, 3 Basmati, and 6 wild rice genotypes and estimating the phenotypic and genotypic diversity and population structure. The quantification of genetic variability for agronomic traits has a particular focus on yield and its components. We aim to use this information towards identifying potential parental sources in future hybrid rice development, particularly indica–japonica-based high yielding hybrids.
4. Discussion
The information on the gene diversity in plant genetic resources provides an important basis for crop improvement [
25], essential to sustaining a high level of productivity [
26], and offers opportunities for breeders to produce superior varieties/hybrids with the best combination of economic traits. Indian rice germplasm possesses rich diversity among the
indica subgroup, which has provided a breakthrough in production and productivity that is confined to the subspecies. Similarly, limitations are also observed in the other major sub-group,
japonica. It has been demonstrated that assembling the genetic diversity of both sub-species through breeding could result in harnessing unexploited heterosis, which is naturally hindered due to the evolutionary cross incompatibility between
indica and
japonica [
27]. Although rice hybrids are a reality today, the heterosis levels realized are not convincing enough for outright adoption by farmers. One of the major reasons for low heterosis is the poor heterotic diversity among the parental lines within each of the subspecies.
Hybridization between
indica and
japonica has been the subject of rice research for a long time. Several breakthroughs have been reported, such as the identification of widely compatible genes, the development of new plant type (NPT)-based breeding lines [
28], and the use of tropical
japonica as the bridge between them [
29]. NPT lines characterized by low tillering capacity, possessing large panicles inherited from the tropical
japonica gene pool, and carrying widely compatible genes are capable of hybridization with
indica lines are being used for the diversification of the parental genetic base in
indica types. Additionally, this opens avenues for the exploitation of heterosis through inter-varietal and interspecific hybrids [
30]. Moreover, the utilization of NPT lines provides an opportunity to improve the poor grain filling and the low biomass of interspecific hybrid rice. Notably, several improved rice varieties were developed at the IRRI using NPT lines derived from tropical
japonica germplasm [
31]. Therefore, tropical
japonica holds the key to improving rice lines, particularly those belonging to the
indica subspecies, via the development of NPT-based hybrids using information about genetic diversity. Furthermore, the diversification of the parental base is essentially needed in hybrid rice development, as most of the existing cultivar diversity has been tested for combining ability. Combining ability coupled with genetic diversity could lead to the development of heterotic pools in rice that potentially will lay a foundation for the next generation of hybrid rice breeding. Genomic tools, particularly using molecular markers, have become one of the most robust tools used to evaluate genetic variability, genetic structure, and phylogenetic similarities in rice germplasm [
32].
In this study, we comprehensively characterized a global collection of tropical
japonica genotypes for genetic diversity using phenotypic and genotypic features. The geographic locations included 41 countries/territories spread across six continents. Together, 36 rice lines from India, including six wild rice and two
aus cultivars, were also included in the evaluation. Agronomically, the tropical
japonica genotypes were highly variable for traits such as tiller number, panicle length, and yield, three major characters used in NPT development. The apparent morphological variation present in grain yield indicates that they harbour variable gene combinations that can give differential yield expressions [
33]. Most importantly, the largest contributor to the total variation among the tropical
japonica lines was panicle length. This implied that NPT development with panicle length as a major yield contributor could be realized from the current population. This was followed by another important yield component: total grain number. A preponderance of high variability for these two traits further consolidates the usability of the tropical
japonica lines directly into introgression breeding with
indica lines. Third, plant height showed a significant contribution to total variation, indicating the opportunity to select a suitable plant height while breeding for genotypes adaptable to various geographical locations. For instance, taller genotypes may be preferred if adaptation is sought for problem areas, such as those with water stagnation as a characteristic feature, while semi-dwarf types are suitable for irrigated ecosystems [
34]. One of the major traits that are preferred in NPT development is tiller number. Optimal tillering, not too low and not too many, is desirable when ensuring adequate partitioning of metabolites into grains during the grain-formation stage. Excess tillers or late tillers may divert the key metabolites to growth instead of grain formation, which may render panicles under development or may reduce the number of spikelets [
35]. Therefore, NPT is an optimized combination of tiller number, panicle length, and grain number leading to high grain yield [
36]. Other traits such as spikelet fertility could also play a significant role in yield; hence, high spikelet fertility among NPT lines is the most desirable trait for recruitment into breeding. We could realize all of these features in the current panel, both through variability investigations on individual traits as well as through multivariate analyses using PCA. When grouped, the tropical
japonica lines were stratified into six clusters based on the total phenotypic variability, wherein 75.8% of the total variation was explained by PC1, PC2, and PC3. Furthermore, we observed that all of the desirable traits of NPT were available in the panel as major contributors of variability. Messmer et al. [
37] mentioned that, if the variation is higher or equal to 25%, clustering can be performed to show the similarities among the genotypes. In our findings, traits such as spikelet fertility, yield, tiller number, and panicle length had adequate variability to contribute to the major principal components that could adequately scatter the genotypes in a two-dimensional plot.
One of the striking observations in the genotype grouping based on yield and the related trait is the incongruity of origin and the clusters. It was seen that all of the clusters had genotypes that originated from countries irrespective of the continents. Although this random pattern could be coincidental, scientific evidence strongly supported the movement of rice from its centre of origin to different parts of the world [
38]. Evidence confirmed that the rice originated from China and spread throughout the world through the Indian route, which has further transformed itself as the secondary centre of origin, particularly accumulating
indica subspecies [
39]. Therefore, it is prudent to conclude that tropical
japonica lines sourced from different parts of the world in this study have moved from southeast Asia. We have seen more evidence for this hypothesis from marker-based genetic diversity. Since the inception of IRRI in the 1960s, a lot of rice germplasm has transferred across continents, accelerating the genetic migration of rice. The Philippines, as the host nation of IRRI, has become pivotal to the rice genetic network across the world. Some good phenotypic characters observed among the tropical
japonica lines sourced from the Philippines and the United States underpins the role of genetic networks in the modern era because the United States neither uses rice as its staple cereal nor has natural genetic diversity.
The development of heterotic pools for hybrid rice breeding programmes requires precise evaluation of genome-wide genetic diversity. Earlier, IRRI published a set of the 50 most informative rice microsatellite markers that are robust enough to divulge rice genetic diversity known as GCP markers after the Generation Challenge Programme of Consultative Group for International Agricultural Research (CGIAR) [
15]. Of the 50 markers, 46 were polymorphic (92%) in the germplasm assembly and produced as many as four alleles per locus, which indicated a high resolving power of the markers for genetic diversity, even within a section of tropical
japonica lines. A similar average number of alleles/markers was reported earlier by Anandan et al. [
40], who examined 629 rice genotypes, including tropical
japonica and
indica accessions using 39 genetic markers. However, an average number of alleles of 2.42 alleles/locus was reported from a set of 100 iso-cytoplasmic restorer lines derived from the 25 commercial hybrids using GCP markers (reference). This further suggested that a significant portion of genetic variability within the germplasm assembly used in this study could be exposed by the GCP markers, indicating their suitability as the smallest and quickest set of explorative markers for genetic diversity investigations in rice. According to DeWoody et al. [
41], if the PIC value is more than 0.5, the marker will be considered highly informative, such as six markers identified in this panel, while the average PIC remained as high as 0.36. Several researchers observed similar results using SSR markers in rice [
42,
43,
44], whereas much higher PIC values were also reported in some other studies [
45,
46,
47].
Different grouping patterns have emerged from phenotypic and genotypic data, with altered orientations and distributions of genotypes. This pattern is not unexpected because the genotypic data were derived from random SSR markers and the phenotypic data were based on a few yield and related traits. Unless there is a close association between marker and trait, it is unlikely that both data types can produce a cognizable resemblance. Moreover, the quantitative traits under the influence of the changing environment can also introduce ambiguity in the distribution pattern within and between clusters, whereas such an uncertainty does not exist with the data obtained from genetic markers as they are free from environmental influence and maintains high accuracy and repeatability in the prediction pattern. The genetic architecture of modern-day rice genotypes is designed by processes related to domestication, geography, and breeding.
New plant type (NPT) was a concept developed in rice breeding targeted at improving rice yield by 20–25%. The NPT integrates phenotypic traits such as short, thick, and sturdy stems; dark green leaves; few tillers with long panicles; and high grain numbers. A subsequent survey in the rice germplasm for donors of these traits landed on several tropical
japonica (
bulu) lines from Indonesia. Although the first-generation crosses with the NPT donors resulted in high sterility, the subsequent development of second-generation hybrids could improve fertility to a great extent. Ever since Oka [
48] described the affinity of genotypes belonging to
indica and
japonica towards articulating hybrid fertility,
indica–japonica hybridization has been a major breeding interest in rice. Although earlier attempts to make hybrids had met limited success, the NPT lines in the tropical
japonica background renewed the interest in the development of new hybrids [
49]. In addition, the discovery of a wide compatibility (WC) system [
50] and the consequential identification of WC genes have contributed ways to improving fertility in hybrids [
51], particularly in super rice hybrids in China. Therefore, tropical
japonica germplasm provides greater promise in parental line development, diversification, and hybrid breeding in rice. One of the major requirements for hybrid development is the documentation of genetic variability and population structure. The advent of molecular systems in the late 1980s has triggered research on deciphering the diversity pattern in rice. Glaszmann [
52] used isozyme variability to pattern the genetic relationship among a large collection of 1688 traditional Asian rice lines and divided them into six varietal groups, two major, two minor, and two satellite groups. Subsequently, several studies on genetic diversity and population structure in rice have been published using DNA-based molecular markers [
53,
54,
55,
56,
57] that used germplasm from different regions. Garris et al. [
12] grouped a large number of accessions into five major clusters such as
aus, temperate
japonica,
indica, aromatic, and tropical
japonica. Likewise, this study assessed diversity at the molecular level among tropical
japonica lines collected worldwide, together with a few
indica,
aus, and wild accessions from India. We could classify the germplasm panel used into four distinct subpopulations, which included two major groups, separating
indica and tropical
japonica initially, with further sub-structuring of the tropical
japonica set into three subgroups. Taken together, sub-populations derived from tropical
japonica are designated as POP1, POP2, and POP3, while the population that included
indica accessions is POP4. Furthermore, the population structure identified in this study showed concordance with the distance-based clustering from the principal coordinate analysis (PCoA). PCoA is an method extensively utilized to evaluate genetic diversity based on quantitative and qualitative traits, which scales the distance data into the multidimensional planes to characterize diversity. However, the grouping based on population structure seems to be more accurate, as it could precisely differentiate the
indica and
japonica types. The only exception was Rahmann Bhatti P, a landrace from Kashmir, which seems to be a perfect admixture between
indica and
japonica. Another notable feature was the placement of wild rice among POP4, indicating that cross-breeding occurred in the wild accessions naturalized under Indian conditions. Citing similar reasons, Sun et al. [
58] have reported that most wild rice populations in South Asia have closer similarity to the
indica subspecies.
However, we could make a most interesting observation in the distribution of genotypes based on the country of sourcing and their distribution among the subpopulations. Pair-wise FST values showed a lower level of differentiation between genotypes from Asia and those from the rest of the world, indicating that all of the genotypes sourced outside the centre of origin could be migrants that moved out at different times. Nevertheless, significant differentiation also could be noticed among the tropical japonica accessions sourced from some countries of Africa, America, Europe, and Pacific regions, suggesting independent evolution of population within the geographical confines after migrating from the centre of origin. Moreover, the distribution of source countries among the subpopulations indicated no particular pattern within the tropical japonica. POP1 being the largest subpopulation accumulated genotypes drawn exclusively from 16 countries, further suggesting a common origin of the genotypes. Moreover, the inclusive distribution of source countries between subpopulations was common within tropical japonica, while the indica types sourced from India were confined to only one subpopulation, POP4. This pattern could be due to the major representation of tropical japonica genotypes in the germplasm assembly, wherein the indica types occupied merely 16% of the population.
To conclude, the current study revealed the population structure of a set of tropical
japonica accessions sourced worldwide, which indicated dispersion from a common region in Asia. Most of these tropical
japonica lines possessed NPT characteristics in various combinations, indicating their potential usefulness in pre-breeding to develop superior parental lines for hybrid development as well as in recombination breeding with
indica types. Particularly, traits such as long panicles and high grain numbers found among several lines could be useful for breeding for higher yield. With a high level of crossability of some of these lines with an
indica genotype, Pusa 44 has already been assessed [
7,
9] and the presence of fertility-restoring genes
Rf3 and
Rf4 for the wild-abortive (WA) cytoplasm has been reported. It would be of interest to assess the level of WC properties in these lines. The consolidation of genetic characteristics hitherto available on these tropical
japonica lines sheds light on their potential use in rice improvement, especially in parental line development for hybrid breeding as well as for recombination breeding.