1. Introduction
Sorghum [
Sorghum bicolor (L.) Moench] is a major cereal grain, ranking fifth globally in production and cultivated area [
1]. It uses less water and endures climate change better than other cereals. In light of climate change, sorghum could be a feasible solution for farmers to grow due to its heat and drought tolerance [
2]. Due to its high nutritional content, drought tolerance, minimal input requirements, and remarkable environmental adaptability, sorghum is a crucial crop for food security [
3,
4,
5]. This versatile crop is a widely cultivated crop grown in over 100 countries, particularly in dry, hot, and arid regions [
6]. The largest sorghum producers are the U.S., Nigeria, Sudan, Mexico, Ethiopia, and India [
7]. In the U.S., the ‘Sorghum Belt’, which includes Kansas, Texas, Colorado, Oklahoma, and South Dakota, is a major sorghum producer. These states provide both rainfed and dry conditions on ultisol and mollisol soil types [
8,
9].
Sorghum contains important nutrients and phytochemicals, including protein, fiber, essential minerals, fatty acids (linoleic, oleic, palmitic, linolenic, and stearic), B vitamins, and fat-soluble vitamins (A, D, E, and K) [
10]. Sorghum also contains valuable secondary metabolites (phenolic acids, flavonoids, sterols, policosanols) and antioxidants [
11,
12]. Sorghum is rich in resistant and slowly digestible starches, which help manage blood sugar levels by reducing post-meal spikes compared to other major cereal grains [
13]. Sorghum’s diverse bioactive polyphenols can lower the risk of nutrition-linked chronic diseases. Additionally, its high-molecular-weight tannins are known to alter the functionality of proteins and starch, offering the potential for developing novel bioactive ingredients and enhancing food quality [
14]. The factors mentioned above make sorghum a rare crop that is resilient to climate change and can play a crucial role in ensuring nutritional security.
Sorghum is a multipurpose crop used in biofuel production, forage, ethanol production, and fodder preservation. In particular, sweet sorghum is gaining attention as a biofuel crop due to its high sugar content, ease of extractability, and low input requirements as a C4 crop [
15]. After human consumption, the remainder of sorghum is mainly utilized for animal feed [
16]. The ideal mineral and fatty acid balance of sorghum and its protein source suitability for aquafeed production have recently increased its popularity as an aquafeed [
17].
The International Crops Research Institute for the Semi-Arid Tropics (ICRISAT) GeneBank has almost 37,000 Sorghum accessions, 2247 of which were selected to form a smaller group of germplasm known as the core collection. However, this core collection was also overwhelming. The core collection was evaluated for 11 qualitative and 10 quantitative traits, yielding 21 hierarchical groupings. From each cluster, about 10% or at least one accession was selected to create a mini-core of 242 accessions [
18]. The sorghum mini-core contains 10% of the core’s accessions, or 1% of the entire collection, representing homogeneity for geographical origin, biological races, qualitative features, means, variances, phenotypic diversity indices, and phenotypic correlation. As a result, it is widely used in current genomic studies to evaluate various agronomic traits and biotic and abiotic resistant traits [
18,
19,
20].
Senegalese sorghum germplasm lines are particularly well-known for resistance to biotic stresses such as fungal diseases [
21]. Extensive genome-wide association studies (GWAS) have dissected sorghum resistance against various fungal pathogens in the germplasms [
21,
22,
23]. However, research on other agronomically important traits, such as seed morphology, has received limited attention.
Morphological variation in seed traits includes variations in seed size and shape. The morphology of seeds is a crucial agricultural characteristic as it reflects a combination of genetic, physiological, and environmental aspects, all of which significantly impact crop yield, quality, and market value [
24]. Apart from market value, seed morphology has proved beneficial in determining taxonomic relationships in plant families. As a result, both seed shape and size are relevant parameters for assessing plant biodiversity [
24]. In addition, investigating the biodiversity of seeds can help characterize intra- and inter-species variation, genotypic discrimination, and correlation—all of which are important for breeding to achieve the target levels of seed yield and quality [
24,
25].
Wang et al. [
26] evaluated sorghum mini-core panel in multiple locations with 6,094,317 single nucleotide polymorphism (SNP) markers and identified one locus for recurving peduncles and eight loci for panicle length, width, and compactness. Sakamoto et al. [
27] used multi-trait GWAS to analyze 329 sorghum germplasms from different origins and found SNPs that may be related to seed morphology, such as SNP loci S01_50413644, S04_59021202, and S05_9112888. GWAS conducted on the 300 diverse accessions of the sorghum association panel (SAP) with 265,487 SNPs identified 30 SNPs that were strongly associated with traits measured at the seedling stage under cold stress, and 12 SNPs were significantly associated with seedling traits under heat stress [
28].
Building upon our previous work, which evaluated 162 Senegalese germplasm accessions for eight seed morphology traits (seed area size, length, width, length-to-width ratio, perimeter, circularity, the distance between the intersection of length and width (IS) and center of gravity (CG), and seed darkness and brightness) and identified candidate genes potentially associated with these traits using 193,727 publicly available SNPs [
29], this study investigated seed morphology in genetically diverse sorghum accessions, encompassing a subset of mini-core collection (115 lines including IS19975 originated from Senegal) and germplasms from Senegal (130 lines excluding IS19975). Eight key quantitative traits related to seed size, shape, and color were evaluated in over 24,000 seeds. The selection of these accessions prioritized the public availability of SNP data, facilitating GWAS to map genetic determinants of the observed phenotypic variation. To explore potential associations between seed morphology and resistance to major sorghum diseases, this study employed statistical analyses to investigate anthracnose, head smut, and downy mildew within the mini-core lines. Lastly, employing the Genome Association and Prediction Integrated Tool (GAPIT) R package, this study conducted GWAS using phenotypic data from the seeds and over 290,000 publicly available SNPs. This analysis identified SNPs linked to various seed morphology traits in the reference sorghum genome.
4. Discussion
Seed morphology significantly impacts various biological and ecological processes, such as seed dormancy, germination, dispersal, persistence, evolution, and adaptation [
37]. Despite its versatility, high-stress tolerance, and diverse applications as grain, forage, and biomass [
38], sorghum seed morphology remains relatively unexplored. Correlation analysis of mini-core and Senegalese accessions identified significance among the traits, identical to the patterns observed in previous studies with Senegalese germplasm [
29]. Both PCA plots and partial contribution analyses yielded highly similar results, strengthening the consistency of these findings [
29]. The observed consistency in correlation patterns across both studies could be attributed to the overlap of some Senegalese accessions. However, analyzing just the mini-core accessions in this study yielded nearly identical results, suggesting a broader generalizability of these findings (data available in
Supplementary Data S1).
Furthermore, recent studies identified potential linkages between sorghum seed morphology traits and host resistance against fungal pathogens. Significant negative correlations between grain mold severity and seed weight in sorghum were identified in a recent study [
39]. Similarly, Ahn et al. [
29] identified correlations between seed morphology traits (circularity and the distance between IS and CG) and the formation of spots on seedling leaves. These spots appeared when seedlings were inoculated with
Sporisorium reilianum, a causal pathogen causing head smut, and submerged under water [
40]. Though spotted plants are considered susceptible, the cause of the spots is unclear. They might be a direct result of fungal infection or, alternatively, a defense mechanism triggered by the seedlings. Regardless of their origin, the association between spot appearance rate and seed morphology traits is notable. While no statistically significant links between seed morphology and anthracnose/downy mildew susceptibility were found except for IS and CG, five out of eight tested traits exhibited associations with head smut susceptibility. The head smut data applied in this study are from syringe needle inoculation (hypodermic injection), with resistance/susceptibility confirmed by the occurrence or absence of infected heads in mature plants [
19]. The observed correlations between seed morphology and head smut resistance might be rooted in the distinct infection processes of
S. reilianum. Unlike anthracnose caused by
Colletotrichum sublineola, which involves direct contact infection by conidia, head smut relies on systemic fungal growth originating from soilborne spores infecting plants during seed germination and seedling emergence [
41]. This suggests that certain seed morphological traits may influence plant structures or defenses that impact internal fungal spread, but the precise mechanism remains unknown.
The GWAS analysis revealed over 100 candidate genes linked to seed morphology traits (
Table S2). Intriguingly, several genes with similar functions appeared as top candidates for multiple traits, suggesting shared genetic influences as suggested in correlation analysis. For example, UDP-glycosyltransferases ranked among the top hits for area size, circularity, and distance between IS and CG, indicating their potential impact on seed size and shape. Grain size and abiotic stress tolerance in rice are regulated by UDP-glucosyltransferase, with this regulation being associated with metabolic flux redirection [
42]. Genes associated with zinc finger motifs emerged as candidates for length and LWR, indicating their potential influence on grain size and shape. This is further supported by the C2H2 zinc-finger protein LACKING RUDIMENTARY GLUME 1 (LRG1) in rice, which directly regulates spikelet formation and consequently impacts grain size and yield [
43]. Likewise, F-box genes associated with LWR and brightness support findings in rice, where the F-box protein FBX206 and OVATE family proteins form a regulatory network in the brassinosteroid signal pathway to control plant architecture, grain size, and grain yield [
44]. Furthermore, leucine-rich repeat protein genes linked to length and brightness and the cytochrome P450 superfamily associated with area size and circularity support their roles in plant development, stress responses, and metabolism [
29,
45,
46,
47,
48]. Notably, GW10, a P450 subfamily member, regulates grain size and number in rice [
49]. NDR1/HIN1-like proteins were associated with seed shape. NDR1/HIN1-like genes are known to be associated with pathogen-induced plant responses to biotic stress and their possible roles in plant development [
50]. This dual function in plant defense and development among candidate genes could explain why seed morphology is associated with fungal defense. The primary function of the plant cell wall is to act as a defense mechanism against both biotic and abiotic stressors [
51]. A GWAS combined with transcriptome data in maize revealed that cell wall protein IFF6-like was an important candidate gene for kernel size and development [
52]. This protein is a candidate gene connected to seed brightness in this study, indicating one gene can be associated with seed size, shape, color, and even defense response altogether. These examples, alongside the entire candidate gene list in
Table S2, offer valuable resources for future research and potential candidates for breeding programs aiming to improve sorghum seed morphology and grain yield. Multiple genes previously identified as top candidates in our earlier work [
29] resurfaced as key genes in this study: Homeobox-leucine zipper, glycosyltransferase, zinc finger, and cytochrome P450 genes were consistently identified across our previous and current work. This repeated association strongly suggests their genuine involvement in shaping seed morphology traits. These genes warrant particular attention for further functional validation studies to explore their roles in determining seed morphology.