Next Article in Journal
Water Relations and Physiological Response to Water Deficit of ‘Hass’ Avocado Grafted on Two Rootstocks Tolerant to R. necatrix
Previous Article in Journal
Effects of Composted Straw, Biochar, and Polyacrylamide Addition on Soil Permeability and Dynamic Leaching Characteristics of Pollutants in Loessial Soil in Urban Greenbelts According to Indoor Simulation Experiments
Previous Article in Special Issue
Genome-Wide Association Study and Candidate Gene Mining of Seed Size Traits in Soybean
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Identification of Multiple Genetic Loci and Candidate Genes Determining Seed Size and Weight in Soybean

1
Key Laboratory of Molecular Epigenetics of the Ministry of Education (MOE), Northeast Normal University, Changchun 130024, China
2
Jilin Academy of Agricultural Sciences, Changchun 130033, China
3
Department of Agronomy, Jilin Agricultural University, Changchun 130118, China
*
Author to whom correspondence should be addressed.
These authors contributed equally to this study.
Agronomy 2024, 14(9), 1957; https://doi.org/10.3390/agronomy14091957
Submission received: 15 July 2024 / Revised: 26 August 2024 / Accepted: 27 August 2024 / Published: 29 August 2024

Abstract

:
Soybean is a primary source of plant-based oil and protein for human diets. Seed size and weight are important agronomic traits that significantly influence soybean yield. Despite their importance, the genetic mechanisms underlying soybean seed size and weight remain to be fully elucidated. In order to identify additional, major quantitative trait loci (QTL) associated with seed size and weight, we developed segregating populations by crossing a large-seeded soybean variety “Kebaliang” with a small-seeded soybean variety “SUZUMARU”. We evaluated seed length, width, thickness, and hundred-seed weight across two generations, F4 and F5, in 2022 and 2023. Employing bulked segregate analysis with whole-genome resequencing (BSA-seq), we detected 18 QTLs in the F4 population and 12 QTLs in the F5 population. Notably, six QTLs showed high stability between the two generations, with five derived from two pleiotropic loci (qSS4-1 and qSS20-1) and one specific to seed width (qSW14-1). Further validation and refinement of these loci were carried out through linkage mapping using molecular markers in the F5 population. Additionally, we identified 18 candidate genes within these stable loci and analyzed their sequence variations and expression profiles. Together, our findings offered a foundational reference for further soybean seed size research and unveiled novel genetic loci and candidate genes that could be harnessed for the genetic enhancement of soybean production.

1. Introduction

Soybean (Glycine max (L.) Merr.), which originated from China [1], was domesticated from annual wild soybean (Glycine soja (L.) Sieb. & Zucc.) ca. 6000 to 9000 years ago [2]. Soybean serves as a primary source of plant protein and oil, providing approximately a quarter of plant protein and more than half of oilseeds for both global human and livestock consumption [3]. Given the unrelenting expansion of the global population coupled with enhancements in living standards, the demand for soybeans is experiencing a consistent upwards trajectory [4]. This growing demand underlines the imperative need to enhance soybean yields, thereby drawing attention to traits associated with soybean productivity.
In soybean, seed size components, including seed length (SL), seed width (SW), and seed thickness (ST), collectively influence seed weight/hundred-seed weight (HSW), shape, quality, and yield [5,6]. Seed size and seed weight are complex traits that are crucial for yield. Recent studies have significantly advanced our understanding of the genetic underpinnings of soybean seed size and weight [7]. To date, there are 396 quantitative trait loci (QTLs) associated with seed size and weight that have been documented in the SoyBase database (http://www.soybase.org, accessed on 25 June 2024), including 28 QTLs for SL, 29 QTLs for SW, 23 for ST, and 316 for seed weight [7,8]. The majority of these QTLs were identified through linkage mapping using various bi-parental populations, such as F2, recombinant inbred lines (RILs), chromosome segment substitution lines (CSSLs), and near-isogenic lines (NILs). However, due the limitation of population size and available markers, many of these QTLs were mapped to large chromosomal regions [7]. In addition to the QTLs documented in SoyBase, numerous others have been reported in recent studies [8,9,10]. Notably, several QTLs were identified using mapping populations derived from the cross between wild accession and cultivated soybean cultivar [11,12]. With the advent of next-generation sequencing (NGS) technology, genome-wide association studies (GWASs) have been employed to explore the genetic architecture of seed size and weight. Over the past decade, more than 100 GWAS-identified QTLs for seed size and seed weight have been reported, with the majority pertaining to seed weight [13,14,15,16,17,18,19,20,21]. These studies have significantly expanded our understanding of the genetic basis of these traits. Although early QTLs were less effective in pinpointing functional genes due to their low resolution [7], recent efforts have successfully cloned several genes involved in regulating seed size or weight through linkage analysis and/or GWASs [22,23,24]. For example, the wild soybean allele of the Phosphatase 2C (PP2C-1) gene, which contributes to increased seed size and weight, was identified using high-density linkage analysis [25]. The ST1 (seed thickness 1) gene, encoding a UDP-D-glucuronate 4-epimerase that affects seed thickness and oil content, was identified through GWASs and positional cloning [22]. Additionally, the CCT domain gene POWR1, which has pleiotropic effects on seed oil content, weight, and yield, was also identified via GWASs [24]. Another example, GmST05 (seed thickness 05), predominantly controlling seed thickness and size, was identified through a genome-wide association study of over 1800 soybean accessions [23]. Several genes associated with the regulation of soybean seed size and weight have been identified through co-localization of RNA sequencing with quantitative trait loci (QTLs), as well as through RNA sequencing analysis alone. Notable examples include GA20OX, GmCYP78A5, and WRKY15a [26,27,28]. To date, approximately two dozen genes have been reported to affect seed size or weight by regulating lipid accumulation, cell expansion, and cell proliferation [7,8,29]. However, additional genetic loci related to these traits in soybean likely exist and await discovery.
Bulked segregant analysis (BSA) was initially developed for the rapid identification of genetic markers linked to traits of interest [30,31]. The integration of BSA with whole-genome resequencing (BSA-seq) facilitates the estimation of genome-wide allele frequencies in bulks without the need for prior marker development [32,33,34,35]. BSA-seq has been demonstrated as reliable and efficient to detect QTLs with both major and minor effects [32,36] and has gained popularity as a method for mapping QTL in crop species [34,37,38,39,40,41]. Despite its advantages, BSA-seq exhibits several limitations compared to traditional QTL mapping methods, such as the inability to estimate allelic effect sizes or interactions. To overcome these shortcomings, recent studies have combined BSA-Seq with linkage mapping to enhance QTL identification [38,41].
In this study, two soybean varieties with contrasted seed sizes, Kebaliang (KBL), a local landrace with large seeds, and Suzumaru (SUZU), a Japanese cultivar with smaller seeds, were chosen to develop a mapping population. We conducted QTL mapping of soybean seed size and weight using BSA-seq on F4 and F5 populations. Furthermore, we substantially refined two stable pleiotropic QTLs on chromosomes 4 and 20, and an SW QTL on chromosome 14. By integrating gene expression, sequence variation, and functional annotation of genes within the stable QTL regions, we predicted several candidate genes for seed size and weight.

2. Materials and Methods

2.1. Plant Materials and Phenotyping

The large-seeded soybean variety Kebaliang (KBL) was selected as the maternal progenitor for hybridization with the small-seeded soybean variety Suzumaru (SUZU) for the development of the F4 and F5 mapping population. KBL is a local landrace, while SUZU is a cultivar from Japan. The distinct backgrounds of these two parental varieties facilitate the identification of novel genetic loci associated with seed traits. The parental varieties were obtained from the germplasm database of the Jilin Academy of Agricultural Sciences (JAAS). In 2022, the parental varieties and 373 F4 individuals were planted in the experimental field of Jilin Agricultural University, Changchun, China. Due to the restriction of a local epidemic, leaf samples were collected from only 187 F4 individuals, although seeds were harvested from all F4 individuals. In 2023, the parental varieties and 287 single-seed decent F5 individuals were cultivated under field conditions at the Fanjiatun Experimental Station of the JAAS. The leaf samples were collected from 218 F5 individuals. For each plant, 100 seeds were randomly selected for weighing, and the process was repeated three times. The HSW was determined by the mean of the three replications. For individuals with fewer than 100 seeds, we measured the weight of 50 seeds and doubled it to calculate the HSW. The SL, SW, and ST were measured by averaging 10 seeds for each individual.

2.2. DNA Extraction and Sequencing

Genomic DNA from the two parental varieties and F4/F5 plants was isolated from young leaves using the PlantZol Kit (EE141-01) of TransGen Biotech Co. (Beijing, China) according to the manufacturer’s instructions (https://www.transgen.com/, accessed on 25 June 2024). The genomic DNA of the two parents was sequenced on a NovaSeq 6000 platform with a 150 bp paired-end strategy for ~10× coverage (10 Gb of clean data). For each trait, 20 individual progenies with the highest and lowest extreme phenotypes were selected (Table S1), and two DNA pools were constructed by mixing equal amounts of DNA from each individual progeny. The two DNA pools were sequenced for ~30× coverage (30 Gb of clean data).

2.3. Analysis of Genomic Variants between the Two Parents

The raw DNA sequencing data were filtered and trimmed using Trimmomatic (version 0.39) with the parameter “LEADING:5 TRAILING:5 HEADCROP:10 MINLEN:75” [42]. Clean reads of the two parental varieties were mapped to the Williams 82 reference genome (Phytozome V13 Glycine max Wm82.a4.v1) using BWA mem with default settings [43]. Variants were jointly called and genotyped using GATK (version 4.1.3.0) and BCFtools (version 1.15.1). The SNPs detected by both pipelines were retained and further filtered using VCFtools (version 0.1.16) with the settings “--minGQ 20 --minDP 4”. Then, only biallelic homozygous SNPs between the two parents were extracted using BCFtools (version 1.15.1) for further BSA-seq analysis. The biallelic SNPs and InDels were annotated using snpEff (version 5.0c) [44]. Large InDels were called using Manta (version 1.6.0) with default parameters [45]. The 50–200 bp InDels between the two parents were extracted for InDel marker design.

2.4. BSA-Seq Data Analysis

The raw data of each pool were mapped with the same settings as for the parental data. The allelic read counts were calculated using CollectAllelicCounts from GATK (version 4.1.3.0) for each parental SNP site. The delta–SNP index and G’ were calculated using QTLseqr (version 0.7.5.2) [46] with a window of 4 Mb. For clear results, we designed high/low-bulk and reference/alternative settings for each trait (Table S2). Two methods, QTL-seq [34] and G’, were employed for the preliminary identification of QTLs. G’ q-value < 0.05, SNP index confidence interval > 0.95, and delta–SNP index > 0 were used as the threshold for identifying significant SNPs and regions. The overlapping regions were merged as one locus.

2.5. Naming Principle for QTLs

The naming convention for QTLs follows the format of the lowercase letter “q” + population identifier + uppercase initial of the trait + chromosome number + sequential number. For example, the first QTL determining seed thickness on chromosome 10 in the F4 population was qF4:ST10-1. When organizing pleiotropic QTLs based on their physical positions, the population name is disregarded, and the trait name is replaced with “SS” (seed size).

2.6. RNA Extraction and Sequencing

R3 pods (pod is approximately 5 mm in length), R5 seeds (seed is approximately 3 mm in length), and R5 pod-valves (pod-valves of R5 seeds) [47] were collected in the field in 2023. Samples from 5 individuals were pooled as a biological replicate, and three biological replicates were prepared for each tissue of each genotype. The fresh samples were frozen in liquid nitrogen, ground using a grinder, and then transferred to 2 mL centrifuge tubes. Total RNA was extracted using Tripure Isolation Reagent (Roche Diagnostics, Mannheim, Germany). RNA-seq was performed on the Illumina NovaSeq 6000 platform (Illumina, San Diego, CA, USA) with a 150 bp paired-end strategy at Novogene Co., Ltd., Beijing, China. For each sample, 6 Gb of clean data were obtained.

2.7. Transcriptome Data Analysis

For the transcriptome sequencing data, the raw transcriptome data were subjected to quality control using Trimmomatic (version 0.39) with the parameter “LEADING:5 TRAILING:5 HEADCROP:15 MINLEN:75”. Subsequently, all quality-controlled transcriptome data were aligned to the cultivated soybean reference genome (Phytozome V13 Glycine max Wm82.a4.v1) using STAR (version 2.7.3a) software, and raw read counts were calculated for each gene. Differentially expressed genes (DEGs) were identified using DESeq2 (version 1.34.0) package, with criteria of FDR adjusted to p-value < 0.05. Gene Ontology (GO) enrichment analysis was performed using a one tail hypergeometric test method for DEGs identified in each tissue. The raw p-values were adjusted using the FDR method, and only terms who had adjusted p-values less than 0.05 were classified as significantly over-represented.

2.8. RT-qPCR Assay

RNA was reverse-transcribed into cDNA using the TransScript One-Step gDNA Removal and cDNA Synthesis SuperMix (TransGen Biotech Co., Ltd., Beijing, China). The gene expression levels were evaluated using the SYBR Green (Roche Diagnostics) approach on the QuantStudio™ 5 platform (Life Technologies Holdings Pte Ltd., Woodlands, Singapore). For each tissue type, three biological replicates were analyzed, and the relative expression levels of the genes were calculated by the 2−∆∆Ct method with SUZU samples serving as the control.

2.9. InDel/CAPS Marker Design and Linkage Mapping

Large parental InDels within the candidate regions were selected for InDel marker development. SNPs and InDels located in restriction endonuclease recognition sites were selected for Cleaved Amplified Polymorphic Sequences (CAPSs) marker development. The 150bp flanking region around InDel/SNPs was retrieved for primer design using Primer 5. The InDel/CAPS marker was validated in parental varieties via 1.5% agarose gel electrophoresis. The genotypes of each F4/F5 individual were assayed using markers. The linkage map and QTL mapping were carried out using IciMapping (version 4.2.53), the mapping method used was ICIM-ADD with a 1 cM step, and the LOD threshold for each QTL was manually set to 3.0 [48].

3. Results

3.1. Phenotype Distribution and Correlations among Different Seed Traits in the Segregating Populations

The SL, SW, ST, and HSW were assessed for 187 F4 plants and 218 F5 plants in 2022 and 2023, respectively, along with both parental varieties. The parental variety KBL exhibited significantly larger seed size and greater seed weight comparted to SUZU (t-test p-value < 0.001) (Figure 1A–F and Figure S1). The selfed progeny populations from both F4 and F5 showed continuous segregation of all measured seed traits (Figure 2A–H; Table S3). ST from the F4 population and two traits (SW and HSW) from the F5 population displayed normal distributions in the population (Shapiro–Wilk test p-value > 0.05, Table S3). The rest traits in the F4 and F5 populations exhibited absolute skewness close to or less than one, and the kurtosis values were approximately equal to or less than three, except for SW (kurtosis = 3.51) and ST (kurtosis = 4.08) in F5, suggesting near-normal distributions consistent with the observed frequency distributions (Tables S3 and S4). To explore the relationships among these seed traits, correlation analyses were conducted. Significant positive correlations were detected among all traits (Pearson correlation coefficients 0.40–0.88, p-value < 0.001) (Figure S2), indicating that these traits may be influenced by pleiotropic QTLs or closely linked QTLs.

3.2. Identification of Genetic Loci for Seed Traits through BSA-Seq

In the F4 population, BSA-seq analysis identified a total of 18 genetic loci associated with at least one of the four soybean seed traits: SL, SW, ST, and HSW. Specifically, six loci were associated with HSW, while four loci were identified for each of SL, SW, and ST. The physical intervals of these loci ranged from 190.8 kb to 38.9 Mb (Figure 3; Table S5). In the following F5 population, five, four, and three loci were identified to be associated with SL, SW, and HSW, respectively, with intervals spanning from 399.8 kb to 21.5 Mb (Figure 3; Table S5). In total, 30 loci were detected in the two generations. Comparison with previously reported QTLs curated in SoyBase revealed that 17 loci overlapped with known QTLs in terms of physical positions (Table S5), while 13 were newly discovered in this study (Table S5). Additionally, five pleiotropic loci were identified by analyzing the overlaps in the physical positions of these QTLs (Table S6).
Notably, six QTLs from the F4 and seven QTLs from the F5 populations were found to overlap, with two specific QTLs from F5 (qF5:SL20-1 and qF5:SL20-2) coinciding with a QTL from F4 (qF4:SL20-1). This overlap resulted in six concurrent QTLs, indicating their environmental stability (Table S7). There were six stable loci, including three loci for SW, two for SL, and one for HSW (Table S7). Among these stable QTLs, SL QTLs (qF4:SL4-1 and qF5:SL4-1) and SW QTLs (qF4:SW4-1 and qF5:SW4-1) were physically overlapped on chromosome 4, spanning the region from 56,874 to 8,470,755 bp. Therefore, they were merged and designated as qSS4-1. Similarly, on chromosome 20, the HSW QTLs (qF4:HSW20-3 and qF5:HSW20-1), SL QTLs (qF4:SL20-1, qF5:SL20-1 and qF5:SL20-2), and SW QTLs (qF4:SW20-1 and qF5:SW20-1) also overlapped, covering the region from 2,183,087 to 45,850,829 bp, which was designated as qSS20-1 (Table S7).

3.3. Linkage Mapping of Stable QTLs

To further validate and narrow down the identified regions of the stable QTLs, genetic maps were constructed using InDel/CAPS markers for qSS4-1, qSS20-1, and qSW14-1 (Table S8). In the candidate region for qSS4-1, QTLs associated with SL, SW, and HSW were pinpointed to an interval between markers Caps4-1 and Caps4-2 (Gm04: 3,294,059~4,511,350). The LOD scores for these traits were 7.37, 6.14, and 5.24, respectively, explaining 15.04%, 12.51%, and 10.75% of the phenotypic variance (Figure 4A; Table 1). The overlapping confidence intervals for these QTLs suggest a pleiotropic locus influencing multiple seed traits. This locus coincided with a previously reported QTL for seed size (SS), called Seed volume 1-9 (Gm04:564,119~7,891,669) [49]. Furthermore, in the candidate region for qSS20-1, QTLs for SL, SW, ST, and HSW were localized within a 1.22 Mb interval between Marker20-4 and Marker20-5 (Gm20: 37,158,491~38,379,617). The LOD scores for these traits were 10.37, 4.04, 3.33, and 10.39, respectively, accounting for 19.87%, 13.73%, 6.86%, and 19.84% of the phenotypic variance (Figure 4C; Table 1). Notably, the left side of qSS20-1 is overlapped with a previously identified QTL for HSW, named Seed weight 37-11 (Gm20: 36,842,373~37,789,703) [50], while the right side covered a GWAS QTL for HSW, named Seed weight 15-g9 (Gm20:38,375,777) [17]. However, its impact on seed size traits has not been previously reported, indicating a novel aspect of its genetic effects by this study. The two pleiotropic QTLs, qSS4-1 and qSS20-1, were confirmed through linkage mapping in the F4 population (Figure S3). In the analysis of the candidate region for qSW14-1 on chromosome 14, the peak covers a broader interval, positioned between Marker14-4 and Marker14-6, (Gm14: 14,344,083~28,212,214), with an LOD score of 20.31 and a PVE of 8.64% (Figure 4B; Table 1), overlapping with the QTL Seed width 4-7, as shown in SoyBase [51].

3.4. Gene Expression Difference and Genetic Variation in Candidate Genes within qSS4-1, qSS20-1, and qSW14-1

Within the three stable QTLs (qSS4-1, qSS20-1, and qSW14-1), a total of 145, 143, and 198 genes were identified, respectively. Among these, 18 genes were found to have Arabidopsis homologs involved in the regulation of seed development and were therefore selected for further analysis (Table 2). Specifically, qSS4-1 harbors five genes with Arabidopsis homologs implicated in seed development regulation (Table 2). Similarly, in the qSS20-1 region, seven seed development-related genes were identified, five of which were homologs of Arabidopsis genes AGL62 and AGL61 (Table 2). For qSW14-1, six genes, involved in seed development, hormone signaling, and synthesis, were selected. Among these six genes, three genes were homologs of Arabidopsis AAP8, which is a known function gene for a major QTL for seed size and weight (SSW1) in Arabidopsis [52].
We analyzed the possible sequence variations in these candidate genes between the two parental varieties, KBL and SUZU, based on whole-genome sequencing (WGS) data. Most sequence variations were detected in the UTR, intron, and up- or downstream regions of the candidate genes. Six genes in qSW14-1 (Glyma.14G128400, Glyma.14G132500, Glyma.14G138700, Glyma.14G144200, Glyma.14G144400, Glyma.14G144700) and four genes in qSS20-1 (Glyma.20G135400, Glyma.20G136600, Glyma.20G136700, Glyma.20G136800) had sequence variants in their UTRs, introns, and up-/downstream regions. Notably, Glyma.20G136800 also contained a missense variation that resulted in an amino acid difference between the two parental varieties, which is asparagine in KBL and serine in SUZU (Table 2).
Furthermore, we analyzed the expression of the candidate genes within these QTLs by using RNA-seq data from R3 pods, R5 pod-valves, and R5 seeds. In total, 5971, 6596, and 6295 genes were found to show significant differential expression between KBL and SUZU in R3 pods, R5 pod-valves, and R5 seeds, respectively (Table S9). Furthermore, we randomly selected 10 DEGs and validated their expression levels using RT-qPCR, and the results confirmed the reliability of the RNA-seq data (Figure S4). Among the candidate genes, seven genes showed very low expression levels in all these three tissues and were classified as unexpressed. Two genes (Glyma.04G045500 and Glyma.04G047900) in qSS4-1, one gene (Glyma.14G144200) in qSW14-1, and one gene (Glyma.20G135700) in qSS20-1 showed a significantly higher expression level in KBL than in SUZU in R3 pods or R5 seeds (Table 2, Figure 5). Of note, the differentially expressed candidate genes were validated using RT-qPCR (Figure S5). The RT-qPCR results confirmed that the expression levels of these four genes were significantly higher in KBL compared to SUZU (t-test p-value < 0.05, Figure S5).

4. Discussion

4.1. Identification of Novel QTLs for Seed Size and Weight in Soybean

Seed size and weight are amongst the major deterministic factors of soybean yield [53,54,55]. Seed size is a complex composite trait primarily determined by component traits such as SL, SW, and ST [56,57]. Significant progress has been made in understanding the genetic basis of soybean seed size and weight. Nearly 400 QTLs, which were identified through linkage mapping and >100 QTLs identified by GWASs, have been curated in SoyBase [7,8,29]. Recent studies have reported several seed size or weight QTLs that exhibit pleiotropic effects, environmental interactions, and epistasis [58,59,60,61]. However, most QTLs reported by early linkage mapping studies span large intervals due to technical limitations at that time. With the application of improved and more rapid methods for QTL detection and the construction of the mapping population, it is expected that more QTLs will be identified for soybean seed size and weight.
In this study, we quantified the phenotype data of seed traits from the F4 and F5 populations that were grown in 2022 and 2023, respectively. In both years, KBL and SUZU showed significant differences in all seed traits, indicating that the phenotypic variations between the two varieties are primarily governed by genetic differences. We identified a total of 30 genetic loci associated with seed size/weight through BSA-seq from the F4 and F5 mapping populations of a pair of soybean varieties with contrasting phenotypes in these traits. Among these identified QTLs, nine affected SL, eight affected SW, four influenced ST, and the remaining nine affected HSW (Table S5). Of note, 17 (56.67%) of these soybean seed size and seed weight QTLs identified in this study overlapped with those documented in SoyBase (Table S5), while the remaining 13 (43.33%) were new loci (Table S5). Three SW QTLs, two SL QTLs, and one HSW QTL were recurrently detected in both F4 and F5 populations, indicating that they are stable QTLs (Table S7).
All HSW QTLs were found to be overlapped with those documented in Soybase (Table S5) [17,62,63,64,65,66]. Specifically, two QTLs (qF4:HSW20-1 and qF4:HSW20-2) overlapped with the previously reported QTL cqSeed weight-003 [62], and another two QTLs (qF4:HSW20-3 and qF5:HSW20-1) from stable loci overlapped with the GWAS locus Seed weight 15-g9 (Table S5) [17]. For SL, QTLs from two stable loci (qF4:SL4-1, qF5:SL4-1, qF4:SL20-1, qF5:SL20-2) overlapped with two previously reported QTLs (Seed length 1-12 and Seed length 1-g8) (Table S5) [49,56]. Additionally, three SW QTLs, including two from a stable locus, coincided with previously reported QTLs [51]. Only one ST QTL was overlapped with a previously reported QTL (Table S5) [49]. In total, nine unstable QTLs were overlapped with previously reported QTLs (Table S5). The G’ values of most unstable QTLs were relatively lower than those of the stable QTLs (Table S5), indicating that their effects are comparatively minor. These QTLs were detected in only one population in our study, likely due to their relatively minor effects, interactions with environmental factors, or the limited detection power of the BSA-seq method for such QTLs. These newly discovered genetic loci and stable QTLs will be valuable in soybean breeding.

4.2. The Pleiotropic Loci for Soybean Seed Size and Weight

There is a highly significant positive correlation among the four traits determining seed size and weight (Figure S2), suggesting the existence of genetic connections among them. In this study, we identified two stable pleiotropic QTLs, qSS4-1 and qSS20-1, using linkage mapping. The genetic locus qSS4-1 on chromosome 4 simultaneously controls SL, SW, and HSW, explaining more than 10% of phenotypic variation in each trait, highlighting its significance as a large-effect QTL for seed size and seed weight (Table 1). A previous study has reported a pleiotropic QTL for seed volume in the same region with a physical length of 7.33 Mb [49]. Meanwhile, a large QTL, Seed weight 47-3, spanning nearly the whole chromosome 4, was found to be associated with HSW [67]. In this study, we further narrowed down the interval of the qSS4-1 locus to 1.22 Mb, thus significantly increasing the precision of its localization. On the other hand, qSS20-1 was detected in both the F4 and F5 populations by BSA-seq, controlling SL, SW, ST, and HSW, with PVE being more than 10% for all traits except ST (6.86%), and it was located in a 1.22Mb interval through linkage mapping (Table 1). Undoubtedly, this is a highly significant pleiotropic QTL that has a substantial effect on soybean seed size and weight. Although the small portion of the qSS20-1 locus overlaps with the previously identified Seed weight 37-11 [50], this does not necessarily imply an identical genetic underpinning. Furthermore, the impact of qSS20-1 on SL, SW, and ST has not been previously reported, underscoring a novel aspect of its genetic effect.

4.3. Prediction of Candidate Genes in qSS4-1, qSS20-1, and qSW14-1

Although hundreds of QTLs associated with seed size and weight have been documented, the functional genes underlying these QTLs remain to be identified. To date, only about two dozen genes have been reported to influence soybean seed size and weight regulation [22,23,24,25,26,27,28,68,69,70,71,72,73,74,75,76,77,78,79,80]. The molecular regulatory network controlling seed size in soybeans is intricate, conceivably, involving multi-level interactions such as hormone metabolism, nutrient management, and signal transduction [7,8,29].
In this study, we analyzed genome-wide gene expression differences between the KBL and SUZU soybean varieties in R3 pods, R5 pod-valves, and R5 seeds using RNA sequencing. We identified 5971 to 6596 DEGs across the three tissues (Table S9). Several Gene Ontology (GO) terms were significantly over-represented in the DEGs. Notably, the carbohydrate metabolic process and signal transduction were consistently over-represented in DEGs across all three tissues (Figure S6). Specially in the R3 pods, DEGs were particularly enriched in biological processes such as the lipid biosynthetic process and fatty acid biosynthetic process. A few GO terms were over-represented in DEGs in R5 pod-valves and R5 seeds, such as xyloglucosyl transferase activity, defense response, and iron ion bind (Figure S6). These findings suggest that these biological processes or functions are likely involved in the observed differences in seed size and weight between the two parental varieties.
Within the qSS4-1 locus, we pinpointed five genes as prime candidates that potentially influence seed size and seed weight (Table 2). Glyma.04G042000 and Glyma.04G045500 are homologs of Arabidopsis CYCD3;1 and CYCD5;1, respectively, which have been shown to play roles in seed development regulation [81]. Notably, the expression of Glyma.04G045500 in R3 pod tissue is significantly higher in KBL than in SUZU, suggesting its pivotal role underlying the QTL. The homolog of Glyma.04G046600 in Arabidopsis is AT4G33800, which encodes a hypothetical protein and is annotated to participate in cytokinin metabolic processes and the signaling pathway. Glyma.04G047900, corresponding to Arabidopsis ANT, an AP2-like family transcription factor which is implicated in the maternal control of seed size [82], showed a higher expression level in KBL than in SUZU in R5 seed tissue. Lastly, Glyma.04G055600, analogous to Arabidopsis CKX7, is known to regulate fruit growth through cytokinin degradation [83].
Within the qSS20-1 locus, we found 143 genes, highlighting 7 candidates likely involved in seed size regulation (Table 2). Glyma.20G135400, homologous to Arabidopsis CYCD1;1, is anticipated to regulate seed development [81]. Glyma.20G135700, a homolog of PSK4, has been linked to increased seed size and weight in Arabidopsis when expressing the soybean gene GmPSKγ1 [78]. The remaining five genes, tandemly clustered in position, are homologs of Arabidopsis AGL62 and AGL61, known to significantly influence the development of floral organs and seeds, thereby affecting seed morphology and size [84]. Among the seven genes, Glyma.20G135700 showed a higher expression level in KBL than SUZU in R3 pod tissue, and Glyma.20G136800 has a missense variant between parental varieties (Table 2).
Within the qSW14-1 locus, Glyma.14G144200, Glyma.14G144400, and Glyma.14G144700 are homologs of Arabidopsis AAP8, the functional gene underling a major QTL for seed size and weight on chromosome 1 (SSW1) in Arabidopsis [52]. Glyma.14G128400 (GA3OX1), Glyma.14G132500 (SUP32), and Glyma.14G138700 (SAUR37), with homologs in Arabidopsis involved in seed development and hormone signaling and synthesis, were identified in the candidate region. Among them, only Glyma.14G144200 was differentially expressed between the two parents and showed a significantly higher expression level in KBL in R5 seeds (Table 2).

5. Conclusions

In this study, we mapped 30 QTLs associated with soybean seed size and weight. Of these, we identified six stable QTLs, with five originating from two pleiotropic loci and one specific to SW. In addition, we pinpointed several candidate genes within the stable QTL and analyzed their sequence variation and expression. Further efforts with secondary mapping populations will facilitate the identification of functional genes underlining these QTLs. The genetic loci and candidate genes identified in this study have laid a solid foundation for further elucidating the genetic basis and molecular mechanisms underpinning soybean seed size and seed weight. Additionally, this research offers genetic resources for soybean genetic improvement by molecular marker-facilitated breeding.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/agronomy14091957/s1. Figure S1: Comparison of seed size and weight traits between parental varieties; Figure S2: The correlation among seed length, seed width, seed thickness, and hundred-seed weight in the F4 and F5 populations; Figure S3: Linkage mapping of the pleotropic qSS4-1 and qSS20-1 loci in F4 population; Figure S4: RT-qPCR validation of ten DEGs; Figure S5: RT-qPCR validation of differentially expressed candidate genes in qSS4-1, qSW14-1, and qSS20-1; Figure S6: GO enrichment analysis of DEGs between parental varieties in three tissues; Table S1: Descriptive statistics of seed size and weight traits in high- and low-bulk pool of BSA-seq; Table S2: The design of high and low bulks for each trait; Table S3: Shapiro–Wilk normality test p-values, skewness, and kurtosis of phenotypic data in F4 and F5 populations; Table S4: Descriptive statistics of seed size and weight traits in F4, F5, and parents populations; Table S5: Total QTLs detected by BSA-seq in the F4 and F5 populations; Table S6: Pleiotropic QTLs in the F4 and F5 populations; Table S7: Stable QTLs in the F4 and F5 populations; Table S8: The primer sequences of the InDel and CAPS markers for linkage mapping in qSS4-1, qSW14-1, and qSS20-1 loci and the sequences of the RT-qPCR primers; Table S9: Summary of DEGs in different tissues.

Author Contributions

Conceptualization, B.L. and C.X.; methodology, C.X. and Q.D.; formal analysis, M.W.; investigation, M.W., X.D., J.Y., M.J., and L.L.; resources, X.D.; data curation, M.W., Y.Z., N.Z., and P.L.; writing—original draft preparation, M.W., X.D., Y.Z., G.X., J.Y., M.J., L.L., and P.L.; writing—review and editing, N.Z., Q.D., and B.L.; supervision, C.X.; project administration, C.X.; funding acquisition, B.L. and C.X.; All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by the National Natural Science Foundation of China (grant no. U21A20215), the Natural Science Foundation of Jilin Province (grant no. 20210101007JC), the Jilin Province Science and Technology Development Plan Project (grant no. YDZJ202202CXJD014), and the Fundamental Research Funds for the Central Universities (grant no. 2412023YQ005).

Data Availability Statement

The WGS data and RNA-seq data for this study have been submitted to the NCBI SRA database and can be found under the following accession numbers: PRJNA1102699 and PRJNA1102715.

Acknowledgments

We thank Jingbo Zhang for help in taking care of the plant materials.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Li, Y.; Guan, R.; Liu, Z.; Ma, Y.; Wang, L.; Li, L.; Lin, F.; Luan, W.; Chen, P.; Yan, Z.; et al. Genetic structure and diversity of cultivated soybean (Glycine max (L.) Merr.) landraces in China. Theor. Appl. Genet. 2008, 117, 857–871. [Google Scholar] [CrossRef] [PubMed]
  2. Zhang, D.; Sun, L.; Li, S.; Wang, W.; Ding, Y.; Swarm, S.A.; Li, L.; Wang, X.; Tang, X.; Zhang, Z.; et al. Elevation of soybean seed oil content through selection for seed coat shininess. Nat. Plants 2018, 4, 30–35. [Google Scholar] [CrossRef] [PubMed]
  3. Graham, P.H.; Vance, C.P. Legumes: Importance and constraints to greater use. Plant Physiol. 2003, 131, 872–877. [Google Scholar] [CrossRef]
  4. Liang, Q.; Chen, L.; Yang, X.; Yang, H.; Liu, S.; Kou, K.; Fan, L.; Zhang, Z.; Duan, Z.; Yuan, Y.; et al. Natural variation of Dt2 determines branching in soybean. Nat. Commun. 2022, 13, 6429. [Google Scholar] [CrossRef] [PubMed]
  5. Li, J.; Zhao, J.; Li, Y.; Gao, Y.; Hua, S.; Nadeem, M.; Sun, G.; Zhang, W.; Hou, J.; Wang, X.; et al. Identification of a novel seed size associated locus SW9-1 in soybean. Crop J. 2019, 7, 548–559. [Google Scholar] [CrossRef]
  6. Niu, Y.; Xu, Y.; Liu, X.-F.; Yang, S.-X.; Wei, S.-P.; Xie, F.-T.; Zhang, Y.-M. Association mapping for seed size and shape traits in soybean cultivars. Mol. Breed. 2013, 31, 785–794. [Google Scholar] [CrossRef]
  7. Duan, Z.; Li, Q.; Wang, H.; He, X.; Zhang, M. Genetic regulatory networks of soybean seed size, oil and protein contents. Front. Plant Sci. 2023, 14, 1160418. [Google Scholar] [CrossRef]
  8. Tayade, R.; Imran, M.; Ghimire, A.; Khan, W.; Nabi, R.B.S.; Kim, Y. Molecular, genetic, and genomic basis of seed size and yield characteristics in soybean. Front. Plant Sci. 2023, 14, 1195210. [Google Scholar] [CrossRef]
  9. Kumawat, G.; Xu, D. A Major and Stable Quantitative Trait Locus qSS2 for Seed Size and Shape Traits in a Soybean RIL Population. Front. Genet. 2021, 12, 646102. [Google Scholar] [CrossRef]
  10. Luo, S.; Jia, J.; Liu, R.; Wei, R.; Guo, Z.; Cai, Z.; Chen, B.; Liang, F.; Xia, Q.; Nian, H.; et al. Identification of major QTLs for soybean seed size and seed weight traits using a RIL population in different environments. Front. Plant Sci. 2023, 13, 1094112. [Google Scholar] [CrossRef]
  11. Liu, D.; Yan, Y.; Fujita, Y.; Xu, D. Identification and validation of QTLs for 100-seed weight using chromosome segment substitution lines in soybean. Breed. Sci. 2018, 68, 442–448. [Google Scholar] [CrossRef]
  12. Yuan, B.; Qi, G.; Yuan, C.; Wang, Y.; Zhao, H.; Li, Y.; Wang, Y.; Dong, L.; Dong, Y.; Liu, X. Major genetic locus with pleiotropism determined seed-related traits in cultivated and wild soybeans. Theor. Appl. Genet. 2023, 136, 125. [Google Scholar] [CrossRef] [PubMed]
  13. Copley, T.R.; Duceppe, M.-O.; O’Donoughue, L.S. Identification of novel loci associated with maturity and yield traits in early maturity soybean plant introduction lines. BMC Genom. 2018, 19, 167. [Google Scholar] [CrossRef] [PubMed]
  14. Yan, L.; Hofmann, N.; Li, S.; Ferreira, M.E.; Song, B.; Jiang, G.; Ren, S.; Quigley, C.; Fickus, E.; Cregan, P.; et al. Identification of QTL with large effect on seed weight in a selective population of soybean with genome-wide association and fixation index analyses. BMC Genom. 2017, 18, 529. [Google Scholar] [CrossRef] [PubMed]
  15. Whiting, R.M.; Torabi, S.; Lukens, L.; Eskandari, M. Genomic regions associated with important seed quality traits in food-grade soybeans. BMC Plant Biol. 2020, 20, 485. [Google Scholar] [CrossRef]
  16. Fang, C.; Ma, Y.; Wu, S.; Liu, Z.; Wang, Z.; Yang, R.; Hu, G.; Zhou, Z.; Yu, H.; Zhang, M.; et al. Genome-wide association studies dissect the genetic networks underlying agronomical traits in soybean. Genome Biol. 2017, 18, 161. [Google Scholar] [CrossRef]
  17. Wu, D.; Li, C.; Jing, Y.; Wang, J.; Zhao, X.; Han, Y. Identification of quantitative trait loci underlying soybean (Glycine max) 100-seed weight under different levels of phosphorus fertilizer application. Plant Breed. 2020, 139, 959–968. [Google Scholar] [CrossRef]
  18. Zhang, H.; Hao, D.; Sitoe, H.M.; Yin, Z.; Hu, Z.; Zhang, G.; Yu, D. Genetic dissection of the relationship between plant architecture and yield component traits in soybean (Glycine max) by association analysis across multiple environments. Plant Breed. 2015, 134, 564–572. [Google Scholar] [CrossRef]
  19. Wang, J.; Chu, S.; Zhang, H.; Zhu, Y.; Cheng, H.; Yu, D. Development and application of a novel genome-wide SNP array reveals domestication history in soybean. Sci. Rep. 2016, 6, 20728. [Google Scholar] [CrossRef]
  20. Hao, D.; Cheng, H.; Yin, Z.; Cui, S.; Zhang, D.; Wang, H.; Yu, D. Identification of single nucleotide polymorphisms and haplotypes associated with yield and yield components in soybean (Glycine max) landraces across multiple environments. Theor. Appl. Genet. 2012, 124, 447–458. [Google Scholar] [CrossRef]
  21. Zhang, J.; Song, Q.; Cregan, P.B.; Jiang, G.-L. Genome-wide association study, genomic prediction and marker-assisted selection for seed weight in soybean (Glycine max). Theor. Appl. Genet. 2016, 129, 117–130. [Google Scholar] [CrossRef] [PubMed]
  22. Li, J.; Zhang, Y.; Ma, R.; Huang, W.; Hou, J.; Fang, C.; Wang, L.; Yuan, Z.; Sun, Q.; Dong, X.; et al. Identification of ST1 reveals a selection involving hitchhiking of seed morphology and oil content during soybean domestication. Plant Biotechnol. J. 2022, 20, 1110–1121. [Google Scholar] [CrossRef] [PubMed]
  23. Duan, Z.; Zhang, M.; Zhang, Z.; Liang, S.; Fan, L.; Yang, X.; Yuan, Y.; Pan, Y.; Zhou, G.; Liu, S.; et al. Natural allelic variation of GmST05 controlling seed size and quality in soybean. Plant Biotechnol. J. 2022, 20, 1807–1818. [Google Scholar] [CrossRef] [PubMed]
  24. Goettel, W.; Zhang, H.; Li, Y.; Qiao, Z.; Jiang, H.; Hou, D.; Song, Q.; Pantalone, V.R.; Song, B.-H.; Yu, D.; et al. POWR1 is a domestication gene pleiotropically regulating seed quality and yield in soybean. Nat. Commun. 2022, 13, 3051. [Google Scholar] [CrossRef]
  25. Lu, X.; Xiong, Q.; Cheng, T.; Li, Q.T.; Liu, X.L.; Bi, Y.D.; Li, W.; Zhang, W.K.; Ma, B.; Lai, Y.C.; et al. A PP2C-1 Allele Underlying a Quantitative Trait Locus Enhances Soybean 100-Seed Weight. Mol. Plant 2017, 10, 670–684. [Google Scholar] [CrossRef]
  26. Du, J.; Wang, S.; He, C.; Zhou, B.; Ruan, Y.-L.; Shou, H. Identification of regulatory networks and hub genes controlling soybean seed set and size using RNA sequencing analysis. J. Exp. Bot. 2017, 68, 1955–1972. [Google Scholar] [CrossRef]
  27. Gu, Y.; Li, W.; Jiang, H.; Wang, Y.; Gao, H.; Liu, M.; Chen, Q.; Lai, Y.; He, C. Differential expression of a WRKY gene between wild and cultivated soybeans correlates to seed size. J. Exp. Bot. 2017, 68, 2717–2729. [Google Scholar] [CrossRef] [PubMed]
  28. Lu, X.; Li, Q.-T.; Xiong, Q.; Li, W.; Bi, Y.-D.; Lai, Y.-C.; Liu, X.-L.; Man, W.-Q.; Zhang, W.-K.; Ma, B.; et al. The transcriptomic signature of developing soybean seeds reveals the genetic basis of seed trait adaptation during domestication. Plant J. 2016, 86, 530–544. [Google Scholar] [CrossRef]
  29. Zhang, Y.; Bhat, J.A.; Zhang, Y.; Yang, S. Understanding the Molecular Regulatory Networks of Seed Size in Soybean. Int. J. Mol. Sci. 2024, 25, 1441. [Google Scholar] [CrossRef]
  30. Giovannoni, J.J.; Wing, R.A.; Ganal, M.W.; Tanksley, S.D. Isolation of molecular markers from specific chromosomal intervals using DNA pools from existing mapping populations. Nucleic Acids Res. 1991, 19, 6553–6568. [Google Scholar] [CrossRef] [PubMed]
  31. Michelmore, R.W.; Paran, I.; Kesseli, R.V. Identification of markers linked to disease-resistance genes by bulked segregant analysis: A rapid method to detect markers in specific genomic regions by using segregating populations. Proc. Natl. Acad. Sci. USA 1991, 88, 9828–9832. [Google Scholar] [CrossRef]
  32. Ehrenreich, I.M.; Torabi, N.; Jia, Y.; Kent, J.; Martis, S.; Shapiro, J.A.; Gresham, D.; Caudy, A.A.; Kruglyak, L. Dissection of genetically complex traits with extremely large pools of yeast segregants. Nature 2010, 464, 1039–1042. [Google Scholar] [CrossRef]
  33. Li, Z.; Xu, Y. Bulk segregation analysis in the NGS era: A review of its teenage years. Plant J. 2022, 109, 1355–1374. [Google Scholar] [CrossRef]
  34. Takagi, H.; Abe, A.; Yoshida, K.; Kosugi, S.; Natsume, S.; Mitsuoka, C.; Uemura, A.; Utsushi, H.; Tamiru, M.; Takuno, S.; et al. QTL-seq: Rapid mapping of quantitative trait loci in rice by whole genome resequencing of DNA from two bulked populations. Plant J. 2013, 74, 174–183. [Google Scholar] [CrossRef]
  35. Magwene, P.M.; Willis, J.H.; Kelly, J.K. The Statistics of Bulk Segregant Analysis Using Next Generation Sequencing. PLoS Comput. Biol. 2011, 7, e1002255. [Google Scholar] [CrossRef] [PubMed]
  36. Wenger, J.W.; Schwartz, K.; Sherlock, G. Bulk Segregant Analysis by High-Throughput Sequencing Reveals a Novel Xylose Utilization Gene from Saccharomyces cerevisiae. PLoS Genet. 2010, 6, e1000942. [Google Scholar] [CrossRef] [PubMed]
  37. Zhang, S.; Abdelghany, A.M.; Azam, M.; Qi, J.; Li, J.; Feng, Y.; Liu, Y.; Feng, H.; Ma, C.; Gebregziabher, B.S.; et al. Mining candidate genes underlying seed oil content using BSA-seq in soybean. Ind. Crops Prod. 2023, 194, 116308. [Google Scholar] [CrossRef]
  38. Li, R.; Jiang, H.; Zhang, Z.; Zhao, Y.; Xie, J.; Wang, Q.; Zheng, H.; Hou, L.; Xiong, X.; Xin, D.; et al. Combined Linkage Mapping and BSA to Identify QTL and Candidate Genes for Plant Height and the Number of Nodes on the Main Stem in Soybean. Int. J. Mol. Sci. 2020, 21, 42. [Google Scholar] [CrossRef]
  39. Vogel, G.; LaPlant, K.E.; Mazourek, M.; Gore, M.A.; Smart, C.D. A combined BSA-Seq and linkage mapping approach identifies genomic regions associated with Phytophthora root and crown rot resistance in squash. Theor. Appl. Genet. 2021, 134, 1015–1031. [Google Scholar] [CrossRef]
  40. Win, K.T.; Vegas, J.; Zhang, C.; Song, K.; Lee, S. QTL mapping for downy mildew resistance in cucumber via bulked segregant analysis using next-generation sequencing and conventional methods. Theor. Appl. Genet. 2017, 130, 199–211. [Google Scholar] [CrossRef]
  41. Zhang, K.; Yuan, M.; Xia, H.; He, L.; Ma, J.; Wang, M.; Zhao, H.; Hou, L.; Zhao, S.; Li, P.; et al. BSA-seq and genetic mapping reveals AhRt2 as a candidate gene responsible for red testa of peanut. Theor. Appl. Genet. 2022, 135, 1529–1540. [Google Scholar] [CrossRef] [PubMed]
  42. Bolger, A.M.; Lohse, M.; Usadel, B. Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics 2014, 30, 2114–2120. [Google Scholar] [CrossRef]
  43. Li, H.; Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 2009, 25, 1754–1760. [Google Scholar] [CrossRef]
  44. Cingolani, P.; Platts, A.; Wang, L.L.; Coon, M.; Nguyen, T.; Wang, L.; Land, S.J.; Lu, X.; Ruden, D.M. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff. Fly 2012, 6, 80–92. [Google Scholar] [CrossRef]
  45. Chen, X.; Schulz-Trieglaff, O.; Shaw, R.; Barnes, B.; Schlesinger, F.; Källberg, M.; Cox, A.J.; Kruglyak, S.; Saunders, C.T. Manta: Rapid detection of structural variants and indels for germline and cancer sequencing applications. Bioinformatics 2015, 32, 1220–1222. [Google Scholar] [CrossRef] [PubMed]
  46. Mansfeld, B.N.; Grumet, R. QTLseqr: An R Package for Bulk Segregant Analysis with Next-Generation Sequencing. Plant Genome 2018, 11, 180006. [Google Scholar] [CrossRef]
  47. Fehr, W.R.; Caviness, C.E.; Burmood, D.T.; Pennington, J.S. Stage of Development Descriptions for Soybeans, Glycine Max (L.) Merrill. Crop Sci. 1971, 11, 929–931. [Google Scholar] [CrossRef]
  48. Pavan Kumar, N.; Biradar, B.D.; Bagewadi, B.; Hanamaratti, N.G.; Bhat, S.; Shekharappa; Nethra, P.; Kariyannanavar, P.; Kavyashree, N.M. Identification of SSR markers linked to new fertility restoration trait in sorghum (Sorghum bicolor (L.) Moench) for A4 (maldandi) male sterile cytoplasm. Plant Breed. 2023, 143, 195–203. [Google Scholar] [CrossRef]
  49. Salas, P.; Oyarzo-Llaipen, J.C.; Wang, D.; Chase, K.; Mansur, L. Genetic mapping of seed shape in three populations of recombinant inbred lines of soybean (Glycine max L. Merr.). Theor. Appl. Genet. 2006, 113, 1459–1466. [Google Scholar] [CrossRef]
  50. Sun, Y.-N.; Pan, J.-B.; Shi, X.-L.; Du, X.-Y.; Wu, Q.; Qi, Z.-M.; Jiang, H.-W.; Xin, D.-W.; Liu, C.-Y.; Hu, G.-H.; et al. Multi-environment mapping and meta-analysis of 100-seed weight in soybean. Mol. Biol. Rep. 2012, 39, 9435–9443. [Google Scholar] [CrossRef] [PubMed]
  51. Jun, T.-H.; Freewalt, K.; Michel, A.P.; Mian, R. Identification of novel QTL for leaf traits in soybean. Plant Breed. 2014, 133, 61–66. [Google Scholar] [CrossRef]
  52. Jiang, S.; Jin, X.; Liu, Z.; Xu, R.; Hou, C.; Zhang, F.; Fan, C.; Wu, H.; Chen, T.; Shi, J.; et al. Natural variation in SSW1 coordinates seed growth and nitrogen use efficiency in Arabidopsis. Cell Rep. 2024, 43, 114150. [Google Scholar] [CrossRef]
  53. Fontes, L.A.N.; Ohlrogge, A.J. Influence of Seed Size and Population on Yield and Other Characteristics of Soybean [Glycine max (L.) Merr.]. Agron. J. 1972, 64, 833–836. [Google Scholar] [CrossRef]
  54. Smith, T.J.; Camper, H.M., Jr. Effects of Seed Size on Soybean Performance. Agron. J. 1975, 67, 681–684. [Google Scholar] [CrossRef]
  55. Poeta, F.; Borrás, L.; Rotundo, J.L. Variation in Seed Protein Concentration and Seed Size Affects Soybean Crop Growth and Development. Crop Sci. 2016, 56, 3196–3208. [Google Scholar] [CrossRef]
  56. Hina, A.; Cao, Y.; Song, S.; Li, S.; Sharmin, R.A.; Elattar, M.A.; Bhat, J.A.; Zhao, T. High-Resolution Mapping in Two RIL Populations Refines Major “QTL Hotspot” Regions for Seed Size and Shape in Soybean (Glycine max L.). Int. J. Mol. Sci. 2020, 21, 1040. [Google Scholar] [CrossRef] [PubMed]
  57. Xu, Y.; Li, H.-N.; Li, G.-J.; Wang, X.; Cheng, L.-G.; Zhang, Y.-M. Mapping quantitative trait loci for seed size traits in soybean (Glycine max L. Merr.). Theor. Appl. Genet. 2011, 122, 581–594. [Google Scholar] [CrossRef]
  58. Wang, L.; Karikari, B.; Zhang, H.; Zhang, C.; Wang, Z.; Zhao, T.; Feng, J. Comprehensive Identification of Main, Environment Interaction and Epistasis Quantitative Trait Nucleotides for 100-Seed Weight in Soybean (Glycine max (L.) Merr.). Agronomy 2024, 14, 483. [Google Scholar] [CrossRef]
  59. Chen, Y.; Xiong, Y.; Hong, H.; Li, G.; Gao, J.; Guo, Q.; Sun, R.; Ren, H.; Zhang, F.; Wang, J.; et al. Genetic dissection of and genomic selection for seed weight, pod length, and pod width in soybean. Crop J. 2023, 11, 832–841. [Google Scholar] [CrossRef]
  60. Elattar, M.A.; Karikari, B.; Li, S.; Song, S.; Cao, Y.; Aslam, M.; Hina, A.; Abou-Elwafa, S.F.; Zhao, T. Identification and Validation of Major QTLs, Epistatic Interactions, and Candidate Genes for Soybean Seed Shape and Weight Using Two Related RIL Populations. Front. Genet. 2021, 12, 666440. [Google Scholar] [CrossRef] [PubMed]
  61. Li, M.; Chen, L.; Zeng, J.; Razzaq, M.K.; Xu, X.; Xu, Y.; Wang, W.; He, J.; Xing, G.; Gai, J. Identification of Additive–Epistatic QTLs Conferring Seed Traits in Soybean Using Recombinant Inbred Lines. Front. Plant Sci. 2020, 11, 566056. [Google Scholar] [CrossRef]
  62. Nichols, D.M.; Glover, K.D.; Carlson, S.R.; Specht, J.E.; Diers, B.W. Fine Mapping of a Seed Protein QTL on Soybean Linkage Group I and Its Correlated Effects on Agronomic Traits. Crop Sci. 2006, 46, 834–839. [Google Scholar] [CrossRef]
  63. Orf, J.H.; Chase, K.; Jarvik, T.; Mansur, L.M.; Cregan, P.B.; Adler, F.R.; Lark, K.G. Genetics of Soybean Agronomic Traits: I. Comparison of Three Related Recombinant Inbred Populations. Crop Sci. 1999, 39, 1642–1651. [Google Scholar] [CrossRef]
  64. Specht, J.E.; Chase, K.; Macrander, M.; Graef, G.L.; Chung, J.; Markwell, J.P.; Germann, M.; Orf, J.H.; Lark, K.G. Soybean Response to Water: A QTL Analysis of Drought Tolerance. Crop Sci. 2001, 41, 493–509. [Google Scholar] [CrossRef]
  65. Yan, L.; Li, Y.-H.; Yang, C.-Y.; Ren, S.-X.; Chang, R.-Z.; Zhang, M.-C.; Qiu, L.-J. Identification and validation of an over-dominant QTL controlling soybean seed weight using populations derived from Glycine max × Glycine soja. Plant Breed. 2014, 133, 632–637. [Google Scholar] [CrossRef]
  66. Mian, M.A.R.; Bailey, M.A.; Tamulonis, J.P.; Shipe, E.R.; Carter, T.E.; Parrott, W.A.; Ashley, D.A.; Hussey, R.S.; Boerma, H.R. Molecular markers associated with seed weight in two soybean populations. Theor. Appl. Genet. 1996, 93, 1011–1016. [Google Scholar] [CrossRef] [PubMed]
  67. Li, D.; Sun, M.; Han, Y.; Teng, W.; Li, W. Identification of QTL underlying soluble pigment content in soybean stems related to resistance to soybean white mold (Sclerotinia sclerotiorum). Euphytica 2010, 172, 49–57. [Google Scholar] [CrossRef]
  68. Jiang, W.; Zhang, X.; Song, X.; Yang, J.; Pang, Y. Genome-Wide Identification and Characterization of APETALA2/Ethylene-Responsive Element Binding Factor Superfamily Genes in Soybean Seed Development. Front. Plant Sci. 2020, 11, 566647. [Google Scholar] [CrossRef]
  69. Zhang, M.; Dong, R.; Huang, P.; Lu, M.; Feng, X.; Fu, Y.; Zhang, X. Novel Seed Size: A Novel Seed-Developing Gene in Glycine max. Int. J. Mol. Sci. 2023, 24, 4189. [Google Scholar] [CrossRef]
  70. Tang, X.; Su, T.; Han, M.; Wei, L.; Wang, W.; Yu, Z.; Xue, Y.; Wei, H.; Du, Y.; Greiner, S.; et al. Suppression of extracellular invertase inhibitor gene expression improves seed weight in soybean (Glycine max). J. Exp. Bot. 2016, 68, 469–482. [Google Scholar] [CrossRef]
  71. Hu, Y.; Liu, Y.; Tao, J.-J.; Lu, L.; Jiang, Z.-H.; Wei, J.-J.; Wu, C.-M.; Yin, C.-C.; Li, W.; Bi, Y.-D.; et al. GmJAZ3 interacts with GmRR18a and GmMYC2a to regulate seed traits in soybean. J. Integr. Plant Biol. 2023, 65, 1983–2000. [Google Scholar] [CrossRef] [PubMed]
  72. Wang, X.; Li, Y.; Zhang, H.; Sun, G.; Zhang, W.; Qiu, L. Evolution and association analysis of GmCYP78A10 gene with seed size/weight and pod number in soybean. Mol. Biol. Rep. 2015, 42, 489–496. [Google Scholar] [CrossRef]
  73. Singh, A.K.; Fu, D.-Q.; El-Habbak, M.; Navarre, D.; Ghabrial, S.; Kachroo, A. Silencing Genes Encoding Omega-3 Fatty Acid Desaturase Alters Seed Size and Accumulation of Bean pod mottle virus in Soybean. Mol. Plant Microbe Interact. 2011, 24, 506–515. [Google Scholar] [CrossRef] [PubMed]
  74. Wang, S.; Liu, S.; Wang, J.; Yokosho, K.; Zhou, B.; Yu, Y.-C.; Liu, Z.; Frommer, W.B.; Ma, J.F.; Chen, L.-Q.; et al. Simultaneous changes in seed size, oil content and protein content driven by selection of SWEET homologues during soybean domestication. Natl. Sci. Rev. 2020, 7, 1776–1786. [Google Scholar] [CrossRef] [PubMed]
  75. Hu, Y.; Liu, Y.; Lu, L.; Tao, J.-J.; Cheng, T.; Jin, M.; Wang, Z.-Y.; Wei, J.-J.; Jiang, Z.-H.; Sun, W.-C.; et al. Global analysis of seed transcriptomes reveals a novel PLATZ regulator for seed size and weight control in soybean. New Phytol. 2023, 240, 2436–2454. [Google Scholar] [CrossRef] [PubMed]
  76. Zhu, W.; Yang, C.; Yong, B.; Wang, Y.; Li, B.; Gu, Y.; Wei, S.; An, Z.; Sun, W.; Qiu, L.; et al. An enhancing effect attributed to a nonsynonymous mutation in SOYBEAN SEED SIZE 1, a SPINDLY-like gene, is exploited in soybean domestication and improvement. New Phytol. 2022, 236, 1375–1392. [Google Scholar] [CrossRef] [PubMed]
  77. Zhao, B.; Dai, A.; Wei, H.; Yang, S.; Wang, B.; Jiang, N.; Feng, X. Arabidopsis KLU homologue GmCYP78A72 regulates seed size in soybean. Plant Mol. Biol. 2016, 90, 33–47. [Google Scholar] [CrossRef]
  78. Yu, L.; Liu, Y.; Zeng, S.; Yan, J.; Wang, E.; Luo, L. Expression of a novel PSK-encoding gene from soybean improves seed growth and yield in transgenic plants. Planta 2019, 249, 1239–1250. [Google Scholar] [CrossRef]
  79. Ge, L.; Yu, J.; Wang, H.; Luth, D.; Bai, G.; Wang, K.; Chen, R. Increasing seed size and quality by manipulating BIG SEEDS1 in legume species. Proc. Natl. Acad. Sci. USA 2016, 113, 12414–12419. [Google Scholar] [CrossRef]
  80. Zhang, Y.; Zhang, Y.-J.; Yang, B.-J.; Yu, X.-X.; Wang, D.; Zu, S.-H.; Xue, H.-W.; Lin, W.-H. Functional characterization of GmBZL2 (AtBZR1 like gene) reveals the conserved BR signaling regulation in Glycine max. Sci. Rep. 2016, 6, 31134. [Google Scholar] [CrossRef]
  81. Collins, C.; Dewitte, W.; Murray, J.A.H. D-type cyclins control cell division and developmental rate during Arabidopsis seed development. J. Exp. Bot. 2012, 63, 3571–3586. [Google Scholar] [CrossRef] [PubMed]
  82. Li, N.; Li, Y. Maternal control of seed size in plants. J. Exp. Bot. 2015, 66, 1087–1097. [Google Scholar] [CrossRef] [PubMed]
  83. Di Marzo, M.; Herrera-Ubaldo, H.; Caporali, E.; Novak, O.; Strnad, M.; Balanza, V.; Ezquer, I.; Mendes, M.A.; de Folter, S.; Colombo, L. SEEDSTICK Controls Arabidopsis Fruit Size by Regulating Cytokinin Levels and FRUITFULL. Cell Rep. 2020, 30, 2846–2857.e2843. [Google Scholar] [CrossRef] [PubMed]
  84. Zhang, C.; Wei, L.; Wang, W.; Qi, W.; Cao, Z.; Li, H.; Bao, M.; He, Y. Identification, characterization and functional analysis of AGAMOUS subfamily genes associated with floral organs and seed development in Marigold (Tagetes erecta). BMC Plant Biol. 2020, 20, 439. [Google Scholar] [CrossRef] [PubMed]
Figure 1. The seed trait phenotypes of KBL, SUZU, F4, and F5 generation progenies. (A,D) Seed length. (B,E) Seed width. (C,F) Seed thickness.
Figure 1. The seed trait phenotypes of KBL, SUZU, F4, and F5 generation progenies. (A,D) Seed length. (B,E) Seed width. (C,F) Seed thickness.
Agronomy 14 01957 g001
Figure 2. The frequency distribution of four seed traits in F4 and F5 populations. The frequency distribution of (A) seed length, SL (B) seed width, SW (C) seed thickness, ST (D) hundred-seed weight, HSW, respectively, in the F4 population. The frequency distribution of (E) seed length, SL (F) seed width, SW (G) seed thickness, ST and (H) hundred-seed weight, HSW, in the F5 population. Brown and blue vertical dashed lines represent the phenotypic values of KBL and SUZU, respectively.
Figure 2. The frequency distribution of four seed traits in F4 and F5 populations. The frequency distribution of (A) seed length, SL (B) seed width, SW (C) seed thickness, ST (D) hundred-seed weight, HSW, respectively, in the F4 population. The frequency distribution of (E) seed length, SL (F) seed width, SW (G) seed thickness, ST and (H) hundred-seed weight, HSW, in the F5 population. Brown and blue vertical dashed lines represent the phenotypic values of KBL and SUZU, respectively.
Agronomy 14 01957 g002
Figure 3. The distribution of G’ value and delta–SNP index detected by BSA-seq analysis in the F4 and F5 population. (AD) represent the BSA-seq analysis results for seed length, seed width, seed thickness, and hundred-seed weight, respectively. The blue and red curves stand for F4 and F5, respectively, and the cyan frames denote the regions of the three stable QTLs qSS4-1, qSW14-1, and qSS20-1.
Figure 3. The distribution of G’ value and delta–SNP index detected by BSA-seq analysis in the F4 and F5 population. (AD) represent the BSA-seq analysis results for seed length, seed width, seed thickness, and hundred-seed weight, respectively. The blue and red curves stand for F4 and F5, respectively, and the cyan frames denote the regions of the three stable QTLs qSS4-1, qSW14-1, and qSS20-1.
Agronomy 14 01957 g003
Figure 4. Linkage mapping of the qSS4-1, qSW14-1, and qSS20-1 loci. (A) QTL mapping of qSS4-1 in the F5 population, with red curve representing seed length, bottle-green curve representing seed width, and blue curve representing hundred-seed weight. (B) QTL mapping of qSW14-1 in the F5 population, and the bottle-green curve represents seed width. (C) QTL mapping of qSS20-1 in the F5 population. The red, bottle-green, blue, and cyan curves represent seed length, seed width, hundred-seed weight, and seed thickness, respectively. The black dashed line denotes the LOD threshold (LOD = 3).
Figure 4. Linkage mapping of the qSS4-1, qSW14-1, and qSS20-1 loci. (A) QTL mapping of qSS4-1 in the F5 population, with red curve representing seed length, bottle-green curve representing seed width, and blue curve representing hundred-seed weight. (B) QTL mapping of qSW14-1 in the F5 population, and the bottle-green curve represents seed width. (C) QTL mapping of qSS20-1 in the F5 population. The red, bottle-green, blue, and cyan curves represent seed length, seed width, hundred-seed weight, and seed thickness, respectively. The black dashed line denotes the LOD threshold (LOD = 3).
Agronomy 14 01957 g004
Figure 5. The heatmap of candidate gene expression in different tissues. Reads counts were log2-transferred and scaled across tissues. * denotes DEGs.
Figure 5. The heatmap of candidate gene expression in different tissues. Reads counts were log2-transferred and scaled across tissues. * denotes DEGs.
Agronomy 14 01957 g005
Table 1. Summary of linkage mapping results for qSS4-1, qSW14-1, and qSS20-1.
Table 1. Summary of linkage mapping results for qSS4-1, qSW14-1, and qSS20-1.
QTLChrTraitsLODPVE(%) aLeft MarkerRight MarkerPosition (bp) b
qSS4-14SL7.3715.04Caps4-1Caps4-23,294,059–4,511,350
SW6.1412.51Caps4-1Caps4-23,294,059–4,511,350
HSW5.2410.75Caps4-1Caps4-23,294,059–4,511,350
qSS20-120SL10.3719.87Marker20-4Marker20-537,158,491–38,379,617
SW4.0413.73Marker20-4Marker20-537,158,491–38,379,617
ST3.336.86Marker20-4Marker20-537,158,491–38,379,617
HSW10.3919.84Marker20-4Marker20-537,158,491–38,379,617
qSW14-114SW20.318.64Marker14-4Marker14-614,344,083–28,212,214
a PVE, phenotypic variation explained. b Position defined by flaking markers of 95% confidence intervals.
Table 2. Candidate genes within the intervals of qSS4-1, qSW14-1, and qSS20-1.
Table 2. Candidate genes within the intervals of qSS4-1, qSW14-1, and qSS20-1.
QTLIDAt LocusName aDEG bVariation
qSS4-1Glyma.04G042000AT4G34160CYCD3;1
Glyma.04G045500AT4G37630CYCD5;1R3 pods
Glyma.04G046600AT4G33800T16L1.290
Glyma.04G047900AT4G37750ANTR5 seeds
Glyma.04G055600AT5G21482CKX7
qSW14-1Glyma.14G128400AT1G15550GA3OX1NAupstream/downstream
Glyma.14G132500AT3G49600SUP32 intron/upstream/downstream
Glyma.14G138700AT2G24400SAUR37NAupstream/downstream
Glyma.14G144200AT1G10010AAP8R5 seedsintron
Glyma.14G144400AT1G10010AAP8NAupstream
Glyma.14G144700AT1G10010AAP8 3′-UTR/upstream/downstream
qSS20-1Glyma.20G135400AT1G70210CYCD1;1 intron/upstream/downstream
Glyma.20G135700AT3G49780PSK4R3 podssynonymous
Glyma.20G136400AT5G60440AGL62NA
Glyma.20G136500AT5G60440AGL62NA
Glyma.20G136600AT5G60440AGL62 upstream
Glyma.20G136700AT5G60440AGL62NAsynonymous/upstream/downstream
Glyma.20G136800AT2G24840AGL61NAmissense (Asn-Ser)/intron/upstream/downstream
a Gene name in Arabidopsis thaliana. b The tissues in which the candidate genes were differentially expressed between KBL and SUZU. NA stands for genes without detected expression in the three tissues.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, M.; Ding, X.; Zeng, Y.; Xie, G.; Yu, J.; Jin, M.; Liu, L.; Li, P.; Zhao, N.; Dong, Q.; et al. Identification of Multiple Genetic Loci and Candidate Genes Determining Seed Size and Weight in Soybean. Agronomy 2024, 14, 1957. https://doi.org/10.3390/agronomy14091957

AMA Style

Wang M, Ding X, Zeng Y, Xie G, Yu J, Jin M, Liu L, Li P, Zhao N, Dong Q, et al. Identification of Multiple Genetic Loci and Candidate Genes Determining Seed Size and Weight in Soybean. Agronomy. 2024; 14(9):1957. https://doi.org/10.3390/agronomy14091957

Chicago/Turabian Style

Wang, Meng, Xiaoyang Ding, Yong Zeng, Gang Xie, Jiaxin Yu, Meiyu Jin, Liu Liu, Peiyuan Li, Na Zhao, Qianli Dong, and et al. 2024. "Identification of Multiple Genetic Loci and Candidate Genes Determining Seed Size and Weight in Soybean" Agronomy 14, no. 9: 1957. https://doi.org/10.3390/agronomy14091957

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop