Next Article in Journal
Assessing the Risks of Pesticide Exposure: Implications for Endocrine Disruption and Male Fertility
Previous Article in Journal
Functional Characterization of the Effects of CsDGAT1 and CsDGAT2 on Fatty Acid Composition in Camelina sativa
Previous Article in Special Issue
Glutamate Receptor-like (GLR) Family in Brassica napus: Genome-Wide Identification and Functional Analysis in Resistance to Sclerotinia sclerotiorum
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Integrated Assays of Genome-Wide Association Study, Multi-Omics Co-Localization, and Machine Learning Associated Calcium Signaling Genes with Oilseed Rape Resistance to Sclerotinia sclerotiorum

1
Key Laboratory of Biology and Ecological Control of Crop Pathogens and Insects of Zhejiang Province, Institute of Biotechnology, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou 310058, China
2
Centre of Analysis and Measurement, Zhejiang University, 866 Yu Hang Tang Road, Hangzhou 310058, China
3
Hainan Institute, Zhejiang University, Sanya 572025, China
*
Author to whom correspondence should be addressed.
Int. J. Mol. Sci. 2024, 25(13), 6932; https://doi.org/10.3390/ijms25136932
Submission received: 5 May 2024 / Revised: 20 June 2024 / Accepted: 20 June 2024 / Published: 25 June 2024
(This article belongs to the Special Issue New Advances in Plant-Fungal Interactions)

Abstract

:
Sclerotinia sclerotiorum (Ss) is one of the most devastating fungal pathogens, causing huge yield loss in multiple economically important crops including oilseed rape. Plant resistance to Ss pertains to quantitative disease resistance (QDR) controlled by multiple minor genes. Genome-wide identification of genes involved in QDR to Ss is yet to be conducted. In this study, we integrated several assays including genome-wide association study (GWAS), multi-omics co-localization, and machine learning prediction to identify, on a genome-wide scale, genes involved in the oilseed rape QDR to Ss. Employing GWAS and multi-omics co-localization, we identified seven resistance-associated loci (RALs) associated with oilseed rape resistance to Ss. Furthermore, we developed a machine learning algorithm and named it Integrative Multi-Omics Analysis and Machine Learning for Target Gene Prediction (iMAP), which integrates multi-omics data to rapidly predict disease resistance-related genes within a broad chromosomal region. Through iMAP based on the identified RALs, we revealed multiple calcium signaling genes related to the QDR to Ss. Population-level analysis of selective sweeps and haplotypes of variants confirmed the positive selection of the predicted calcium signaling genes during evolution. Overall, this study has developed an algorithm that integrates multi-omics data and machine learning methods, providing a powerful tool for predicting target genes associated with specific traits. Furthermore, it makes a basis for further understanding the role and mechanisms of calcium signaling genes in the QDR to Ss.

1. Introduction

Sclerotinia sclerotiorum (Ss) is a notorious plant pathogen capable of infecting over 700 species of monocotyledonous and dicotyledonous plants. It causes a wide range of crop diseases, including white mold, watery soft rot, and Sclerotinia stem rot (SSR) [1,2]. These diseases pose a substantial and widespread threat to crop production, leading to significant economic losses on a global scale. Among the affected crops is oilseed rape (Brassica napus), which holds a prominent position as one of the world’s most vital oilseed crops. For instance, in temperate climates, Ss infections can lead to crop yield reductions of 80–100% [3,4]. Ss presents a major threat due to its ability to infect diverse plant species and the absence of effective host resistance mechanisms, thereby lacking stable resistant cultivars. Consequently, understanding the molecular mechanisms of plant resistance to Ss and developing resilient cultivars have become critical objectives in agricultural research. Quantitative disease resistance (QDR) has emerged as a key component in the defense against Ss in oilseed rape and other crops. QDR involves the cumulative effects of multiple quantitative trait loci (QTLs) that collectively contribute to resistance [5,6]. The complexity and diversity of QDR mechanisms necessitate further investigation into its genetic foundations and evolutionary aspects. Unraveling the molecular architecture of QDR reveals an intricate network that integrates multiple response pathways, incorporating various pathogen molecular determinants and environmental cues [7,8]. Multiple studies in the past decade have identified Ss resistance-related QTLs across various chromosomes. These include QTL SRC6 on chromosome C06 containing the candidate gene BnaC.IGMT5.a belonging to the monolignol biosynthetic gene family [9]; QTL DSRC4 on chromosome C04 with two tau class glutathione S-transferase (GSTU) genes GSTU3 and GSTU4 [10]; and another QTL carrying a GSTU gene cluster on chromosome C06 [11].
In addition to QTL, genome-wide associated study (GWAS) was employed to identify genes involved in QDR to Ss. The QTLs predicted by parental linkage analysis are typically restricted to the differences between specific parental lines, with limited capacity to explore diversity in large-scale and more diverse populations [12]. In contrast, GWAS based on population structure can capture a broader range of genetic variations, aiding in the better understanding of the polygenic genetic background underlying complex quantitative traits. GWAS, employing high-density single nucleotide polymorphisms (SNPs) markers, allows for more precise gene localization, thereby facilitating the identification of individual loci associated with quantitative resistance [13]. Using GWAS, BnaA08g25340D (BnMLO2_2) and BnaC07g35650D (BnGLIP1) were identified to be associated with SSR resistance in B. napus, which was validated by Arabidopsis mutant inoculation assays [6,14]. In summary, only a very limited number of genes associated with QDR to Ss have been identified using QTL and GWAS assays. Further efforts are required to identify more important loci and genes associated with QDR to Ss.
GWAS and post-GWAS approaches have been developing to predict genome-wide loci and genes associated with an interesting trait. For GWAS, in addition to single-SNP-based GWAS (single-SNP GWAS), haplotype-based GWAS (HAP-GWAS) was developed for better capturing long-range linkages [15,16]. Moreover, post-GWAS technologies such as omics-wide association studies (OWAS) have been continuously developed to obtain more accurate and reliable genes of interest [17]. These include epigenome-wide association studies (EWAS) for epigenomics [18], transcriptome-wide association studies (TWAS) for transcriptomics [19], and metabolome-wide association studies (mGWAS) for metabolomics data [20]. Weighted gene co-expression network analysis (WGCNA) and expression quantitative trait nucleotide (eQTN) co-localization can also provide more accurate predictions [21]. Composite resequencing-based GWAS combines conventional GWAS with rare allele testing, functional prediction, and prior knowledge [22]. Integrated assays of various GWAS and post-GWAS approaches should result in more accurate and reliable predictions especially for complex traits such as QDR.
Machine learning (ML) methods have emerged as powerful tools for handling and analyzing high-dimensional datasets to capture nonlinear relationships within genotypes [23,24]. In recent years, an increasing number of studies have utilized ML in the identification of phenotypes-associated target genes [25,26,27]. The algorithms QTG-Finder and QTG-Finder2 have been developed based on Random Forest, trained using known causal genes from different species, and utilize features such as polymorphism, functional annotation, and co-functional networks to prioritize genes within QTLs [28,29]. QTG-Finder does not test GWAS results but predicts genes within known QTLs. The important features for identifying causal genes through QTL mapping may differ from those identified through GWAS. QTL mapping tends to identify large-effect alleles in protein-coding regions, while GWAS tends to identify common alleles with larger effect sizes in both protein-coding and non-coding regions [30]. In oilseed rape, the POCKET algorithm [31] has been developed to predict target genes associated with seed oil content by integrating multi-omics features, including TWAS results. To date, no algorithm is available that efficiently integrates multiple post-GWAS results to rapidly predict disease-resistant genes within resistance-associated loci (RALs), which are specific regions on the chromosome containing a large number of SNPs associated with disease resistance. Current research relies partially on specific features that may not fully explain the variations identified by GWAS, thereby limiting their applicability across different traits. Novel algorithms capable of effectively integrating a broader range of omics information to rapidly predict target genes associated with QDR within larger regions remain to be developed.
Cellular calcium ion concentration ([Ca2+]) serves as a ubiquitous second messenger, widely present from prokaryotes to eukaryotes, and plays a crucial role in plant growth and development, and biotic and abiotic stress responses [32,33,34]. The regulation of calcium influx is accomplished by calcium channels and pumps, such as glutamate receptor (GLR), cyclic nucleotide-gated channels (CNGC), and Ca2+/H exchangers (CAX) [33,35]. Calcium sensors, including calmodulin (CaM), calmodulin-like proteins (CML), calcium-dependent protein kinases (CDPK), and calcineurin B-like proteins (CBL), perceive changes in intracellular calcium concentration and activate downstream kinases. These kinases phosphorylate regulatory proteins, such as transcription factors or transporters/channels, thereby directly modulating gene expression or transporter/channel activity, leading to stress tolerance, plant adaptation, and other phenotype responses [36,37]. Which calcium signaling genes are involved in the QDR to Ss remains unclear.
This study aims to identify on a genome-wide scale the genes associated with the QDR to Ss in oilseed rape, providing a molecular basis for breeding resistant varieties. We identified 48 RALs associated with resistance to Ss through single-SNP GWAS, and co-localized seven highly correlated RALs associated with this resistance employing HAP-GWAS in conjunction with WGCNA and RNA-Seq. Furthermore, we developed Integrated Multi-Omics Analysis and Machine Learning for Target Gene Prediction (iMAP), a machine learning algorithm based on Random Forest (RF), incorporating multi-omics features, to predict optimal target genes associated with Ss resistance. Consequently, we successfully identified a set of calcium signaling genes exhibiting evolutionary selection and breeding potential for resistance to Ss.

2. Results

2.1. Optimization for Improved Single-SNP GWAS for SSR Resistance in Oilseed Rape

To evaluate the filed resistance of oilseed rape resources to Ss, we conducted two-year field inoculation analyses for 300 oilseed rape accessions in Changxing, China in 2021 and 2022. The length of stem lesions (LL) and the corresponding stem circumference (SC) were measured two weeks after stem inoculation with Ss mycelial plugs. Although there were slight differences in the maximum, minimum, and median values of lesion lengths between the two years, the overall trend was consistent. The broad-sense heritability (h2) of lesion length was 90.74%, indicating that the stem resistance of the accessions to Ss is genetically stable and independent of the environment (Figure 1A). Consequently, two-year consistently resistant (e.g., R4762 and R4572) and susceptible (e.g., R4385 and R4665) rapeseed germplasm collections were identified (Figure 1B). Furthermore, the lesion length data for both years followed a normal distribution. Correlation analysis between stem circumference and lesion length revealed a weak negative relationship (Figure 1C,D). GWAS was performed based on the reported single-SNP data for the collected 300 accessions [38], in which the distribution of SNPs across all chromosomes of oilseed rape was illustrated in Figure S1. Linkage disequilibrium (LD) decay analysis was performed, and a distance of 18.674 kb, where r2 decreased by half, was selected (Figure 1E). Three models MLM, GLM, and FarmCPU for GWAS using the R package rMVP v1.0.0 [39] were compared. The results for the two-year data showed that the GLM model demonstrated better control over false positives and false negatives, making it the most suitable model for this experiment (Figure 1F,G). Fifty principal components (PCs) were calculated using Plink v1.9, and significant tests were performed using EIGENSOFT v6.0.1. After conducting the significance tests, the first 16 highly significant PCs were selected as covariates (Table S1). The inclusion of kinship (K), Principal Components Analysis (PCA), SC, and flowering time (FT) significantly reduced false positives (Figure 1H,I). Finally, the GLM model, incorporating K, PCA, SC, and FT, was considered the optimal approach for conducting single-SNP-based GWAS (single-SNP GWAS) for candidate resistance gene identification.

2.2. Identification of Resistance-Associated Loci and Candidate Genes by Optimized Single-SNP GWAS

The single-SNP GWAS for the SSR in oilseed rape was performed using our optimized parameters described above. We filtered significant SNPs from the single-SNP GWAS results (−log10(p_value) > 5) and consequently identified 48 resistance-associated loci (RALs) associated with resistance to Ss (with at least three significant SNPs per 18 MB) by referring to the range of previously identified Ss resistance-associated quantitative trait loci (QTLs) in existing studies. Among these 48 RALs, 15 overlapped with previously reported QTLs associated with Ss resistance, while the remaining 33 were newly discovered (Table 1). The Ss-associated RALs were found to be distributed across multiple chromosomes, including A03, A06, C05, C07, Ann_random, and Cnn_random. Each of these chromosomes contained at least three or more RALs associated with resistance to Ss (Figure 2A and Figure S2). The gene ontology (GO) database was used to annotate and enrich the functional characteristics of the genes within the 48 Ss-associated RALs (Figure 2B; Table S2). The analysis revealed that many genes were enriched in various biological processes, including obsolete oxidation-reduction process, protein phosphorylation, transmembrane transport, and the regulation of DNA-templated transcription. In terms of cellular components, the majority of genes were located in the membrane, while others were distributed in intracellular anatomical structures, ribosomes, and the nucleus. Regarding molecular functions, several genes were significantly enriched in various functional processes, such as protein binding, ATP binding, DNA binding, protein kinase activity, catalytic activity, calcium ion binding, metal ion binding, and nucleic acid binding. Further analysis of gene structure and enrichment using the IPR database (Figure 2C and Figure S3A; Table S3) and ProSitePatterns database (Figure 2D and Figure S3B; Table S4) revealed that many genes contained structural domains associated with calcium ion binding, such as the EF-hand domain, EF-hand domain pair, and EF-Hand 1, calcium-binding sites. Additionally, some genes contained structural domains related to protein kinase activity, such as Serine/Threonine protein kinases active-site signature and Protein kinases ATP-binding region signature. In summary, the gene functional annotation and enrichment analysis of the 48 SSR resistance-associated RALs revealed the potentially important roles of these genes in the QDR to Ss.

2.3. Identification of RALs and Genes by Integrated Assays of Single-SNP GWAS, Hap-GWAS, WGCNA, and DEGs

Haplotype-based GWAS (HAP-GWAS) has been considered a better predictor of reliable RALs [15,16]. Therefore, Hap-GWAS was further used to analyze the two-year stem inoculation results in oilseed rape (Figure 3A). It identified 11 RALs that overlapped with the results from the single-SNP GWAS. Moreover, weighted gene co-expression network analysis (WGCNA) was performed on RNA-Seq data obtained from susceptible and resistant rapeseed germplasm accessions following stem inoculation with Ss (NCBI Sequence Read Archive, accession no. SRP053361) (Figure S4). The analysis revealed 13 significantly upregulated modules (Figure 3B), and within these modules, 30 RALs were co-located with the results from single-SNP GWAS. Similarly, the same batch of RNA-Seq data [10] was used to analyze differentially expressed genes (DEGs) at three time points: 24 hpi, 48 hpi, and 96 hpi. A total of 4470 genes were co-located (Figure 3C), indicating their potential significance as candidate genes associated with resistance to Ss. By intersecting the significant disease-resistant genes from single-SNP GWAS, Hap-GWAS, WGCNA, and DEGs, a total of 7 RALs and 110 potential target genes were identified (Figure 3D,E). The specific locations of these RALs on the chromosomes are illustrated in Figure 3F. Interestingly, three RALs were found on chromosome A06, appearing in both GWAS, WGCNA, and DEGs analyses. This suggests that chromosome A06 may play a crucial regulatory role in resistance to Ss in oilseed rape. Consequently, A06 became the focal point for further in-depth investigation.

2.4. iMAP Predicts the Involvement of Calcium Signaling Genes in Resistance to Ss

To better predict candidate genes related to resistance against Ss within RALs, we collected diverse features and constructed a forward training set to develop a machine learning algorithm, which is named here Integrated Multi-Omics Analysis and Machine Learning for Target Gene Prediction (iMAP). This algorithm combines Principal Component Analysis (PCA) and Random Forest (RF) to achieve accurate predictions. Specifically, we use the dimensionality reduction technique PCA (Figure 4A), which transforms high-dimensional data into a lower-dimensional space while retaining the most important information [44]. RF is a powerful ensemble learning method that combines multiple decision trees (Figure 4B) to improve the accuracy and robustness of predictions [45]. To validate the effectiveness of the RF algorithm, we compared it with Logistic Regression (LR), Support Vector Machine (SVM), eXtreme Gradient Boosting (XGBoost), and Neural Network (NN) algorithms. When using only single-SNP GWAS as features, based on the confusion matrix analysis (Figure 4C,D), RF exhibited the highest accuracy (0.78), while SVM, LR, NN, and XGBoost achieved accuracies of 0.58, 0.58, 0.58, and 0.60, respectively. In terms of precision, RF had the highest value (0.82), indicating that 82% of the predicted positive samples were true positives, while that of the remaining algorithms was lower than 0.71. The recall rate, which measures the model’s ability to correctly identify positive samples, was relatively high for RF and SVM at 0.59. The F1 score, a comprehensive performance metric that combines precision and recall, was highest for RF (0.69), while LR, NN, XGBoost, and SVM had F1 scores 0.01, 0.04, 0.11, and 0.54, respectively. RF, LR, and XGBoost can all perform fast computations with a prediction time below 0.1 s, whereas SVM had the longest prediction time (2.61 s) (Figure 4D). In summary, RF exhibited superior performance in terms of accuracy, precision, recall, and F1 score, along with faster prediction. Therefore, RF appears to be the best-performing algorithm on the given dataset. Furthermore, we conducted tests by expanding the feature set beyond single-SNP GWAS (Figure 4E–J and Figure S5A,B). The addition of HAP-GWAS, gene function (GF), and WGCNA features improved the performance of most classification models. Based on F1 score, precision, recall, and accuracy, the performance of the single-SNP GWAS + HAP-GWAS + GF + WGCNA data combination significantly outperformed that of single-SNP GWAS and single-SNP GWAS + HAP-GWAS + GF combinations. Among all the data combinations, RF consistently performed the best, achieving the highest F1 scores, precision, recall, and accuracy. This indicates that RF possesses strong classification capabilities for handling multiple data combinations.
Finally, using the RF model and the single-SNP GWAS + HAP-GWAS + GF + WGCNA dataset, we predicted seven calcium signaling genes within three key RALs on chromosome A06, namely CIPK17 (BnaA06g03950D), SLP2 (BnaA06g12600D), CPK4 (BnaA06g15970D), CML15 (BnaA06g12600D), CML44 (BnaA06g15280D), IQD30 (BnaA06g13020D), IQD32 (BnaA06g14070D) (Table 2). These results suggest the important roles of calcium signaling pathways in the QDR to Ss and demonstrate the powerful performance of iMAP in precise prediction of the target genes.

2.5. Positive Selection of Multiple Calcium Signaling Genes in the Population Evolution for Ss Resistance in Oilseed Rape

To further investigate whether the seven calcium signaling genes predicted by iMAP on chromosome A06 have undergone positive selection during the evolution of the oilseed rape population, we conducted nucleotide diversity ratio (π ratio) and Tajima’s D statistical analysis on the A06 chromosome in resistance and susceptibility subpopulations (Figure 5A). The π ratio values of the genes ranged from 1.20 to 2.56, indicating a higher level of genetic variation between the resistance and susceptibility subpopulations. Tajima’s D values ranged from 1.88 to 5.36, potentially indicating signs of non-neutral evolution, possibly due to positive selection. BnaA06g15970D and BnaA06g13020D showed higher fixation index (Fst) values of 0.20 and 0.15, respectively, while BnaA06g12600D, BnaA06g12660D, BnaA06g14070D, and BnaA06g15280D exhibited moderate Fst values. These findings imply that these genes have experienced positive selection between the resistance and susceptibility subpopulations (Figure 5B). Furthermore, we performed an r2 analysis of SNP loci within the gene regions (Figure 5C) and compared lesion lengths among three different genotypes (no munition, single and double nucleotide mutations) of selected SNP to identify the optimal genotypes for disease resistance. Apart from BnaA06g03950D, which showed comparable lesion lengths across the three haplotypes, the other six genes (BnaA06g12600D, BnaA06g12660D, BnaA06g13020D, BnaA06g14070D, BnaA06g15280D, and BnaA06g15970D) exhibited significant disparities in lesion length among the genotypes. Additionally, there were significant differences between single and double nucleotide mutations in the SNP loci of these six genes, indicating that these mutations may gradually contribute to changes in disease resistance among germplasm accessions (Figure 5D,E).
In summary, the predicted calcium signaling genes have undergone positive selection during the evolution of the oilseed rape population and may be associated with the evolution of disease resistance in oilseed rape.

3. Discussion

SSR caused by the necrotrophic fungal pathogen Sclerotinia sclerotiorum is an economically important disease in oilseed rape [1,46]. However, resistance to SSR is a complex quantitative disease resistance (QDR), characterized by subtle cumulative and partially dominant effects [11,47]. In contrast to typical resistance mediated by single R genes, QDR is controlled by the complex interaction of multiple genes, involving multiple loci and genetic factors, potentially associated with different immune response pathways. QDR exhibits a continuous spectrum of disease resistance phenotypes, indicating that different individuals may display varying levels of resistance, rather than a binary classification of resistant or susceptible [48,49]. GWAS based on linkage disequilibrium (LD) can provide more precise localization of RALs. Several important RALs associated with resistance to Ss have been identified on multiple chromosomes in oilseed rape. These RALs harbor genes involved in oxidative burst, lignin biosynthesis, and jasmonic acid (JA) pathways [11,50,51]. In this study, we identified a total of 48 RALs associated with resistance to Ss, of which 15 RALs were consistent with previous studies. In addition to the well-established A02 and C09 chromosomes, we observed the presence of overlapping RALs on eight other chromosomes, further confirming the repeatability and reliability of the RALs identified in this GWAS. Gene ontology (GO) annotation revealed a significant enrichment of genes associated with calcium ion binding and protein kinase activity, highlighting the potentially important role of the calcium signaling pathway in resistance to Ss.
Currently, GWAS often involves large genomic regions when predicting QTL. To address this limitation, numerous co-localization strategies that integrate multi-omics data have been developed in this study. Machine learning techniques have the ability to integrate diverse data sources and perform feature selection, enabling the construction of predictive models for target gene prediction and unraveling complex associations between genotypes and phenotypes [52,53]. Random Forest is a powerful ensemble learning algorithm that combines multiple decision trees to create an accurate and robust model. It reduces overfitting, handles high-dimensional data, and improves prediction accuracy through the majority voting of individual decision trees [45,54]. Currently, there have been numerous studies utilizing machine learning and analyzing multi-omics data to identify relevant genes associated with crop yield in economic crops [55]. However, there is still a lack of developed algorithms that can extensively analyze multi-omics data, specifically targeting plant disease resistance, particularly QDR. In this study, we propose a novel approach that integrates multi-omics and machine learning techniques, iMAP, to gain deeper insights into the molecular mechanisms underlying plant disease resistance (Figure 6). The development of the iMAP algorithm has provided researchers with a powerful tool to rapidly rank and list potential candidate genes associated with specific traits within a large number of RAL regions. This lays the foundation for a deeper understanding of gene function and enables advancements in precision breeding and other research areas. Moreover, the algorithm is not limited to specific species or traits and can flexibly incorporate, integrate, and analyze various features based on different research objectives and data characteristics. It demonstrates good performance in terms of F1 score even with limited feature data, highlighting its wide range of potential applications.
In crop improvement, achieving high yields requires finding an appropriate balance between growth and defense, as immune activation often comes with high costs and compromises in growth and development, known as “growth-defense tradeoffs” [56]. Calcium ions (Ca2+) play a pivotal role as secondary messengers in various developmental and physiological processes in plants and have long been considered crucial in plant immune responses. While pattern recognition receptors (PRRs) and nucleotide-binding domain leucine-rich repeat proteins (NLRs) are activated by different receptors, their signaling cascades enhance a range of defense responses [31,57]. Recent studies have revealed the molecular functionality of at least some coiled-coil (CC) NLRs (CNLs) and RPW8-like NLRs (RNLs) as calcium-permeable cation channels, further highlighting the importance of calcium in defense mechanisms [58,59]. During the pattern-triggered immunity (PTI) process, BIK1 phosphorylates and activates the CNGC2-CNGC4 channels [60]. Simultaneously, these channels play a crucial role in maintaining intracellular calcium ion balance, preventing the excessive accumulation of cytoplasmic calcium ions, thereby affecting growth and development [61]. In summary, calcium ions play a pivotal role in plant immune responses by regulating multiple signaling pathways and gene expression, thereby influencing plant resistance against pathogens. Components such as calcium channels and calcium-dependent protein kinases are key players in these processes. The activation of plant immunity incurs energy costs and modifies hormone signaling, leading to a defense-growth trade-off [62]. Breeding high-quality economic crops necessitates achieving a delicate equilibrium between yield and pathogen resistance. Recent studies have emphasized the regulatory role of CAXs in intracellular calcium signaling and the attainment of growth-immunity balance [63]. Further investigation is needed to understand the molecular mechanisms underlying the involvement of calcium signaling in plant growth and disease response, which is crucial for improving crop disease resistance while maintaining optimal yield in modern agriculture. Through integrated assays of single-SNP GWAS, Hap-GWAS, WGCNA, and DEGs, we have identified three significant RALs associated with resistance to Ss on chromosome A06 in oilseed rape. Furthermore, using the iMAP algorithm, we predicted seven calcium signaling genes with high relevance to disease resistance: CIPK17 (BnaA06g03950D), SLP2 (BnaA06g12600D), CPK4 (BnaA06g15970D), CML15 (BnaA06g12600D), CML44 (BnaA06g15280D), IQD30 (BnaA06g13020D), and IQD32 (BnaA06g14070D). Some members of these gene families have already been identified in other crops for their crucial roles in resistance against different pathogens and the regulation of plant growth. CML8 in Arabidopsis positively regulates immune responses against Pseudomonas syringae associated with the salicylic acid (SA) signaling pathway [64]. In wheat, the overexpression of CIPK14 enhances broad-spectrum resistance against wheat stripe rust [65], and TaCIPK15-4A plays a positive role in wheat resistance against powdery mildew [66]. CBL-CIPK complexes play a role in seed germination and protect seeds and germinating seedlings from salt stress through the CBL5-CIPK8/CIPK24-SOS1 pathway [67]. Furthermore, we have previously found that some components of calcium signaling pathways are involved in plant resistance to Ss. These include calcium generators guanylate cyclase (GC) [68] and CNGCs [69,70], Ca2+ sensors CaM2 and CaM6 [71], and CDPK, as well as CRK and Ca2+/CaM-dependent protein kinase (CCaMK) [72,73], and calcium signaling relays the transcription factor CAMTA3 [74,75]. These results not only verify the prediction results in this study and thus demonstrate the power of the approaches developed in this study to identify the target genes, but also highlight the potentially pivotal roles of calcium signaling pathways in the QDR to Ss.
Nevertheless, further experiments in oilseed rape are required to confirm the functions and elucidate the mechanisms of these calcium signaling genes in resistance to Ss. These calcium signaling genes, which may regulate calcium ion concentrations and signaling pathways, modulate plant growth rhythms, nutrient allocation, and energy utilization to achieve an effective balance between growth and defense in response to diverse growth environments and biotic pressures. An in-depth investigation of the molecular mechanisms and regulatory networks involved in calcium signaling homeostasis can enhance crop adaptability, disease resistance, and yield stability, contributing to sustainable agriculture and food security.

4. Materials and Methods

4.1. Plant Cultivation and Field Inoculation

The oilseed rape (Brassica napus) germplasm accessions used in this study were sourced from the core germplasm as described [38]. A total of 300 oilseed rape accessions from 39 countries were cultivated in Changxing, China, during the years 2021 and 2022. The experiment consisted of three replicates, with each replicate containing more than 16 plants of each variety. Within each replicate, three randomly selected plants of each variety were inoculated with stem inoculation.
The Sclerotinia sclerotiorum strain UF-1 was cultured on potato dextrose agar (PDA) medium for 3 days at 23 °C. Plugs of 5 mm in diameter were taken from the outer edge of the mycelium and placed, mycelial side down, on the main stem. The plugs were secured with breathable 3M medical tape and cling film to maintain moisture. Lesion length on the main stem was measured at 7 days post inoculation (dpi) using a measuring scale. The circumference of the main stem at the site of lesion formation was also recorded at 7 dpi.

4.2. Data Quality Control and Single-SNP GWAS Analysis

The SNP database of 300 oilseed rape germplasm accessions was obtained from the BnaSNPDB website database (https://bnapus-zju.com/bnasnpdb/, accessed on 1 June 2022) [76]. SNP calling was performed by mapping clean reads of each accession to “Darmor-bzh” reference genomes (B. napus v4.1 genome, http://www.genoscope.cns.fr/brassicanapus/data/, accessed on 5 June 2022). Quality control of SNPs was conducted using PLINK v1.9 [77], considering that SNPs were filtered based on genotype minor allele frequency (0.15), Hardy–Weinberg equilibrium (HWE) with a threshold of 1 × 10−5, and minor allele frequency (MAF) of 0.03. 2574368 high-quality SNPs were obtained based on the “Darmor-bzh” reference genome. LD decay calculation and plotting were performed using the PopLDdecay software v3.42 [78]. The LD heatmap was generated using the BnaSNPDB website. Broad-sense heritability (h2) was calculated using the inti package in R v4.3.0. Principal component analysis (PCA) analysis was conducted using the EIGENSOFT package v6.0.1 [79].
Single-SNP Genome-Wide Association Study (GWAS) was performed using the rMVP package v1.0.0 in R v4.3.0 [39]. Kinship was calculated using rMVP, and three models (GLM, MLM, FarmCPU) were compared. Manhattan plots and Q-Q plots were generated using rMVP to assess SNP associations and significance. Gene phenotype distribution plots and normality analyses were also conducted using rMVP. To conduct gene matching analysis, a nearby genomic region of around 20Kb surrounding the SNPs was selected.

4.3. RAL Identification and Enrichment Analysis

We identified significant SNPs from the single-SNP GWAS results by filtering (−log10(p_value) > 5). RALs associated with resistance to Ss were considered when there were at least 3 significant SNPs within 18 Mb. The distribution of these RALs on the chromosomes was visualized using MG2C [80]. B. napus genes were searched against the Gene Ontology terms (https://geneontology.org/, accessed on 15 September 2022), IPR database (http://pir.georgetown.edu/iproclass/, accessed on 15 September 2022), and ProSitePatterns database (http://www.ebi.ac.uk/interpro/, accessed on 15 September 2022). Enrichment analysis was performed using the OmicStudio tools (https://www.omicstudio.cn/, accessed on 20 September 2022) [81].

4.4. Hap-GWAS, WGCNA and DEGs Analysis

Hap-GWAS analysis was conducted using the R package RAINBOWR v0.1.36 [82], employing the parameters “window.size.half = 5” and “window.slide = 11”. For the WGCNA, RNA-seq data from the previous study [10] were utilized. The RNA-seq data were deposited in the NCBI Sequence Read Archive under the accession number SRP053361. WGCNA was performed using the OECloud tools (https://cloud.oebiotech.com, accessed on 6 September 2023). Differentially expressed genes between inoculated and mock-inoculated samples were identified based on strict criteria: an absolute value of log2 fold changes ≥ 1 and a false discovery rate (FDR) ≤ 0.01.

4.5. Machine Learning

All data related to resistance to Ss were collected and combined (RNA-seq and gene function annotations) with our dataset (WGCNA, single-SNP GWAS, and HAP-GWAS) to construct a training feature set. Protein sequences from the gene models were BLASTed against the TAIR 10 protein database to determine the gene annotation. The dataset consisted of 2001 positive samples and 1439 negative samples. We allocated 80% of the dataset for model training and the remaining 20% for model testing. The training sample had a feature dimension of 13,666, with 2 dimensions for WGCNA features, 7 dimensions for GWAS features, 2 dimensions for HAP GWAS, and 13,655 dimensions for gene function (GO and other database annotations).
For model selection, we compared four machine learning methods: LR, SVM, XGBoost, and RF. We utilized scikit-learn, a popular open-source machine learning library for Python, for data preprocessing, PCA, and model training. PCA was applied to retain 99.5% of the variance in the features, effectively reducing the feature dimension from 13,655 to 20 while preserving most of the feature variance. During model training, we performed a grid search to find the best parameters for all models, such as the number of estimators (n_estimator) in the range of (30, 40, 50, 60) and the maximum number of features. To evaluate the performance of the trained models, we utilized appropriate evaluation metrics such as accuracy, precision, recall, and F1 score. Additionally, techniques like cross-validation were employed to assess the generalization ability of the models.

4.6. Selective Sweep Scans

The vcftools software v4.1 [83] was used to perform selective sweep scans. Nucleotide diversity (π) was calculated with the parameters “--window-pi 20000 --window-pi-step 1000” to assess genetic variation within the population. Tajima’s D value (--TajimaD 10000) and Fst value (--fst-window-size 20,000 --fst-window-step 1000) were also computed to evaluate the occurrence of selection and population differentiation, respectively.

4.7. Statistical Analysis

Statistical analyses were conducted using GraphPad Prism 8 software. One-way ANOVA followed by Duncan’s new multiple range test (DMRT) was utilized for group comparisons. All data were presented as the mean ± standard deviation (SD).

Supplementary Materials

The supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/ijms25136932/s1.

Author Contributions

Conceptualization, X.-Y.W. and X.-Z.C.; methodology, X.-Y.W.; software, X.-Y.W.; formal analysis, X.-Y.W., Q.-W.F., Y.-P.X., L.-W.W. and Z.-L.M.; investigation, X.-Y.W., C.-X.R. and Q.-W.F.; writing, X.-Y.W. and X.-Z.C.; supervision, X.-Z.C.; funding acquisition, X.-Z.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Zhejiang Science and Technology Major Program on Agricultural New Variety Breeding (No. 2021C02064), the Zhejiang Provincial Natural Science Foundation of China (No. LZ18C140002), and the Hainan Provincial Natural Science Foundation of China (No. 324CXTD430).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article/supplementary material; further inquiries can be directed to the corresponding author.

Acknowledgments

We are grateful to the laboratory members for their helps in field resistance evaluation of oilseed rape resources.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Bolton, M.D.; Thomma, B.P.H.J.; Nelson, B.D. Sclerotinia sclerotiorum (Lib.) de Bary: Biology and Molecular Traits of a Cosmopolitan Pathogen. Mol. Plant Pathol. 2006, 7, 1–16. [Google Scholar] [CrossRef]
  2. Ding, L.N.; Li, T.; Guo, X.J.; Li, M.; Liu, X.Y.; Cao, J.; Tan, X.L. Sclerotinia Stem Rot Resistance in Rapeseed: Recent Progress and Future Prospects. J. Agric. Food Chem. 2021, 69, 2965–2978. [Google Scholar] [CrossRef]
  3. Adams, P.B. Ecology of Sclerotinia Species. Phytopathology 1979, 69, 896. [Google Scholar] [CrossRef]
  4. Alkooranee, J.T.; Aledan, T.R.; Ali, A.K.; Lu, G.Y.; Zhang, X.K.; Wu, J.S.; Fu, C.H.; Li, M.T. Detecting the Hormonal Pathways in Oilseed Rape behind Induced Systemic Resistance by Trichoderma Harzianum TH12 to Sclerotinia sclerotiorum. PLoS ONE 2017, 12, e0168850. [Google Scholar] [CrossRef] [PubMed]
  5. Khan, M.A.; Cowling, W.A.; Banga, S.S.; Barbetti, M.J.; Cantila, A.Y.; Amas, J.C.; Thomas, W.J.W.; You, M.P.; Tyagi, V.; Bharti, B.; et al. Genetic and Molecular Analysis of Stem Rot (Sclerotinia sclerotiorum) Resistance in Brassica Napus (Canola Type). Heliyon 2023, 9, e19237. [Google Scholar] [CrossRef] [PubMed]
  6. Liu, J.; Wu, Y.; Zhang, X.; Gill, R.A.; Hu, M.; Bai, Z.; Zhao, C.J.; Zhang, Y.; Liu, Y.Y.; Hu, Q.; et al. Functional and Evolutionary Study of MLO Gene Family in the Regulation of Sclerotinia Stem Rot Resistance in Brassica napus L. Biotechnol. Biofuels Bioprod. 2023, 16, 86. [Google Scholar] [CrossRef]
  7. Corwin, J.A.; Kliebenstein, D.J. Quantitative Resistance: More Than Just Perception of a Pathogen. Plant Cell 2017, 29, 655–665. [Google Scholar] [CrossRef] [PubMed]
  8. Roux, F.; Voisin, D.; Badet, T.; Balagué, C.; Barlet, X.; Huard-Chauveau, C.; Roby, D.; Raffaele, S. Resistance to Phytopathogens e Tutti quanti: Placing Plant Quantitative Disease Resistance on the Map. Mol. Plant Pathol. 2014, 15, 427–432. [Google Scholar] [CrossRef]
  9. Wu, J.; Cai, G.Q.; Tu, J.Y.; Li, L.X.; Liu, S.; Luo, X.P.; Zhou, L.P.; Fan, C.C.; Zhou, Y.M. Identification of QTLs for Resistance to Sclerotinia Stem Rot and BnaC.IGMT5.a as a Candidate Gene of the Major Resistant QTL SRC6 in Brassica napus. PLoS ONE 2013, 8, e67740. [Google Scholar] [CrossRef]
  10. Wu, J.; Zhao, Q.; Yang, Q.Y.; Liu, H.; Li, Q.Y.; Yi, X.Q.; Cheng, Y.; Guo, L.; Fan, C.C.; Zhou, Y.M. Comparative Transcriptomic Analysis Uncovers the Complex Genetic Network for Resistance to Sclerotinia sclerotiorum in Brassica Napus. Sci. Rep. 2016, 6, 19007. [Google Scholar] [CrossRef]
  11. Wei, L.J.; Jian, H.J.; Lu, K.; Filardo, F.; Yin, N.; Liu, L.Z.; Qu, C.M.; Li, W.; Du, H.; Li, J.N. Genome-wide Association Analysis and Differential Expression Analysis of Resistance to Sclerotinia Stem Rot in Brassica napus. Plant Biotechnol. J. 2016, 14, 1368–1380. [Google Scholar] [CrossRef] [PubMed]
  12. Bazakos, C.; Hanemian, M.; Trontin, C.; Jiménez-Gómez, J.M.; Loudet, O. New Strategies and Tools in Quantitative Genetics: How to Go from the Phenotype to the Genotype. Annu. Rev. Plant Biol. 2017, 68, 435–455. [Google Scholar] [CrossRef]
  13. Uffelmann, E.; Huang, Q.Q.; Munung, N.S.; De Vries, J.; Okada, Y.; Martin, A.R.; Martin, H.C.; Lappalainen, T.; Posthuma, D. Genome-Wide Association Studies. Nat. Rev. Methods Primers 2021, 1, 59. [Google Scholar] [CrossRef]
  14. Ding, L.N.; Li, M.; Guo, X.J.; Tang, M.Q.; Cao, J.; Wang, Z.; Liu, R.; Zhu, K.M.; Guo, L.; Liu, S.Y.; et al. Arabidopsis GDSL1 Overexpression Enhances Rapeseed Sclerotinia sclerotiorum Resistance and the Functional Identification of Its Homolog in Brassica napus. Plant Biotechnol. J. 2020, 18, 1255–1270. [Google Scholar] [CrossRef]
  15. Lorenz, A.J.; Hamblin, M.T.; Jannink, J.L. Performance of Single Nucleotide Polymorphisms versus Haplotypes for Genome-Wide Association Analysis in Barley. PLoS ONE 2010, 5, e14079. [Google Scholar] [CrossRef] [PubMed]
  16. N’Diaye, A.; Haile, J.K.; Cory, A.T.; Clarke, F.R.; Clarke, J.M.; Knox, R.E.; Pozniak, C.J. Single Marker and Haplotype-Based Association Analysis of Semolina and Pasta Colour in Elite Durum Wheat Breeding Lines Using a High-Density Consensus Map. PLoS ONE 2017, 12, e0170941. [Google Scholar] [CrossRef]
  17. Xiao, Y.J.; Liu, H.J.; Wu, L.J.; Warburton, M.; Yan, J.B. Genome-Wide Association Studies in Maize: Praise and Stargaze. Mol. Plant 2017, 10, 359–374. [Google Scholar] [CrossRef] [PubMed]
  18. Flanagan, J.M. Epigenome-Wide Association Studies (EWAS): Past, Present, and Future. In Cancer Epigenetics; Verma, M., Ed.; Methods in Molecular Biology; Springer: New York, NY, USA, 2015; Volume 1238, pp. 51–63. ISBN 978-1-4939-1803-4. [Google Scholar]
  19. Gusev, A.; Ko, A.; Shi, H.; Bhatia, G.; Chung, W.; Penninx, B.W.J.H.; Jansen, R.; De Geus, E.J.C.; Boomsma, D.I.; Wright, F.A.; et al. Integrative Approaches for Large-Scale Transcriptome-Wide Association Studies. Nat. Genet. 2016, 48, 245–252. [Google Scholar] [CrossRef]
  20. Kastenmüller, G.; Raffler, J.; Gieger, C.; Suhre, K. Genetics of Human Metabolism: An Update. Hum. Mol. Genet. 2015, 24, R93–R101. [Google Scholar] [CrossRef]
  21. Song, Y.P.; Chen, P.F.; Xuan, A.R.; Bu, C.H.; Liu, P.; Ingvarsson, P.K.; El-Kassaby, Y.A.; Zhang, D.Q. Integration of Genome Wide Association Studies and Co-expression Networks Reveal Roles of PtoWRKY 42-PtoUGT76C1-1 in Trans -zeatin Metabolism and Cytokinin Sensitivity in Poplar. New Phytol. 2021, 231, 1462–1477. [Google Scholar] [CrossRef]
  22. Zhu, C.S.; Li, X.R.; Yu, J.M. Integrating Rare-Variant Testing, Function Prediction, and Gene Network in Composite Resequencing-Based Genome-Wide Association Studies (CR-GWAS). G3 2011, 1, 233–243. [Google Scholar] [CrossRef] [PubMed]
  23. Roy, A. A Classification Algorithm for High-Dimensional Data. Procedia Comput. Sci. 2015, 53, 345–355. [Google Scholar] [CrossRef]
  24. Thottakkara, P.; Ozrazgat-Baslanti, T.; Hupf, B.B.; Rashidi, P.; Pardalos, P.; Momcilovic, P.; Bihorac, A. Application of Machine Learning Techniques to High-Dimensional Clinical Data to Forecast Postoperative Complications. PLoS ONE 2016, 11, e0155705. [Google Scholar] [CrossRef] [PubMed]
  25. De Luis Balaguer, M.A.; Fisher, A.P.; Clark, N.M.; Fernandez-Espinosa, M.G.; Möller, B.K.; Weijers, D.; Lohmann, J.U.; Williams, C.; Lorenzo, O.; Sozzani, R. Predicting Gene Regulatory Networks by Combining Spatial and Temporal Gene Expression Data in Arabidopsis Root Stem Cells. Proc. Natl. Acad. Sci. USA 2017, 114, E7632–E7640. [Google Scholar] [CrossRef] [PubMed]
  26. Ma, C.; Zhang, H.H.; Wang, X.F. Machine Learning for Big Data Analytics in Plants. Trends Plant. Sci. 2014, 19, 798–808. [Google Scholar] [CrossRef] [PubMed]
  27. Yan, J.; Xu, Y.T.; Cheng, Q.; Jiang, S.Q.; Wang, Q.; Xiao, Y.J.; Ma, C.; Yan, J.B.; Wang, X.F. LightGBM: Accelerated Genomically Designed Crop Breeding through Ensemble Learning. Genome Biol. 2021, 22, 271. [Google Scholar] [CrossRef] [PubMed]
  28. Lin, F.; Fan, J.; Rhee, S.Y. QTG-Finder: A Machine-Learning Based Algorithm to Prioritize Causal Genes of Quantitative Trait Loci in Arabidopsis and Rice. G3 2019, 9, 3129–3138. [Google Scholar] [CrossRef] [PubMed]
  29. Lin, F.; Lazarus, E.Z.; Rhee, S.Y. QTG-Finder2: A Generalized Machine-Learning Algorithm for Prioritizing QTL Causal Genes in Plants. G3 2020, 10, 2411–2421. [Google Scholar] [CrossRef] [PubMed]
  30. Singleton, A.B.; Hardy, J.; Traynor, B.J.; Houlden, H. Towards a Complete Resolution of the Genetic Architecture of Disease. Trends Genet. 2010, 26, 438–442. [Google Scholar] [CrossRef]
  31. Tang, S.; Zhao, H.; Lu, S.P.; Yu, L.Q.; Zhang, G.F.; Zhang, Y.T.; Yang, Q.Y.; Zhou, Y.M.; Wang, X.M.; Ma, W.; et al. Genome- and Transcriptome-Wide Association Studies Provide Insights into the Genetic Basis of Natural Variation of Seed Oil Content in Brassica napus. Mol. Plant 2021, 14, 470–487. [Google Scholar] [CrossRef]
  32. Dangl, J.L.; Dietrich, R.A.; Richberg, M.H. Death Don’t Have No Mercy: Cell Death Programs in Plant-Microbe Interactions. Plant Cell 1996, 8, 1793–1807. [Google Scholar] [CrossRef] [PubMed]
  33. Luan, S.; Wang, C. Calcium Signaling Mechanisms Across Kingdoms. Annu. Rev. Cell Dev. Biol. 2021, 37, 311–340. [Google Scholar] [CrossRef] [PubMed]
  34. Yuan, M.; Ngou, B.P.M.; Ding, P.T.; Xin, X.F. PTI-ETI Crosstalk: An Integrative View of Plant Immunity. Curr. Opin. Plant Biol. 2021, 62, 102030. [Google Scholar] [CrossRef] [PubMed]
  35. Schönknecht, G. Calcium Signals from the Vacuole. Plants 2013, 2, 589–614. [Google Scholar] [CrossRef] [PubMed]
  36. Dodd, A.N.; Kudla, J.; Sanders, D. The Language of Calcium Signaling. Annu. Rev. Plant Biol. 2010, 61, 593–620. [Google Scholar] [CrossRef] [PubMed]
  37. Hannan Parker, A.; Wilkinson, S.W.; Ton, J. Epigenetics: A Catalyst of Plant Immunity against Pathogens. New Phytol. 2022, 233, 66–83. [Google Scholar] [CrossRef] [PubMed]
  38. Wu, D.; Liang, Z.; Yan, T.; Xu, Y.; Xuan, L.; Tang, J.; Zhou, G.; Lohwasser, U.; Hua, S.; Wang, H.; et al. Whole-Genome Resequencing of a Worldwide Collection of Rapeseed Accessions Reveals the Genetic Basis of Ecotype Divergence. Mol. Plant 2019, 12, 30–43. [Google Scholar] [CrossRef]
  39. Yin, L.; Zhang, H.; Tang, Z.; Xu, J.; Yin, D.; Zhang, Z.; Yuan, X.; Zhu, M.; Zhao, S.; Li, X.; et al. rMVP: A Memory-Efficient, Visualization-Enhanced, and Parallel-Accelerated Tool for Genome-Wide Association Study. Genom. Proteom. Bioinform. 2021, 19, 619–628. [Google Scholar] [CrossRef]
  40. Qasim, M.U.; Zhao, Q.; Shahid, M.; Samad, R.A.; Ahmar, S.; Wu, J.; Fan, C.C.; Zhou, Y.M. Identification of QTLs Containing Resistance Genes for Sclerotinia Stem Rot in Brassica napus Using Comparative Transcriptomic Studies. Front. Plant Sci. 2020, 11, 776. [Google Scholar] [CrossRef]
  41. Wu, J.; Chen, P.P.; Zhao, Q.; Cai, G.Q.; Hu, Y.; Xiang, Y.; Yang, Q.Y.; Wang, Y.P.; Zhou, Y.M. Co-location of QTL for Sclerotinia Stem Rot Resistance and Flowering Time in Brassica napus. Crop J. 2019, 7, 227–237. [Google Scholar] [CrossRef]
  42. Zhao, J.W.; Udall, J.A.; Quijada, P.A.; Grau, C.R.; Meng, J.L.; Osborn, T.C. Quantitative Trait Loci for Resistance to Sclerotinia Sclerotiorum and Its Association with a Homeologous Non-reciprocal Transposition in Brassica napus L. Theor. Appl. Genet. 2006, 112, 509–516. [Google Scholar] [CrossRef] [PubMed]
  43. Wu, J.; Zhao, Q.; Liu, S.; Shahid, M.; Lan, L.; Cai, G.; Zhang, C.; Fan, C.; Wang, Y.; Zhou, Y. Genome-Wide Association Study Identifies New Loci for Resistance to Sclerotinia Stem Rot in Brassica napus. Front. Plant Sci. 2016, 7, 225163. [Google Scholar] [CrossRef] [PubMed]
  44. Maćkiewicz, A.; Ratajczak, W. Principal Components Analysis (PCA). Comput. Geosci. 1993, 19, 303–342. [Google Scholar] [CrossRef]
  45. Breiman, L. Random Forests. Mach. Learn 2001, 45, 5–32. [Google Scholar] [CrossRef]
  46. Boland, G.J.; Hall, R. Index of Plant Hosts of Sclerotinia sclerotiorum. Can. J. Plant Pathol. 1994, 16, 93–108. [Google Scholar] [CrossRef]
  47. Derbyshire, M.C.; Khentry, Y.; Severn-Ellis, A.; Mwape, V.; Saad, N.S.M.; Newman, T.E.; Taiwo, A.; Regmi, R.; Buchwaldt, L.; Denton-Giles, M.; et al. Modeling First Order Additive × Additive Epistasis Improves Accuracy of Genomic Prediction for Sclerotinia Stem Rot Resistance in Canola. Plant Genome 2021, 14, e20088. [Google Scholar] [CrossRef] [PubMed]
  48. Badet, T.; Léger, O.; Barascud, M.; Voisin, D.; Sadon, P.; Vincent, R.; Le Ru, A.; Balagué, C.; Roby, D.; Raffaele, S. Expression Polymorphism at the ARPC 4 Locus Links the Actin Cytoskeleton with Quantitative Disease Resistance to Sclerotinia sclerotiorum in Arabidopsis thaliana. New Phytol. 2019, 222, 480–496. [Google Scholar] [CrossRef]
  49. Iakovidis, M.; Teixeira, P.J.P.L.; Exposito-Alonso, M.; Cowper, M.G.; Law, T.F.; Liu, Q.; Vu, M.C.; Dang, T.M.; Corwin, J.A.; Weigel, D.; et al. Effector-Triggered Immune Response in Arabidopsis thaliana Is a Quantitative Trait. Genetics 2016, 204, 337–353. [Google Scholar] [CrossRef] [PubMed]
  50. Shahoveisi, F.; Oladzad, A.; Del Río Mendoza, L.E.; Hosseinirad, S.; Ruud, S.; Rissato, B. Assessing the Effect of Phenotyping Scoring Systems and SNP Calling and Filtering Parameters on Detection of QTL Associated with Reaction of Brassica napus to Sclerotinia sclerotiorum. PhytoFrontiers 2021, 1, 135–148. [Google Scholar] [CrossRef]
  51. Wei, D.Y.; Mei, J.Q.; Fu, Y.; Disi, J.O.; Li, J.; Qian, W. Quantitative Trait Loci Analyses for Resistance to Sclerotinia sclerotiorum and Flowering Time in Brassica napus. Mol. Breeding 2014, 34, 1797–1804. [Google Scholar] [CrossRef]
  52. Murphy, K.P. Machine Learning: A Probabilistic Perspective; Adaptive Computation and Machine Learning Series; MIT Press: Cambridge, MA, USA, 2012; ISBN 978-0-262-01802-9. [Google Scholar]
  53. Szymczak, S.; Biernacka, J.M.; Cordell, H.J.; González-Recio, O.; König, I.R.; Zhang, H.P.; Sun, Y.V. Machine Learning in Genome-Wide Association Studies. Genet. Epidemiol. 2009, 33 (Suppl. S1), S51–S57. [Google Scholar] [CrossRef] [PubMed]
  54. Breiman, L. Bagging Predictors. Mach. Learn 1996, 24, 123–140. [Google Scholar] [CrossRef]
  55. Zhao, T.; Wu, H.Y.; Wang, X.T.; Zhao, Y.Y.; Wang, L.Y.; Pan, J.Y.; Mei, H.A.; Han, J.; Wang, S.Y.; Lu, K.N.; et al. Integration of eQTL and Machine Learning to Dissect Causal Genes with Pleiotropic Effects in Genetic Regulation Networks of Seed Cotton Yield. Cell Rep. 2023, 42, 113111. [Google Scholar] [CrossRef] [PubMed]
  56. Huot, B.; Yao, J.; Montgomery, B.L.; He, S.Y. Growth-Defense Tradeoffs in Plants: A Balancing Act to Optimize Fitness. Mol. Plant 2014, 7, 1267–1287. [Google Scholar] [CrossRef] [PubMed]
  57. Yuan, M.H.; Jiang, Z.Y.; Bi, G.Z.; Nomura, K.; Liu, M.H.; Wang, Y.P.; Cai, B.Y.; Zhou, J.M.; He, S.Y.; Xin, X.F. Pattern-Recognition Receptors Are Required for NLR-Mediated Plant Immunity. Nature 2021, 592, 105–109. [Google Scholar] [CrossRef] [PubMed]
  58. Bi, G.Z.; Su, M.; Li, N.; Liang, Y.; Dang, S.; Xu, J.C.; Hu, M.J.; Wang, J.Z.; Zou, M.X.; Deng, Y.A.; et al. The ZAR1 Resistosome Is a Calcium-Permeable Channel Triggering Plant Immune Signaling. Cell 2021, 184, 3528–3541. [Google Scholar] [CrossRef] [PubMed]
  59. Jacob, P.; Kim, N.H.; Wu, F.H.; El-Kasmi, F.; Chi, Y.; Walton, W.G.; Furzer, O.J.; Lietzan, A.D.; Sunil, S.; Kempthorn, K.; et al. Plant “Helper” Immune Receptors Are Ca2+-Permeable Nonselective Cation Channels. Science 2021, 373, 420–425. [Google Scholar] [CrossRef]
  60. Tian, W.; Hou, C.C.; Ren, Z.J.; Wang, C.; Zhao, F.G.; Dahlbeck, D.; Hu, S.P.; Zhang, L.Y.; Niu, Q.; Li, L.G.; et al. A Calmodulin-Gated Calcium Channel Links Pathogen Patterns to Plant Immunity. Nature 2019, 572, 131–135. [Google Scholar] [CrossRef] [PubMed]
  61. Wang, Y.; Kang, Y.; Ma, C.; Miao, R.; Wu, C.; Long, Y.; Ge, T.; Wu, Z.; Hou, X.; Zhang, J.; et al. CNGC2 Is a Ca2+ Influx Channel That Prevents Accumulation of Apoplastic Ca2+ in the Leaf. Plant Physiol. 2017, 173, 1342–1354. [Google Scholar] [CrossRef]
  62. Yang, D.L.; Yang, Y.N.; He, Z.H. Roles of Plant Hormones and Their Interplay in Rice Immunity. Mol. Plant 2013, 6, 675–685. [Google Scholar] [CrossRef]
  63. Wang, C.; Tang, R.J.; Kou, S.H.; Xu, X.S.; Lu, Y.; Rauscher, K.; Voelker, A.; Luan, S. Mechanisms of Calcium Homeostasis Orchestrate Plant Growth and Immunity. Nature 2024, 627, 382–388. [Google Scholar] [CrossRef]
  64. Zhu, X.Y.; Robe, E.; Jomat, L.; Aldon, D.; Mazars, C.; Galaud, J.P. CML8, an Arabidopsis Calmodulin-Like Protein, Plays a Role in Pseudomonas Syringae Plant Immunity. Plant Cell Physiol. 2017, 58, 307–319. [Google Scholar] [CrossRef]
  65. He, F.X.; Wang, C.; Sun, H.L.; Tian, S.X.; Zhao, G.S.; Liu, C.; Wan, C.P.; Guo, J.; Huang, X.L.; Zhan, G.M.; et al. Simultaneous Editing of Three Homoeologues of TaCIPK14 Confers Broad-Spectrum Resistance to Stripe Rust in Wheat. Plant Biotechnol. J. 2023, 21, 354–368. [Google Scholar] [CrossRef] [PubMed]
  66. Liu, X.Y.; Wang, X.Q.; Yang, C.X.; Wang, G.Y.; Fan, B.L.; Shang, Y.T.; Dang, C.; Xie, C.J.; Wang, Z.Y. Genome-Wide Identification of TaCIPK Gene Family Members in Wheat and Their Roles in Host Response to Blumeria graminis f. sp. Tritici Infection. Int. J. Biol. Macromol. 2023, 248, 125691. [Google Scholar] [CrossRef]
  67. Xie, Q.; Yin, X.C.; Wang, Y.; Qi, Y.T.; Pan, C.C.; Sulaymanov, S.; Qiu, Q.S.; Zhou, Y.; Jiang, X.Y. The Signalling Pathways, Calcineurin B-like Protein 5 (CBL5)-CBL-interacting Protein Kinase 8 (CIPK8)/CIPK24-salt Overly Sensitive 1 (SOS1), Transduce Salt Signals in Seed Germination in Arabidopsis. Plant Cell Environ. 2024, 47, 1486–1502. [Google Scholar] [CrossRef] [PubMed]
  68. Rahman, H.; Wang, X.Y.; Xu, Y.P.; He, Y.H.; Cai, X.Z. Characterization of Tomato Protein Kinases Embedding Guanylate Cyclase Catalytic Center Motif. Sci. Rep. 2020, 10, 4078. [Google Scholar] [CrossRef]
  69. Saand, M.A.; Xu, Y.P.; Munyampundu, J.P.; Li, W.; Zhang, X.R.; Cai, X.Z. Phylogeny and Evolution of Plant Cyclic Nucleotide-Gated Ion Channel (CNGC) Gene Family and Functional Analyses of Tomato CNGCs. DNA Res. 2015, 22, 471–483. [Google Scholar] [CrossRef]
  70. Saand, M.A.; Xu, Y.P.; Li, W.; Wang, J.P.; Cai, X.Z. Cyclic Nucleotide Gated Channel Gene Family in Tomato: Genome-Wide Identification and Functional Analyses in Disease Resistance. Front. Plant Sci. 2015, 06, 303. [Google Scholar] [CrossRef] [PubMed]
  71. Zhao, Y.; Liu, W.; Xu, Y.P.; Cao, J.Y.; Braam, J.; Cai, X.Z. Genome-Wide Identification and Functional Analyses of Calmodulin Genes in Solanaceous species. BMC Plant Biol. 2013, 13, 70. [Google Scholar] [CrossRef]
  72. Wang, J.P.; Xu, Y.P.; Munyampundu, J.P.; Liu, T.Y.; Cai, X.Z. Calcium-Dependent Protein Kinase (CDPK) and CDPK-Related Kinase (CRK) Gene Families in Tomato: Genome-Wide Identification and Functional Analyses in Disease Resistance. Mol. Genet. Genom. 2016, 291, 661–676. [Google Scholar] [CrossRef]
  73. Wang, J.P.; Xu, Y.P.; Cai, X.Z. Phylogeny of Plant Calcium and Calmodulin-Dependent Protein Kinases (CCaMKs) and Functional Analyses of Tomato CCaMK in Disease Resistance. Front. Plant Sci. 2015, 6, 1075. [Google Scholar] [CrossRef]
  74. Rahman, H.; Xu, Y.P.; Zhang, X.R.; Cai, X.Z. Brassica Napus Genome Possesses Extraordinary High Number of CAMTA Genes and CAMTA3 Contributes to PAMP Triggered Immunity and Resistance to Sclerotinia sclerotiorum. Front. Plant Sci. 2016, 7, 581. [Google Scholar] [CrossRef] [PubMed]
  75. Rahman, H.; Yang, J.; Xu, Y.P.; Wang, J.P.; Cai, X.Z. Phylogeny of Plant CAMTAs and Role of AtCAMTAs in Nonhost Resistance to Xanthomonas oryzae Pv. Oryzae. Front. Plant Sci. 2016, 7, 177. [Google Scholar] [CrossRef]
  76. Yan, T.; Wang, Q.; Maodzeka, A.; Wu, D.Z.; Jiang, L.X. BnaSNPDB: An Interactive Web Portal for the Efficient Retrieval and Analysis of SNPs among 1,007 Rapeseed Accessions. Comput. Struct. Biotechnol. J. 2020, 18, 2766–2773. [Google Scholar] [CrossRef]
  77. Chang, C.C.; Chow, C.C.; Tellier, L.C.; Vattikuti, S.; Purcell, S.M.; Lee, J.J. Second-Generation PLINK: Rising to the Challenge of Larger and Richer Datasets. Gigascience 2015, 4, 7. [Google Scholar] [CrossRef]
  78. Zhang, C.; Dong, S.S.; Xu, J.Y.; He, W.M.; Yang, T.L. PopLDdecay: A Fast and Effective Tool for Linkage Disequilibrium Decay Analysis Based on Variant Call Format Files. Bioinformatics 2019, 35, 1786–1788. [Google Scholar] [CrossRef]
  79. Price, A.L.; Patterson, N.J.; Plenge, R.M.; Weinblatt, M.E.; Shadick, N.A.; Reich, D. Principal Components Analysis Corrects for Stratification in Genome-Wide Association Studies. Nat. Genet. 2006, 38, 904–909. [Google Scholar] [CrossRef] [PubMed]
  80. Chao, J.T.; Li, Z.Y.; Sun, Y.H.; Aluko, O.O.; Wu, X.R.; Wang, Q.; Liu, G.S. MG2C: A User-Friendly Online Tool for Drawing Genetic Maps. Mol. Hortic. 2021, 1, 16. [Google Scholar] [CrossRef] [PubMed]
  81. Lyu, F.; Han, F.; Ge, C.; Mao, W.; Chen, L.; Hu, H.; Chen, G.; Lang, Q.; Fang, C. OmicStudio: A Composable Bioinformatics Cloud Platform with Real-time Feedback That Can Generate High-quality Graphs for Publication. iMeta 2023, 2, e85. [Google Scholar] [CrossRef]
  82. Hamazaki, K.; Iwata, H. RAINBOW: Haplotype-Based Genome-Wide Association Study Using a Novel SNP-Set Method. PLoS Comput. Biol. 2020, 16, e1007663. [Google Scholar] [CrossRef]
  83. Danecek, P.; Auton, A.; Abecasis, G.; Albers, C.A.; Banks, E.; DePristo, M.A.; Handsaker, R.E.; Lunter, G.; Marth, G.T.; Sherry, S.T.; et al. The Variant Call Format and VCFtools. Bioinformatics 2011, 27, 2156–2158. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Optimized GLM-based single-SNP GWAS analysis with covariates for SSR resistance in oilseed rape. (A) Phenotypic variation of 300 oilseed rape germplasm accessions in resistance to sclerotinia stem rot (SSR). Lesion length was measured 14 days post inoculation (dpi) in the years 2011 and 2022. (B) The disease symptoms of oilseed rape stems after Ss inoculation. Representative resistant (R4762 and R4572) and susceptible (R4385 and R4665) rapeseed germplasm accessions are shown. Bar = 1 cm. (C) Frequency distribution of lesion length in genome-wide association study (GWAS) population in years 2021 and 2022. (D) Regression analysis between stem circumference (SC) and stem lesion length (LL) in respect to SSR resistance. (E) Linkage disequilibrium (LD)-decay plot. LD (r2) was estimated with PopLDdecay v3.42 and plotted as a function of physical distance in Kb for each population. (F,G) Multi-track Manhattan plot and Quantile–Quantile (Q-Q) plot based on single-SNP GWAS using GLM, MLM, and FarmCPU models for the phenotypic data collected in 2021 (F) and 2022 (G). The red dashed lines indicate the significance threshold (−log10 (p_value) = 5.0). (H,I) Q-Q plots comparing single-SNP GWAS results with and without covariates were generated in years 2021 (H) and 2022 (I). The covariates kinship (K), Principal Component Analysis (PCA), SC, and flowering time (FT) were included.
Figure 1. Optimized GLM-based single-SNP GWAS analysis with covariates for SSR resistance in oilseed rape. (A) Phenotypic variation of 300 oilseed rape germplasm accessions in resistance to sclerotinia stem rot (SSR). Lesion length was measured 14 days post inoculation (dpi) in the years 2011 and 2022. (B) The disease symptoms of oilseed rape stems after Ss inoculation. Representative resistant (R4762 and R4572) and susceptible (R4385 and R4665) rapeseed germplasm accessions are shown. Bar = 1 cm. (C) Frequency distribution of lesion length in genome-wide association study (GWAS) population in years 2021 and 2022. (D) Regression analysis between stem circumference (SC) and stem lesion length (LL) in respect to SSR resistance. (E) Linkage disequilibrium (LD)-decay plot. LD (r2) was estimated with PopLDdecay v3.42 and plotted as a function of physical distance in Kb for each population. (F,G) Multi-track Manhattan plot and Quantile–Quantile (Q-Q) plot based on single-SNP GWAS using GLM, MLM, and FarmCPU models for the phenotypic data collected in 2021 (F) and 2022 (G). The red dashed lines indicate the significance threshold (−log10 (p_value) = 5.0). (H,I) Q-Q plots comparing single-SNP GWAS results with and without covariates were generated in years 2021 (H) and 2022 (I). The covariates kinship (K), Principal Component Analysis (PCA), SC, and flowering time (FT) were included.
Ijms 25 06932 g001
Figure 2. Genetic analysis and functional characterization of 48 SSR resistance-associated RALs in Oilseed Rape. (A) Chromosomal distribution of 24 out of the total 48 SSR resistance related resistance-associated loci (RALs). The distribution of the remaining RALs can be found in Figure S2. (B) The results of gene ontology (GO) enrichment analysis of the genes within the 48 SSR resistance-associated RALs. (C,D) Protein structure analysis of genes within the 48 SSR resistance-associated RALs based on IPR (C) and ProSitePatterns (D) databases.
Figure 2. Genetic analysis and functional characterization of 48 SSR resistance-associated RALs in Oilseed Rape. (A) Chromosomal distribution of 24 out of the total 48 SSR resistance related resistance-associated loci (RALs). The distribution of the remaining RALs can be found in Figure S2. (B) The results of gene ontology (GO) enrichment analysis of the genes within the 48 SSR resistance-associated RALs. (C,D) Protein structure analysis of genes within the 48 SSR resistance-associated RALs based on IPR (C) and ProSitePatterns (D) databases.
Ijms 25 06932 g002
Figure 3. Multi-Omics analysis reveals the co-predicted SSR resistance-associated RALs in Oilseed Rape. (A) Manhattan and Q-Q plots for haplotype-based GWAS (HAP-GWAS) for Ss resistance in oilseed rape for the years 2021 and 2022. The red dashed lines indicate the significance threshold (−log10 (p_value) = 4.0). (B) Weighted gene co-expression network analysis (WGCNA) for gene expression in responsive to Ss infection in susceptible and resistant oilseed rape germplasm accessions. Correlation heatmap of disease resistance modules was shown. Module colors in coordinates of the left panel correspond to those in Y-axis of the right panel. The Pearson correlation algorithm in OECloud tools (https://cloud.oebiotech.com, accessed on 6 September 2023) was used to calculate the correlation coefficient and p value of the module’s characteristic genes and traits. “.” indicates non-significance. * p ≤ 0.05; *** p ≤ 0.001. (C) Venn diagrams illustrating the overlap of differentially expressed genes at 24 hpi, 48 hpi, and 96 hpi after Sclerotinia infection in susceptible and resistant oilseed rapes. (D) Venn diagrams depicting the overlapping of SSR resistance-associated RALs identified through single-SNP GWAS, HAP-GWAS, WGCNA, and differentially expressed genes (DEGs) analyses. (E) Venn diagrams illustrating the overlapping of SSR resistance-associated genes identified through single-SNP GWAS, HAP-GWAS, WGCNA, and DEGs analyses. (F) Chromosomal distribution of 7 SSR resistance-associated RALs that are unanimously identified in single-SNP GWAS, HAP-GWAS, WGCNA, and DEG analyses.
Figure 3. Multi-Omics analysis reveals the co-predicted SSR resistance-associated RALs in Oilseed Rape. (A) Manhattan and Q-Q plots for haplotype-based GWAS (HAP-GWAS) for Ss resistance in oilseed rape for the years 2021 and 2022. The red dashed lines indicate the significance threshold (−log10 (p_value) = 4.0). (B) Weighted gene co-expression network analysis (WGCNA) for gene expression in responsive to Ss infection in susceptible and resistant oilseed rape germplasm accessions. Correlation heatmap of disease resistance modules was shown. Module colors in coordinates of the left panel correspond to those in Y-axis of the right panel. The Pearson correlation algorithm in OECloud tools (https://cloud.oebiotech.com, accessed on 6 September 2023) was used to calculate the correlation coefficient and p value of the module’s characteristic genes and traits. “.” indicates non-significance. * p ≤ 0.05; *** p ≤ 0.001. (C) Venn diagrams illustrating the overlap of differentially expressed genes at 24 hpi, 48 hpi, and 96 hpi after Sclerotinia infection in susceptible and resistant oilseed rapes. (D) Venn diagrams depicting the overlapping of SSR resistance-associated RALs identified through single-SNP GWAS, HAP-GWAS, WGCNA, and differentially expressed genes (DEGs) analyses. (E) Venn diagrams illustrating the overlapping of SSR resistance-associated genes identified through single-SNP GWAS, HAP-GWAS, WGCNA, and DEGs analyses. (F) Chromosomal distribution of 7 SSR resistance-associated RALs that are unanimously identified in single-SNP GWAS, HAP-GWAS, WGCNA, and DEG analyses.
Ijms 25 06932 g003
Figure 4. Exploration of algorithms and feature sets for genomic analysis in predictive modeling. (A) Illustration of PCA algorithm-mediated reduction of data dimensions from a three-dimensional plane (left) to a two-dimensional plane (right). Different groups of data are indicated in various colors. (B) Work model of the Random Forest (RF) algorithm, combining multiple decision trees with randomly selected data sets and features. (C) Confusion Matrix of Logistic Regression (LR), Support Vector Machine (SVM), eXtreme Gradient Boosting (XGBoost), Neural Network (NN) and RF algorithms using single-SNP GWAS as the feature set. (D) Performance comparison of LR, NN, XGBoost, SVM, and RF algorithms using single-SNP GWAS as the feature set in terms of accuracy, precision, recall, F1 score, and predict time. (E) Confusion Matrix of SVM, NN, XGBoost, LR, and RF algorithms using single-SNP GWAS+ HAP-GWAS + gene function (GF) + WGCNA as the feature set. (FI) Performance evaluation of LR, NN, XGBoost, SVM, and RF algorithms in terms of accuracy (F), precision (G), recall (H), and F1 score (I) using different feature sets, including single-SNP GWAS, single-SNP GWAS + HAP-GWAS + GF, and single-SNP GWAS + HAP-GWAS + GF + WGCNA for machine learning. (J) Performance comparison of LR, NN, XGBoost, SVM, and RF algorithms using single-SNP GWAS + HAP-GWAS + GF + WGCNA as the feature set in terms of accuracy, precision, recall, F1 score, and predict time.
Figure 4. Exploration of algorithms and feature sets for genomic analysis in predictive modeling. (A) Illustration of PCA algorithm-mediated reduction of data dimensions from a three-dimensional plane (left) to a two-dimensional plane (right). Different groups of data are indicated in various colors. (B) Work model of the Random Forest (RF) algorithm, combining multiple decision trees with randomly selected data sets and features. (C) Confusion Matrix of Logistic Regression (LR), Support Vector Machine (SVM), eXtreme Gradient Boosting (XGBoost), Neural Network (NN) and RF algorithms using single-SNP GWAS as the feature set. (D) Performance comparison of LR, NN, XGBoost, SVM, and RF algorithms using single-SNP GWAS as the feature set in terms of accuracy, precision, recall, F1 score, and predict time. (E) Confusion Matrix of SVM, NN, XGBoost, LR, and RF algorithms using single-SNP GWAS+ HAP-GWAS + gene function (GF) + WGCNA as the feature set. (FI) Performance evaluation of LR, NN, XGBoost, SVM, and RF algorithms in terms of accuracy (F), precision (G), recall (H), and F1 score (I) using different feature sets, including single-SNP GWAS, single-SNP GWAS + HAP-GWAS + GF, and single-SNP GWAS + HAP-GWAS + GF + WGCNA for machine learning. (J) Performance comparison of LR, NN, XGBoost, SVM, and RF algorithms using single-SNP GWAS + HAP-GWAS + GF + WGCNA as the feature set in terms of accuracy, precision, recall, F1 score, and predict time.
Ijms 25 06932 g004
Figure 5. Positive selection on potential key calcium signaling genes associated with Ss resistance in oilseed rape. (A) Chromosomal distribution of seven calcium signaling genes associated with SSR resistance, and their genetic diversity (π) levels and Tajima’s D between susceptible (S) and resistant (R) subgroups. (B) The Fixation Index (Fst) between susceptible (S) and resistant (R) groups for the distribution of seven calcium signaling genes. (C) LD plots illustrating the genomic region surrounding the focal SNPs of the six calcium signaling genes. (D) The haplotype frequencies for four SNPs in the coding sequence (CDS) and promoter regions of the seven calcium signaling genes. Significant difference was determined by one-way ANOVA followed by DMRT (ns, non-significance; * p ≤ 0.05; ** p ≤ 0.01; **** p ≤ 0.0001). Specific p-values are shown in the panels when p > 0.01. (E) The R-genotypes rate in each of the three genotypes, which is calculated as the proportion of germplasm accessions with a lesion length less than 30 mm out of the total number of germplasm accessions for each genotype.
Figure 5. Positive selection on potential key calcium signaling genes associated with Ss resistance in oilseed rape. (A) Chromosomal distribution of seven calcium signaling genes associated with SSR resistance, and their genetic diversity (π) levels and Tajima’s D between susceptible (S) and resistant (R) subgroups. (B) The Fixation Index (Fst) between susceptible (S) and resistant (R) groups for the distribution of seven calcium signaling genes. (C) LD plots illustrating the genomic region surrounding the focal SNPs of the six calcium signaling genes. (D) The haplotype frequencies for four SNPs in the coding sequence (CDS) and promoter regions of the seven calcium signaling genes. Significant difference was determined by one-way ANOVA followed by DMRT (ns, non-significance; * p ≤ 0.05; ** p ≤ 0.01; **** p ≤ 0.0001). Specific p-values are shown in the panels when p > 0.01. (E) The R-genotypes rate in each of the three genotypes, which is calculated as the proportion of germplasm accessions with a lesion length less than 30 mm out of the total number of germplasm accessions for each genotype.
Ijms 25 06932 g005
Figure 6. Work model of iMAP. We integrate multi-omics data to perform comprehensive analyses, including Single-SNP GWAS and Hap-GWAS on SNP data, and WGCNA and DEGs analysis on gene expression data. Additionally, iMAP allows for the integration of SNP and expression data for TWAS. Furthermore, we incorporate various databases for functional and structural analysis of genes. By using Random Forest algorithm, iMAP performs machine learning on different features to predict potential target genes associated with traits of interest. These predicted genes can be validated through further biological experiments to explore their functional roles.
Figure 6. Work model of iMAP. We integrate multi-omics data to perform comprehensive analyses, including Single-SNP GWAS and Hap-GWAS on SNP data, and WGCNA and DEGs analysis on gene expression data. Additionally, iMAP allows for the integration of SNP and expression data for TWAS. Furthermore, we incorporate various databases for functional and structural analysis of genes. By using Random Forest algorithm, iMAP performs machine learning on different features to predict potential target genes associated with traits of interest. These predicted genes can be validated through further biological experiments to explore their functional roles.
Ijms 25 06932 g006
Table 1. Physical position of RALs associated with resistance to Ss.
Table 1. Physical position of RALs associated with resistance to Ss.
RAL Name 1Chr 2StartEndRecurring QTLs in the ReferenceReference
RSSA01achrA0129,3107,115,091
RSSA01bchrA0113,304,77622,353,741chrA01:12,444,829–19,857,639Wu et al., 2013 [9]
RSSA02achrA026,327,1978,672,595chrA02:5,418,060–7,389,046Qasim et al., 2020 [40]
chrA02:8,152,495–9,281,526Qasim et al., 2020 [40]
RSSA02bchrA0217,499,71619,951,552chrA02:16,670,964–20,474,897Wu et al., 2013 [9]
RSSA03achrA03178,5924,297,931chrA03:2,702,453–4,262,421Wu et al., 2019 [41]
RSSA03bchrA038,896,71711,278,534
RSSA03cchrA0315,124,06722,211,498chrA03:15,547,362–16,064,878Zhao et al., 2006 [42]
RSSA04achrA049,608,45012,922,254
RSSA04bchrA0414,872,04718,447,087
RSSA05chrA0517,966,32520,397,420
RSSA06achrA061,853,6422,407,529
RSSA06bchrA066,152,6388,944,061
RSSA06cchrA0617,880,85221,689,696chrA06:20,965,425–23,324,273
chrA06:21,629,047–23,499,018
Wu et al., 2013 [9]
Wu et al., 2019 [41]
RSSA07achrA0710,082,56414,250,889
RSSA07bchrA0720,186,06021,609,303
RSSA09achrA096,448,5386,596,656
RSSA09bchrA0932,071,20232,105,505chrA09:28,638,676–31,720,464Qasim et al., 2020 [40]
RSSA10achrA101,236,2864,041,051
RSSA10bchrA1010,486,13314,839,131
RSSAnn_random_achrAnn_random15,679,61421,810,062
RSSAnn_random_bchrAnn_random25,598,04233,228,887
RSSAnn_random_cchrAnn_random39,652,44946,732,598
RSSC01chrC0119,283,44534,365,756
RSSC02achrC0214,388,40918,630,147
RSSC02bchrC0226,296,40928,749,960
RSSC03achrC034,693,4559,737,761
RSSC03bchrC0320,346,76533,132,785chrC03:22,217,518–30,597,169
chrC03:30,597,169–47,861,855
Qasim et al., 2020 [40]
Qasim et al., 2020 [40]
RSSC03_randomchrC03_random1,845,6591,847,988
RSSC04achrC0419,676,38220,444,349chrC04:11,691,778–28,720,453Zhao et al., 2006 [42]
RSSC04bchrC0423,959,21926,737,257chrC04:11,691,778–28,720,453Zhao et al., 2006 [42]
RSSC04cchrC0439,374,82343,990,961chrC04:40,419,964–41,916,428Wu et al., 2016 [43]
RSSC05achrC051,079,6758,269,010
RSSC05bchrC0510,013,53717,710,073
RSSC05cchrC0533,795,59139,750,069
RSSC06achrC062,873,7094,819,806
RSSC06bchrC0612,871,21117,346,441
RSSC07achrC0722,967,05027,205,509
RSSC07bchrC0730,140,82533,700,961chrC07:29,634,609–31,761,057Wu et al., 2013 [9]
RSSC07cchrC0740,473,49544,450,958
RSSC08achrC0814,471,40523,488,895chrC08:20,654,840–20,788,744Wu et al., 2016 [43]
RSSC08bchrC0829,321,32733,960,528chrC08:31,404,544–32,040,944
chrC08:31,404,544–33,506,513
Wu et al., 2013 [9]
Wu et al., 2013 [9]
RSSC09achrC093,611,79321,444,314
RSSC09bchrC0933,468,28533,556,099
RSSC09cchrC0941,569,76245,686,680chrC09:43,326,113–43,593,795Zhao et al., 2006 [42]
RSSCnn_random_cchrCnn_random19,417,33125,686,759
RSSCnn_random_dchrCnn_random41,618,22049,282,425
RSSCnn_random_echrCnn_random55,253,39260,160,894
RSSCnn_random_fchrCnn_random62,592,39469,187,154
1 RAL name: name of Resistance-Associated Loci. 2 Chr: Chromosome.
Table 2. Prediction of calcium signaling genes associated with Ss resistance on Chromosome A06 using iMAP.
Table 2. Prediction of calcium signaling genes associated with Ss resistance on Chromosome A06 using iMAP.
RAL_Name 1Gene_IDChr 2StartEndAt_Gene 3SymbolDescriptionScore
RSSA06aBnaA06g03950DchrA062,392,9272,389,029AT1G48260CIPK17CBL interacting protein kinase 170.78
RSSA06bBnaA06g12600DchrA066,535,4636,537,122AT1G18480SLP2Calcineurin like metallo phosphoesterase superfamily protein0.97
RSSA06bBnaA06g12660DchrA066,551,7376,551,264AT1G18530CML15Calmodulin-like 150.92
RSSA06bBnaA06g13020DchrA066,809,1846,812,279AT1G18840IQD30IQ domain 300.98
RSSA06bBnaA06g14070DchrA067,550,9277,554,322AT1G19870IQD32IQ domain 320.97
RSSA06bBnaA06g15280DchrA068,382,9658,383,757AT1G21550CML44Calcium binding EF hand family protein1.00
RSSA06bBnaA06g15970DchrA068,783,5888,786,720AT5G24430CPK4Calcium dependent protein kinase0.76
1 RAL_name: name of Resistance-Associated Loci. 2 Chr: Chromosome. 3 At_gene: Arabidopsis gene homologs of oilseed rape genes.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, X.-Y.; Ren, C.-X.; Fan, Q.-W.; Xu, Y.-P.; Wang, L.-W.; Mao, Z.-L.; Cai, X.-Z. Integrated Assays of Genome-Wide Association Study, Multi-Omics Co-Localization, and Machine Learning Associated Calcium Signaling Genes with Oilseed Rape Resistance to Sclerotinia sclerotiorum. Int. J. Mol. Sci. 2024, 25, 6932. https://doi.org/10.3390/ijms25136932

AMA Style

Wang X-Y, Ren C-X, Fan Q-W, Xu Y-P, Wang L-W, Mao Z-L, Cai X-Z. Integrated Assays of Genome-Wide Association Study, Multi-Omics Co-Localization, and Machine Learning Associated Calcium Signaling Genes with Oilseed Rape Resistance to Sclerotinia sclerotiorum. International Journal of Molecular Sciences. 2024; 25(13):6932. https://doi.org/10.3390/ijms25136932

Chicago/Turabian Style

Wang, Xin-Yao, Chun-Xiu Ren, Qing-Wen Fan, You-Ping Xu, Lu-Wen Wang, Zhou-Lu Mao, and Xin-Zhong Cai. 2024. "Integrated Assays of Genome-Wide Association Study, Multi-Omics Co-Localization, and Machine Learning Associated Calcium Signaling Genes with Oilseed Rape Resistance to Sclerotinia sclerotiorum" International Journal of Molecular Sciences 25, no. 13: 6932. https://doi.org/10.3390/ijms25136932

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop