**2. Results**

### *2.1. Overview of the Transcriptome Data and Gene Expression Profile under Various Stress Treatments*

In this study, we analyzed the global gene expression profiles of sesame under various abiotic stress treatments based on the RNA-Sequencing (RNA-Seq) technology. In total, four datasets from salt stress [8], drought stress [7], waterlogging stress [6] and a newly generated RNA-Seq data under osmotic stress, comprising of 30, 24, 6 and 12 samples, respectively, were investigated (Table 1). In the selected studies, stress-sensitive and tolerant varieties were used. Additionally, all RNA-Seq datasets had three replicates per treatment, untreated controls for each treatment, and the tissue type was primarily the root. After cleaning and filtering the RNA-Seq datasets, we obtained clean reads ranging from 13.5 to 230 Gb with a total of 25,319 uniquely expressed genes among the 72 samples (Table 1).


**Table 1.** Characteristics of the RNA-Seq datasets used in this study.

#### *2.2. Identification of DEGs, Core Conserved DEGs in Response to Abiotic Stress*

For each dataset, we compared the gene expression under control condition to the stress condition in order to identify the differentially expressed genes (DEGs). We identified more DEGs under salt stress treatment compared to the other stress treatments, indicating that salt stress triggers the most intense gene regulation in sesame (Figure 1A). A total of 12,784 DEGs were identified among different samples in the datasets. We cross-compared the identified DEGs among the four datasets aiming at identifying the core conserved DEGs in response to all the four abiotic stresses. The result showed that large numbers of DEGs were stress-specific; however, 543 genes constantly participate in sesame responses to its major abiotic stresses and represent the core abiotic stress responsive genes (CARG) (Figure 1B; Table S2; Figure S1). To confirm these CARGs, we selected 10 independent sesame genotypes and evaluated the expression fold change (FC) of 50 randomly selected genes within the CARGs under drought, salt, osmotic, heat and waterlogging stresses compared to the control. As shown in Figure S2, the expression levels of all the tested genes were highly altered under stress (|FC|>2), showing that the proposed CARGs are well conserved in sesame and may be functional under a more diverse arrays of abiotic stresses.

We further searched for the enriched transcription factor (TF) families within these CARGs. As shown in Figure 1C, 18 TF families were represented but the ERF, MYB, bHLH and WRKY were overrepresented, denoting that these TF families are the major regulators of sesame responses to multiple abiotic stresses.

*Int. J. Mol. Sci.* **2019**, *20*, x FOR PEER REVIEW 4 of 19

**Figure 1.** Identification of the core abiotic stress responsive genes (CARG) in sesame. (**A**) Differentially expressed genes detected between control and stress treatments. (**B**) Venn Diagram showing stress specific genes and CARGs. (**C**) Major transcription families enriched in the sesame **Figure 1.** Identification of the core abiotic stress responsive genes (CARG) in sesame. (**A**) Differentially expressed genes detected between control and stress treatments. (**B**) Venn Diagram showing stress specific genes and CARGs. (**C**) Major transcription families enriched in the sesame CARGs.

#### CARGs. *2.3. WGCNA and Detection of Functional Modules*

*2.3. WGCNA and Detection of Functional Modules*  Weighted gene co-expression network analysis (WGCNA) was conducted on the CARGs to reveal the different modules of co-expressed genes. WGCNA divided the 543 core DEGs into three different modules named as Blue, Turquoise and Grey, containing 113, 276 and 154, genes respectively (Figure 2A,B; Table S3). Association of the detected modules and the abiotic stresses indexes showed that all the three modules respond differently to the abiotic stresses, except for salt stress. In fact, all the three modules were negatively correlated with salt stress (r = −0.47, −0.84, −0.74 for Blue, Turquoise and Grey, respectively), suggesting that the CARGs should be down-regulated to allow sesame survive under salt stress (Figure 2C). In addition, with all modules taken together, we could observe that the CARGs engage distinct responses according to the stress, highlighting the Weighted gene co-expression network analysis (WGCNA) was conducted on the CARGs to reveal the different modules of co-expressed genes. WGCNA divided the 543 core DEGs into three different modules named as Blue, Turquoise and Grey, containing 113, 276 and 154, genes respectively (Figure 2A,B; Table S3). Association of the detected modules and the abiotic stresses indexes showed that all the three modules respond differently to the abiotic stresses, except for salt stress. In fact, all the three modules were negatively correlated with salt stress (r = −0.47, −0.84, −0.74 for Blue, Turquoise and Grey, respectively), suggesting that the CARGs should be down-regulated to allow sesame survive under salt stress (Figure 2C). In addition, with all modules taken together, we could observe that the CARGs engage distinct responses according to the stress, highlighting the key roles played by the regulator genes to shape the stress-specific responses.

key roles played by the regulator genes to shape the stress-specific responses. To explore the particular biological processes involving the three modules detected by WGCNA, we performed GO enrichment analysis. Blue module related genes represent the basal defense of sesame as evidenced by the enriched GO terms such as 'defense response' and 'response to biotic stimulus' (Figure S3A). Grey module related genes were enriched in the transporter activity role (Figure S3B). Finally, in Turquoise module, 'iron ion binding' and 'heme binding' were detected as the most enriched GO terms, which are well known to be involved in abiotic stress responses in plants [46,47] (Figure S3C). These results further support the premise that the co-expressed gene To explore the particular biological processes involving the three modules detected by WGCNA, we performed GO enrichment analysis. Blue module related genes represent the basal defense of sesame as evidenced by the enriched GO terms such as 'defense response' and 'response to biotic stimulus' (Figure S3A). Grey module related genes were enriched in the transporter activity role (Figure S3B). Finally, in Turquoise module, 'iron ion binding' and 'heme binding' were detected as the most enriched GO terms, which are well known to be involved in abiotic stress responses in plants [46,47] (Figure S3C). These results further support the premise that the co-expressed gene modules of the sesame CARGs play different functions in response to abiotic stresses.

modules of the sesame CARGs play different functions in response to abiotic stresses.

*Int. J. Mol. Sci.* **2019**, *20*, x FOR PEER REVIEW 5 of 19

**Figure 2.** Detection of co-expressed modules in the sesame core abiotic stress responsive genes based on WGCNA. (**A**) Dendrogram showing the different genes clustered into co-expressed modules. (**B**) Number of assigned DEGs to the different modules. (**C**) Association between co-expressed modules and abiotic stresses in sesame. The numbers represent the Pearson correlation coefficients. Positive correlation is colored in red while negative correlation is colored in green. **Figure 2.** Detection of co-expressed modules in the sesame core abiotic stress responsive genes based on WGCNA. (**A**) Dendrogram showing the different genes clustered into co-expressed modules. (**B**) Number of assigned DEGs to the different modules. (**C**) Association between co-expressed modules and abiotic stresses in sesame. The numbers represent the Pearson correlation coefficients. Positive correlation is colored in red while negative correlation is colored in green.

#### *2.4. Networks Displaying Relationships among Genes within Co-Expressed Modules 2.4. Networks Displaying Relationships among Genes within Co-Expressed Modules*

To understand the gene interaction within each module, we constructed the gene network using the Cytoscape software. Genes in the different modules were subsequently divided into different clusters, each having network of different number of genes (Figure 3). TFs are represented with different node colors except sky blue and the size of node circle is positively correlated with the number of genes it interacts (Figure 3). Genes with biggest node sizes represent the hub genes. In Blue module, we observed two clusters of genes connected by the gene SIN\_1024285 (transmembrane protein 45B-like). We identified several hub genes, including SIN\_1003530 (2-aminoethanethiol dioxygenase), SIN\_1011060 (pyruvate decarboxylase 1), SIN\_1018022 (pentatricopeptide repeat-containing protein), SIN\_1003097 (NA), SIN\_1003294 (NA), SIN\_1006713 (cationic amino acid transporter 1-like), SIN\_1013309 (alcohol dehydrogenase 3), SIN\_1014647 (squalene monooxygenase), SIN\_1017721 (NA), SIN\_1024789 (ATP-dependent 6-phosphofructokinase) etc. Furthermore, some key TFs were also present SIN\_1005329 (ERF) and SIN\_1019627 (WRKY), which may play important regulation role in this module (Figure 3). Turquoise module contains complex clusters of genes, implying diverse biological functions within this module (Figure 3). The key hub genes detected are SIN\_1017899 (NA), SIN\_1002615 (G-type lectin S-receptor-like serine/threonine-protein kinase), SIN\_1005524 (exocyst complex component EXO70A1-like), SIN\_1017189 (NA), SIN\_1007389 (NA), SIN\_1016615 (NA), SIN\_1009465 (pectinesterase 1-like), SIN\_1021953 (WRKY) and SIN\_1013789 (MAPK). In the Grey module, one main cluster surrounded by several small clusters of genes was detected (Figure S4). Several hub genes, including SIN\_1002211 (aspartic proteinase nepenthesin-1), SIN\_1003190 (protein ECERIFERUM 1), SIN\_1019848 (GDSL esterase), SIN\_1008670 (18.2 kDa class I heat shock protein), SIN\_1014764 (3-ketoacyl-CoA synthase 4) and SIN\_1004965 (glucan endo-1,3-β-glucosidase 13), may To understand the gene interaction within each module, we constructed the gene network using the Cytoscape software. Genes in the different modules were subsequently divided into different clusters, each having network of different number of genes (Figure 3). TFs are represented with different node colors except sky blue and the size of node circle is positively correlated with the number of genes it interacts (Figure 3). Genes with biggest node sizes represent the hub genes. In Blue module, we observed two clusters of genes connected by the gene SIN\_1024285 (transmembrane protein 45B-like). We identified several hub genes, including SIN\_1003530 (2-aminoethanethiol dioxygenase), SIN\_1011060 (pyruvate decarboxylase 1), SIN\_1018022 (pentatricopeptide repeat-containing protein), SIN\_1003097 (NA), SIN\_1003294 (NA), SIN\_1006713 (cationic amino acid transporter 1-like), SIN\_1013309 (alcohol dehydrogenase 3), SIN\_1014647 (squalene monooxygenase), SIN\_1017721 (NA), SIN\_1024789 (ATP-dependent 6-phosphofructokinase) etc. Furthermore, some key TFs were also present SIN\_1005329 (ERF) and SIN\_1019627 (WRKY), which may play important regulation role in this module (Figure 3). Turquoise module contains complex clusters of genes, implying diverse biological functions within this module (Figure 3). The key hub genes detected are SIN\_1017899 (NA), SIN\_1002615 (G-type lectin S-receptor-like serine/threonine-protein kinase), SIN\_1005524 (exocyst complex component EXO70A1-like), SIN\_1017189 (NA), SIN\_1007389 (NA), SIN\_1016615 (NA), SIN\_1009465 (pectinesterase 1-like), SIN\_1021953 (WRKY) and SIN\_1013789 (MAPK). In the Grey module, one main cluster surrounded by several small clusters of genes was detected (Figure S4). Several hub genes, including SIN\_1002211 (aspartic proteinase nepenthesin-1), SIN\_1003190 (protein ECERIFERUM 1), SIN\_1019848 (GDSL esterase), SIN\_1008670 (18.2 kDa class I heat shock protein), SIN\_1014764 (3-ketoacyl-CoA synthase 4) and SIN\_1004965 (glucan endo-1,3-β-glucosidase 13), may play preponderant roles in this module.

Next, we extended our analysis to unveil the major regulators of the different co-expressed gene modules of the CARGs in sesame. First, we screened for overrepresented regulatory motifs in the 1

play preponderant roles in this module.

kb promoter regions of genes within each module. Seven TF binding motifs were enriched in the analyzed promoter regions (Table S4). Then, we constructed the gene regulatory networks predicting directional interactions between CARG regulators and targets associated with the three modules using the TF DNA binding motif information. Figure 4 presents the generated regulatory networks, in which the circular nodes represent the key regulators connected by an edge to a module. The size of the nodes is proportional to the number of the inferred regulated genes harboring the corresponding TF binding motifs in the promoter region and the nodes are colored according to the appertaining module. Our predicted networks showed the main TFs regulating gene expression within each module. An intense transcriptional activity was predicted in the Turquoise module, having the highest number of predicted regulators. The networks also highlighted the master players (SIN\_1026747 (MYB), SIN\_1012698 (bHLH) and SIN\_1008018 (Dof)), which are predicted to regulate significant numbers of target genes in their modules and represent potential genes to exploit for the enhancement of sesame tolerance to various abiotic stresses. We

**Figure 3.** Co-expressed network analysis of Blue module and Turquoise module. The size of node circle is positively correlated with the number of the interacting gene partners. The gene names marked in red are those selected for validation using transgenic Arabidopsis approach. **Figure 3.** Co-expressed network analysis of Blue module and Turquoise module. The size of node circle is positively correlated with the number of the interacting gene partners. The gene names markedin red are those selected for validation using transgenic Arabidopsis approach.

Next, we extended our analysis to unveil the major regulators of the different co-expressed gene modules of the CARGs in sesame. First, we screened for overrepresented regulatory motifs in the 1 kb promoter regions of genes within each module. Seven TF binding motifs were enriched in the analyzed promoter regions (Table S4). Then, we constructed the gene regulatory networks predicting directional interactions between CARG regulators and targets associated with the three modules using the TF DNA binding motif information. Figure 4 presents the generated regulatory networks, in which the circular nodes represent the key regulators connected by an edge to a module. The size of the nodes is proportional to the number of the inferred regulated genes harboring the corresponding TF binding motifs in the promoter region and the nodes are colored according to the appertaining module. Our predicted networks showed the main TFs regulating gene expression within each module. An intense transcriptional activity was predicted in the Turquoise module, having the highest number of predicted regulators. The networks also highlighted the master players (SIN\_1026747 (MYB), SIN\_1012698 (bHLH) and SIN\_1008018 (Dof)), which are predicted to regulate significant numbers of target genes in their modules and represent potential genes to exploit for the enhancement of sesame tolerance to various abiotic stresses. We infer that these regulators are the key genes that shape the sesame CARG specific responses to the major abiotic stresses.

*Int. J. Mol. Sci.* **2019**, *20*, x FOR PEER REVIEW 7 of 19

**Figure 4.** Predicted directional interactions of TFs and the co-expressed modules in the sesame CARGs. Network plots of inferred connections between TF and genes in the three modules. The promoter sequences of genes associated with each module were tested for overrepresentation of DNA motifs shown to be bound to TFs that are differentially transcribed following stress treatments. Each TF with a known motif is represented by a colored circle corresponding to its appertaining module. The different modules are represented by a rectangle. An edge between a TF and a module indicates significant enrichment of the corresponding binding motif in that module. The size of each TF node is proportional to the number of predicted regulated downstream genes. Logos of the seven enriched DNA binding motifs within the promoter regions of the genes belonging to each module detected by WGCNA were added. **Figure 4.** Predicted directional interactions of TFs and the co-expressed modules in the sesame CARGs. Network plots of inferred connections between TF and genes in the three modules. The promoter sequences of genes associated with each module were tested for overrepresentation of DNA motifs shown to be bound to TFs that are differentially transcribed following stress treatments. Each TF with a known motif is represented by a colored circle corresponding to its appertaining module. The different modules are represented by a rectangle. An edge between a TF and a module indicates significant enrichment of the corresponding binding motif in that module. The size of each TF node is proportional to the number of predicted regulated downstream genes. Logos of the seven enriched DNA binding motifs within the promoter regions of the genes belonging to each module detected by WGCNA were added.

#### *2.5. Validation of Hub and Non-Hub TFs from the Co-Expressed Modules of the Sesame CARGs in Transgenic Arabidopsis 2.5. Validation of Hub and Non-Hub TFs from the Co-Expressed Modules of the Sesame CARGs in Transgenic Arabidopsis*

We selected two transcription factor encoding genes: SIN\_1005329 (SiERF5) and SIN \_1026079 (SiNAC104) to confirm their involvement in various abiotic stress responses using Arabidopsis system. In fact, sesame resilience to the genetic manipulation is still significant enough to justify the use of a heterologous system such as *Arabidopsis thaliana*. SiERF5 is a hub gene from Blue module while SiNAC104 is a non-hub gene from Turquoise module (Figure 3) and both were induced at different time points after stress treatments (Table S2). We hypothesized that the over-expression of SiERF5 will confer higher abiotic stress tolerance than the over-expression of SiNAC104 in We selected two transcription factor encoding genes: SIN\_1005329 (SiERF5) and SIN \_1026079 (SiNAC104) to confirm their involvement in various abiotic stress responses using Arabidopsis system. In fact, sesame resilience to the genetic manipulation is still significant enough to justify the use of a heterologous system such as *Arabidopsis thaliana*. SiERF5 is a hub gene from Blue module while SiNAC104 is a non-hub gene from Turquoise module (Figure 3) and both were induced at different time points after stress treatments (Table S2). We hypothesized that the over-expression of SiERF5 will confer higher abiotic stress tolerance than the over-expression of SiNAC104 in Arabidopsis, given their contrasting importance in their respective modules.

Arabidopsis, given their contrasting importance in their respective modules. We generated several T3 homozygous over-expressing Arabidopsis lines for both genes, of which three independent lines for each gene (SiERF5-1, SiERF5-2, SiERF5-3 and SiNAC104-1, SiNAC104-2, SiNAC104-3) were selected for stress applications. Overexpression of SiERF05 increased the leaf biomass which may result from a pleiotropic effect as compared to We generated several T3 homozygous over-expressing Arabidopsis lines for both genes, of which three independent lines for each gene (SiERF5-1, SiERF5-2, SiERF5-3 and SiNAC104-1, SiNAC104-2, SiNAC104-3) were selected for stress applications. Overexpression of SiERF05 increased the leaf biomass which may result from a pleiotropic effect as compared to SiNAC104-overexpressing plants

SiNAC104-overexpressing plants and the vector control (VC) plants (Figure S5). qRT-PCR was used

and the vector control (VC) plants (Figure S5). qRT-PCR was used to confirm the integration and the expression of the transgene (Figure S5). Under osmotic stress induced by 250 mM Mannitol addition to the MS medium, the growth performance of all the transgenic lines was significantly better than the VC plants (*p* < 0.001), indicating that the transgenes confer osmotic stress tolerance (Figure 5A,B). Furthermore, we observed that the SiERF05-overexpressing lines maintained significantly better relative root growth than the SiNAC104-overexpressing lines (Figure 5A,B). Next, we evaluated the performance of the overexpressing lines and VC plants under drought (20 days), salinity (200 mM NaCl) and waterlogging (18 days). As shown in Figure 6A, SiERF5 overexpression had pleiotropic effect including delaying of flowering time and increase of the rosette biomass compared to the SiNAC104-overexpressing lines and VC plants, showing that SiERF5 participates in plant growth and development. This result hints that Blue and Turquoise modules related genes are functionally different, as demonstrated by the GO enrichment analysis. Under stress treatments, all the plants were affected, reflected by the reduced biomass (Figure 6A). However, similar to the osmotic stress treatment, the transgenic lines significantly better sustained drought, salt and waterlogging stresses than the VC plants as evidenced by their higher survival rate, relative rosette fresh weight and relative seed yield (Figure 6B–D). In addition, the results indicated that the two transgenes were more potent under drought and waterlogging stresses compared to the salt stress since we had a more pronounced biomass reduction and no seed yield under salt. Furthermore, we observed a significantly higher tolerance to the different abiotic stresses by SiERF5-overexpressing lines than the SiNAC104-overexpressing lines. Overall, these findings support the argument that the proposed sesame CARGs are functionally active under various abiotic stresses. In addition, this finding supports our formulated hypothesis, denoting that the position of a gene in the co-expressed modules (hub genes or non-hub genes) may reflect the importance of its function, which will guide the choice of high potential genes from the sesame CARGs for germplasm enhancement. to confirm the integration and the expression of the transgene (Figure S5). Under osmotic stress induced by 250 mM Mannitol addition to the MS medium, the growth performance of all the transgenic lines was significantly better than the VC plants (*p* < 0.001), indicating that the transgenes confer osmotic stress tolerance (Figure 5A,B). Furthermore, we observed that the SiERF05-overexpressing lines maintained significantly better relative root growth than the SiNAC104-overexpressing lines (Figure 5A,B). Next, we evaluated the performance of the overexpressing lines and VC plants under drought (20 days), salinity (200 mM NaCl) and waterlogging (18 days). As shown in Figure 6A, SiERF5 overexpression had pleiotropic effect including delaying of flowering time and increase of the rosette biomass compared to the SiNAC104-overexpressing lines and VC plants, showing that SiERF5 participates in plant growth and development. This result hints that Blue and Turquoise modules related genes are functionally different, as demonstrated by the GO enrichment analysis. Under stress treatments, all the plants were affected, reflected by the reduced biomass (Figure 6A). However, similar to the osmotic stress treatment, the transgenic lines significantly better sustained drought, salt and waterlogging stresses than the VC plants as evidenced by their higher survival rate, relative rosette fresh weight and relative seed yield (Figure 6B–D). In addition, the results indicated that the two transgenes were more potent under drought and waterlogging stresses compared to the salt stress since we had a more pronounced biomass reduction and no seed yield under salt. Furthermore, we observed a significantly higher tolerance to the different abiotic stresses by SiERF5-overexpressing lines than the SiNAC104-overexpressing lines. Overall, these findings support the argument that the proposed sesame CARGs are functionally active under various abiotic stresses. In addition, this finding supports our formulated hypothesis, denoting that the position of a gene in the co-expressed modules (hub genes or non-hub genes) may reflect the importance of its function, which will guide the choice of high potential genes from the sesame CARGs for germplasm enhancement.

*Int. J. Mol. Sci.* **2019**, *20*, x FOR PEER REVIEW 8 of 19

**Figure 5.** Functional characterization of SiERF5- and SiNAC104-overexpressing lines and their counterparts vector control (VC) plants under osmotic stress induced by 250 mM Mannitol addition to the MS medium. (**A**) Phenotypes of the transgenic and VC plants under stress. The bar represents 1 cm (**B**) relative root length estimated as the ratio of the root length recorded after 10 days under stressed and control MS mediums. Values are means ±SD from two independent measurements. Bars with different letters are significantly different (*p* ≤ 0.05). **Figure 5.** Functional characterization of SiERF5- and SiNAC104-overexpressing lines and their counterparts vector control (VC) plants under osmotic stress induced by 250 mM Mannitol addition to the MS medium. (**A**) Phenotypes of the transgenic and VC plants under stress. The bar represents 1 cm (**B**) relative root length estimated as the ratio of the root length recorded after 10 days under stressed and control MS mediums. Values are means ±SD from two independent measurements. Bars with different letters are significantly different (*p* ≤ 0.05).

*Int. J. Mol. Sci.* **2019**, *20*, x FOR PEER REVIEW 9 of 19

**Figure 6.** Functional characterization of SiERF5- and SiNAC104-overexpressing lines and their counterparts vector control (VC) plants under drought (20 days), salt (200 mM NaCl) and waterlogging (18 days). (**A**) Phenotypes of the plants. The bar represents 8 cm (**B**) Survival rate of the plants. (**C**) Relative rosette fresh weight. (**D**) Relative seed yield. The relative values are estimated as the ratio of the data recorded after stress period in stressed and control plants. Values are means ± SD from two independent measurements. For each treatment, bars with different letters are significantly different (*p* ≤ 0.05). **Figure 6.** Functional characterization of SiERF5- and SiNAC104-overexpressing lines and their counterparts vector control (VC) plants under drought (20 days), salt (200 mM NaCl) and waterlogging (18 days). (**A**) Phenotypes of the plants. The bar represents 8 cm (**B**) Survival rate of the plants. (**C**) Relative rosette fresh weight. (**D**) Relative seed yield. The relative values are estimated as the ratio of the data recorded after stress period in stressed and control plants. Values are means ± SD from two independent measurements. For each treatment, bars with different letters are significantly different (*p* ≤ 0.05).

#### **3. Discussion 3. Discussion**

Reinforcing crop's tolerance to abiotic stress has become a hot issue in the current scenario of climate change, which is boosting extreme weathers, posing a growing threat to sustainable agriculture. Because of the multitude of environmental stresses affecting crop survival and productivity in the field, the identification of potential genes conferring tolerance to multiple abiotic stresses is highly desirable [9,48]. In this study, we employed various abiotic stresses RNA-Seq datasets (waterlogging, drought, salt and osmotic stresses) from diverse sesame genotypes with contrasting levels of tolerance. Our meta-analysis unveiled 543 genes as the core abiotic stress responsive genes (CARG) modulating sesame responses to multiple abiotic stresses. We validated a subset of these CARGs in ten independent sesame genotypes, showing that these CARGs are not genotype-dependent but are well conserved in the sesame species. The alteration of the expression levels of selected CARGs under heat stress further hints that the proposed CARGs may be functional under a more diverse set of environmental stressors. Nonetheless, our studied transcriptome data were mainly from sesame root samples; hence, additional analyses are needed to check how the proposed CARGs respond to stress in other tissues. Very recently, Cohen and Leach [27] also demonstrated that meta-analysis of diverse transcriptomic data sets in rice is a valid and robust approach to develop hypotheses for how plants respond to various stress. They discovered a list of novel candidate genes for improving rice environmental stress tolerance, which were not detected in studies focused on a single stress. Similarly, meta-analysis of transcriptome data has been performed to find out key genes in responses to abiotic stresses in other important plants [45,49–52]. Reinforcing crop's tolerance to abiotic stress has become a hot issue in the current scenario of climate change, which is boosting extreme weathers, posing a growing threat to sustainable agriculture. Because of the multitude of environmental stresses affecting crop survival and productivity in the field, the identification of potential genes conferring tolerance to multiple abiotic stresses is highly desirable [9,48]. In this study, we employed various abiotic stresses RNA-Seq datasets (waterlogging, drought, salt and osmotic stresses) from diverse sesame genotypes with contrasting levels of tolerance. Our meta-analysis unveiled 543 genes as the core abiotic stress responsive genes (CARG) modulating sesame responses to multiple abiotic stresses. We validated a subset of these CARGs in ten independent sesame genotypes, showing that these CARGs are not genotype-dependent but are well conserved in the sesame species. The alteration of the expression levels of selected CARGs under heat stress further hints that the proposed CARGs may be functional under a more diverse set of environmental stressors. Nonetheless, our studied transcriptome data were mainly from sesame root samples; hence, additional analyses are needed to check how the proposed CARGs respond to stress in other tissues. Very recently, Cohen and Leach [27] also demonstrated that meta-analysis of diverse transcriptomic data sets in rice is a valid and robust approach to develop hypotheses for how plants respond to various stress. They discovered a list of novel candidate genes for improving rice environmental stress tolerance, which were not detected in studies focused on a single stress. Similarly, meta-analysis of transcriptome data has been performed to find out key genes in responses to abiotic stresses in other important plants [45,49–52].

The proposed sesame CARGs are functionally diversified as evidenced by the various biological pathways contributed by these genes (Figure S3). Several genes within the sesame CARGs are universally known to be abiotic stress responsive genes. For example, we detected the gene

The proposed sesame CARGs are functionally diversified as evidenced by the various biological pathways contributed by these genes (Figure S3). Several genes within the sesame CARGs are universally known to be abiotic stress responsive genes. For example, we detected the gene *SIN\_1012768* predicted to be a member of the late embryogenesis abundant (LEA) family. Proteins encoded by the LEA are demonstrated to play defensive roles in plants during abiotic stresses, including cold, drought, salinity [53–57]. Other well-known abiotic stress marker genes are the alcohol dehydrogenase (ADH) family members, which are key enzymes responsible for catalyzing the reduction of acetaldehyde to ethanol using NADH as reductant, particularly during the periods of anaerobic stress [58,59]. ADH genes are involved in various environmental stresses such as drought cold, salinity, hypoxia and pathogen infection [60–64]. In the present study, we identified one ADH gene (*SIN\_1013309*) in the sesame CARGs. Besides, we also detected several important plant abiotic stress marker genes within the sesame CARGs, including peroxidase (*SIN\_1026962* and *SIN\_1013457*), DREB (*SIN\_1015595*), universal stress protein (*SIN\_1022749* and *SIN\_1012609*), glutathione *S*-transferase (*SIN\_1017866* and *SIN\_1002858*), 1-aminocyclopropane-1-carboxylate oxidase (*SIN\_1024757*, *SIN\_1007659*, *SIN\_1023068*, *SIN\_1009668* and *SIN\_1026934*), heat shock protein (*SIN\_1008669*, *SIN\_1008679*, *SIN\_1008672* and *SIN\_1008670*) [65–81] etc. In future abiotic stress treatment experiments in sesame, we propose to select some of these well-known CARGs as abiotic stress marker genes in order to gauge the stress levels. Noteworthy, several uncharacterized genes were present in the CARGs, providing an exciting gene repertoire to further illuminate the complex mechanisms of plant responses to multiple abiotic stresses.

We employed the weighted gene co-expression network analysis (WGCNA) to break down the sesame CARGs into three functional modules (Figure 2A,B). Interestingly, the functional characterization of these three modules revealed that they are involved in distinct biological pathways in response to abiotic stresses (Figure S3). With the WGCNA package, we correlated the different abiotic stressors to the gene modules (Figure 2C). This analysis is cardinal because it allowed the identification of the synergistic and antagonistic gene modules of abiotic stress response in sesame. We found that the co-expressed modules of the sesame CARGs globally displayed positive correlations with waterlogging, drought and osmotic stresses, but they were all negatively correlated with salinity stress. This suggests that manipulation of master genes of these modules to simultaneously enhance tolerance to all the four investigated abiotic stresses may not be possible in sesame, because enhancing tolerance to waterlogging, drought and osmotic stresses will lead to an increase sensitivity to salinity stress. Our findings are not surprising, since previous studies have also shown plant antagonistic responses to some stresses [9,14].

Transcription factors (TF) are regulatory molecules that play central roles in gene transcription and promote plant adaptation to various environmental conditions. The sesame CARGs contained several TFs, with ERF, MYB, bHLH and WRKY being the more predominant families (Figure 1C). A similar meta-analysis in cotton also underscored the important role of these TF families in abiotic stress responses [52], indicating a conserved abiotic stress gene regulation mechanism in plants. ERF family has been one the most studied genes in plants. Extensive studies have shown that ERF genes are essential in responses to a wide range of abiotic stresses mediated by the plant hormone ethylene [82–86]. It has been reported that MYB TFs also play prominent roles in triggering the right response upon exposure of plants to abiotic stresses through the ABA-dependent and independent pathways (reviewed by Li et al. [87]; Roy [88]). The WRKY genes are among the top four TF families highly active in the transcriptional reprogramming during stress and act principally through the ABA mediated pathways [73,89]. Conversely, the role and regulatory mechanisms of bHLH genes in plant abiotic stresses responses are still elusive [90]. Therefore, the sesame bHLH genes detected as key regulators of abiotic stress responsive genes in this study may represent candidate genes for the elucidation of bHLH abiotic stress regulation mechanism in plants. Overall, the diversity of TFs within the sesame CARGs highlights the complex network of interacting pathways which shape the responses to multiple abiotic stresses. To further pinpoint the master players among the large number of detected TFs, we identified in the promoter of genes within each module, the enriched putative *cis*-regulatory

motifs. Previous works in yeast and human have demonstrated that genes with similar expression patterns are regulated by the same set of TFs, and therefore are likely to have similar *cis*-regulatory motifs in their upstream promoter regions [91,92]. Our study corroborated well these conclusions and unraveled for each module the master TFs that may regulate the gene expression under specific abiotic stress (Figure 4).

In gene networks, many genes only interact with a limited number of other genes, whereas a smaller subset of genes (hub genes) interacts with many other genes and therefore has a more central role [93]. Hub genes are expected to play preponderant and essential functions in organism's fitness and according to Jeong et al. [94], hub genes are three times more likely to be essential than genes with fewer interaction partners. To test this hypothesis in our predicted gene networks (Figure 3), we selected a hub gene (*SiERF5*) and a non-hub gene (*SiNAC104*), both being transcription factors. Over-expression of these two genes in *Arabidopsis thaliana* resulted in an enhanced tolerance to drought, waterlogging and osmotic stresses, but the over-expressing lines did not tolerate salinity stress (Figures 5 and 6). Furthermore, we observed that the transgenic lines over-expressing the hub gene had a stronger fitness and a higher performance under abiotic stresses compared to those transformed with the non-hub gene. It is worth mentioning that the over-expression of the hub gene had clear pleiotropic effects beyond abiotic stress responses in *Arabidopsis thaliana*, thus might play a central role in various biological pathways. This experiment therefore highlighted three key findings: (1) perturbation of a hub gene is likely to have a major fitness consequence than a non-hub gene; (2) proper manipulation of sesame CARGs may confer tolerance to multiple abiotic stresses; (3) genetic manipulation for generating sesame lines tolerant to all the four investigated abiotic stresses may be challenging due to the antagonistic response of the sesame CARGs in the face of some abiotic stresses. Although the main goal of this work was not to investigate the functional importance of hub genes versus non-hub genes, the preliminary result obtained from the Arabidopsis mutants will fuel a future study based on both sesame and Arabidopsis using over-expression and knock-out transgenic techniques and employing more hub genes and non-hub genes to comprehensively elucidate this important scientific question.

## **4. Materials and Methods**

## *4.1. RNA-Sequencing Datasets of Abiotic Stressed Sesame Samples*

In order to decipher the core genome involved in abiotic stress responses in plants, our group previously generated several RNA-Seq data of sesame under waterlogging [6], drought [7], salt [8] and details of the experimental procedures could be found in the respective articles. In this study, we collected the root RNA-Seq data of the waterlogging-tolerant genotype Zhongzhi13 under 3 h waterlogging stress and control condition (SRR2886790). Concerning the drought stress, we collected the root RNA-Seq data from a drought-tolerant genotype ZZM0635 (SAMN06130606) and a drought-sensitive genotype ZZM4782 (SAMN06130607) after 0, 3, 7 and 11 days drought stress. For the salt stress experiment, two contrasting genotypes (salt-tolerant WZM3063 and salt-sensitive ZZM4028) were treated with 150mM NaCl and whole seedling samples were collected at 0, 2, 6, 12 and 24 h (PRJNA524278). In addition to these released datasets, we newly generated an RNA-Seq data from root samples of osmotic stressed sesame (PRJNA552167). Two sesame genotypes (osmotic-tolerant G546 and osmotic-sensitive G259) grown in a box containing 9 L of half-strength Hoagland solution for 21 days, were treated with 2% PEG6000 for 7 days. Samples were collected in triplicate from stress and control conditions after the stress period, immediately placed in liquid nitrogen and stored at −80 ◦C until use.

#### *4.2. Total RNA Isolation and Sequencing from the PEG6000-Treated Seedlings*

Total RNA of the 12 PEG6000-treated samples was extracted using an EASYspin Plus kit (Aidlab, Beijing, China). The cDNA libraries generated from RNA samples were pair-end sequenced on an

Illumina Hiseq 4000 platform (San Diego, California, CA, USA.) according to the methods described by Dossa et al. [7].
