*Article Arabidopsis NDL-AGB1 modules* **Play Role in Abiotic Stress and Hormonal Responses Along with Their Specific Functions**

#### **Arpana Katiyar and Yashwanti Mudgil \***

Department of Botany, University of Delhi, New Delhi 110007, India; katiyar.arpana@gmail.com **\*** Correspondence: ymudgil@botany.du.ac.in

Received: 18 August 2019; Accepted: 19 September 2019; Published: 24 September 2019

**Abstract:** *Arabidopsis* N-MYC Downregulated Like Proteins (NDLs) are interacting partners of G-Protein core components. Animal homologs of the gene family N-myc downstream regulated gene (NDRG) has been found to be induced during hypoxia, DNA damage, in presence of reducing agent, increased intracellular calcium level and in response to metal ions like nickel and cobalt, which indicates the involvement of the gene family during stress responses. *Arabidopsis NDL* gene family contains three homologs *NDL1*, *NDL2* and *NDL3* which share up to 75% identity at protein level. Previous studies on NDL proteins involved detailed characterization of the role of NDL1; roles of other two members were also established in root and shoot development using miRNA knockdown approach. Role of entire family in development has been established but specific functions of NDL2 and NDL3 if any are still unknown. Our *in-silico* analysis of *NDLs* promoters reveled that all three members share some common and some specific transcription factors (TFs) binding sites, hinting towards their common as well as specific functions. Based on promoter elements characteristics, present study was designed to carry out comparative analysis of the *Arabidopsis NDL* family during different stages of plant development, under various abiotic stresses and plant hormonal responses, in order to find out their specific and combined roles in plant growth and development. Developmental analysis using GUS fusion revealed specific localization/expression during different stages of development for all three family members. Stress analysis after treatment with various hormonal and abiotic stresses showed stress and tissue-specific differential expression patterns for all three *NDL* members. All three *NDL* members were collectively showed role in dehydration stress along with specific responses to various treatments. Their specific expression patterns were affected by presence of interacting partner the *Arabidopsis* heterotrimeric G-protein β subunit 1 (AGB1). The present study will improve our understanding of the possible molecular mechanisms of action of the independent *NDL*–*AGB1* modules during stress and hormonal responses. These findings also suggest potential use of this knowledge for crop improvement.

**Keywords:** *Arabidopsis*; *N*-Myc downregulated like; abiotic stress; mannitol; PEG; promoter elements; GUS staining; plant hormones

#### **1. Introduction**

G-proteins are well characterized for signal transduction in response to many environmental stresses like drought, heat, salinity and light intensity. Internalization of AtRGS1 (Regulator of G protein signaling protein) in response to NaCl indicates role of G-protein signaling in stress responses [1]. Expression levels of *AGB1* were significantly upregulated during salt treatment, whereas showed down regulation during heat and cold stress [2]. G-protein interactome has been published using Yeast two hybrid (Y2H) and in planta interaction studies, NDL interactome has also been solved as part of same initiative and hints towards role of NDLs in various stress responses [3]. Animal homolog

of plant NDLs consist of four members NDRG1-4. NDRG1 is a ubiquitously expressed intracellular protein that is induced under number of stresses and pathological responses [4]. NDRG1 upregulates under various stress conditions like hypoxia, in the presence of a reducing agent, DNA damage and in response to increased intracellular calcium concentration [5–9]. Various metal ions like nickel, cobalt, and iron are also reported for inducing expression of NDRG1 [10].

Plant NDR proteins were first reported in sunflower (SF21) as pollen, stigma and transmitting tissue cell-specific proteins with putative role, as a signaling molecule in pollen-pistil interaction [11]. Detailed analysis of expression profiles of gene family members in sunflower revealed multiple alternative and organ-specific transcripts indicating spatial and developmental regulation and roles [12,13].

In *Arabidopsis* NDL1 was re-discovered as AGB1/AGG dimmer interacting protein, three homologs had been identified, showing presence of a conserved NDR domain and an alpha/beta hydrolase fold. All NDLs interact with AGB1/AGG (βγ) dimer and C4 domain of RGS1 [14]. Detailed characterization of *Arabidopsis* NDL1 and widespread presence of NDLs from bryophytes to angiosperm hints towards their important functions, though their specific molecular and cellular mechanism of action which still needs to be discovered.

A recent study involving correlation of the gene expression and morpho-physiological traits under severe water-deficient conditions reports expression of *NDL1* positively correlated with rate of transpiration and projected rosette area [15]. NDLs interactome predicts AtNDL1 interaction with ANNEXIN (drought-responsive gene) Sodium and Lithium-Tolerant 1 (SLT1-salt stress-responsive gene) O-Acetylserine (Thiol) Lyase (OAS-TL) Isoform A1 (characterized for its role during cadmium tolerance) *Arabidopsis* Ribosomal Protein (ARS27A involved in genotoxic stress) hinting us to speculate that AtNDL proteins playing important role in abiotic stress signaling through these NDLs stress related interactors (NSRI), stress signaling downstream components (Figure 1A showing location of all NSRI members on *Arabidopsis* genome) [3].

**Figure 1.** Mapping of the *Arabidopsis* N-MYC Downregulated Like Proteins (*AtNDL)* gene family and putative abiotic stress interacting partners, NDLs stress related interactors (NSRI) on *Arabidopsis* genome along with the stress related elements in the promoter region of *AtNDL* gene family. (**A**) Physical Map of AGI at TAIR using Chromosome Map tool to indicate Chromosomal location of *NDLs* (*NDL1*, *NDL2*, and *NDL3*) and their stress related interactors in *Arabidopsis.* The names of the genes are shown to the right of each chromosome and location of genes is indicating their absolute location on respective chromosome. *RGS1*-Regulator of G protein signaling, *ANN1*-ANNEXIN, *SLTI*-Sodium and Lithium-Tolerant 1, *RS27A*- *Arabidopsis* Ribosomal Protein, *OAS-TL*-O-Acetylserine (Thiol) Lyase. (**B**) In silico analysis of the promoter region of *AtNDL* gene family promoter regions analyzed to get information about common *cis*-regulatory elements present on each of the member. (**C**) Specific *cis*-regulatory elements present on each *AtNDL* member. Different colors are used to indicate for each *cis*-regulatory element.

Previous *NDL* characterization study used miRNA knockdown approach of the entire *NDL* gene family, as well as the detailed overexpression and localization of NDL1, revealed their important role in regulating auxin transport and distribution. Detailed expression/localization profiling of the *NDL1* homologs *NDL2* and *NDL3* had not been done, though all three proteins share 75% identity at amino acid level [14]. Current study aims to dissect out independent functional significance of all three *NDL* gene family members and their role in plant stress management, comparative transcription profiling of all three *NDL* gene family members is performed in silico and *in vivo*. Previously it has been reported that all *NDL* members play role in root as well as shoot development, AGB1 presence is not essential for transcriptional stability of *NDL1* but needed for the stability of NDL1 protein *in planta* [14,16] in order to find-out effect of AGB1 availability on expression profile of *NDL2* and *NDL3* their expression is also studied in *agb1-2* background. The main objectives of the current study are:


#### **2. Results**

#### *2.1. In Silico Analysis of the Upstream Regulatory Regions of Arabidopsis NDL1, NDL2 and NDL3*

Comparative analysis of the *cis*-elements present in the promoter sequences of all three *NDL* members was done using the PLANTPAN 2.0 software. In silico comparative analysis revealed presence of many common TFs binding sites. All three members showed presence of AtMYB2 and AtMYC2 binding sites which are already established characteristic of *NDL* gene family in animals and reason for their nomenclature [17,18]. Abiotic stress-responsive TFs that are involved in dehydration (MYB1AT and ABRELATERD1), sulfur-responsive (SURECOREATSULTR11), phosphate-responsive (PIBS) and SA-induced (WBOXATNPR1), cytokinin-responsive element (ARR1) and defense related TFs like MYB1LEPR binding motif are also present in all the three members of the family. Presence of all these common elements in all three *NDL* genes hints towards the common role of *NDL* family during abiotic and biotic responses. Light-induced TFs like GT1CONSENSUS and SORLIP1AT and developmental and cell cycle-responsive TFs like LEAFYATG and MYBCOREAYCYCB1 are also present in all three members indicating diverse role of *NDL* family members during various stages of plant growth and development (Figure 1B and Table 1).

In addition to these common regulatory elements unique TFs binding sites specific to the promoter region of *NDL1, NDL 2* and *NDL3* are also observed (Figure 1C). *NDL1* promoter showed presence of E2F binding sites indicating its cell cycle-specific functions. Other noted specific elements of *NDL1* promoter region are SBOXATRBCS, important for sugar and abscisic acid (ABA) responses, and RHERPATEXPA7, a root hair specific cis element (Table 2).

*NDL2*, promoter harbors three ABA-responsive elements, ABREATCONSENSUS, ABREATRD22, and ACGTABREMOTIFA2OSEM. It is well established that drought stress triggers production of ABA which in turn induces various other drought inducible genes [19]. Thereby presence of ABA-responsive elements in the promoter sequence of *NDL2* along with MYBATRD22 dehydration-responsive element suggests regulation and involvement of *NDL2* during dehydration/drought stress (Table 2). Along with ABA-responsive elements other specific sites present in the promoter region of *NDL2* are Gibberellic acid (GA) (GADOWNAT) and Jasmonic acid (JA) (T/GBOXATPIN2)-responsive elements. *NDL3* promoter sequence shows presence of one pathogen-responsive element (GCCCORE), and two Ca<sup>2</sup>+/Calmodulin binding sites (CGCGBOXAT) along with AGAMOUS-LIKE2 and HD-ZIP transcription binding sites.


**Table 1.** List of common transcription factor (TF) binding sites in *NDL1*, *NDL2*, and *NDL3* promoter.

**Table 2.** List of specific TFs binding sites in promoter region of *NDL1*, *NDL2*, and *NDL3*.


Presence of common TFs binding sites in the upstream promoter regions of all three *NDL* members postulates their involvement in common regulatory mechanisms, while the presence of specific TFs binding sites might attribute unique functions to each *NDL* family member.

#### *2.2. Comparative In Silico Expression Analysis of NDL Members*

Comparative expression analysis of *NDL1, NDL2,* and *NDL3* has been performed using genevestigator microarray data. All three *NDL* members showed expression across all developmental stages, *NDL2* showed expression little higher than *NDL1* and *NDL3* (Figure 2A). Microarray analysis in different plant tissues revealed that *NDL* genes exhibit differential expression pattern across various organs. *NDL1* and *NDL2* showed highest expression in pollen while *NDL3* showed highest expression in carpel and shoot apex (Figure 2B). Since, the role of *NDLs* has been already established in auxin transport during root development [14]. We closely looked at the expression levels of the *NDLs* members in different parts of roots and found that expression level of all the members were comparable, in primary roots *NDL1* and *NDL2* is comparatively little higher than *NDL3*. Overall expression pattern revealed ubiquitous expression of the gene family which indicates important function of the family members during plant growth and developmental responses (Figure 2B).

**Figure 2.** In silico expression analysis of *NDL* gene family. (**A**) GENEVESTIGATOR microarray expression analysis of *NDL1*, *NDL2* and *NDL3* during different developmental stages of plant growth. (**B**) In silico microarray analysis showing tissue-specific expression of *NDL1, NDL2,* and *NDL3* using GENEVESTIGATOR.

#### *2.3. Comparative In Vivo Expression Analysis of NDLs during Early Stages of the Plant Growth*

*In-vivo* expression analysis of all three *NDL* members showed that the localization pattern of GUS is quite unique for each member. During the initial stages of plant growth (4–12 day-old seedlings) *pNDL1-NDL1-GUS* showed localization in cotyledonary leaves, true leaves, various parts of primary and lateral roots including elongation zone and root apical meristem (RAM) (Figure 3A–C and Table 3). *pNDL2-GUS* showed GUS expression in cotyledonary leaves, true leaves, hypocotyl and maturation zone of primary root, but strikingly missing from root tip and in lateral roots (Figure 3G–I and Table 3). *pNDL3-GUS* expression was missing in cotyledonary leaves, present in newly emerging true leaves, present in the RAM of the primary root and in lateral roots (Figure 3M–O and Table 3).

**Figure 3.** *In planta* expression analysis of *NDL* gene family during different developmental stagesin presence and absence of *AGB1*: (**A**–**C**, **G**–**I** and **M**–**O**). Histochemical GUS staining of *pNDL1-NDL1-GUS*, *pNDL2-GUS* and *pNDL3-GUS* seedlings in wild type Col-0 background. (**D**–**F**, **J**–**L** and **P**–**R**) Histochemical GUS staining of *pNDL1-NDL1-GUS*, *pNDL2-GUS* and *pNDL3-GUS* seedlings in *agb1-2* mutant background. Histochemical GUS analysis in transgenic *Arabidopsis* plants during different stages of development was carried out 4 day, 8 day, and 12 day old seedlings were analyzed. GUS expression is under the control of *NDL2* and *NDL3* promoter and translational fusion in case of *NDL1*, Scale Bar = 0.2 μM.



*Int. J. Mol. Sci.* **2019** , *20*, 4736







PR= Primary Root, LR = Lateral Root, CDZ = Cell division Zone, EZ = Elongation Zone, MZ = Maturation Zone, Root tip = RT. (-) No GUS staining, (++) basal staining level in each genotype without any treatment, (+++) increased level of staining, (+) decreased levels of staining compared to basal genotypic levels.

#### *2.4. Comparative In Vivo Expression Analysis of NDLs in Absence of AGB1*

Previously, we have found that presence of AGB1 doesn't affect transcriptional levels of *NDL1* but necessary to maintain the post translational steady state level of NDL1 [14]. Since all three NDL members had shown physical interaction with AGB1 in Y2H we hypothesized that NDL2 and NDL3 might require its partner, AGB1, either to regulate expression levels or for post-translational stability similar to NDL1. In order to test this speculation, expression of *pNDL2-GUS* and *pNDL3-GUS* was studied in *agb1-2* background.

In *agb1-2* background *pNDL1-NDL1-GUS* localization was absent from cotyledonary leaves (Figure 3D–F and Table 3) but present in true leaves. In case of roots the localization was absent from both primary and lateral roots RAM (Figure 3D,E inset) whereas present in other parts of roots similar to Col-0 background. In case of *pNDL2-GUS* and *pNDL3-GUS* localization pattern of the GUS was same in *agb1-2* mutant background as compared to wild type Col-0. (Figure 3J–L for *NDL2*, Figure 3P–R for *NDL3* and Table 3). These finding indicate that expression levels of both *NDL2* and *NDL3* are also unaffected by the absence of *AGB1* similar to previously reported for *NDL1.*

#### *2.5. In Silico Expression Analysis of NDLs under Various Stress and Hormonal Treatments*

Previous study has confirmed that *NDLs* play role in auxin transport during lateral root formation [14]. Correlation of gene expression and responses of morpho-physiological traits under severe water-deficient conditions indicate substantial alteration in expression levels of *AtNDL1* during drought stress responses using tanscriptomic analysis [15]. In order to find out if independent NDL members play independent specific roles or have combined redundant functions during abiotic stress exposure and hormonal treatments, we have analyzed their comparative expression profiles from publically available databases. (Genevestigator and eFP browser).

Genevestigator Microarray data revealed that after auxin (IAA) treatment, expression levels of *NDL1* decreased to nearly 1.77 fold, expression of *NDL2* and *NDL3* showed a similar trend, decrease of around 1.3 fold. (Figure 4A). In presence of ABA, *NDL2* showed a decrease (2.21 fold) in expression level, as compared to *NDL1* and *NDL3* which showed an upregulation up to ~1.12 fold and down regulation of nearly 1.25 fold respectively. Unlike *NDL2* and *NDL3*, *NDL1* showed slight decrease (1.1 fold) in its expression in response to GA (Figure 4A).

All *NDL* members also showed a response toward various abiotic stresses (heat, cold and drought). In case of heat treatment in green tissue *NDL1* showed increase (2.08 fold) as compared to *NDL2* and *NDL3* which showed a decrease in expression (1.44 and 1.52 fold respectively) in case of roots also similar trend was there. Even for cold treatment *NDL1* showed positive regulation (1.06 fold up) while down regulation was observed for *NDL2* and *NDL3* (3.59 and 1.43 fold respectively). Under drought stress all members have positive response, *NDL2* showed upregulation in expression (3.59 fold) followed by *NDL3* and *NDL1* (increase of 2.11 and 1.60 fold respectively, Figure 4A). The observed high expression of *NDL2* gene in drought stress thereby could possibly be linked to the fact that multiple ABA-responsive factor binding elements are present in promoter sequence of *NDL2*.

To find out optimal time point for the in vivo expression analysis we also performed comparative in silico expression analysis of *NDLs* at different time points during various abiotic stress treatments using eFP browser. For *NDL1* maximum transcript levels compared to no-treatment control were detected at 24 h. In case of *NDL2* expression, levels fluctuated during different stress treatments and variation was found at different time points but overall maximum levels were observed at 24 h. A trend similar to *NDL2* was observed for *NDL3* expression during different treatments and time points (Figure 4B)

**Figure 4.** In silico expression analysis of *NDL* gene family during abiotic stress and hormonal treatment using genevestigator and eFP browser: (**A**). Comparative *in-silico* expression analysis of *NDL1, NDL2,* and *NDL3* using GENEVESTIGATOR microarray data, under different abiotic stress and hormonal treatments (**B**). In silico analysis of *NDL1, NDL2,* and *NDL3* during abiotic stress treatments at different time points (0.5, 3, 6, 12, and 24 h) comparative expression is shown as fold change compared to no stress control.

#### *2.6. In Vivo Expression Analysis under Abiotic Stresses and Hormonal Treatments*

In order to find out maximum expression levels for all three *NDLs* promoter GUS/β-glucuronidase activity was tested using 4-Methylumbelliferyl-β-D-glucuronide (MUG) assay for two time points (6 h and 24 h) in Col-0 background, no significant difference was found between expression levels of *NDLs* and two time points. (Supplementary Figure S1). Based on the in vivo GUS/β-glucuronidase activity levels and in silico analysis of the expression profiles of all three *NDL* members, treatment with various hormones and abiotic stresses were designed. Six-day-old seedlings were subjected to 24 h treatment with various hormones (ABA, GA, IAA, JA, SA, and BAP) and abiotic stresses (cold, heat, Mannitol, PEG and salt) followed by measurement of GUS activity either histochemically or by fluorometry.

*pNDL1-NDL1-GUS*, showed increase in the GUS staining intensity when treated with cold, IAA (auxin), JA, Salicyclic acid (SA), BAP (cytokinin), and Poly ethylene glycol (PEG) and almost no staining observed with heat treatment (Figure 5). GUS staining remained unaffected after treatment with ABA (Supplementary Figure S2), GA (Supplementary Figure S3) and mannitol (Supplementary Figure S4). In case of salt treatment, no difference was found in root but hypocotyle showed increase in the GUS staining intensity (Supplementary Figure S5) compared to no stress treatment control.

**Figure 5.** In vivo histochemical GUS activity for *NDL1* in wild type background in response to different abiotic stresses and hormones: (**A**). Histochemical analysis of GUS activity for *NDL1* in wild type background in response to different abiotic stress and hormones. Changes in intensity of GUS staining were detected with Heat, Cold, IA, JA, SA and BAP. GUS staining was done in seedlings that were six days old and 24 h of stress treatment followed. (**B**) Histochemical GUS staining of 6-D-old *pNDL1-NDL1-GUS* seedlings in wild type Col-0 background. Increased levels of GUS staining intensity detected with PEG at root tip (inset). Result shown is representative of three independent biological replicates (*n* ≥ 10 in each experiment). Scale Bar = 0.2 μM for seedlings and 0.02 μM for inset RAM.

**Figure 6.** In vivo histochemical analysis of GUS activity for *NDL2* in wild type Col-0 background in response to different abiotic stresses and hormones: Histochemical GUS staining of 6-D-old transgenic *Arabidopsis* seedlings (*pNDL2-GUS)* in wild type Col-0 background. Increased intensity levels of GUS staining detected with ABA, cold, and JA. Result shown is representative of three independent biological replicates (*n* ≥ 10 in each experiment). Scale Bar = 0.2 μM.

*pNDL2-GUS* showed increased levels of GUS staining in the cotyledons and hypocotyl after treatment with ABA, cold and JA (Figure 6), primary root showed no expression similar to no treatment control. BAP (Supplementary Figure S2), IAA, GA, heat (Supplementary Figure S3), Mannitol and PEG (Supplementary Figure S4) and SA (Supplementary Figure S5) and other treatments had no effect on expression level as no difference was observed compared to no stress treatment control.

In case of *pNDL3-GUS,* no difference in the GUS staining was detected when seedlings were treated with ABA, BAP, cold (Supplementary Figure S2), IAA, GA, JA, heat (Supplementary Figure S3), salt and SA (Supplementary Figure S5). Mannitol and PEG treatment resulted in decreased level of *pNDL3-GUS* expression in RAM and hypocotyl in comparison to no treatment control which showed diminished staining at both the places. GUS staining was absent in the rest of the seedlings, same as in case of no treatment control of six-day old seedling (Figure 7 and Supplementary Figure S4).

**Figure 7.** In vivo histochemical analysis of GUS activity for *NDL3* in wild type Col-0 background in response to different abiotic stresses and hormones: (**A**). Histochemical GUS staining of 6-D-old transgenic *Arabidopsis* seedlings (*pNDL3-GUS)* in wild type Col-0 background. Changes in intensity of GUS staining detected with PEG and Mannitol in RAM. (**B**) and in hypocotyl part of the seedlings. Result shown is representative of three independent biological replicates (*n* ≥ 10 in each experiment). Scale Bar = 0.2μM for seedlings and 0.02 μM for RAM.

Results indicate that all the three family members' show independent response when encountering various stress/hormonal treatments. Both *NDL1* and *NDL3* after treatment with Mannitol and PEG showed difference in GUS intensity indicates that both the members are working when plants encounter with drought and osmotic stress. Although both the members have altered expression in same stress mechanism of functioning seems opposite as *NDL1* showed up regulation after PEG treatment while *NDL3* showed down regulation when treated with PEG (compare Figures 5B and 7), while *NDL2* expression remains unaffected by both treatments in tested plant stage (Supplementary Figure S4).

#### *2.7. In Vivo Expression Analysis of NDLs in Absence of AGB1 under Abiotic Stress and Hormonal Treatments*

Previous study has reported that expression levels of *NDL1* were not affected but protein steady state levels were affected by AGB1 [14]. In order to find out role of AGB1 on expression levels of *NDL2* and *NDL3* their expression patterns were studied in *agb1-2* mutant. We found that expression of *NDL2* and *NDL3* were not affected in *agb1-2* mutant and stays same as in wild type Col-0 background (Figure 3). In order to find-out whether AGB1 has any role on expression during abiotic stress or hormonal treatments expression pattern was observed for all the three members in absence of AGB1 after abiotic (cold, heat, Mannitol, salt and PEG), and hormonal (ABA, GA, IAA, JA, SA and BAP) treatments.

*pNDL1-NDL1-GUS* showed increased GUS intensity in primary root of the seedlings compared to no treatment control for GA, Mannitol, PEG and JA while no staining was detected in the primary root after heat treatment (Figure 8A,B). MUG assay was also performed and significantly increased GUS activity was observed only for PEG and Mannitol treatment in comparison to no-treatment control seedlings (Figure 8C). GUS staining intensity remains unaffected for ABA, BAP, cold (Supplementary Figure S2), IAA (Supplementary Figure S3), salt and SA (Supplementary Figure S5) treatments. Response of ABA and salt for *pNDL1-NDL1-GUS* remained similar in both the background (in wild type Col-0 as well as in *agb1-2* mutant) meaning presence of AGB1 don't make any difference on expression, while for cold, IAA, SA and BAP upregulation was detected in the cotyledons and in primary root in comparison to no treatment control seedlings in case of wild type

**Figure 8.** In vivo histochemical GUS staining of 6-day old transgenic *Arabidopsis* seedlings (*pNDL1-NDL1-GUS*) in *agb1-2* mutant background: Histochemical GUS staining was done in transgenic *Arabidopsis* seedlings that were six days old treatment was given for 24 h. (**A**) Changes in GUS intensity levels were detected with GA, heat and JA. (**B**). Increased GUS staining detected for Mannitol and PEG at the cell division and elongation part of the primary root. (**C**) Abiotic stress treatments followed by fluorometric MUG assay was done for quantitative estimation of stress response. Significantly higher GUS activity was obtained for Mannitol and PEG treatment compared to No-stress control indicative of NDL1 involvement during osmotic and drought responses (Students *t*-test = \* *p* value < 0.05, Error bars represent SD). Result shown is representative of three independent biological replicates (*n* ≥ 10 in each experiment). Scale Bar = 0.2 μM.

Col-0 background while remains unaffected in *agb1-2* mutant background meaning presence of AGB1 is vital for NDL1 stability during cold, IAA, SA and BAP treatment, indicating stress specific role of NDL1-AGB1 module.

The expression analysis for *pNDL2-GUS* in terms of GUS staining was also observed in absence of AGB1. Seedlings were treated with ABA, cold, IAA, GA, heat, JA, Mannitol, PEG, salt, SA and BAP. No difference in GUS staining was detected in *agb1-2* mutant background compared to no treatment control (Supplementary Figures S2–S5) suggesting no role/requirement of AGB1 for *NDL2* expression during treatment in *agb1-2* background. In case of cold, ABA and JA treatment upregulation of *pNDL2-GUS* was detected in the cotyledons in wild type Col-0 background (Figure 6) but in *agb1-2* background this effect was not present (Supplementary Figures S2–S5) meaning AGB1 presence is required during cold, ABA and JA treatments for *NDL2* expression upregulation.

In case of *pNDL3-GUS*, no difference in expression was detected in for ABA, BAP, Cold (Supplementary Figure S2), IAA, GA, JA, heat (Supplementary Figure S3), salt, SA (Supplementary Figure S5) in comparison to no treatment control. For all these treatments results were similar in both the background in wild type as well as in *agb1-2* mutant background. Very interestingly after treatment with PEG and Mannitol in *agb1-2* mutant background intensity of GUS staining showed increase in the RAM which is opposite effect compared to the Col-0 background where a decline in

hypocotyl and almost absence of expression was observed in RAM (Figure 7). This means differential and opposite role of AGB1 is operating for regulating *NDL3* expression under normal and dehydration stress condition (Figure 9). Presence of AGB1 is needed under stress condition to downregulate *NDL3* expression, which might need to be downregulated under dehydration stress.

**Figure 9.** In vivo histochemical GUS staining of 6-day old transgenic *Arabidopsis* seedlings (*pNDL3-GUS*) in *agb1-2* mutant background: (**A**). Increased levels of GUS staining intensity were detected in RAM compare to almost no staining in no treatment control with Mannitol and PEG treatment. (**B**) Histochemical GUS staining was analysed in whole seedling also. ne Six days old *Arabidopsis* transgenic seedlings were subjected to treatment for 24 h. Result shown is representative of three independent biological replicates (*n* ≥ 10 in each experiment). Scale Bar = 0.2 μM for seedlings and 0.02 μM for RAM.

#### **3. Discussion**

Comparative analysis for all three members of *NDL* gene family revealed their role in many common and unique functions. *In-silico* analysis revealed that all three members share many common transcription factor binding sites (Table 1). Presence of MYC binding factors postulate evolutionary conserved regulatory mechanisms of downregulation similar to animal systems for *AtNDL* family. Regulatory regions of all the three members contain motifs for stresses (biotic and abiotic), development, and for hormonal responses. MYB recognition sites like MYB1AT and MYB1LEPR are present in all the *NDL* members. MYB1AT is a MYB recognition site which is also present in the promoter of *RD22* which is ABA-induced and drought-responsive gene in *Arabidopsis* [18]. Another dehydration-responsive motif ABRELATERD1 is also present in the *NDL* family members, this motif has been reported in the up-stream region of the *ERD1* (early response to dehydration 1). Promoters with *ERD1* motif are involved in response to ABA and also show significant upregulation under water stress [20]. Both AtMYC and AtMYB are known to function as transcriptional activators in abscisic acid signaling presence of these motifs in combination with dehydration-responsive motifs in *NDL* family members indicates their role in ABA-dependent early response to dehydration.

In the present study, we have only analysed early stage of plant growth but Rymaszewski et al., 2017 has analysed plant growth starting from early stages till reproductive stage. They found positive co-relation of morpho-physiological traits like projected rosette area, and increased transpiration rate under low water-deficient conditions and expression levels for *NDL1*. They also speculate that these acclimations may be driven by largely stress hormone ABA. In another study Du et al., 2018 [21] has also found that during low water treatment conditions plants shifts for drought escape mechanism and leads to early flowering to complete their life cycle. Activation of the flowering genes are both ABA-dependent and ABA-independent [21]. All these finding indicates the complex network between stress, hormones and developmental responses.

Interestingly both AGB1 and GPA1 both are interacting partners of NDL members [14] are also interacting partners of one of the MYB (AT3G24120) in G protein interactome using in Y2H [3].

ANAERO1CONSENSUS motif is indicative of involvement of the gene family during anaerobic conditions. MYCCONSENSUSAT regulates transcription of *CBF*/*DREB1* genes during cold. Presence MYC recognition sites had been postulated to regulate the gene transcription by a basic bHLH transcription activator during cold stress [22]. MYCCONSENSUSAT sites are present mainly at the genes which have more expression in seed and the response to cold is ABA-mediated [23]. MYCCONSENSUSAT motif was also detected in the promoter of the gene which showed response for JA. WBOXATNPR1 is a disease related motif that is established as a site for binding SA-induced WRKY TFs [23]. Presence of a P1BS and SURECOREATSULTR11 motif indicative of the involvement of the family members for nutrient availability. P1BS motifs are associated with phosphate starvation response, while SURECOREATSULTR11 are found in the genes involved during sulfur deficiency responses (*SULTR1; 1*) [24,25].

All the three *NDL* members contain various hormone-responsive transcription factors binding sites. Responses of MYCCONSENSUSAT and ABRELATERD1 are ABA-dependent during desiccation and cold responses. MYCCONSENSUSAT motif also present in the genes which are involved in JA-mediated defense responses. Cytokinin response regulator ARR1AT and SA-responsive WBOXATNPR1 is also present commonly in all the three family members. Light-responsive motifs GT1CONSENSUS and SORLIP1AT are also common in all the members. GT1CONSENSUS a light-responsive motif, also showed tissue-specific localization, it is present in the up-stream region of the right regulated genes which are highly expressed in leaf [23]. Other than these common shared regulatory elements unique TFs binding sites to *NDL1*, *NDL2* and *NDL3* were also observed (Table 2). *NDL1* showed presence of exclusive binding sites for E2F, which targets cell cycle-regulated expression. E2F controls cell cycle by the regulation of the transcription of the genes that are involved during cell cycle and DNA replication [26,27] *Arabidopsis* E2F are classified as classical E2F proteins (E2Fa-c) and atypical E2F proteins (E2Fd-f) [28]. In G-protein interactome database NDL1 shows interaction with CKS2 (cell cycle regulatory subunit) and cyclin-dependent kinase G1 hinting specific function during cell cycle. Previously it is established that sucrose and D-glucose enhances the stability of NDL1 protein [14] which goes in accordance with presence of SBOXATRBCS, important for sugar and ABA responses. Role of NDL1 during primary and lateral root growth and development is already established presence of RHERPATEXPA7, a root hair specific cis-element needs further probing to study role of NDL1 in root hair development.

Likewise, *NDL2* strikingly shows presence of various kinds of ABA-responsive elements like ABREATCONSENSUS, which play role during ABA signaling and responsible for abiotic stress tolerance [29], ABREATRD22 and ACGTABREMOTIFA2OSEM. ACGTABREMOTIFA2OSEM motif present in the promoter of rice *OsEm* gene, which is regulated by seed specific transcription factor and ABA-responsive [30]. ACGTABREMOTIFA2OSEM motif also found in maturing seeds and acts as a binding site for bZIP TFs ABI5 [31]. Those genes which are highly expressed in mature seeds of *Arabidopsis* are found to be enriched with ABRE motif [32]. A single copy of ABRE is not sufficient for ABA-mediated responses. Multiple copy number of ABRE along with a coupling element forms the active complex for ABA-mediated responses [33]. Along with ABA-responsive binding sites several other sites like, GA and JA-responsive sites GADOWNAT and T/GBOXATPIN2 are also present in the promoter region of *NDL2*. GADOWNAT is common sequence found in genes which downregulates after GA treatments. GADOWNAT is shown to be identical as ABRE [34], also hints the regulation of gene under ABA responses. T/GBOXATPIN2 is a wounding response motif, found in the promoter of JA-responsive genes [35]. As per the *in-silico* analysis presence of different kinds of ABA response factors and seed specific transcription factors which are again ABA-dependent for their responses. Since our in vivo expression analysis proved absence of the *NDL2* expression in the roots. *NDL2* members might be specifically involved during seed germination and growth responses.

The *NDL3* promoter showed presence of -ABRERATCAL, an early stress-induced motif related to ABRE and found in the up-stream region of Ca2<sup>+</sup> ion-responsive genes, CGCGBOXAT is a Ca2+-dependent Calmodulin binding motif is also present both indicative of involvement of *NDL3* during calcium and ABA-mediated stress adaptations [36–38]. GCCCORE has been found in the promoter region of the several pathogen-responsive genes showed JA-dependent defense responses [39]. HDZIPIIIAT involved in vascular differentiation and patterning [40] and required for polar auxin transport in shoot [41], *NDLs* combined role is already established for polar auxin transport in roots [14] Increased HDZIPIIIAT activity leads to formation of extra cotyledons [42], a phenotype somewhat similar to tricot phenotype of *ndl* microRNA downregulated lines [16]. While downregulation of HDZIPIIIAT showed the loss of cotyledons formation [42,43]. All these findings indicate that *NDL3* might be a Ca2+-responsive gene during stress signaling. *NDL3* also involved in similar functions like *NDL1* as both share similar in vivo localization during early stages of plant growth

The combinatorial effect of unique and common transcription factors present in *NDL* gene family essentially suggests that *NDL1, NDL2,* and *NDL3* do share a basic common regulatory mechanism of downregulation by MYC/MYB. Our in vivo analysis also shows that during different stages of life cycle, and when plant encounter specific stress conditions differential regulation of all *NDL* members is possible and this could be attributed to the unique transcription factor binding sites present in each gene respectively.

Initial developmental study for all three members of the family showed that they are very specific about their expression pattern. In wild type background, *NDL1* was expressed in cotyledonary leaves, true leaves, primary and lateral roots. Functional characterization of *NDL1* has been done in detail and it has been found that *NDL1* is required for primary root and shoot meristem initiation growth and lateral root/shoot branching [14,16]. For *NDL2* the expression was detected in cotyledonary leaves, true leaves and in maturation zone of primary root, no expression was detected in RAM and in lateral roots (Figure 3) indicating function in different organs compared to *NDL1* and *NDL3*, which share overlapping expression zones. Presence of GA and cold-responsive elements, along with different kind of ABA-responsive factors in *NDL2* promoter is suggestive of its involvement during seed germination and early growth as during low temperature ABA biosynthesis and GA catabolism is up regulated that lead to seed dormancy [44].

*In silico* and in vivo expression analysis in case of *NDL3* was detected in newly emerging true leaves and at primary and lateral root tips (Figure 3). *NDL3* show response during drought stress and expression pattern is again quite similar with *NDL1* response (Figure 7; Figure 9) meaning, both the members share overlapping expression patterns in vivo and are playing role in similar physiological processes.

Previously it has been found that in *agb1* mutant *NDL1* expression levels are of wild type level meaning *NDL1* transcript is unaffected by the AGB1 (Figure 5D) [14] Similar to *NDL1* study we found that in vivo expression levels/patterns of *NDL2* and *NDL3* remains similar to Col-0 levels even in *agb1-2* mutant, meaning AGB1 does not affect the transcript levels of *NDLs.*

NDL1 protein stability in young primary root needs presence of AGB1, as NDL1 undergoes proteosomal degradation in absence of AGB1 [14], effect of AGB1 on stability and localization of NDL2 and NDL3 is still pending and needs to be characterized.

The effect of different hormones and abiotic stress responses also shows that along with the common response of *NDL1* and *NDL3* during osmotic/drought stress the different family members show specific responses for various treatments. NDL1 localization is upregulated during cold, IAA, JA, SA and cytokinin treatment and downregulated after heat stress (Figure 5A), meaning presence of AGB1 is essential for NDL1 stability during cold, IAA, JA, SA and BAP treatment, indicating stress specific role of NDL1-AGB1 module.

Although *NDL1* and *NDL3* share similar expression domains, *NDL3* expression remains unaffected by cold, IAA, JA, SA, cytokinin and heat treatments (Supplementary Figures S2–S5) meaning differential specificity in their functions.

Various ABA binding sites are present in the promoter region of *NDL2* and in accordance after ABA treatment (also for cold and JA) the *NDL2* expression shows upregulation in Col-0 background but did not showed any difference in *agb1-2* mutant background indicating presence of AGB1 is essential for *NDL2* expression upregulation during these treatments.

In case of *NDL3* expression upon treatment with Mannitol and dehydration stress it was found that AGB1 is negatively regulating *NDL3* expression under normal and dehydration stress condition (Figure 9). Presence of AGB1 is needed under stress condition to downregulate *NDL3* expression, which may be downregulated under dehydration stress.

Although various common TFs have been found in the promoter analysis but all the three members did not show the response for each of them. Every members of the family behaves differentially even though they share the common regulatory motifs. This happens because the family members also showed differential expression pattern and response to various stresses and hormones could be specific for particular developmental stage, our analysis is limited for initial growth stage of the plant. Previously it was found that AGB1 protein is needed for NDL1 stability and here we also found that the protein stability was affected in mutant background. In addition, *AGB1* was also confirmed to be involved in drought stress [45]. *NDL1* is also predicted to be a stress marker for drought [15]. Our *in-vivo* study proves that during the combined function out of three members two of them (*NDL1* and *NDL3*) are involved during drought stress response. *NDL1* is acting as a general abiotic stress responder during heat, cold, IAA, JA, SA and in cytokinin responses as protein steady state levels show upregulation after all these treatments, while *NDL3* expression remains unaffected by all these treatments, meaning limited and specific role. *NDL2* doesn't show any alteration in expression after Mannitol and PEG treatments but showed significant upregulation for ABA, cold, and JA treatment.

In summary dehydration stress (Mannitol and PEG treatments) caused increase in steady state protein levels of NDL1 in both wild type Col-0 as well as in *agb1-2* mutant background. Whereas *NDL3* showed downregulation of expression in wild type Col-0 background while upregulation in *agb1-2* mutant background in RAM. Expression levels of *pNDL2-GUS* were affected in AGB1-dependent manner during cold, ABA and JA treatment. These results strongly support the direct involvement of *NDL1* and *NDL3* during osmotic/drought stress responses and *NDL2* might be playing ABA-dependent indirect role. AGB1 is also playing role in differential regulation of expression of these *NDLs* members during different treatments by affecting protein stability (seen in case of NDL1) or transcript levels (as seen in case of *NDL3* expression) in stress specific manner.

#### **4. Material and Methods**

#### *4.1. Plant Material and Growth Conditions*

*Arabidopsis* Col-0 ecotype was used in the present study. *Agrobacterium* (strain GV3101)-mediated floral dip method [46] was used to generate transcriptional fusion transgenic (*pNDL2, 3-GUS*) in wild type Col-0 as well as in the *agb1-2* mutant background. *pNDL1-NDL1-GUS* translational fusion lines were taken from previous study [14]. Transformed seeds were selected on <sup>1</sup> <sup>2</sup> MS medium (Himedia, Mumbai, India) containing 25 mg/L Hygromycin (Duchefa, Amsterdam, Netherland), resistant plants were moved to the soil and grown to maturity in a growth room with a photoperiod 16h light/8h dark at 22 ◦C, and the light intensity of 100 μmolm<sup>−</sup>2s−1. Three independent single insertion T3 homozygous lines were obtained and used for developmental and stress treatments. For developmental study, seeds were grown vertically for, 4-day, 8-day and 12-day respectively followed by in vivo GUS assay. For stress treatment and fluorometric analysis six day old seedlings were used.

#### *4.2. Isolation and Cloning of NDL2 and NDL3 Promoters*

Genomic DNA was isolated with minor modifications from the *Arabidopsis* thaliana Col-0 using Doyle method [47]. Primers listed in Supplementary Table S1 were used to amplify the promoter region of *NDL2* and *NDL3*. Amplified fragments were then cloned into a pENTR/D-TOPO entry vector (Invitrogen, Massachusetts, United States) gateway, followed by pGWB3 destination vectors) with C-terminus GUS reporter. For NDL1 previously published lines were used [14].

#### *4.3. In-Silico Analysis*

Sequence information and location of *NDL1, NDL2* and *NDL3* was retrieved from https://www. arabidopsis.org/. In silico analysis was done using online programs PLANTPAN 2.0 [48]. For the expression pattern of *NDL* genes in different stages of development in different tissues, and time points data from the free version of the GENEVESTIGATOR online portal (https://www.genevestigator.com/ gv/plant.jsp) and eFP browser (http://bar.utoronto.ca/efp/cgi-bin/efpWeb.cgi) was used for analysis.

#### *4.4. GUS Staining Assay*

Three independent lines for *pNDL1-NDL1-GUS* and published *pNDL1-NDL1-GUS* lines were germinated and grown on <sup>1</sup> <sup>2</sup> MS media for 4-day, 8-day and 12-days. Gus staining on the seedlings was done using Jefferson method [49].

#### *4.5. Fluorometric GUS Assay*

For quantitatively GUS activity. 0.5 g of seedlings (6-day old) were harvested and frozen into liquid N2 in 1.5 mL micro centrifuge tube, followed by grinding and protein extraction in extraction buffer (50 mM sodium phosphate buffer, pH 7.0, 10 mM EDTA, 0.1% Triton *X*-100, and 10 mM β-mercaptoethanol). The homogenate was centrifuged at 10,000 g for 20 min at 4 ◦C. Supernatant was collected and assayed for protein concentration using Bradford method [50]. Five microgram of protein was added to GUS assay buffer (1*X* = 50 mM sodium phosphate buffer, pH 7.0, 10 mM EDTA, 0.1% Triton *X*-100, and 10 mM β-mercaptoethanol containing 2 mM MUG (Himedia, Mumbai, India) total volume was made up to 500 μL using ddH2O. Samples were incubated at 37 ◦C for one hour followed termination of the reaction using 400 μL of 0.2 M Na2CO3. For quantitative GUS analysis duplicate samples were assayed. The reaction product 4-methylumbelliferon (MU) was detected fluorometrically at excitation and emission of 365 nm and 455 nm respectively [51] using Tecan Spark multimode micro plate reader. GUS activity was expressed in nmol MU min−<sup>1</sup> μg−<sup>1</sup> protein.

#### *4.6. Hormone and Abiotic Stress Treatments*

Seeds were stratified and grown vertically on <sup>1</sup> <sup>2</sup> MS media. Six day old seedlings were subjected to 24 h treatment of sodium chloride (NaCl; 150 mM), Mannitol (300 mM), cold (4 ◦C), heat (37 ◦C), abscisic acid (ABA; 20 μM), Indole-3-acetic acid (IAA; 10 μM), salicylic acid (SA; 10 μM), methyl jasmonate (JA; 10 μM), polyethylene glycol (PEG 6000, 20%) and Gibberellic acid (GA; 20 μM) in liquid MS. NaCl, Mannitol and PEG were obtained from Himedia (Himedia, Mumbai, India) all the other hormones and fine chemicals were obtained from Sigma (Sigma, Missouri, United States).

#### *4.7. Accession Numbers*

Sequence data from the article can be found from https://www.arabidopsis.org/using accession number: AT5G56750 (*NDL1*), AT5G11790 (*NDL2*), AT2G19620 (*NDL3*) and AT4G34460 (*AGB1*). Chromosomal locations of the respective genes are as follows: NDL1: Chromosome 5 (22957629- 22960916), NDL2: Chromosome 5 (3799408-3803216), NDL3: Chromosome 2 (8485991-8488963), AGB1: Chromosome 4 (16477031-16479620), Regulator of G protein signaling protein-RGS1: Chromosome 3 (9532613-9535629), ANNEXIN-ANN1: Chromosome 1 (13225168-13227239), Sodium and Lithium-Tolerant 1-SLTI: Chromosome 2 (15761295-15763451), *Arabidopsis* Ribosomal Protein -RS27A: Chromosome 3 (22611521-22612905), O-Acetylserine (Thiol) Lyase -OAS-TL: Chromosome 4 (8517960-8520596).

#### **5. Conclusions**

Activation of specific AGB1-NDL module interaction might be stress specific and hormone regulated, specificity of action is further added in different stages of growth and development due to differential expression patterns of *NDL* members. NDL1 protein stability in young primary root needs presence of AGB1, as NDL1 undergoes proteosomal degradation in absence of AGB1 [14]. Though AGB1 physically interacts with all NDLs it might differentially regulate their stability and hence their function during different stresses, effect of AGB1 on protein stability of NDL2 and NDL3 is still pending and needs to be characterized. Interestingly, Yeast 2 Hybrid (Y2H) confirmed NDL1 interactors includes-Annexin 1 (ANNAT1-has role in drought stress), Sodium and Lithium-Tolerant 1 (SLT1 involved in salt stress), Lesion Stimulating Disease 1 (LSD1 regulates cold stress), O-Acetylserine (Thiol) Lyase (OAS-TL) Isoform A1 (OASA1 role in cadmium tolerance) and *Arabidopsis* Ribosomal Protein S27 (ARS27A involved in genotoxic stress) [3]. These interactions speculate that NDL proteins might plays role in stress signaling in the form of multimeric complexes with these stress effectors and other G protein core components (RGS1 and AGB1) in stress specific manner.

All of these findings together including regulatory elements in silico and in vivo expression profiling indicate that *NDL* family members along with *AGB1* play–key differential roles in different organs during different stages of plant growth and developmental. Collectively, our data suggest that *NDL-AGB1* modules are abiotic stress and hormone treatment-responsive and could be used as stress markers. In long run they could be potential candidates for crop improvement strategies (Figure 10).

**Figure 10.** *AtNDLs* responses to various abiotic and hormonal treatments in presence and absence of AGB1: (**A**) Diagrammatic representation of *AtNDLs* in wild type background in response to various abiotic and hormonal treatments. (**B**) *AtNDLs* responses to various abiotic and hormonal treatments in *agb1-2* mutant background. The green lines indicates that the gene was up regulated while the red lines indicates the downregulation of gene after the respective treatments.

**Supplementary Materials:** Can be found at http://www.mdpi.com/1422-0067/20/19/4736/s1.

**Author Contributions:** Conceptualization, Y.M.; methodology, Y.M. and A.K.; validation, A.K.; formal analysis, A.K.; investigation, A.K.; resources, Y.M.; data curation, A.K.; writing—original draft preparation, A.K.; writing—review and editing, Y.M.; visualization, A.K.; supervision, Y.M.; project administration, Y.M.; funding acquisition, Y.M.

**Funding:** Work in the Mudgil's Lab is supported by grants from the DST-SERB (EMR/2016/002780), DBT (BT/PR20657/BPA/118/206/2016) and Delhi University R&D and DU-DST PURSE grant. A.K is supported by JRF/SRF fellowship from CSIR.

**Acknowledgments:** We thank Swati Singh (Delhi University, Botany Department) for cloning of NDL2 and NDL3 promoter in Entry Vector and Diwakar Bhardwaj (Delhi University, Botany Department) for his help in generating T3 transgenic lines used in the study. We would also like to extend our thanks to Aniket (Delhi University, Ambedkar Centre for Biomedical Research) for his help with fluorometer.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### *Article* **Overexpression of** *GmCAMTA12* **Enhanced Drought Tolerance in Arabidopsis and Soybean**

**Muhammad Noman, Aysha Jameel, Wei-Dong Qiang, Naveed Ahmad, Wei-Can Liu, Fa-Wei Wang \* and Hai-Yan Li \***

College of Life Sciences, Engineering Research Center of the Chinese Ministry of Education for Bioreactor and Pharmaceutical Development, Jilin Agricultural University, Changchun 130118, Jilin, China **\*** Correspondence: fw-1980@163.com (F.-W.W.); hyli99@163.com (H.-Y.L.)

Received: 15 August 2019; Accepted: 26 September 2019; Published: 29 September 2019

**Abstract:** Fifteen transcription factors in the CAMTA (calmodulin binding transcription activator) family of soybean were reported to differentially regulate in multiple stresses; however, their functional analyses had not yet been attempted. To characterize their role in stresses, we first comprehensively analyzed the *GmCAMTA* family in silico and thereafter determined their expression pattern under drought. The bioinformatics analysis revealed multiple stress-related *cis*-regulatory elements including *ABRE*, *SARE*, *G-box* and *W-box*, 10 unique miRNA (microRNA) targets in *GmCAMTA* transcripts and 48 proteins in GmCAMTAs' interaction network. We then cloned the 2769 bp CDS (coding sequence) of *GmCAMTA12* in an expression vector and overexpressed in soybean and Arabidopsis through *Agrobacterium*-mediated transformation. The T3 (Transgenic generation 3) stably transformed homozygous lines of Arabidopsis exhibited enhanced tolerance to drought in soil as well as on MS (Murashige and Skoog) media containing mannitol. In their drought assay, the average survival rate of transgenic Arabidopsis lines OE5 and OE12 (Overexpression Line 5 and Line 12) was 83.66% and 87.87%, respectively, which was ~30% higher than that of wild type. In addition, the germination and root length assays as well as physiological indexes such as proline and malondialdehyde contents, catalase activity and leakage of electrolytes affirmed the better performance of OE lines. Similarly, *GmCAMTA12* overexpression in soybean promoted drought-efficient hairy roots in OE chimeric plants as compare to that of VC (Vector control). In parallel, the improved growth performance of OE in Hoagland-PEG (polyethylene glycol) and on MS-mannitol was revealed by their phenotypic, physiological and molecular measures. Furthermore, with the overexpression of *GmCAMTA12*, the downstream genes including *AtAnnexin5*, *AtCaMHSP*, *At2G433110* and *AtWRKY14* were upregulated in Arabidopsis. Likewise, in soybean hairy roots, *GmELO*, *GmNAB* and *GmPLA1-IId* were significantly upregulated as a result of *GmCAMTA12* overexpression and majority of these upregulated genes in both plants possess CAMTA binding *CGCG*/*CGTG* motif in their promoters. Taken together, we report that *GmCAMTA12* plays substantial role in tolerance of soybean against drought stress and could prove to be a novel candidate for engineering soybean and other plants against drought stress. Some research gaps were also identified for future studies to extend our comprehension of *Ca-CaM-CAMTA*-mediated stress regulatory mechanisms.

**Keywords:** arabidopsis; CaM (Calmodulin); calmodulin-binding transcription activators (CAMTA); *cis*-elements; drought; qPCR; soybean hairy roots

#### **1. Introduction**

A successful sustainable agriculture should ensure food security, must be eco-friendly and safe to humans [1]. With the existing population growth rate, the current food production rate needs to be increased at least up to 70% by 2050 [2,3]. Despite advanced farming practices, the abiotic stresses due to drought, salinity, water and temperature fluctuations are causing 50–80% losses in crop yield, and therefore, should be as effectively managed as possible [4]. Soon, the warmer earth will cause a more humid atmosphere but less humid soil, leading to more frequent drought that would negatively affect the rate of photosynthesis, uptake of CO2, accumulation of biomass and yield [5,6]. Thus, developing stress-resistant crops with stable yields under adverse conditions is an important strategy to ensure future food security [2,7]. Deep insights into the mechanisms underlying signaling crosstalk mediating stress tolerance would aid plant researchers to upgrade the plant's indigenous natural machinery through biotechnology and genetic engineering.

In plants, the ubiquitous secondary messenger calcium is the key to maintain the harmonious and homeostatic conditions via signaling [8,9]. Calcium ions perceive and encode the environmental, developmental or hormonal signals into a definite frequency that is decoded and relayed by the protein molecules next to them including calmodulin (CaM). By interacting with calcium, calmodulin (calcium-modulated) proteins sense and convey the signals to calmodulin-binding proteins [10]. Here, CaM through signal frequency readjustment channelizes them before transducing to CAMTA TFs (transcription factors), thus CaM acts like a prism (as prism dissects white light into its components). A transcription factor family reported and initially referred to as Ethylene-induced calmodulin-binding protein [11] or signal responsive (SR) protein, while now known as Calmodulin-binding transcription activator (CAMTA), is present in almost all eukaryotes. CAMTAs were first detected in *Nicotiana tabaccum* while studying calmodulin-binding proteins [11–13]. After their emergence, all multicellular eukaryotes studied to date have been reported to be equipped with variable number of *CAMTA* genes such as *Arabidopsis thaliana* (6) [13], *Lycopersicum esculantum* (7) [14], *Medicago truncatula* (7) [15], Citrus (9) [16], *Populus trichocarpa* (7) [17], *Nicotiana tabacum* (13) [18], *Musa acuminata* (5) [19] and *Phaseolus vulgaris* (8) [20]. Glycine max possesses 15 *CAMTA* genes and all of them differentially express under various stress conditions [21].

CAMTA TFs are an integral element of Calcium-mediated biotic/abiotic stress and hormonal signaling pathway [22–25]. Stress signals are conveyed and modulated through *Ca-CaM-CAMTA* pathway and a rapid and calculated response is observed by carrying out the transcription of stimulus-specific genes [26]. Calmodulin Binding Transcription Activators through their CG-1 domain recognize and specifically bind the ((*A*/*C*)*CGCG*(*C*/*G*/*T*), (*A*/*C*)*CGTGT*)) sequence of the target genes thereby directly interacting and regulating their transcription [12]. The CG-1 motif is a pivotal member of a rapid stress response element (RSRE) found in the promoters of many genes that are rapidly activated in response to stress [27]. Previous experiments report the negative role of Arabidopsis *CAMTA3* in regulating plant immunity as demonstrated in the *AtMCAMTA3* (loss-of-function) mutant Arabidopsis [28–30]. Similarly, *AtCAMTA1* and *AtCAMTA3* have been shown to function in drought and regulate auxin [22,31]. Moreover, *AtCAMTA1* and *AtCAMTA3* regulated cold response by inducing CBF (C-Repeat/DRE-Binding Factor) pathway genes as the double *ATCAMTA1* and *ATCAMTA2* mutant exhibited impaired freezing tolerance [32,33]. The role of *AtCAMTA3* in regulating SA (salicylic acid) pathway genes working in freezing tolerance was only recently determined [34]. Two recent studies highlighted the role of *AtCAMTA3* [35] and *AtCAMTA6* [36] in salt tolerance. *TaCAMTA4* in wheat was demonstrated to negatively regulate defense response against *Puccinia triticina* [25].

Soybean is an important crop cultivated globally for food, feed [37], pharmaceutical and soil nitrogen improving purposes [38]. However, environmental adversities including drought affect soybean growth and yield. Earlier, the in silico analysis of *GmCAMTAs* was conducted, yet the potential targets of miRNA in *GmCAMTA* transcripts and the protein-protein interaction network were not reported [21]. In addition, the previous study also did not analyze the expression pattern of *GmCAMTAs* in soybean leaves in response to drought [21]. In an extension to that study and in order to decipher the role of soybean CAMTA family in drought, we first comprehensively analyzed (in silico) the *GmCAMTA* family including their physicochemical properties, chromosomal distribution, *cis*-motifs, miRNA targets and protein-protein interaction network. Secondly, we determined the spatiotemporal expression pattern of GmCAMTAs in roots and leaves of soybean under PEG stress and selected an efficient member of the GmCAMTA family to functionally characterize. We constructed the overexpression construct by cloning the 2769 bp CDS (coding sequence) of *GmCAMTA12* and transformed into Arabidopsis and soybean hairy roots. Through various drought assays, we demonstrated that the transgenic Arabidopsis and chimeric soybean (OE-Overexpressing *GmCAMTA12*) plants exhibited enhanced tolerance and performed better under drought stress than their non-transgenic counterpart at phenotypic, physiological and molecular level. qPCR (quantitative PCR) of the downstream genes in Arabidopsis and soybean also displayed altered expression as a result of *GmCAMTA12* overexpression. From our analyses, we report that *GmCAMTA12* as a transcription factor plays role in drought stress by regulating the downstream genes involved in drought tolerance and could be exploited in developing drought-tolerant crops.

#### **2. Results**

#### *2.1. Physico-Chemical Properties of GmCAMTA Proteins*

Various physico-chemical properties including the number of amino acids, protein molecular weight (MW), pI (isoelectric point), number of atoms, instability and aliphatic indexes and GRAVY (grand average of hydropathy) determined online with the ProtParam tool are given in Table S1 and File S1. GmCAMTA11 and GmCAMTA9 are the shortest polypeptides comprising of 910 and 911 aa (amino acid), while GmCAMTA1 is the longest possessing 1122 aa. Overall, their average length is ~1004 aa with a range of ~200 aa mutual difference. The molecular weight of GmCAMTAs ranges between 126,989.43 and 102,394.56 kDa with an average MW of 112848.3 and the number of atoms is proportional to the molecular weight of each protein. Similarly, GmCAMTA8 has the highest pI value of 7.64 showing that GmCAMTAs have relatively lower pI. Moreover, they are also hydrophilic in nature as the GRAVY ranges between –0.625 (GmCAMTA1) and –0.394 (GmCAMTA12). Almost all of the GmCAMTAs are thermally stable as their aliphatic indexes match with that of the other globular proteins (highest in GmCAMTA12). While none of them are stable in a test tube, as the instability index of all GmCAMTAs is higher than 40. Except GmCAMTA8, the number of Asp+Glu (negatively charged aa) is higher than Arg+Lys (positively charged aa).

#### *2.2. Phylogenetics and Structure of GmCAMTAs*

As mentioned earlier, CAMTAs are multiple-stress responsive transcription factors. Enrichment of *cis*-motifs involved in signal response in the promoters of Medicago *CAMTA* genes hints that they are likely to respond variedly to various signals like other *CAMTAs* [15]. We dissected the regulatory region of *GmCAMTAs* (~2 kb upstream) online with PLANTCare, which detected stimulus-specific *cis*-motifs in their promoters. Overall, there are light (*G-box, MRE* and *AE-box*), drought (*MBS*), salt *(MYB*), pathogen (*TC-rich repeats*), wound (*WUN, WRE*), low temperature (*LTR*), gibberellin (*GARE, P-box*), auxin (*AUXRE*) and abscisic acid (*ABRE*) responsive *cis*-elements as shown in figure 1A. The presence of multiple *cis*-motifs in *GmCAMTA* genes represents their responsivity to multiple stimuli. Moreover, the *cis*-element (*CG-box*) is the binding site of CAMTA TFs; thus, the presence of *G-box* within *GmCAMTA* promoters indicates the interaction of one GmCAMTA TF with another. *GmCAMTA12* possesses *MYC*, *MYB*, *MBS* and *G-box,* which shows its potential role in drought stress.

Using the online GSDS tool, the gene structure of all the *GmCAMTAs* was visualized in order to mutually compare their structural diversity. The length of GmCAMTA genes lie in the range between 3196 bp (*GmCAMTA14*) to 3947 bp (*GmCAMTA6*) with an average length of 3607 bp. The exons (yellow), introns (black lines) as well as 5' and 3' UTR (untranslated) regions (blue) of each gene are shown in figure 1B. A close observation of the number of exon-introns reveals a similar pattern, namely, three genes (*GmCAMTA7, GCAMTA10 and GmCAMTA15*) that are comprised of 12 exons; the rest have 13 exons and 12 introns of variable length, indicating their close mutual evolutionary relationship. This fixed numbering of intron and exon is a conserved characteristic of *CAMTA*, which is descended from ancestors and is also demonstrated in the *CAMTA* family of other species [23]. Similarly, except the

last one/two exons, an ascending order in the exon size from 5' to 3' UTR can be observed across all the genes.

The four domains (CG-1, ANK (ankyrin repeat), IQ and CaMBD (CaM binding domain)) is the common conserved characteristic of all CAMTA TFs [23]. Scanning the amino acid sequences of GmCAMTAs with online protein domains illustrator tool showed the same four domains in all 15 members (figure 1C). GmCAMTA TFs through IQ (calcium-independent)/CaMBD (calcium-dependent) interact with Calmodulin, while through the CG-1 domain, they bind the DNA in a sequence-specific manner (*CGCG*/*CGTG*) at their promoter region. "ANK repeats" mediate protein-protein interactions. All these conserved domains, along with other properties, make GmCAMTA proteins, the "transcription factors". The high sequence specificity is common in the Calmodulin binding domain of Arabidopsis and soybean CAMTAs as shown in figure 1D.

An ML (Maximum Likelihood) tree was constructed which traced the evolutionary relationship among the CAMTA families of soybean, Arabidopsis, maize and tomato (Figure 1E). Using MEGAX (molecular evolutionary genetic analysis), the evolutionary tree was constructed from the full length aligned CAMTA protein sequences of the four species. A total of 37 CAMTAs including six from Arabidopsis, 15 from soybean, seven from tomato and nine from maize clustered into four distinct groups, I, II, III and IV. GmCAMTA1–6, AtCAMTA1–3, SlCAMTA1 and 2 and ZmCAMTA3, 6, 7a and 7b might have co-evolved and thus clustered together in group I, representing the largest clade. Similarly, GmCAMTA10, 11, 14, 15, AtCAMTA4, SlCAMTA3, 4 and ZmCAMTA1 clustered in group III showing their mutually high homology. GmCAMTA8, 9, 12 and 13 grouped with AtCAMTA5, 6, SlCAMTA5, 6 and ZmCAMTA5 making clade IV. Two orthologs (GmCAMTA7 and SlCAMTA7) comprised group II, which is the smallest clade. Clustering in the phylogenetic reconstruction indicates more mutual similarity and probably weak homology to the members of the other three bigger clusters. It is noticeable that except II, in the rest of clades, CAMTAs of the same species are at the tips of the same branches and vice versa retaining their intraspecific homology.

#### *2.3. miRNA Targets in GmCAMTA Transcripts*

miRNA target prediction is important in finding their role in plant growth development in normal as well as stress conditions [18]. Keeping the Expectation score ≤5, the 639 miRNAs were scanned and miRNA, where the minimum E. value was selected using the online psRNATarget tool [39]. A total of 10 unique miRNAs were predicted which potentially target the *GmCAMTA* transcripts by inhibiting translation or through cleavage (Table S2 and File S1). Their length ranges between 19 bp (gma-miR1533) to 24 bp (gma-miR343b). The accessibility of target site (UPE), which is associated with identification of target site and energy required to cleave the transcript, varies from 11.8 (gma-miR1533) to 21.6 (gma-miR5780c). The translation of *GmCAMTA1* and *GmCAMTA2* transcripts is potentially inhibited by the common gma-miR5780c, while gma-miR6299 cleaves four *GmCAMTA8, GmCAMTA9*, *GmCAMTA12* and *GmCAMTA13*). Similarly, *GmCAMTA5* and *GmCAMTA 6* have potential targets for gma-miR1533 while *GmCAMTA11* and *GmCAMTA14* are predicted to be cleaved by gma-miR2111b. gma-miR9726 is predicted to cleave *GmCAMTA3* and gma-miR1522 *GmCAMTA15*. These in silico predictions require experimental validation, which would extend our understanding the mechanisms of *Ca-CaM-CAMTA*-mediated stress tolerance in plants.

**Figure 1.** Comprehensive in silico analysis of the *GmCAMTA* family. (**A**) *Cis*-elements in the *GmCAMTA* promoter region. Each type of *cis*-motif present in *GmCAMTA* promoters is shown in unique color/color pattern. (**B**) Exon-intron assembly of *GmCAMTA* genes. (**C**) Domain organization of GmCAMTA proteins. Four different domains are represented in different colored shapes. (**D**) Alignment showing the conserved motif sequence of the Calmodulin Binding Domain of Arabidopsis and Soybean CAMTA TFs. Each conserved residue at the definite position along the row (throughout the orthologues) is shaded in unique color. The functional residues in CaMB domain of these CAMTAs are indicated in the motif below the alignment. In the square brackets "[ ]" are the amino acids allowed in this position of the motif; "X" represents any amino acid and the round brackets "( )" indicate the number of amino acids. (**E**) ML (maximum likelihood) phylogenetic tree.

#### *2.4. Chromosomal Distribution and Regulatory Network of GmCAMTAs*

The genome browser tool in NCBI (National Center for Biotechnology Information) mapped *GmCAMTA* genes to their respective chromosomes. The 15 *GmCAMTA* genes are unequally distributed over eight out of 20 chromosomes of soybean as shown in Figure 2. The figure depicts the complete size of each chromosome with the exact position of genes. Chromosome 8 has the highest number of *GmCAMTA* genes, i.e., four, while chromosome 7, 9, 11 and 18 have got only one in each. In prokaryotes, due to the absence of nucleus, transcription and translation occur simultaneously in coupling phase. On the other hand, translation of mRNA is always executed outside the nucleus in the eukaryotic cytoplasm and the proteins that work only in nucleus have a nuclear localization sequence/signal (NLS). Transcription factors also work in the nucleus; thus, after their translation in cytoplasm, they are directed to the nucleus which is mediated through their NLS. To find NLS, protein sequences of each GmCAMTA were submitted to the online cNLS mapper tool at http: //nls-mapper.iab.keio.ac.jp/cgi-bin/NLS\_Mapper\_form.cgi. Keeping a cut off score of 5, at least one NLS in all GmCAMTAs and even more than one in some proteins were detected. Likewise, all of the six Arabidopsis CAMTAs possess only one NLS in the CG-1 domain of each protein [12]. In contrast, rice CAMTAs have one NLS in the C-terminal and another in N of CG-1 domain [40]. Experimental evidence shows that these domains perform diverse functions in the regulation of gene expression [41]. Table S3 (File S1) shows the nuclear localization sequence in each of the 15 GmCAMTAs and predicts that all these transcription factors are localized in the nucleus.

**Figure 2.** Chromosomal distribution of *GmCAMTA* genes. The 15 *GmCAMTA* genes are located on chromosome 5, 7, 8, 9, 15, 17 and 18.

In order to find the interaction network of GmCAMTA proteins to relate them with other pathways, the protein sequences were individually put in the STRING (Search tool for the retrieval of interacting gene/proteins) database, which predicted a number of interactors [42]. Thus, they can aid in linking proteins of interest to other pathways and could lead to the discovery of novel pathways as well. The STRING database displayed a network of ~10 interactors for each GmCAMTA protein among which some sets were redundant. Thus, a total of 48 unique proteins were predicted for 15 GmCAMTAs as shown in Figure 3 (Table S4 and File S1). This vast interaction network (experimentally determined and software predicted) indicates the complex regulatory upstream/downstream pathways of CAMTAs. However, these *in silico* predicted interactions require experimental validation. Besides these predicted interactions, their orthologues in other species such as Arabidopsis or other legumes could also be exploited to search other potential interactors in soybean.

**Figure 3.** Protein-protein interaction network of GmCAMTAs detected in silico using the STRING database (https://string-db.org/). The experimentally determined interactions are denoted by pink strings, interactions on the basis of textmining are indicated by yellow strings while interactions on the basis of gene neighborhood shown by green strings.

#### *2.5. GmCAMTAs as Early Drought Stress-Responsive TFs*

The spatiotemporal expression in roots and leaves under 0, 1, 3, 6, 9 and 12 h of simulated drought stress is shown graphically in Figure 4. In roots, *GmCAMTA2* was highly expressed during 3 h of drought followed by *GmCAMTA7* and *GmCAMTA10*. In contrast, *GmCAMTA14* was downregulated during all the five time points followed by *GmCAMTA8, GmCAMTA9* and *GmCAMTA11*. Overall, these transcription factors upregulated abruptly during 1 and 3 h of stress and downregulated afterwards (Figure 4A). The expression profile of *GmCAMTAs* in leaves at different stress durations is different as compared to roots (Figure 4B). In leaves, they look uniform, except *GmCAMTA4,* which is the highest upregulated gene followed by *GmCAMTA5*, *GmCAMTA11* and *GmCAMTA12*. Interestingly, like roots, majority of these 15 genes retained the 3 h trend in the leaves as well. It is obvious that the expression was relatively high at 3 h, after which it decreased until 12 h.

**Figure 4.** Spatiotemporal expression analysis of GmCAMTA in drought. (**A**) Relative fold expression of *GmCAMTAs* in Roots. Soybean plants were treated with 6% PEG6000 in Hoagland's solution for five different durations (1, 3, 6, 9 and 12 h) along with a control (0 h). (**B**) Relative fold expression of *GmCAMTAs* in Leaves.

The differential expression of *GmCAMTA* family is the result of their tightly regulated transcription. We speculate that in stress conditions, although the calcium ions continuously convey the stress signals through calcium signatures to the cytoplasm as well as the nucleus; however, the intensity/amount of these signals is weighed and adjusted by the next signal relaying molecules, such as CaM, before conveying to *CAMTA* transcription factors. In case, this signal transduction to *CAMTA* is continuing with the same intensity, yet after certain period (3 h in our case), *CAMTAs'* response is not the same throughout the course and seems to be unconcerned and even downregulated as the stress period continues. From the control samples (0 h) in leaves and roots, it is also obvious that the expression of all *CAMTAs* is active at all times. In brief, the spatiotemporal expression pattern revealed that

*GmCAMTAs* are upregulated in the early phase of drought thus are early drought stress-responsive transcription factors.

#### *2.6. Arabidopsis Overexpressing GmCAMTA12 Exhibited Enhanced Drought Tolerance*

To evaluate the contribution of GmCAMTA12 protein in drought stress, we engineered Arabidopsis plants to constitutively express *GmCAMTA12* gene under 35 s promoter. Prior to Arabidopsis transformation, the *Agrobacterium tumefacians* strain EHA105 transformants were verified (through PCR) to harbor the overexpression cassette. (Figure S1 and File S1). After floral dip, several overexpressing lines were obtained of which we selected two independent stable homozygous lines (OE5 and OE12) for functional analysis. The expression of *GmCAMTA12* in two transgenic Arabidopsis lines was validated through qPCR.

For drought assay, the two lines of OE *GmCAMTA12* (OE5 and OE12) and the wild type (WT) plants were subjected to drought stress by withholding water for two weeks and then re-watered as shown in Figure 5A. Initially, all the plants were growing normally until water was withheld. However, upon encountering drought, nearly all the wild type and transgenic Arabidopsis stopped growth, wilted and started turning yellow afterwards. After 14 days of continuous drought treatment, most of the wild type plants were completely dried as obvious from their phenotype (dried leaves). Unlike WT, most of the OE lines retained life processes, which was evident from chlorophyll they retained in their leaves. After re-watering, majority of transgenic Arabidopsis rejuvenated but most of the wild type did not. The plants were then allowed to grow under normal conditions until seed harvest. As expected, the seed yield of WT and transgenic lines was unequal and OE lines developed more seeds than wild type Arabidopsis. Under well-watered conditions, the survival rate in soil under drought was ~100%; however withholding water for two weeks and then re-watering, less than 60% WT plants survived while OE5 and OE12 showed about 83% and 87% survival rate as shown graphically in Figure 5D. Obviously, the constitutive overexpression of *GmCAMTA12* had enhanced the drought survival efficiency of transgenic Arabidopsis leading to better growth and development.

In their root length assay on <sup>1</sup> <sup>2</sup> MS-mannitol medium as shown in Figure 5B, WT plants grew longer roots than OE plants at 0 mM concentration of mannitol; however, on mannitol, OE plants, specifically OE12, developed longer roots than WT at all three concentrations (50, 100 and 200 mM). Interestingly, the roots of OE12 plants were the longest at 200 mM mannitol. Root length (cm) is shown graphically in Figure 5E. As flaunted through root length assay, the comparatively longer roots of OE than WT were the result of *GmCAMTA12* overexpression in T3 seeds.

Between the two overexpression lines, OE12 performed better than OE5 in both drought assays, thus, we subsequently inoculated seeds of WT and OE12 on <sup>1</sup> <sup>2</sup> MS with various concentrations of mannitol (50, 100, 150 and 200 mM) to evaluate the germination rate (Figure 5C). As expected, we observed higher germination rate of transgenic line OE12 than that of WT at all four concentrations of mannitol. The germination rate of OE12 was ~30% higher than WT at 50 mM mannitol. At 100 mM, the germination rate decreased in both types at a similar pace (~75% in OE and ~45% in WT). As the mannitol concentration increased to 150 and 200 mM, we saw a dramatic decline in the germination efficiency of WT (30% and 25%) as compared to OE (>65% and >60%) (Figure 5F). We can say that the constitutive expression on GmCAMTA12 has enhanced the germination efficiency of transgenic seeds under drought.

**Figure 5.** Phenotypic and physiological assays of wild type (WT) and OE under drought stress. (**A**) Drought assay of wild type (WT) and transgenic (OE5 and OE12) Arabidopsis grown on soilrite subjected to 14 days of drought stress. (**B**) Root length assay on MS-mannitol. (**C**) Germination assay on MS-mannitol. (**D**) Column chart showing the survival rate (%) of WT and OE in soil. (**E**) Root length (cm) of WT, OE5 and OE12 lines. (**F**) Germination rate (%) of WT and OE plants. (**G**) Proline contents, (**H**) malonaldehyde (MDA) contents, (**I**) CAT activity and (**J**) relative electrolyte leakage (REL %) of WT and OE plants in control and drought treatment.

In order to check the performance of WT and OE lines at physiological level, the physiological indexes, such as proline and MDA contents, CAT activity and relative electrolyte leakage were determined in all plants subjected to stress. Under well-watered conditions, we observed no significant difference in the level of proline contents, which was quite low in WT and OE lines. In contrast, proline contents calculated in drought-stressed plants had significantly elevated. In WT, the average value of proline was ~400 μg/g, while in OE, it was recorded in the range of 850 to 900 μg/g (Figure 5G). Malonaldehyde (MDA) is a well-known biomarker for sorting out stress-induced membrane damage due to oxidative stress. MDA contents in WT and OE plants during normal conditions matched mutually but its level was doubled in the drought-stressed wt plants compared with drought treated OE lines (Figure 5H). Catalase (CAT) is a major antioxidant enzyme, which is accumulated during abiotic stresses and scavenges H2O2. The CAT activity in WT and OE plants under usual conditions was nearly equal (Figure 5I); however, drought treatment enhanced CAT activity in transgenic plants as compare to WT. During stress, electrolytes, specifically K<sup>+</sup> ions leak out of the cells through various channels and thus damage due to stress could be monitored by comparing the electrolyte leakage in WT and transgenic lines. We determined relative electrolyte leakage (REL) in WT and OE lines during normal as well as drought conditions. REL % was nearly leveled under well-watered conditions, but was more pronounced in WT as compared to OE lines during drought (Figure 5J). Noticeably, the amino acid sequence analysis revealed high level of sequence similarity between GmCAMTA12 and AtCAMTA5. We can speculate that GmCAMTA12 being a transcription factor interacted with the downstream target genes (including AtCAMTA5 interactors) and modulated their expression in transgenic Arabidopsis, which contributed to their better performance under drought (determined at molecular level in later section).

#### *2.7. GmCAMTA12 Overexpression Regenerated More Developed and Drought-E*ffi*cient Hairy Roots in Soybean*

For further functional validation of *GmCAMTA12* in response to drought, the hairy roots system was exploited to overexpress the target gene in soybean. Prior to generating transgenic hairy roots, the *Agrobacterium rhizogenes* strain K599 cells harboring vector control (VC) and overexpressing *GmCAMTA12* construct (OE) were verified through gene-specific PCR (Figure S2 and File S1). After 1–2 weeks of infection of soybean seedlings with K599 (VC and OE), hairy roots had started regenerating with various frequency. Prior to subsequent experiments, we made sure that the hairy roots were transgenic by using gene-specific PCR from the genomic DNA of a small piece of hairy roots. Using qPCR, we validated the overexpression of the target gene in OE roots which was over three times higher than the VC hairy roots. After hairy roots had become long and strong enough by growing for two more weeks, the primary roots were removed and the chimeric plants (with transgenic roots and non-transgenic shoots) were shifted to fresh vermiculite.

When the transgenic roots were ~10 cm long, the VC and OE chimeric soybean plants were extirpated from vermiculite and transferred to Hoagland solution as shown in Figure 6A. After 3 days of acclimation, both VC and OE chimeric plants were treated with 6% PEG6000. The plants started to wilt with leaves curling and shoot apexes drooping on encountering drought. However, the wilting was more in VC than in OE chimeric plants as VC leaves were more wilted. *GmCAMTA12* overexpression induced profuse hairy roots (Figure 6B) due to which the aerial non transgenic part of the OE chimeric soybean plant was also more developed than the VC chimeric plants. The roots were analyzed using the root scanning system (Figure 6C). The OE hairy roots showed higher values with total root length (Figure 6J), surface area (Figure 5K), root volume (Figure 6L), number of branches (Figure 6M) and projected area (Figure 6N).

**Figure 6.** Comparative phenotype and physiology of chimeric sobean plants. (**A**) Chimeric soybean plants bearing VC (Vector control) roots and OE (overexpressing GmCAMTA12) roots in control and drought conditions. (**B**) Comparison of chimeric soybean plants having VC and OE roots. (**C**) VC and OE hairy roots observed with a root scanner. (**D**) Fresh and (**E**) dry weight of VC and OE hairy roots after culturing on MS-mannitol for 10 days. Comparison of VC and OE hairy roots in terms of (**F**) Proline contents, (**G**) MDA contents, (**H**) CAT activity and (**I**) relative electrolyte leakage between VC and OE hairy roots. (**J**) Analysis of the total length, (**K**) surface area, (**L**) root volume, (**M**) number of branches and (**N**) projected area.

The proline and MDA contents, CAT activity and relative electrolyte leakage were determined to check the impact of *GmCAMTA12* overexpression at physiological level. In control samples (Hoagland), proline contents had nearly equal amount in VC and OE hairy roots; however, under drought, proline

content level was recorded to be significantly higher in OE hairy roots (Figure 6F). MDA shows the level of membrane damage as it is the final product of lipid peroxidation. In the absence of stress, MDA contents in OE type were slightly less than VC hairy roots; however, with PEG treatment, VC had substantial increase in MDA contents as compare to OE roots (Figure 6G). In contrast, CAT activity was significantly higher in OE hairy roots than VC under drought stress (Figure 6H). For REL, VC hairy roots had higher electrolyte leakage (%) than OE in response to drought (Figure 6I). Comparative physiology of OE and VC hairy roots shows that the overexpression of GmCAMTA improved the drought tolerance of OE by altering physiology.

To analyze their growth efficiency under drought stress, 0.1 g of hairy root from VC and OE hairy roots was weighed under sterile conditions and inoculated on GM media containing four various mannitol concentrations (50, 100, 150 and 200 mM) including a control (0 mM mannitol) with three replicates of each type (Figure S3 and File S1). All the plates were kept in dark in growth room with 28 ◦C for 10 days of culturing after which the fresh and dry weights were determined. The control samples of both types (VC and OE) hairy roots had nearly same fresh and dry weights; however, on stress media, OE hairy roots showed better performance as the weight of OE roots was more than VC roots at all four concentrations of mannitol (Figure 6D,E). It further flaunted that the overexpression of GmCAMTA12 has improved the drought survival efficiency of OE roots.

#### *2.8. Expression Analysis of GmCAMTA12 Orthologues' Regulatory Network in Arabidopsis*

In Arabidopsis, *AtCAMTA5* is the orthologue of *GmCAMTA12;* thus, the regulatory network of *AtCAMTA5* was predicted with STRING database. To test our hypothesis, whether the overexpression of GmCAMTA12 TF in transgenic Arabidopsis would interact with genes in the regulatory network of *AtCAMTA5*, we analyzed the expression of 10 interactors predicted with STRING database (Figure 7A). Total RNA from drought treated WT and OE plants (Figure 7B) was isolated and reverse transcribed to cDNA. Using cDNA and gene-specific primers (Table S6 and File S1), we conducted a qPCR which deciphered the differential expression pattern of the 10 genes in wt and OE plants (Figure 7C). Among them, AtNIP30 (NEFA-interacting nuclear protein—*AT3G62140*), which in humans is involved in negative regulation of proteasomal protein catabolic process, is slightly upregulated in response to drought. AtWRKY14 (*AT1G30650*) encoding WRKY transcription factor 14 possesses a DNA binding domain and specifically interacts with *W box* (a common elicitor-responsive *cis*-element). WRKY14 is also nuclear localized transcription factor and regulates many important processes through gene regulation. *GmCAMTA12* overexpression upregulated *AtWRKY14,* which indicates that along with other transcription factors including WRKY TFs, the spectrum of CAMTA TFs regulatory networks is much wider than we think. Thus, the mutual interaction of CAMTA and WRKY should be further investigated. Peptdyl-prolyl *cis-trans* isomerase AtCYP59 (*AT1G53720*) mediates posttranslational modifications specifically protein folding. With *GmCAMTA12* overexpression, the upregulation of AtCYP59 is almost equal to that of WRKY14. AtANN5 (*AT1G68090*) is a calcium binding protein, plays role in pollen development and is induced by cold, heat, drought and salt stresses. As expected, its expression is relatively the highest among the 10 interactors. Serine racemase (AtSR—*AT4G11640*) is involved in serine biosynthetic pathway. Its expression also seems to elevate with *GmCAMTA12* overexpression in OE Arabidopsis. Elongator complex 3 (AtELO3—*AT5G50320*) is a part of Elongator multiprotein complex and regulates initiation and elongation of transcription. No apparent change in the expression level of AtELO3 in WT and OE was observed. Similar to ANN5, CaMHSP (Calmodulin Binding Heat Shock Protein—*AT3G49050*), also called Alpha/beta-Hydrolases superfamily protein, exhibited a higher transcript level. It is involved in the lipid metabolic pathway. B120 (G-type lectin S-receptor-like serine/threonine-protein kinase—*AT4G21390*) is involved in protein kinase activity and recognition of pollen. Its expression level was downregulated in transgenic Arabidopsis. *AT2G43110* (U3 containing 90S pre-ribosomal complex subunit) was also upregulated with the overexpression of *GmCAMTA12*. *AT3G19850* (BTB (for BR-C, ttk and bab) or POZ (Pox virus and Zinc finger)) domain-containing) mediates protein degradation by facilitating ubiquitination. Its expression level

was elevated with *GmCAMTA12* upregulation. All of these genes possess a *CGCG*/*CGTG* motif in their promoter sequences, which also validates the sequence-specific binding of CAMTA TFs. Most of these interactions are based on text-mining and should be determined experimentally.

**Figure 7.** Expression analysis of genes in the regulatory network of *GmCAMTA12* and *AtCAMTA5*. (**A**) STRING-predicted regulatory network of *AtCAMTA5*. (**B**) Drought treated WT and OE Arabidopsis. (**C**) Expression profile of the 10 interactors of *AtCAMTA5* in WT and OE Arabidopsis. (**D**) STRING-predicted regulatory network of GmCAMTA12 in soybean. (**E**) Chimeric soybean plants with VC and OE hairy roots. (**F**) Expression analysis of the eight interactors of *GmCAMTA12* in VC and (**G**) OE leaves. (**H**) Expression analysis of the eight interactors in VC and (**I**) OE roots.

#### *2.9. GmCAMTA12 Overexpression Orchestrated Downstream Genes in Transgenic Hairy Roots*

In order to find whether the constitutive overexpression of *GmCAMTA12* modulates the genes with which GmCAMTA12 TF interacts, we selected eight genes in the *GmCAMTA12* regulatory network in soybean predicted with STRING (Figure 7D). In chimeric soybean plants possessing VC and OE hairy roots (Figure 7E) treated with 6% PEG6000, we analyzed the expression pattern of the eight of the predicted interactors of *GmCAMTA12* to know their *GmCAMTA12*-mediated regulation. Using gene-specific primers (Table S7 and File S1), qPCR of the selected interacting proteins in transgenic hairy roots as well as non-transgenic leaves was carried out. The genes displayed a differential expression profile in the control and drought treated chimeric soybean plants. In VC roots (Figure 7H), GmNIP30 (NEFA interacting protein—*GLYMA19G40910*) was slightly upregulated in response to drought and its expression was indifferent to *GmCAMTA12* overexpression in OE roots (Figure 7I). GmPLA1-IId (Phospholipase A1-IId - *GLYMA12G15430*), involved in the lipid metabolic process, was upregulated in VC roots under drought stress, while in OE roots, its upregulation was two-fold higher than that in VC. *GmNAB* (Nucleic Acids Binding—*GLYMA18G40360*) was downregulated in VC roots while upregulated with *GmCAMTA12* overexpression in OE roots in response to drought. The expression of *GmELO* (Catalytic histone acetyltransferase subunit of the RNA polymerase II elongator complex—*GLYMA06G18150*) was the highest and equally expressed gene in both VC and OE roots. GmSR1 (Serine racemase-1 involved in D-Serine biosynthetic process—*GLYMA05G37930*) was upregulated in VC roots; however, with *GmCAMTA12* overexpression, GmSRI was downregulated in OE roots under drought. Similarly, GmSR2 (Serine racemase-2—*GLYMA08G01670*) was repressed in the absence of GmCAMTA12 overexpression; however, in OE roots under drought, it was slightly upregulated. Simultaneously, GmUC1 (uncharacterized, but possessing a *Myb*-DNA binding domain—*GLYMA20G32540*) transcript was slightly upregulated in VC but was significantly induced in OE hairy roots in response to drought. In contrast, GmUC2 (Uncharacterized2—*GLYMA17G07200*) was positively regulated in VC and negatively regulated in OE roots. Unlike the roots, the expression profile of these eight genes was nearly similar in chimeric soybean leaves possessing VC (Figure 7F) and OE (Figure 7G) hairy roots in response to drought. Five out of eight of these genes possess *CGCG*/*CGTTG* motif.

#### **3. Discussion**

Calcium as a ubiquitous secondary messenger orchestrates almost every cellular process in response to environmental stimulus. Plants employ the divalent cation, calcium (Ca2<sup>+</sup>) in relaying these endogenous (developmental) and exogenous (environmental) signals to appropriate cellular responses. Calcium alone specifically encodes a myriad of distinct signals by using spatial and temporal Ca2<sup>+</sup> spikes as well as the frequency and amplitude of Ca2<sup>+</sup> oscillations [10]. Next to the secondary messenger lie the signal relaying molecules including Calmodulin, which further tune the calcium signatures and pass them to *CAMTAs*. Calmodulin Binding Transcription Activators have a short history of two decades, which after first being reported have been genome-wide identified in numerous plant species [23]. *CAMTAs* are important in the sense that they are the transcription factors and an intermediate in the calcium-mediated stress signaling (*Ca-CaM-CAMTA*).

Earlier, the identification and expression analysis of Soybean CAMTAs was conducted [21]; however, their functional characterization remained unexplored. Thus, in order to better comprehend the role of *GmCAMTAs*in drought, and take a holistic snapshot, we first attempted to fill the research gaps through comprehensive in silico analysis of *GmCAMTA* family. Soybean possesses 15 *GmCAMTAs* [21], the second highest number after *Brassica napus*, which possesses 18 genes [24]. Interestingly, such a large stress responsive transcription factor family must have substantial contribution to the drought tolerance of soybean. *CAMTA* has important role in an array of biotic/abiotic stresses as reported in earlier studies in Arabidopsis [22,31–34,43], tomato [14,28], tobacco [18] and *Brassica napus* [24], Arabidopsis [36] and wheat [25]. Dissecting *GmCAMTAs* with various bioinformatics tools, their gene structure depicted 13 exon pattern, which is consistent with its orthologues in Arabidopsis, maize, tomato and others [23]. Promoter enrichment analysis revealed the *cis*-elements including *ABRE, DRE, G-box, W-box, WRKY, ARE* and *MYB*, all of which respond to various stresses [21]. *SlCAMTAs* are differentially expressed during the development and ripening of tomato and the presence of *ERE cis*-element in all *GmCAMTAs* provides the basis for *CAMTA* role in fruit development and ripening [28]. Light responsive elements (*G-box*) are common in all *GmCAMTAs* and we suggest that their role in light stress should be investigated [44]. Additionally, their specific stress-responsive *cis*-motifs should be tested individually or collectively in designing stress-inducible synthetic promoters.

Protein structure is important to know for understanding the action mechanism of protein. The major basic domains, CG-1, ANK, IQ and CaMB are common in all CAMTAs and a close look into the motif sequence of CaMB domain shows residues which are highly conserved across the species. Phylogeny of CAMTAs of four species traced the evolutionary relationship among the homologues as well as orthologues, which is consistent with the previous results [15,17–19,21,23,24]. Some homologues show more similarity and might have co-evolved (Figure 1). Later, the same set of proteins was found to interact with these homologues. A protein interaction network analysis found a few experimentally determined as well as predicted interactions together of 48 unique proteins of various pathways (Figure 3). Thus, we recommend to experimentally determine those interacting proteins to unveil GmCAMTA TFs' link with the pathways of the interacting proteins. In Arabidopsis, 32 proteins were predicted with STRING in the interaction network of *AtCAMTAs* [23]. An experiment-based detailed map of all the *GmCAMTA* interactors would deepen our understanding of intricate mechanisms of *GmCAMTA*-mediated development and immunity against biotic/abiotic factors. Similarly, miRNAs are important transcriptional regulators by directly cleaving their target transcripts; thus, their potential target sites in *GmCAMTAs* were important to determine to understand *CAMTA*-mediated regulation. In silico analysis of *PtCAMTAs* revealed potential targets for four distinct miRNAs [18]. A set of 10 unique miRNAs was detected bringing more complexity in *CAMTA*-mediated stress response mechanisms in soybean (Table S2 and File S1). We recommend to experimentally determine the miRNAs targeting *GmCAMTAs*, which might also lead to the ability to engineer soybean and other crops against drought more accurately.

The expression analysis of *GmCAMTAs* in soybean leaf and root reveals that although all of them express constitutively, but *GmCAMTA2, GmCAMTA4, GmCAMTA5, GmCAMTA11* and *GmCAMTA12* are highly upregulated against drought. Secondly, almost all *GmCAMTAs* expressed and peaked in 3 h after which they were repressed indicating their early responsivity to drought stress (Figure 4). Based on the previous report [21], as well as our qPCR results, GmCAMTA12 was the common drought-efficient transcription factor, and thus, was selected to functionally characterize. As expected, overexpressing *GmCAMTA12* in Arabidopsis and hairy roots and drought assays thereof, validated the previous and current qPCR-based results. Interestingly, *GmCAMTA12* enhanced the drought survivability and growth performance of transgenic Arabidopsis (Figure 5), as well as hairy roots (Figure 6), which was validated at phenotypic, physiological and molecular level. Similar to G*mCAMTA12*, another transcription factor from soybean *GmNAC85* is also drought stress-responsive and its constitutive overexpression significantly enhanced the drought tolerance in transgenic Arabidopsis [45]. Similarly, RNAseq of soybean overexpressing *GmWRKY54* revealed that, *GmWRKY54* conferred drought tolerance by enhancing ABA(Abscisic acid)/Ca2<sup>+</sup> signaling to close stomata as well as regulating numerous stress responsive transcription factors [6]. Similar results were recently reported about GmWRKY12 transcription factor [46]. In contrast, *AtCAMTA1*-mutant Arabidopsis exhibited enhanced drought sensitivity while also affecting the expression of other drought responsive genes [31]. A recent global transcriptome analysis using RNAseq revealed that a large number of genes involved in diverse stress responses are regulated either directly or indirectly by *AtCAMTA3* as about 3000 genes were misregulated in the *AtCAMTA3*-mutant Arabidopsis [34]. We recommend to experimentally verify the downstream targets of GmCAMTAs through protein-protein interaction experiments such as two-hybrid/pull down assays, which might link/lead to novel pathways. *GmCAMTA12* overexpression altered the expression of the genes in the regulatory network of *AtCAMTA5*, which is the orthologue of

*GmCAMTA12* in Arabidopsis. All of the 10 interacting genes possess *CGCG*/*CGTG* motif in their 2000 bp upstream region validating the sequence-specific interaction of *GmCAMTAs* [12]. In conclusion, the better performance of OE Arabidopsis and chimeric soybean plants on MS-mannitol and Hoagland-PEG was validated at physiologic and molecular level. The comparison of Proline and MDA contents, CAT activity and relative electrolyte leakage between VC and OE hairy roots shown graphically, flaunts the improved development of chimeric plants with hairy roots overexpressing *GmCAMTA12*. This was further verified at a molecular level by determining the transcript level of the genes in the regulatory network of *GmCAMTA12*. The *GmCAMTA* overexpression should also be analyzed in other tissues including soybean flower and seeds as well as in various developmental stages to find its integrated role in processes besides stress tolerance. However, GmCAMTA12 nuclear localization, promoter-GUS analysis, CRISPR/cas9-mediated gene-knockout and global transcriptomics of stably transformed transgenic soybean and Arabidopsis are under investigation. Taken together, a more systematic approach should be adopted to decipher the integrated role of GmCAMTA TFs with the Calcium-Calmodulin signaling crosstalk in drought.

#### **4. Materials and Methods**

#### *4.1. In Silico Analysis*

By keyword search (Gene ID/locus/accession no.) in three databases (NCBI https://www.ncbi.nlm. nih.gov/, Phytozome https://phytozome.jgi.doe.gov/pz/portal.html and Plantgrnnoble http://plantgrn. noble.org/), we retrieved the genomic, transcriptomic and proteomic sequences of the 15 members of *Glycine max CAMTA* gene family (File S1). Similarly, protein sequences of the corresponding orthologues in *Arabidopsis thaliana*, *Zea mays* and *Solanum lycopersicum* were also retrieved and a dataset was created for bioinformatics analyses (File S1).

The physico-chemical properties including molecular weights, theoretical isoelectric points, Aspartate+Glutamate, Arginine+Lysine, number of atoms, instability index, aliphatic index and GRAVY (Grand average of hydrophathicity) of all *GmCAMTA* proteins were calculated using the ProtParam tool in the ExPASy (https://web.expasy.org/protparam/).

To trace their evolutionary relationship, an ML tree was constructed with MEGAX using default parameters after multiply aligning the protein sequences of *Glycine max*, *Arabidopsis thaliana*, *Zea mays* and *Solanum lycopersicum*.

The exons and introns along the length of *GmCAMTA* genomic sequences were monitored using the online tool GSDS (Gene Structure Display Server (http://gsds.cbi.pku.edu.cn/index.php) [13]. To check the *cis*-motifs within the regulatory regions of *GmCAMTAs*, (2kb upstream) 5' UTR of each gene was resolved at online database PLACE (https://sogo.dna.affrc.go.jp/). The promoter sequences are in File S1. All the 15 *GmCAMTA* protein sequences were aligned with ClustalX and the conserved motifs were identified using the online SMART tool (Simple Modular Architecture Research Tool) (http://smart.embl-heidelberg.de/). The CaMBD in Arabidopsis and Soybean CAMTA family proteins was particularly analyzed for conserved sequences. The protein alignment is shown in File S1.

The potential miRNA targets along the length of *GmCAMTA* transcripts were predicted online at psRNATarget server (http://plantgrn.noble.org/psRNATarget/). To predict the proteins interacting with *GmCAMTAs*, each protein was separately submitted to STRING database (https://string-db.org/). The online prediction software displayed the proteins experimentally proved/hypothetical proteins interacting with *GmCAMTAs*.

The loci of *CAMTA* family over 20 chromosomes were determined using NCBI Genome Data Viewer at https://www.ncbi.nlm.nih.gov/genome/gdv/. Similarly, the subcellular localization of *GmCAMTA* transcription factors was determined by screening NLS in the protein sequence with online tool cNLS mapper at http://nls-mapper.iab.keio.ac.jp/cgi-bin/NLS\_Mapper\_form.cgi. NLS (nuclear localization signal/sequence) is an amino acid sequence, which, if present in a protein, indicates its nuclear localization.

Primers for qPCR analysis were designed with Primer premier 5 software as well as NCBI online primer tool https://www.ncbi.nlm.nih.gov/tools/primer-blast/. The multiple alignments of *GmCAMTAs* CDS showed frequent conserved sequences; thus, to minimize ambiguity, the primers were designed with special care against the unique sites in each *GmCAMTA* CDS. Moreover, it was ensured that the primers amplify all alternative transcripts of respective gene. We attempted to design the pair targeting two exons amplifying a stretch of 200–300 bp cDNA. Those primers which failed to give amplification, created more than one peak or did not appear on agarose gel under UV were redesigned until corrected. Moreover, the primers specificity was determined with online nBLAST tool in NCBI https://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastn&PAGE\_TYPE=BlastSearch&LINK\_ LOC=blasthome. Suzhou GENEWIZ Biological Technology Services Company, China, synthesized all primers for this experiment listed in Table S5 and File S1.

#### *4.2. Expression Analysis*

The seeds of soybean variety "Willliams 82" were washed with 75% ethanol and then sterile water. They were hydroponically cultured in Hoagland nutrient solution in a growth room with ~28 ◦C room temperature, 16–8 h (light-dark) photoperiod and a relative humidity of 50%. When the plants have opened their unifoliate leaf pair completely, they were subjected to 6% (PEG 6000) stress simulated drought. The plants were treated with drought for 1, 3, 6, 9 and 12 h in triplicate along with control (Hoagland). At the exact time points, we collected leaf and root samples, froze in liquid nitrogen and stored at –80 C for further processing.

Total RNA from the root and leaf samples (ground in liquid nitrogen) of soybean plants (treated with PEG for various time periods) was extracted using RNAiso Plus (Takara, Japan) according to the manufacturer's protocol. RNA concentration was determined with NanoDrop2000Spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA). The quality of RNA samples was checked over 1% agarose gel using 0.5X TBE. (Tris/Borate/EDTA) buffer. Bright and clear bands of 25S/18S RNA showed the intact RNA. The samples that showed degraded RNA or relatively unclear bands were re-extracted until intact RNA was detected. The concentration of all good quality RNA samples was adjusted to ~500 ng/μL and it was reverse transcribed into cDNA using the PrimeScript RT reagent kit with gDNA Eraser (Takara, Japan) according to the company's protocol. After the removal of genomic DNA with gDNA eraser, 1 μg total RNA was used as template to synthesize cDNA and stored at –20 ◦C.

After the reverse transcription of RNA into cDNA and qPCR primer synthesis, qPCR was run according to the experimental design. Prior to the qPCR of all the samples, a standard curve was created with a reaction using a serially diluted cDNA in duplicate. Actin 11 was used as reference with leaf cDNA while Elongation factorα with that of root. Before determining the expression pattern of soybean *CAMTA* gene family in leaves and roots, the personal error was minimized by the standardization of experiment. An initial reaction was run on Stratagene Mx3000 P thermocycler and primers validity was determined first by confirming the single peak (amplification plot), the *C*<sup>t</sup> value for each gene, and then by running the reaction product on agarose gel. Each product gave a single band parallel to right marker band. A 20 μL reaction mixture contained 10 μL SYBR Premix Ex Taq, 0.4 μL ROX Reference Dye II, 0.4 μL of each gene-specific Primer, 2 μL of cDNA and 6.8 μL ddH2O. High PCR <sup>e</sup>fficiency is indispensable for robust and more precise RT-qPCR. The formula E <sup>=</sup> 10(−1/Slope)−<sup>1</sup> was used to calculate amplification efficiency and the slope of the calibration curve. The primers having ~100% PCR amplification efficiencies were used. The RT-qPCR profile for our samples consisted of an initial denaturation of 95 ◦C for 2 min followed by 40 cycles of 94 ◦C for 5 s and 62 ◦C for 20 s. The fold change in relative expression level was evaluated using 2−ΔΔ*C*<sup>t</sup> formula.

#### *4.3. Gene Transformation and Drought Assays*

From the bioinformatics as well as qPCR analyses, we selected the gene *GmCAMTA12* (*Glyma.17 G031900*) to clone and transform for functional validation. The complete CDS (2769 bp) of *GmCAMTA12* was PCR-amplified using cDNA as template, forward primer ACTAGTATGGCGAATAACTTAGCGG and reverse primer CCCGGGCTATGTCTTCAGTTGCATGTCAA. For amplification, LA Taq kit (Takara) was used according to the manufacturer's protocol. The PCR profile was; an initial denaturation at 94 ◦C for 1 min followed 35 cycles of 94 ◦C for 30 s, 55 ◦C for 30 s and 72 ◦C for 150 s, and a final extension at 72 ◦C for 10 min. Using CT101 cloning kit (Transgene, Beijing, China), the gene was cloned into pEASY®-T1 cloning vector and transformed into Trans1-T1 chemically competent cells (Transgene) following heat shock method according to the manufacturer's instructions. The positive clones were obtained on a selective LB-Kan<sup>+</sup> agar plate and screened through vector and gene specific PCR as well as restriction. After double check, three separate clones were sequenced using M13 forward and reverse primers. When the sequences of all three clones were analyzed without any point mutation, the gene was restricted from pEASY®-T1 and sub-cloned by restricting from cloning vector using *SpeI* and *SmaI* (Takara) and ligated into the corresponding sites of expression vector pBASTA (with Kanamycin and Glyphosate resistance genes) using T4 Ligase (Promega, Madison, WI, USA) constituting the recombinant overexpression construct, pBASTA-*GmCAMTA12*. The pBASTA-*GmCAMTA12* construct was transformed into DH5α strain of *E.coli* (Transgene) and positive clones were obtained on a selective LB-Kan<sup>+</sup> agar plate. After a double check (PCR and restriction) of single colonies harboring the overexpression construct, the colonies were preserved in sterile 80% glycerol. The recombinant vector was transformed into chemically competent *Agrobacterium tumefacians* (EHA105) and *Agrobacterium rhizogenes* (K599) by a 5 min liquid nitrogen treatment followed by a heat shock of 5 min at 37 ◦C. The positive clones of EHA105 were identified on YEP Kan+Rif plates while K599 were selected using YEP Kan+Strep plates. After PCR verification, the cells were preserved at –80 ◦C for future use.

Arabidopsis Col seeds were kindly provided by Engineering Research Center of the Ministry of Education of Bioreactor and Pharmaceutical Development. To synchronize the germination of seeds, wt seeds were soaked in water and kept at 4 ◦C for 48 h and then sowed on a mixture of humus + vermiculite (3:7) in a growth room in dark at 23 ◦C. The germinated seedlings were then allowed to grow in 16 h photoperiod until 40 days with regular watering. We transformed the inflorescence (unopened flowers) of mature healthy Arabidopsis plants through Floral Dip method [47] and harvested T1 (Transgenic generation 1) seeds. The T1 seeds were germinated in excess under the same conditions as for wild type seeds. A primary screening of T1 plants (at six leaf stage) was carried out by spraying with 1% Basta (glyphosate) and the basta-survived plants were grown in separate pots in fresh soil (humus + vermiculite) under the same conditions as for wild type. Each plant represented a separate transformation event. The non-transformants died, while from the green plants, we extracted genomic DNA from leaves by 2 X CTAB and a PCR with gene specific primers using the gDNA as template was performed for secondary screening. T1 plants positive with PCR were grown to harvest T2 seeds. Similarly, T3 seeds from homozygous lines were harvested following the primary screening by 1% Basta and secondary with PCR to ensure stable transformation.

The seeds of Arabidopsis were surface sterilized by keeping in 70% ethanol for 1 min followed by 50% bleach for 10 min after which the seeds were washed six times by sterile water. The wild-type and two T3 transgenic Arabidopsis lines were germinated under sterile conditions on half MS plates added with various mannitol concentration. For germination, we vernalized wt and OE seeds by keeping in dark for 3 days at 4 ◦C. Each petri plate was marked and divided in to three parts, one of which was allotted to wt while the other two for Line-5 (OE5) and Line-12 (OE12) seeds. The seeded plates were kept in a growth chamber with 16 h photoperiod and 23 ◦C temperature. The germination of each line was observed and recorded for one week and the germination rate was determined by the number of seeds germinated divided by the total number of seeds. For root length analysis, wt and T3 lines were first germinated on MS media under sterile conditions and one week old seedlings of same length were transferred to square plates containing MS with various concentrations of mannitol. The plates were placed in a rack vertically so that the roots grow downwards to ground. The root growth was observed and the plants were photographed. The seeds of wild-type and two T3 transgenic Arabidopsis lines were germinated and grown for one month in soil at 23 ◦C, watered regularly and then subjected to

drought stress by withholding water for 14 days and photographed before, during and on drought recovery after re-watering.

Various physiological parameters like MDA [48] and proline contents [49] as well as CAT activity [50] and relative electrolyte leakage were determined. Cell membrane integrity is lost as K<sup>+</sup> ions (a chief electrolyte) leaks out of the cell leading to cell death in stress. This measurement is an indicator of the cell damage caused by stress [51]. To determine REL, leaf sample (~2 g) was thoroughly rinsed with deionized water and then subjected to vibration for 2 h at 25 ◦C in a test tube containing 10 mL deionized H2O. Using a conductivity meter, we determined the conductivity of solution (C1). A second measurement of conductivity (C2) was taken after boiling the solution for 25 min and then cooled to RT. REL % was calculated using the formula (C1/C2) × 100.

Soybean hairy root is an established platform for the functional analysis of a gene by overexpression. For hairy roots, seeds of soybean variety JU 72 were washed with 75% ethanol and rinsed in sterile water a few times. The seeds were sown in pots containing autoclaved vermiculite and kept in a growth room with 16 h photoperiod and 28 ◦C temperature and watered regularly. In parallel, K599 from glycerol was streaked on YEP Kan + Strep plates and incubated at 28 ◦C for 48 h. A single colony was picked and inoculated in YEP containing Kanamycin and Streptomycin. The culture was incubated in a shaking incubator at 200 rpm and 28 ◦C for 48 h. A 100 μL of inoculum rom the culture was spread on YEP Kan + Strep plates and incubated at 28 ◦C for 48 h. When the seedlings had just sprouted, the cotyledons unfolded and the first unifoliate leaves had not yet appeared, they were ready to be infected with K599, harboring the overexpression construct. The *Agrobacterium rhizogenes* culture from the plate was picked with a needle and injected into the hypocotyl right at the base of cotyledon [52]. K599 harboring empty pBASTA were used to regeneration of VC (Vector Control) hairy roots. After infection, the seedlings were covered with a transparent plastic to ensure high humidity. Within two weeks, hairy roots started to sprout from the site of infection at variable frequencies with at least one root per seedling. The soybean plants with hairy roots were allowed to grow for two weeks after which excess of autoclaved vermiculite was added to cover the hairy roots and watered regularly. We extracted the genomic DNA from both type of the transgenic roots to confirm VC and OE through gene specific primers. The roots were scanned with a scanner and analyzed with the software.

After the hairy roots were ~10 cm long, the plants were uprooted, primary roots were cut and the chimeric plants (transgenic root and non-transgenic shoot) were transferred to fresh autoclaved vermiculite and watered regularly. It is important to notice that the shoot of each plant was cut after the second trifoliate leaves. Thus, the newly grown leaves were compared between the chimeric plants VC and OE transgenic roots. To check the performance of transgenic hairy roots under drought, plants with 10 cm long transgenic hairy roots were transferred to Hoagland solution and after acclimation, they were treated with 6% PEG6000. Proline and MDA contents, CAT activity and REL were determined in VC and OE hairy roots. Moreover, the 0.1 g of the hairy roots (VC and OE) were weighed in sterile conditions and grown on MS media at various concentrations of mannitol including 0, 50, 100, 150 and 200 mM. After 10 days, fresh and dry weights of the hairy roots were determined. The dry weight was measured keeping hairy roots at 60 ◦C overnight.

#### *4.4. Expression Analysis of the GmCAMTA12 Orthologue's Regulatory Network in wt and OE Arabidopsis*

The regulatory network of the orthologue of *GmCAMTA12* (*AtCAMTA5*) in Arabidopsis was predicted online with STRING database. The genomic and proteomic sequences of these 10 interactors are given in File S1. The primers for qPCR were designed against the 10 genes with Primer-BLAST tool (Table S6 and File S1). In parallel, RNA from wt and OE lines was extracted, quantified and reverse transcribed into cDNA with Takara kit following kit's instructions. To determine the expression pattern of the 10 genes, qPCR was run using the primers of each gene with three biological replicates. *AtActin11* was used an internal reference. The data was analyzed with 2-dd*C*<sup>t</sup> method.

#### *4.5. Analysis of GmCAMTA12 Regulatory Network in Chimeric Soybean Plants*

Using the STRING database, the interactors of *GmCAMTA12* in soybean were predicted. The primers for qPCR of eight genes designed with Primer-BLAST are given in Table S7 and File S1. The genomic and proteomic sequences of these 10 interactors are given in File S1. To determine the effect of overexpression of *GmCAMTA12* on its interacting proteins, RNA from all samples was extracted using RNAiso plus (Takara) and reverse transcribed into cDNA with Takara rt kit and qPCR was run for all samples in triplicate. Actin 11 was used as internal control. Relative expression level was determined using 2-dd*C*<sup>t</sup> method.

**Supplementary Materials:** Supplementary materials can be found at http://www.mdpi.com/1422-0067/20/19/ 4849/s1.

**Author Contributions:** Conceptualization, H.-Y.L.; methodology, W.-C.L. and M.N.; software, M.N.; validation, A.J.; investigation, M.N.; resources, H.-Y.L.; data curation, N.A.; writing—original draft preparation, M.N.; writing—review and editing, M.N. and F.-W.W.; visualization, W.-D.Q.; supervision, F.-W.W.; project administration, H.-Y.L.; funding acquisition, H.-Y.L. All authors read and approved the final manuscript.

**Funding:** This research was funded by the National Key R&D Program of China, grant number (2016 YFD0101005), the National Natural Science Foundation of China, grant number (31601323), the National Natural Science Foundation of Jilin Province, grant number (20170101015 JC, 20180101028 JC, 20190201259 JC) and Special Program for Research of Transgenic Plants grant number (2016 YFD0101005).

**Acknowledgments:** We are grateful to Haiyan Li for making all the materials available and critically reviewing the manuscript.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Abbreviations**



#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
