**Comparative Genome-wide Analysis and Expression Profiling of Histone Acetyltransferase (HAT) Gene Family in Response to Hormonal Applications, Metal and Abiotic Stresses in Cotton**

**Muhammad Imran 1,2, Sarfraz Shafiq 1,3,\*, Muhammad Ansar Farooq 4, Muhammad Kashif Naeem 2, Emilie Widemann 5, Ali Bakhsh 6, Kevin B. Jensen <sup>7</sup> and Richard R.-C. Wang 7,\***


Received: 26 August 2019; Accepted: 24 October 2019; Published: 25 October 2019

**Abstract:** Post-translational modifications are involved in regulating diverse developmental processes. Histone acetyltransferases (HATs) play vital roles in the regulation of chromation structure and activate the gene transcription implicated in various cellular processes. However, HATs in cotton, as well as their regulation in response to developmental and environmental cues, remain unidentified. In this study, 9 HATs were identified from *Gossypium raimondi* and *Gossypium arboretum*, while 18 HATs were identified from *Gossypium hirsutum*. Based on their amino acid sequences, *Gossypium* HATs were divided into three groups: CPB, GNAT, and TAFII250. Almost all the HATs within each subgroup share similar gene structure and conserved motifs. *Gossypium* HATs are unevenly distributed on the chromosomes, and duplication analysis suggests that *Gossypium* HATs are under strong purifying selection. Gene expression analysis showed that *Gossypium* HATs were differentially expressed in various vegetative tissues and at different stages of fiber development. Furthermore, all the HATs were differentially regulated in response to various stresses (salt, drought, cold, heavy metal and DNA damage) and hormones (abscisic acid (ABA) and auxin (NAA)). Finally, co-localization of HAT genes with reported quantitative trait loci (QTL) of fiber development were reported. Altogether, these results highlight the functional diversification of HATs in cotton growth and fiber development, as well as in response to different environmental cues. This study enhances our understanding of function of histone acetylation in cotton growth, fiber development, and stress adaptation, which will eventually lead to the long-term improvement of stress tolerance and fiber quality in cotton.

**Keywords:** histone acetyltransferases; genome-wide analysis; fiber; abiotic stress expression profiles; cotton

#### **1. Introduction**

Nucleosomes, the basic unit of chromatin, are composed of 147 bp of DNA wrapped around a histone octamer (two copies of each of H2A, H2B, H3, and H4 histone proteins). Nucleosomes are dynamic in response to developmental and environmental signals, thus altering the DNA accessibility and DNA-template processes to regulate the various processes in plants, including flowering time, root growth, and response to environmental changes [1,2]. Cells use several mechanisms, including post-translational histone modification and DNA methylation, to regulate the gene expression. The N-terminal tails of histone are subjected to various post-translational modifications, including histone acetylation, methylation, etc. [3]. Histone acetylation is carried out by histone acetyltransferases (HATs) in eukaryotes and is associated with transcriptional activation. Histone acetylation can be detected on different lysine residues of histone H3 and H4. For example, in *Arabidopsis*, K9, K14, K18, K23 and K27 of histone H3, and K5, K8, K12, K16, and K20 of H4 are acetylated [4,5]. HATs are well-conserved in yeast, animals, and plants, suggesting the functional conservation of histone acetylation in transcriptional activation. Plant HATs were classified into different subclasses based on their sequence homology to yeast and animal HATs and their mode of action: (1) The Gcn5-related N-acetyltransferase (GNAT)/MYST (Moz, YBF2, Sas2p, Tip) family, (2) CREB-binding Protein (CBP) family, and (3) TBP-associated factorII 250 (TAFII250) family [6].

In plants, the genome-wide analysis of HATs has been performed in several species, including *Arabidopsis* [7], rice [8], *Vitis vinifera* [9], and *Citrus sinensis* [10]. HATs have been widely reported to play an important role in various aspects of plant development, including floral development [11–13], root growth [14], and gametophyte development [15]. In addition to developmental functions, HATs are also involved in plant adaptation to various environmental fluctuations, such as salt stress [16], cold stress [17], heat stress [18], light signaling [19], abscisic acid (ABA) [20], and other hormone signaling [21]. Therefore, the understanding of HATs functions in field crops may play an important role in sustainable agriculture and food security.

Cotton, as a major source of natural and renewable textile fiber, holds a significant value in the world economy and in daily human life [22]. *Gossypium hirsutum* is a natural allotetraploid (AADD) that arose from the interspecific hybridization between *Gossypium arboretum* (AA) and *Gossypium raimondii* (DD), which occurred approximately 1–2 million years ago [23]. Because allopolyploid cotton produces a superior quality of fiber with high yield compared with their diploid progenitors [24], *G. hirsutum* is widely grown in many parts of the world and contributes to more than 90% of commercial cotton production [25]. On the one hand, world climate is changing quickly, and the abiotic stresses are severely affecting the cotton yield and fiber quality [26,27]. On the other hand, the increasing world population demands the improvement of the cotton yield to meet the requirements of the textile industry. Therefore, identifying the potential genes conferring resistance to different stresses for the molecular breeding of cotton is of the utmost importance [28]. Although the role of HATs has been investigated in the plant development [29,30], response to environmental signals [16,18,31], and hormone signaling [16] in other plants, little is known regarding the function of HATs in cotton. In this scenario, *G. hirsutum*, *G, raimondii*, and *G. arboretum*, having recently available genomic data [32–35], provide an excellent opportunity to identify the candidate genes involved in fiber development and biotic and abiotic stress tolerance and to expand our understanding of underlying epigenetic mechanisms.

In this study, we identified HAT genes from the whole genome of *G. hirsutum*, *G raimondii*, and *G. arboretum*. The identified HATs were comprehensively analyzed for phylogenetic classifications, gene structures, identification of conserved motifs and domain organization, and the presence of cis-regulatory elements in their promoters. Furthermore, gene expression profiles of *G. hirsutum HATs* were analyzed during the different stages of fiber development and in response to various abiotic stresses. Such a comprehensive analysis of HATs provides a fundamental understanding of their roles in cotton growth and development. Furthermore, this study will be useful for functional genomic studies on the regulations of histone acetylation and will eventually lead to the long-term improvement of stress tolerance in cotton.

#### **2. Results**

#### *2.1. Identification of HATs in Cotton*

A systematic blast search was performed to identify the HATs in the genomes of *G. hirsutum*, *G. raimondii* and *G. arboretum* with the query sequence of *Arabidopsis*, and candidate HAT were identified in the cotton genomes. Then, Pfam and InterProScan databases were used to further verify the candidate HAT, and a total of 18 *G. hirsutum* (GhHATs), 9 *G. raimondii* (GrHATs), and 9 *G. arboretum* (GaHATs) were identified (Table S1). The properties of the identified *Gossypium* HATs were analyzed by ExPASy and we found an open reading frame (ORF) length ranging from 1407–6876 bp, which encoded the polypeptides ranging from 468–2291 amino acids with a predicted molecular weight of 53–256 KD. In addition, the theoretical isoelectronic point (PI) values ranged from 0.28–8.84 (Table S1).

#### *2.2. Phylogenetic Analysis of the HAT Gene Family*

To investigate the evolutionary relationship of HATs among cotton (*G.raimondii, G. arboretum*, and *G. hirsutum*) and other species, we constructed a phylogenetic tree using MEGA 6.0. We used the newly identified *Gossypium* HATs and previously identified HATs from *Arabidopsis thaliana, Vitis vinifera*, *Oryza sativa*, and *Brachypodium distachyon* to confirm their evolutionary relationship with the un-rooted phylogenetic tree using the neighbor-joining (NJ) method (Figure 1). The phylogenetic tree results showed that similar to *Vitis* and *Arabidopsis*, *Gossypium* HATs could also be grouped into three distinct classes: CPB, GNAT and TAFII250. MYST homologs were found absent in the *Gossypium* genomes. To validate the phylogenetic tree constructed using the NJ method, we also used the minimum evolution method to construct a tree and found that HAT genes could be naturally classified into three groups and the members within each clade were stable with little difference between topology, which indicates that the NJ tree method could be used for further analysis.

We found that the number of genes in CPB (30) was greater than that in either GNAT (27) or TAFII250 (9). Furthermore, we found that all the three groups comprised mono and dicot species. It is noteworthy that the genes within each group clustered with a dicot- or monocot-specific pattern. The number of HAT from each species was different within each group. Compared to other species, cotton HATs showed a closer relationship with *Vitis* HATs because they always clustered closely to each other in the phylogenetic tree. Nevertheless, their gene numbers were not similar within the group.

#### *2.3. Phylogenetic Tree Construction, Conserved Motif and Gene Structure Analysis of Cotton HATs Genes*

To confirm the subgroup classification and to determine the evolutionary relationship between *Gossypium* HATs, we generated another un-rooted phylogenetic tree using the neighbor-joining method (Figure 2A). We observed one subgroup related to CBP (HACs) and TAFII250 (HAFs) in *Gossypium*, whereas three subgroups related to GNAT (HAGs) were observed. The CBP and GNAT subgroups contained four members in *G. raimondii* and *G. arboretum*, while TAFII250 contained only one member. Furthermore, each diploid progenitor had its own ortholog in allotetraploid.

**Figure 1.** Phylogenetic relationships of histone acetyltransferases (HATs) from *Gossypium hirsutum*, *Gossypium raimondii*, *Gossypium arboretum*, *Arabidopsis thaliana*, *Oryza sativa*, *Brachypodium distachyon* and *Vitis vinifera*. The un-rooted phylogenetic tree was constructed using MEGA 6 by the neighbor-joining (NJ) method, and the bootstrap analysis was performed with 1000 replicates. For the phylogenetic tree, amino acid sequences were used and the classification of CPB, GNAT and TAFII250 was performed based on the conserved signature domain of each subgroup. The HATs from *G. hirsutum*, *G. raimondii*, *G. arboretum*, *Arabidopsis*, *V. vinifera*, *Brachypodium distachyon*, and *Oryza sativa* were marked with black, green, red, pink, cyan blue, yellow, and blue dots, respectively.

**Figure 2.** Phylogenetic relationships, conserved motifs, domain organization, and gene structure analysis of HATs in *Gossypium hirsutum*, *Gossypium raimondii* and *Gossypium arboretum*. (**A**) The unrooted phylogenetic tree was constructed using MEGA 6 by the NJ method, and the bootstrap analysis was performed with 1000 replicates. The HAT proteins from *G. hirsutum*, *G. raimondii*, and *G. arboretum* were marked with black, green and red dots, respectively. The branches of each subgroup were indicated in a specific color. (**B**) Motif identification analysis in HAT proteins from *G. hirsutum*, *G. raimondii* and *G. arboretum*. Each motif is shown in unique color, and the regular expression and sequences of the 1–10 motifs are listed in Figure S1. (**C**) Domain organization of HAT proteins from *G. hirsutum*, *G. raimondii* and *G. arboretum*. One representative of each subgroup of HAT from *G. hirsutum* is presented. (**D**) The intron/exon structure of *HAT* genes from *G. hirsutum*, *G. raimondii* and *G. arboretum*. The green boxes and grey lines represent exons and introns, respectively.

To gain insight into the evolutionary origin and putative functional diversification, a Multiple Expectation Maximization for Motif Elicitation (MEME) analysis was performed, which identified a total of 10 conserved motifs in *Gossypium* HATs (Figure 2B and Figure S1). All members of the CPB subfamily contained all the conserved motifs, whereas GNAT and TAFII250 only contained a few conserved motifs. However, the motif 8 was found conserved in all the *Gossypium* HATs, suggesting that the potential catalytic site of *Gossypium* HATs was conserved. We further dissected the motif analysis of *Gossypium* HATs by investigating their domain organization (Figure 2C). The CBP subfamily of *Gossypium* HATs contained plant homeodomain (PHD), Znf-TAZ, Znf ZZ and a CBP-type HAT domain, while the GNAT subfamily of *Gossypium* HATs contained a conserved GNAT and bromodomain. Moreover, the TAFII250 subfamily of *Gossypium* HATs contained a bromodomain, ubiquitin (UBQ), TATA box–binding protein (TBP), and domain of unknown function (DUF) domain. Furthermore, an alignment information was produced to explore the amino acid conservation of the GhHATs domain sequence (Figure S2). The multiple sequence alignment analysis revealed that all GhHATs proteins shared regions of conserved polypeptide sequences, which could be involved in their molecular functions (Figure S2A–C). In general, *Gossypium* HATs contained a domain organization similar to that of their counterparts in other species.

We then analyzed the gene structure of HATs from *G. hirsutum, G raimondii* and *G. arboretum* (Figure 2D). Our results showed that the number and length of intron/exons were different among CPB, GNAT and TAFII250 classes. For example, intron/exon numbers and gene length of TAFII250 were greater than the CPB and GNAT. Furthermore, the gene structure of *G. raimondii* and *G. arboretum* was extremely similar to their orthologs in the *G. hirsutum*. However, in general, gene structure in terms of intron/exon was greatly similar within the subgroup, which was consistent with the phylogenetic analysis.

#### *2.4. Chromosomal Distribution and Duplication Analysis of HATs*

The *G. hirsutum* HATs were mapped to their corresponding chromosomes (Figure 3A), and all the *HATs* were unevenly distributed on the chromosomes of *G. hirsutum*. For example, all the 18 *G. hirsutum* HATs were assigned to 12 chromosomes out of 26 (Figure 3A). The CBP subfamily of *G. hirsutum* HATs was localized on the chromosomes 5, 6, 8 and 10, while the GNAT subfamily of *G. hirsutum* HATs was localized on the chromosomes 6, 7, 10 and 11. We also mapped the HATs from the diploids *G. raimondii* and *G. arboretum* and found that all the HATs from *G. raimondii* and *G. arboretum* were also unevenly distributed on their chromosomes, similar to *G. hirsutum* (Figure 3B). However, compared with *G. raimondii*, the HATs from *G. arboretum* were more evenly distributed on their chromosomes. We also observed that subfamilies of *Gossypium* HATs were localized to different chromosomes in both diploid progenitors (*G. raimondii* and *G. arboretum*), as well as in the allotetraploid *G. hirsutum*. For example, the TAFII250 subfamily member HAF1501 was localized on the chromosome 10 in *G. hirsutum*, and was localized on the chromosome 8 and 11 in *G. arboretum* and *G. raimondii*, respectively.

**Figure 3.** Chromosomal distribution and gene duplication of HATs in *Gossypium hirsutum* (**A**) and in *Gossypium raimondii* and *Gossypium arboretum* (**B**). The chromosome number was indicated in boxes and represented as Gh1-Gh13A/D (**A**), Ga1-Ga13, and Gr1-Gr13 (**B**) for *G. hirsutum*, *G. raimondii*, and *G. arboretum*, respectively. The orthologous HATs are connected by green, while the segment duplication of HATs is represented by light blue and red colors in different genomes.

We also investigated the contribution of gene duplication to the expansion of *Gossypium* HATs (Figure 3A,B). Two segmental duplications were found in *G. hirsutum* (*GhHAC1501A*/*GhHAC1502A*, *GhHAG1501A*/*GhHAG1502A*), indicating that duplications occurred in the A genome of *G. hirsutum.* Similarly, two segmental duplications were also found in *G. raimondii* (*GrHAC1501*/*GrHAC1502*, *GrHAC1502*/*GrHAC1503*) and in *G. arboretum* (*GaHAC1501*/*GaHAC1502, GaHAG1501*/*GaHAG1502*). Furthermore, we also checked the Ka/Ks ratio to explore the selective constraints on each pair of duplicated *Gossypium HAT* (Table S2). The Ka/Ks ratio was found to be less than 0.30 in all the duplicated gene pairs of *Gossypium* HATs, suggesting that all the duplicated gene pairs of *Gossypium* HATs experienced strong purifying selection pressure.

#### *2.5. Putative Cis-Elements in the Promoter Regions of GhHATs*

To gain more insight into the putative functions of HATs, the putative cis-regulatory elements were scanned in the 1000 bp upstream of the transcription start sites of *G. hirsutum* using the Plant CARE database (Figure S3, Tables S3 and S4). In addition to TATA- and CAAT-box core cis-elements, phytohormone response elements, stress response elements and development response elements were found in the promoters of *G. hirsutum* HATs. Most of the cis-elements were conserved among the CBP, GNAT and TAFII250 subfamilies of *G. hirsutum* HATs. However, some cis-elements were absent in some subfamilies. For example, CAT-box (cis-acting regulatory element related to meristem expression), MRE (MYB binding site involved in light responsiveness), P-box (Gibberellin-responsive element), and O2-sites (cis-acting regulatory element involved in zein metabolism regulation) cis-elements were absent in the CBP subfamily, but were found present in GNAT and TAFII250 members. Similarly, circadian (cis-acting regulatory element involved in circadian control), skn-1 motifs (cis-regulatory element required for endosperm expression), Box-W1 (fungal elicitor-responsive element), and 5 UTR Py-rich stretch (a cis-regulatory element conferring high transcription levels) cis-elements were absent in the TAFII250 subfamily. Moreover, cis-elements in the promoters of A and D genomes were largely conserved in *G. hirsutum HATs*. However, TGA cis-elements (a cis-acting regulatory element involved in the MeJA responsiveness) were only present in the D genome. We also observed that cis-elements in the promoters of orthologous gene pairs of *G. hirsutum* were largely similar. However, some exceptions were also found where they differed much for the cis-elements, e.g., GhHAC1504-A/GhHAC1504-D and GhHAG1503-A/GhHAG1503-D.

#### *2.6. Gene Expression Analysis*

2.6.1. Expression Analysis of *GhHATs* in Different Tissues, Developmental Stages and Multiple Abiotic Stresses by RNA-sequencing

To better understand the potential physiological functions of GhHATs in allotetraploid cotton, we investigated the expression of *GhHAT* genes. RNA-sequencing data were downloaded from the National Center for Biotechnology Information (NCBI) and analyzed. Their analysis revealed that *GhHATs* were widely expressed in the vegetative (root, stem, and leaf) and reproductive (torus, petal, stamen, pistil, calyx, and −3, −1, 0, 1, 3, 5, 10, 20, 25 and 35 days post-anthesis (DPA) ovule) tissues (Figure 4A), highlighting the diverse biological functions of HATs in different tissues. We also noted that some *GhHATs* did not express in vegetative tissues, but had a very weak expression in the reproductive tissue, e.g., *GhHAG1502-D*. We found that A and D genomes showed preferential expression for only some genes in the leaf, root, and stem. For instance, the expression of *GhHAG1502-A* was higher than *GhHAG1502-D* in all the analyzed tissues. However, the opposite correlation was also observed, e.g., the expression of *GhHAG1504-D* and *GhHAG1503-D* was higher in the root than *GhHAG1504-A* and *GhHAG1503-A*. In addition, we also found that *GhHATs* expression was also differently regulated during the course of development (Figure 4B). For example, *GhHAC1501-D* and *GhHAC1502-A*/*D* expressions were increased with the development of the root, while the expression of *GhHAC1502-D* decreased with seed cotyledon development. We further investigated the gene expression pattern of *GhHATs*in different

abiotic stresses. The expression of *GhHAG1501-A*/*GhHAG1501-D* and *GhHAG1504-A*/*GhHAG1504-D* was strongly induced by multiple stresses, indicating their potential involvement in stress responses. However, no clear changes in expression levels were observed for more than half of the *GhHATs* under cold, hot, salt, and polyethylene glycol (PEG) 6000 conditions.

**Figure 4.** Gene expression pattern of *Gossypium hirsutum HATs* by the analysis of RNA-sequencing in different tissues and at different stages of fiber development (**A**) and in response to cold, hot, salt, and polyethylene glycol (PEG) 6000 stresses (**B**). The illumina reads of RNA-seq data were retrieved from the National Center for Biotechnology Information Sequence Read Archive (NCBI SRA) database. The color scale at the bottom of heat map indicates the fragments per kilobase million (FPKM)-normalized log2 transformed counts.

#### 2.6.2. Expression Pattern of *GhHATs* Genes by qPCR

#### Tissue Specific Expression Patterns of HATs

We further validated the *GhHATs* gene expression in some vegetative and reproductive tissues by quantitative real-time polymerase chain reaction (qRT-PCR) (Figure 5A). Because the A and D genomes of allotetraploid cotton were extremely similar in mRNA levels, we considered *GhHAT-A* and *GhHAT-D* as one combination (*GhHAT*) and checked the expression levels by qRT-PCR. Among all the nine analyzed *GhHATs* belonging to three different classes, *GhHAG1501, GhHAG1502, GhHAC1503* and *GhHAG1504* showed the most prominent expression levels in the analyzed tissues, indicating their roles in the development of the leaf, root, stem and flowers. Although *GhHAG1502* and *GhHAC1503* were expressed in the root, stem and leaf, the highest expression was found in leaves. However, only *GhHAG1501, GhHAG1504* and *GhHAG1502* showed the highest expression in flowers, implying the specific function of these *GhHAGs* in flower development. It also indicated the dramatic functional divergence among different classes of GhHATs. Furthermore, *GhHAG1501* and *GhHAG1502* formed a paralogous pair and showed the expression conservation in all the analyzed tissues.

**Figure 5.** Gene expression validation of *HATs* from *G. hirsutum* by quantitative real-time polymerase chain reaction (qRT-PCR) in different tissues. (**A**) RNA from the root, leaf, stem and flowers was extracted and reverse transcribed. *Ubiquitin 7* (*UBQ7*) was used as an internal control for qRT-PCR. The relative expression is presented in the heat map, and the color scale at the top represents the relative signal intensity. The primer sequences can be found in Table S5. (**B**) Gene expression validation of *HATs* from *G. hirsutum* by qRT-PCR at different stages of fiber development. The data were normalized to *UBQ7*. The data presented are the average of three biological replicates. Bar = standard deviation (SD).

#### Expression of HATs at Different Fiber Development Stages

To explore the potential role of GhHATs in fiber development, we investigated their expression at different developmental stages of fiber development (0–25 DPA) by qRT-PCR (Figure 5B). Among all the *GhHATs*, only *GhHAG1501*, *GhHAG1502*, *GhHAC1503* and *GhHAF1501* showed the prominent expression levels during the different stages of fiber development (Figure 5B), indicating that these four genes may play dominant roles in the fiber development. However, among these four, the expression of *GhHAG1501* and *GhHAG1502* was the highest among the *GhHAGs* subgroup members at 0 DPA, and then gradually decreased during the fiber development (0–25 DPA). Similar to *GhHAGs* family members, the expression of *GhHAF1501* was also decreased during the fiber development, except at 20 DPA, where only the expression of *GhHAF1501, GhHAG1501* and *GhHAG1502* slightly increased and then again decreased at 25 DPA. Contrary to *GhHAF* and *GhHAG* family members, the expression of *GhHAC1503* from *GhHACs* family was gradually increased during the fiber development with a maximum at 20 DPA. Afterward, the expression of *GhHAC1503* was decreased at 25 DPA. We further separated the contribution of A and D genomes for the fiber development of *G. hirsutum* from RNA-seq data analysis and found that the A genome preferentially expressed more than D genome partners in the different fiber developmental stages (Figure 4A).

#### Expression of HATs in Response to Heavy Metals and Abiotic Stresses

To better understand the role of GhHATs in abiotic stresses, we investigated the gene expression of *GhHATs* in response to metal stress (Cd and Zn), salt stress (NaCl), cold stress and drought stress by qRT-PCR. All the abiotic stresses differentially regulated the expression of *GhHATs* (Figure 6), indicating the functional specificity of GhHATs in response to a particular stress. In response to Cd stress, the expression of *GhHAC1501, GhHAC1502* and *GhHAG1501* decreased, while the expression of *GhHAC1503, GhHAC1504, GhHAG1502* and *GhHAG1503* increased compared with the control. However, the expression of *GhHAG1501* and *GhHAG1504* did not change compared with the control. In response to Zn stress, only the expression levels of *GhHAC1503*, *GhHAC1504*, *GhHAG1501* and *GhHAG1502* increased, while all other studied genes did not show differential expression compared with the control. However, the expression levels of *GhHAC1503*, *GhHAC1504* and *GhHAG1502* were different in response to Cd and Zn stresses despite their increased expression. Furthermore, the expression of *GhHATs*

was investigated in response to cold, salt, and drought stresses. The results showed that in response to cold stress, the expression of *GhHAC1502*, *GhHAC1503*, *GhHAC1504, GhHAG1501, GhHAG1502, GhHAG1503, GhHAG1504* and *GhHAF1501* decreased compared with the control. In response to salt stress, the expression of *GhHAC1501, GhHAC1502, GhHAC1503, GhHAG1501, GhHAG1504* and *GhHAF1501* increased, while the expression of *GhHAC1504* and *GhHAG1503* decreased compared with the control. In response to PEG treatment, the expression of *GhHAG1501* and *GhHAF1501* increased, while the expression of *GhHAC1503* and *GhHAC1504* decreased compared with the control.

**Figure 6.** Expression pattern of *Gossypium hirsutum HATs* in response to abiotic stresses and hormones. RNA from roots after 24 h of CK (control), cadmium (Cd), zinc (Zn), abscisic acid (ABA), auxin (NAA), DNA damage (MMS), cold, salt (NaCl), and drought (PEG4000) was extracted and reverse transcribed. *UBQ7* was used as an internal control for qRT-PCR. The data presented are the average of three biological replicates. Different letters indicate significant difference by least significant difference (LSD) test (*p* ≤ 0.05). Bar = SD.

Expression of HATs in Response to MMS

We also investigated the *Gossypium HATs* gene expression in response to the DNA damage agent methyl methanesulfonate (MMS) (Figure 6). The results showed that in response to MMS treatment, the expression of *GhHAC1502, GhHAC1503, GhHAG1501* and *GhHAF1501* increased, while the expression of *GhHAC1504* decreased compared with the control, indicating their potential role in DNA damage repair pathways in cotton.

#### Expression of HATs in Response to Phytohormones

Cis-regulatory elements related to phytohormones were found in the promoter of *GhHATs* (Figure S2, Tables S3 and S4). We therefore investigated the gene expression of *GhHATs* in response to auxin (NAA) and abscisic acid (ABA) by qRT-PCR. Auxin and ABA treatments differentially regulated the expression of *GhHATs* (Figure 6). The results showed that in response to ABA treatment, the expression of *GhHAC1503, GhHAC1504* and *GhHAG1504* decreased, while the expression of *GhHAG1501* and *GhHAG1503* increased compared with the control. In response to NAA treatment, the expression of *GhHAC1502, GhHAG1501* and *GhHAF1501* increased, while the expression of *GhHAC1504* decreased compared with the control.

#### *2.7. Co-Localization of HATs with QTLs of Fiber Development*

To validate the potential function of GhHATs in fiber development, the co-localization of *GhHATs* with reported QTLs/SNPs of fiber development (i.e., fiber length (FL), fiber elongation (FE), fiber micronaire (FM), fiber strength (FS), and fiber uniformity (FU) was analyzed. Nine genes were mapped to eight chromosomes with reported QTLs of FL, FE, FS, FM, and FU (Figure 7). Among nine *GhHATs*, only four genes were co-localized with QTLs of FL, FE, FM, FS, and FU on four chromosomes, i.e., Chr-A05, Chr-A06, Chr-A08 and Chr-A11 of the A subgenome. *GhHAC1502* was located within the qFE-A05-2, *GhHAG1503* gene anchored in qFU-A06-1 and qFE-A06-2, and *GhHAG1504* was mapped in the qFU-A05-2 QTL region, while *GhHAC1504* was 2 Mb from the qFL-A08. Among the D subgenome *HATs*, *GhHAC1502*, *GhHAC1503* and *GhHAG1503* were anchored in the FL-QTL-9, SNP (i20058Gh, i38606Gh), and qFL-D06, respectively, while *GhHAF1501* and *GhHAG1504* were found in surroundings of reported SNP. Interestingly, some genes were co-localized with multiple QTLs related to different fiber development traits. For example, the *GhHAC1502* gene on chromosome A05 and D05 was anchored in FL and FE QTL, and the *GhHAG1503* gene was localized inside the FL, FE and FU QTL. This reveals that these *GhHATs* had pleiotropic effects on fiber development related traits.

**Figure 7.** *Cont.*

**Figure 7.** Distribution of co-localized *HAT* genes on chromosomes of A and D subgenomes of *G. hirsutum*. The scale represents the physical position of genes and quantitative trait loci (QTL)-linked markers in megabases (Mb). QTLs/Single Nucleotide Polymorphism (SNPs) related to fiber length (FL), fiber elongation (FE), fiber micronaire (FM), fiber strength (FS), and fiber uniformity (FU) are shown. Asterisks indicate that the *HAT* genes co-localized with QTLs/SNPs related to fiber.

#### **3. Discussion**

Histone acetylation is a mark of transcriptional activation and has been reported to play an important role in plant development and response to various biotic and abiotic stresses [12,14–16,18]. Moreover, levels of histone acetylation are tightly linked with gene expression regulation. HATs carry out histone acetylation and have been classified into different distinct groups in *Arabidopsis* [7], rice [8], *Vitis vinifera* [9], and *Citrus sinensis* [10]. In this study, we revealed that *Gossypium* HATs could also be classified into three major subgroups: CBP, GNAT, and TAFII250. This suggests that multiple subgroups of *Gossypium* HATs might play specialized roles in the adaptive evolution of cotton. However, similar to rice [8], the MYST domain containing homologs of HATs in all the three genomes of cotton were absent, while *Arabidopsis* [7] and *Vitis* [9] had MYST domain members. This indicates the specific functional divergence of HATs between the different field crop plants. CBP, GNAT, and MYST HATs are also considered as transcriptional co-activators in addition to their HAT activity. For example, TAZ-type, ZZ-type and PHD-type zinc finger domains of CBP have been reported to play an important role in protein recognition and protein-protein interactions [36,37]. The PHD domain has also been reported to interact with histones and other histone-related proteins [38]. Furthermore, bromodomains are known to bind to acetylated lysine residues [39,40]. Along with functional catalytic domains, all the other conserved domains of *Gossypium* GNAT, CBP and TAFII250 were largely similar to their counterparts in monocots and dicots (Figure 2C). These observations suggest that all the *Gossypium* HATs may have similar functions as described in other plant species.

Gene duplication plays a significant role in generating new gene subfamilies in the evolution of genome and genetic systems [41]. Tandem duplication, polyploidy, and segmental duplications primarily contribute to the creation of new gene families [41]. Two segmental duplications were found in *G. hirsutum* (*GhHAC1501A*/*GhHAC1502A, GhHAG1501A*/*GhHAG1502A*), *G. raimondii* (*GrHAC1501*/*GrHAC1502, GrHAC1502*/*GrHAC1503*), and *G. arboretum* (*GaHAC1501*/*GaHAC1502, GaHAG1501*/*GaHAG1502*) (Figure 3). Gene duplications occurred in these genes because the identities of the genes flanking both sides of the paralogous *Gossypium HAT* genes were found to be absolutely conserved and located on duplicated segments on two different chromosomes. Moreover, these

duplicated genes were not likely diverged much during the evolution based on their Ka/Ks ratios (Table S2), suggesting the functional conservation of duplicated genes. This observation can partially be validated by the overlapping expression patterns of *GhHAG1501* and *GhHAG1502* during the fiber development and in different tissues (Figure 5A,B). However, the expression of *GhHAG1501* did not change in response to Cd, while the expression of *GhHAG1502* increased compared with control (Figure 6), suggesting that these duplicated genes may undergo functional divergence in response to particular stimuli.

*Arabidopsis* HATs play a crucial role in different aspects of plant development [12,14–16,18]. *G. hirsutum* has a huge contribution in the textile industry. Therefore, the identification of *HATs* genes in *G. hirsutum* and their role in fiber development will provide the fundamental information for future studies. Our results showed that the expression of *GhHAC1503* (CBP subgroup) was increased during the different stages of fiber development, while the expression of *GhHAG1501*/*GhHAG1502* (GNAT subgroup) and *GhHAF1501* (TAFII250 subgroup) was generally decreased (Figure 5B). This suggests that histone acetylation levels are likely to be dynamic during the course of fiber development. However, further studies are required to investigate the function of histone acetylation in cotton fiber development. Phytohormones, including gibberellic acid (GA) [42], jasmonic acid (JA) [43], abscisic acid (ABA) [44], ethylene [45], and auxin (NAA) [46], are also known to regulate the fiber development. Cis-regulatory elements specific for ethylene (ERE), GA (GARE), ABA (ABRE), and JA (CGTCA) were found in the promoter of four highly expressed *GhHATs* (*GhHAC1503, GhHAG1501, GhHAG1502* and *GhHAF1501*) during the fiber development (Tables S3 and S4, Figure S3). This suggests that the expression of these *GhHATs* might be regulated by phytohormones. Consistent with this, the expression of *GhHAG1501*/*GhHAG1502* slightly increased in response to ABA treatment, while the expression of *GhHAC1503* decreased (Figure 6). The expression of *GhHAF1501* did not change in response to ABA treatment, but the expression was increased in response to NAA treatment (Figure 6). This suggests that these phytohormones directly or indirectly regulate the expression of these *GhHATs*, which may lead to different acetylation levels. The exogenous application of auxin has been shown to cause higher acetylation levels at the promoter of *SKP2B* [47], while the acetylation levels decrease at the promoter of *KRP7* in *Arabidopsis* [48], indicating the crosstalk between phytohormones and histone acetylation in plants. However, further studies are required to investigate the crosstalk between histone acetylation and phytohormones in cotton. Together, our results suggest that histone acetylation might play an important role in fiber development, as well as cotton growth. However, further studies are required to investigate the function of each GhHATs in phytohormone related pathways.

Histone acetylation has been reported to play a key role in DNA damage repair [49–51]. Histone acetyltransferase 1 (HAT1) is required for the incorporation of H4K5/H4K12 acetylated H3.3 histones at the sites of double strand breaks (DSB), which then facilitate the recruitment of a key DNA repair factor Rad51 in mammals [52]. Furthermore, DSB-inducing agents have been reported to induce the H4K16ac levels in mammals [53], indicating a key role of histone acetylation in DNA damage and repair pathways. In *Arabidopsis*, the histone acetyltransferases HAM1, HAM2 and HAG3 have been reported to participate in UV-B induced DNA damage [54,55]. Furthermore, a treatment with the histone acetyltransferase inhibitor curcumin increased DNA damage in *Arabidopsis* and maize [54], indicating the conserved role of histone acetylation in DNA damage in plants. Our qRT-PCR results showed that the expression of *GhHAC1502, GhHAC1503, GhHAG1501* and *GhHAF1501* increased, while the expression of *GhHAC1504* decreased compared with the control in response to MMS treatment (Figure 6). In brief, our results suggest the potential implication of GhHATs in DNA damage repair pathways in cotton. However, further studies are required to investigate whether and how GhHATs are involved in DNA damage repair.

Different abiotic stresses, including, cold, drought, salt, and metal stresses, severely affect the cotton growth and yield. Recent studies established the link of histone acetylation and abiotic stresses [32,56–58]. For example, salt treatment caused a global increase of H3K9 and H3K4 acetylations [32], while cold treatment decreased the H3K9, H4K5 and H4K4 acetylations compared with the control in maize [58]. Similarly, in response to dehydration treatment, drought responsive genes such as *RD29A*, *RD29B*, *RD20* and *RAP2.4* were found to be differentially acetylated at H3K9, H3K14, H3K23 and H3K27 in *A. thaliana* [56]. We showed that many *GhHATs* were differentially regulated in response to different abiotic stresses, suggesting the dynamic levels of histone acetylation in each abiotic stress adaptation in cotton. Our expression data also suggest the functional diversity and specificity among different subgroups of *Gossypium* HATs (GhHACs, GhHAGs and GhHAFs) in response to stress, as well as hormone treatments (Figure 6). For example, the expression of *GhHAC1502* strongly increased in response to NaCl, while the expression of *GhHAC1503* was strongly induced by Zn. Similarly, in the HAGs subgroup, the expression of *GhHAG1501* was strongly induced by NaCl, while the expression of *GhHAG1502* did not change. These observations suggest that GhHATs likely play an important role in cotton plant adaptation to various abiotic stresses. Furthermore, we also noticed that more than one gene responded to a particular stress. For example, in response to salt stress, the expression of *GhHAC1501, GhHAC1502, GhHAC1503, GhHAG1501, GhHAG1504* and *GhHAF1501* increased, while the expression of *GhHAC1504* and *GhHAG1503* decreased compared to the control. This suggests that different GhHATs may work together in response to particular stimuli and may participate in long-term resistance to different abiotic stresses. However, further molecular and biochemical studies are required to validate GhHATs function and to understand the underlying molecular mechanism.

Four *HAT* genes (*GhHAC1502*, *GhHAC1503*, *GhHAG1503* and *GhHAG1504)* on six chromosomes were anchored in fiber development-related QTL regions. These four genes were identified inside multiple QTLs, suggesting the pleiotropic role of these genes in fiber related traits. These results also support the conclusion that both the A and D genomes of *Gossypium hirsutum* jointly participate in different aspects of cotton fiber trait. Interestingly, *GhHAC1502*, *GhHAC1503*, *GhHAG1503* and *GhHAG1504* also displayed differential expression at different fiber development stages and in response to abiotic stresses, metal stress, MMS and phytohormones. This suggests that these four *HAT* genes are potentially involved in cotton growth and development, fiber-related traits, and plant response to the environment. However, in general, there were few discrepancies between RNA-seq and qRT-PCR data. RNA-seq expression levels were either from the A or D genome of *G. hirsutum* and it would be difficult to properly separate and count the reads from the homologous regions. On the contrary, we regarded *GhHACxA* and *GhHACxD* as one combination, referred to as *GhHACx*, because of the extremely high similarity between the mRNAs of the *GhHACxA-GhHACxD* gene pairs (for example, *GhHAC1502A*/*GhHAC1502D* and *GhHAG1501A*/*GhHAG1501D* have 99 and 99.5% similarity, respectively) and their nearly identical transcript sizes. With such a high similarity, we could not distinguish them using the qRT-PCR, and the results of qRT-PCR are theoretically the average of A and D genome's expression. Thus, potential difficulties of read counts on homologous regions in RNA-seq and inability to separate the A and G genome expressions by qRT-PCR may cause the discrepancy between RNA-seq and qRT-PCR results. In brief, further studies are required to comprehensively elucidate the function of HATs in different aspects of cotton plant.

In this study, we highlighted complex and diverse transcriptional regulations of *GhHATs* which significantly broadened our understanding of the underlying epigenetic mechanisms in allotetraploid cotton and unraveled an extra layer of complexity for better allotetraploid cotton adaptation in response to developmental, environmental, and hormonal cues. Furthermore, our results strongly recommend the comprehensive dissection of the biological and cellular function of GhHATS and argue for the potential implication of histone acetyltransferases in cotton molecular breeding in addition to other existing breeding strategies. This will eventually lead to the long-term improvement of stress tolerance in cotton and will allow high fiber yield and quality.

#### **4. Materials and Methods**

#### *4.1. Identification of HATs Gene Family*

The data of three cotton species, *G. raimondii* (JGI, version), *G. arboretum* (BJI, version 1.0), and *G. hirsutum* (NAU, version 1.1) were attained from the COTTONGEN (http://www.cottongen.org) [33–35]. The HAT protein sequences from *Arabidopsis*, rice, *Brachypodium distachyon*, and *Vitis Vinfera* were downloaded from Phytozome (http://phytozome.jgi.doe.gov/pz/portal.html) and then used as queries in BLASTP searches [59] against the *G. raimondii, G. arboretum* and *G. hirsutum* genomes, respectively. Genes with *E*-values < 1.0 were selected, and redundant sequences were eliminated by following the previously published method [60]. Furthermore, InterProScan (http://www.ebi.ac.uk/interpro/search/ sequence-search) was used to confirm the presence of the HAT domain [61]. The physicochemical properties were predicted by ExPASy (http://cn.expasy.org/tools).

#### *4.2. Analysis of Chromosomal Location and Gene Duplication*

The loci of *HATs* were deduced from the gff3-files of cotton genome. The localization of *HATs* genes on the chromosomes were visualized using the program Circos [62]. The duplication events of *HATs* and Ka/Ks were calculated using the previously published method [63]. The T = Ks/2λ equation was used to determine the duplication time and deviation of the *HAT* gene pairs, assuming clock-like rates of (λ) 1.5 <sup>×</sup> <sup>10</sup>−<sup>8</sup> substitutions per synonymous site per year for cotton [64].

#### *4.3. Sequence Alignment and Phylogenetic Analyses*

Multiple sequence alignment was performed for the full-length HAT proteins using Clustal W with standard settings [65]. A neighbor-joining (NJ) phylogenetic tree was constructed using the full length HATs sequences from *G. raimondii, G. arboretum* and *G. hirsutum* by MEGA 6.0 [66], with P-distance and pairwise gap deletion parameters engaged. The bootstrap test was used with 1000 replicates to evaluate the statistical consistency of each node. To confirm the grades from the NJ method, the minimal-evolution method of MEGA 6.0 was utilized with 1000 replicates as well.

#### *4.4. Gene Structure, Protein Motif, and Promoter Cis-Element Analysis*

The exon/intron structures of the *HAT* genes were acquired from bed-file and displayed using the online tool Gene Structure Display Server (http://gsds.cbi.pku.edu.cn) [67]. The NJ tree was constructed with MEGA 6.0 as explained above. The deduced HAT protein sequences of three cotton species were submitted to the online Multiple Expectation Maximization for Motif Elicitation (MEME) version 4.11.1 (http://meme-suite.org/tools/meme) [68] as described [60]. For cis-element analysis in promoter regions, the 1 kb upstream sequences were analyzed in the PlantCARE database (http://bioinformatics.psb.ugent.be/webtools/plantcare/html/) [69].

#### *4.5. Transcriptome Data Analysis and Gene Expression Heatmap*

The raw data of RNA-seq of *G. hirsutum* were downloaded from the NCBI Sequence Read Archive (SRA: PRJNA248163) to calculate the expression level. TopHat (version: 2.0.13) was used for mapping reads, cufflinks (version: 2.2.1) were used to analyze gene expression levels, and fragments per kilobase million values were used to normalize gene expression levels [70]. Finally, these values of the *GhHAT* candidates were extracted from total expression data and the heatmap was generated by MeV 4.0 (http://www.tm4.org/).

#### *4.6. Plant Materials, Stress Treatments, and qRT-PCR*

*G. hirsutum* cultivar 'CRI35' was used for gene expression analysis. All the sampled tissues obtained from cotton plants grown under field condition with standardized cultural practices to determine the expression analysis [60]. For treatments, cotton seeds were surface sterilized and germinated on a moist paper. Young seedlings of same size were selected and exposed to NaCl (200 mM), polyethylene glycol 4000 (PEG4000) (15%), cold (4 ◦C), methyl methanesulfonate (MMS) (250 ppm), auxin (NAA) (10 μM), abscisic acid (ABA) (10 μM), ZnSO4 (1 mM), and CdCl2 (1 mM) for 24 h. All treatments were performed in three biological replicates. All samples were frozen quickly in liquid nitrogen and kept at −80 ◦C. The total RNA was extracted from cotton samples using the RNAprep Pure Plant kit (TIANGEN, Beijing, China). A total of 2 μg of RNA was used as the template, and the first-strand cDNAs were synthesized using the SuperScript III (Invitrogen, Waltham, MA, USA). Quantitative real-time PCR (qRT-PCR) analysis was performed as described previously in [71] using the specific primers for each *GhHAT* gene (Table S5). Cotton *UBQ7* (UniProt accession number: AY189972) was used as an internal reference gene for normalization of expression and three biological replicates were performed for each sample. To calculate the relative expression levels, a comparative 2−ΔΔ*C*<sup>t</sup> method was used [72]. The heat map for the gene expression profiles was generated with Mev 4.0 (http://www.tm4.org/) [73].

#### *4.7. Co-localization of HATs with Fiber Related QTLs*

To identify the localization of QTLs and SNPs for fiber development related traits, QTLs and linked molecular markers were retrieved from the Cotton Gen website (https://www.cottongen.org). The sequence of each marker was fetched from Cotton Gen to obtain the physical position information. For this purpose, the sequence of each marker was BLAST against the *G. hirsutum* (AD1) and HAU genome in the CottonGFD database. HAT genes co-localized with QTLs were displayed to show *HAT* gene distribution on chromosomes, along with surrounding loci and QTLs, using mapchart software. Genes identified inside the QTL or ≤500 kb far from SNP were considered as anchored gene in QTL because cotton LD decay was approximately 0.80 Mb [74].

#### *4.8. Statistical Analysis*

After performing normal distribution of the data and the homogeneity of variance tests, analysis of variance (ANOVA) was performed, followed by the least significant difference (LSD) test at *p* value ≤ 0.05 for each parameter. Different letters indicate significant difference by LSD test (*p* ≤ 0.05). Statistical analyses were performed using the Statistical Package for Social Sciences (SPSS) software (version 11.5, SPSS Inc., Chicago, IL, USA).

#### **5. Conclusions**

In this study, 36 HATs were identified in three genomes of *Gossypium* and clustered into three groups: CPB, GNAT, and TAFII250. *Gossypium HATs* are unevenly distributed on the chromosomes, and segmental duplications contributed to the evolution of the *HATs* family. Cis-element analysis discovered several abiotic and biotic stresses and hormonal responsive elements in the promoter region of the *GhHATs*, but each member had peculiar types and numbers. Furthermore, the expression profile of the upland cotton *HAT* gene family exhibited different expression patterns in response to abiotic and hormonal stresses, which disclosed that GhHATs play roles in different aspects of upland cotton abiotic stress tolerance and hormonal signaling. Thus, our study helps to lay the foundation for the functional characterization of the *GhHATs* gene family by overexpression and knockdown/out using RNAi or CRISPR-Cas9 genome editing and provides new insight into the evolution and divergence of *HAT* genes in plants. Furthermore, these results may enhance the understanding of the molecular mechanisms of many agronomic traits of cotton, such as fiber development and other physiological processes.

**Supplementary Materials:** Supplementary Materials can be found at http://www.mdpi.com/1422-0067/20/21/ 5311/s1.

**Author Contributions:** Conceptualization, S.S.; Formal analysis, M.I., S.S., M.A.F., M.K.N., E.W. and A.B.; Funding acquisition, S.S. and R.R.-C.W.; Methodology, M.I.; Writing—original draft, M.I. and S.S.; Writing—review & editing, S.S., K.B.J. and R.R.-C.W.

**Funding:** This research was supported by the grant PD-IPFP/HRD/HEC/2013/1129 from Higher Education Commission of Pakistan.

**Acknowledgments:** We would like to thank Yasar Sajjad (Department of Biotechnology, COMSATS University Islamabad, Abbottabad campus, Pakistan) for his valuable suggestions.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### *Article* **Cloning and Expression Analysis of the** *BocMBF1c* **Gene Involved in Heat Tolerance in Chinese Kale**

**Lifang Zou 1,2, Bingwei Yu 1, Xing-Liang Ma 2, Bihao Cao 1, Guoju Chen 1, Changming Chen <sup>1</sup> and Jianjun Lei 1,3,\***


Received: 3 September 2019; Accepted: 8 November 2019; Published: 11 November 2019

**Abstract:** Chinese kale (*Brassica oleracea* var*. chinensis* Lei) is an important vegetable crop in South China, valued for its nutritional content and taste. Nonetheless, the thermal tolerance of Chinese kale still needs improvement. Molecular characterization of Chinese kale's heat stress response could provide a timely solution for developing a thermally tolerant Chinese kale variety. Here, we report the cloning of *multi-protein bridging factor* (*MBF*) *1c* from Chinese kale (*BocMBF1c*), an ortholog to the key heat stress responsive gene *MBF1c*. Phylogenetic analysis showed that *BocMBF1c* is highly similar to the stress-response transcriptional coactivator *MBF1c* from *Arabidopsis thaliana* (*AtMBF1c*), and the *BocMBF1c* coding region conserves MBF1 and helix-turn-helix (HTH) domains. Moreover, the promoter region of *BocMBF1c* contains three heat shock elements (HSEs) and, thus, is highly responsive to heat treatment. This was verified in *Nicotiana benthamiana* leaf tissue using a green fluorescent protein (GFP) reporter. In addition, the expression of *BocMBF1c* can be induced by various abiotic stresses in Chinese kale which indicates the involvement of stress responses. The BocMBF1c-eGFP (enhanced green fluorescent protein) chimeric protein quickly translocated into the nucleus under high temperature treatment in *Nicotiana benthamiana* leaf tissue. Overexpression of *BocMBF1c* in *Arabidopsis thaliana* results in a larger size and enhanced thermal tolerance compared with the wild type. Our results provide valuable insight for the role of *BocMBF1c* during heat stress in Chinese kale.

**Keywords:** Chinese kale; thermal tolerance; *BocMBF1c*; expression; localization

#### **1. Introduction**

Chinese kale (*Brassica oleracea* var. *chinensis* Lei Syn:*Brassica alboglabra* Bailey) is a popular vegetable for its health benefits and taste [1]. Chinese kale, belonging to the *Brassica* genus, is native to China and, currently, is predominantly cultivated in South China and Southeast Asia [1]. Recently, the importance of Chinese kale has generated interest in its nutrition and genetics [2]. In particular, Chinese kale contains high levels of glucoraphanin which may possess cancer prevention properties [3].

Unfortunately, Chinese kale is very sensitive to heat stress, which often leads to decreased yield and quality [4]. Consequently, the main growing season and distribution of Chinese kale is limited to the winter season of South China.

High temperature can disturb the membrane, structure of protein, and chromatin architecture of the plant cells which respond with several signaling pathways to address the stress [5]. Many plant hormones are reported in the heat response of plants, and exogenous application of plant hormones can activate expression of heat stress-related genes thus improving thermal tolerance. For example, applying abscisic acid (ABA) can increase the thermal tolerance level by upregulating *ABA responsive transcription factors* (*ABRFs*)*, heat stress transcription factor* (*HSF*) *A2c* (*FaHSFA2c*), and *heat shock proteins* (*HSPs*) in tall fescue [6]; treatment with methyl jasmonate (MeJA) stimulates the expression of *HSP70* [7]; and exogenous salicylic acid (SA) increases the heat stress resistance of wheat by regulating the accumulation of ethylene (ET) and proline, increasing the efficiency of nitrogen usage [8]. Application of ET and jasmonic acid (JA) can activate thermal stress-related genes by upregulating *ethylene response factor* (*ERF*) *1* in *Arabidopsis*.

In plants, enormous progress has recently been made in understanding of the molecular mechanism of thermal tolerance [9]. In addition to HSF- and HSP-responsive pathways [9], the transcriptional coactivator *Multi-protein bridging factor1c* (*MBF1c*) was identified as a critical regulator for thermal tolerance responses in *Arabidopsis* [10]. In *Arabidopsis*, heat stress can lead to increased expression of *AtMBF1c* and triggers the nuclear localization of AtMBF1c proteins. Nuclear AtMBF1c then function as transcription factors to regulate the downstream SA, trehalose, and ET thermal resistance-related pathways [10]. The *AtMBF1c* is a trans-acting regulatory element which recognizes CTAGA as a potential binding sequence and can regulate 36 genes, including stress-responsive gene *dehydration-responsive element* (*DRE*)-*binding protein 2A* (*DREB2A*) [11]. In addition, overexpression of *AtMBF1c* and its ortholog from wheat can improve the performance of host plants under high temperature [11,12]. Therefore, *MBF1c* is an ideal gene for improving plant thermal tolerance and, thus, increase productivity under heat stress. The expression of the *MBF1c* gene in Antarctic Moss (*PaMBF1c*) can be induced by several abiotic stresses, and overexpression of *PaMBF1c* can enhance heat, cold, and salinity stress tolerance in *Arabidopsis* [12]. However, to date there have been no studies conducted on the *MBF1c* of Chinese kale. In this study, we reported the ortholog of *AtMBF1c* in Chinese kale, termed *BocMBF1c*, and characterized its potential role in regulating heat tolerance in Chinese kale.

#### **2. Results**

#### *2.1. Cloning and Analysis of the BocMBF1c Gene*

As the genome sequence of Chinese kale is still unavailable, we used the *Brassica rapa* MBF1c gene (LOC103828628) as a reference sequence for primer design and then cloned the coding region of *MBF1c* from Chinese kale (variety "Ai-jiao-xiang-gu") genomic DNA. The resultant amplicon of 707 bp was sequenced and confirmed to be *BocMBF1c* by sequence homology analysis (Figure 1a,b). Open reading fragment (ORF) prediction by ORF finder did not find any intron within the *BocMBF1c* gene, which was then confirmed by agarose gel analysis (Figure S1). Amino acid sequence analysis identified multi-protein bridging factor1 (MBF1) and helix-turn-helix (HTH) domains (Figure 1a). Phylogenetic analysis indicated that BocMBF1c clustered with *Arabidopsis* MBF1c (Figure 1b, Table S1). Next, we obtained the *MBF1c* gene and promoter region of *Brassica oleracea* (Bol013952) by searching the *BocMBF1c* coding sequence (CDS) using BLASTN (http://brassicadb.org/brad/blastPage.php). The 2005 bp DNA fragment containing the promoter of *BocMBF1c* was cloned with primers designed from *BoMBF1c*, from which we used 1409 bp upstream sequence from initiation codon for further analysis. The *BocMBF1c* gene was submitted to GenBank (Accession number: MH685643).

Sequence analysis of the 1409 bp *BocMBF1c* promoter revealed multiple stress and hormonal responsive cis-elements, including heat shock elements (HSEs), ethylene responsive elements (EREs), abscisic acid-responsive element (ABRE), MeJA responsive motifs (CGTCA-motif), drought stress and pathogen-responsive TC-rich repeat, and MYB binding sites (MBS). All of those cis-elements are located upstream of the TATA box (Figure 2).

**Figure 1.** Sequence feature of BocMBF1c protein. (**a**) DNA and amino acid sequence of BocMBF1c. Full-length DNA and deduced amino acid sequence of BocMBF1c. Red underline indicates multi-protein bridging factor1 (MBF1) gene family domain, and green underline signatures helix-turn-helix (HTH) domain. (**b**) Phylogenetic tree constructed by the neighbor-joining method based on the MBF1c amino acid sequences.


**Figure 2.** 1409 bp upstream of the start code are shown as the *BocMBF1c* promoter. DNA elements in the *BocMBF1c* promoter were designated. The start code is designated as bold and the TATA box as grass green. Three heat shock elements (HSEs) are boxed in red, ethylene-responsive elements (EREs) in green boxes, ABA-responsive elements (ABREs) in yellow boxes, methyl jasmonate (MeJA)-responding elements in purple, MYB binding sites (MBS) in the purple red box, and TC-rich repeats as stress-responding elements in the blue box.

We cloned the coding region of *BocMBF1c* from genomic DNA, and the deduced BocMBF1c protein contained the *MBF1* gene family domain and HTH domain (Figure 1a and Figure S2). The amino acid sequence of BocMBF1c shared 99% and 94% similarity with its ortholog from *Brassica napus* (CDY39547.1) and *Arabidopsis*, respectively (NP\_189093.1). We queried the BocMBF1 protein sequence using the BLASTP program (https://blast.ncbi.nlm.nih.gov/Blast.cgi), and the resulting sequences all belonged to angiosperm. The phylogenetic tree showed *BocMBF1c* was clustered in one group with *MBF1c* genes from dicots, while all *MBF1c* genes from monocots belonged to another group. In addition, *BocMBF1c* belonged to the *MBF1c* gene clade from the Brassicaceae family including *Brassica* *napus* and *Arabidopsis*. The *AtMBF1a*/*b* were clustered in an individual cluster, implying they belonged to different members of the *MBF1* gene family (Figure 1b, Table S1).

#### *2.2. Transient Expression Analysis Showed BocMBF1c Promoter Activity Can Be Induced by Heat Stress*

The *BocMBF1c* promoter contains several HSEs, indicating its role in heat stress response. Thus, we analyzed the activity of the *BocMBF1c* promoter under heat treatment. We combined the 1409 bp region upstream of the BocMBF1c start code with the CDS of *green fluorescent protein* (*GFP*), and then transiently expressed in *Nicotiana* (*N*.) *benthamiana* leaf tissue by agroinfiltration under ambient and heat-stressed conditions. As predicted, the activity of the *BocMBF1c* promoter was rapidly stimulated under heat stress in Tabaco leaves (Figure 3a,b), which is consistent with previous reports [13].

**Figure 3.** Heat treatment can stimulate *BocMBF1c* promoter activity. (**a**) Legend of pBI101-*pBocMBF1c*-*GFP* construct; *BocMBF1c* promoter was fused with *green fluorescent protein* (*GFP*) coding sequence (CDS) with the *NOS* terminator. (**b**) Treatment at 37 ◦C induced increased expression in the *Nicotiana* (*N*.) *benthamiana* leaves after agroinfiltration. Scale bar, 50 μm.

#### *2.3. Expression Pattern of BocMBF1c*

Under a standard growth environment, expression of *BocMBF1c* was measured in various tissues of Chinese kale (Figure 4a). The results showed *BocMBF1c* had a comparatively higher expression level in combining sites (CS) and leaf veins (LV) and maintained a comparatively low expression level in other tissues (Figure 4b).

Under heat stress, *BocMBF1c* expression rapidly increased >250 fold within 0.5 h and maintained this high expression level for at least 8 h in the leaves of Chinese kale (Figure 5a). When the Chinese kale was exposed to chilling conditions, *BocMBF1c* transcript abundance increased five-fold within 0.5 h, then slowly returned to unstressed levels by 4 h post-treatment (Figure 5b). As illustrated in Figure 2, the *BocMBF1c* promoter possessed a series of hormone and other related cis-elements. The qPCR analysis confirmed that *BocMBF1c* was moderately responsive to salinity treatment and several plant hormones including MeJA, ABA, SA, and ET (Figure 5c,d). Interestingly, the temporal response patterns of *BocMBF1c* expression were distinct among different treatments. Under heat and cold stress, the increase in the *BocMBF1c* gene reached full extent within 0.5 h. For salinity treatment,

the expression level of *BocMBF1c* reached the maximum after 8 h. When MeJA, ABA, SA, and ethephon (CEPA) were applied, the highest *BocMBF1c* transcription activities were detected after 2 h, 8 h, 2 h, and 16 h, respectively. Overall, our results indicated that other abiotic factors could also influence the transcript level of *BocMBF1c* besides responding to high-temperature conditions. These results indicate that *BocMBF1c* in Chinese kale involves resistance to multiple stresses including heat stress.

**Figure 4.** The expression pattern of *BocMBF1c* in different tissues under standard growth conditions. (**a**) Various tissues of Chinese kale. SL, senescent leaf; ML, mature leaf; YL, young leaf; LV, leaf vein; Pe, petiole; YB, young bolting stem flesh; MB, middle bolting stem flesh; BSS, bolting stem skin; FB, flower buds; CS, combining sites between stem and root; Ro, Roots. (**b**) Bolting stage tissues were harvested for expression specificity analysis. Quantitative PCR (qPCR) was used to measure *BocMBF1c* expression in different tissues, with the *BocTubulin8* gene as the control. Data shown are the mean ± SD of three biological replicates.

**Figure 5.** *Cont*.

**Figure 5.** Responses of *BocMBF1c* to various abiotic treatments. (**a**) semiRT-PCR (reverse transcription PCR) and qPCR were used to determine the expression profile under heat stress (37 ◦C) in the leaf tissue of Chinese kale. (**b**) Expression of *BocMBF1c* under cold treatment (4 ◦C). semiRT-PCR and qPCR were used to determine the expression profile under cold condition in the leaf tissue of Chinese Kale. (**c**) Responses of *BocMBF1c* to salinity treatment. (**d**) Expression of *BocMBF1c* after applying hormones. All analyses used *Tubulin8* as an internal reference gene, and all data shown were normalized to mock treatments. Data shown are the mean ± SD of three biological replicates.

#### *2.4. BocMBF1c Protein Localizes to the Nucleus under Heat Stress*

Under normal temperature, the GFP-tagged BocMBF1c protein shows no obvious distribution preference when transiently expressed in *N*. *benthamiana* leaf tissue (Figure 6b). However, the chimeric BocMBF1c-eGFP protein quickly concentrated into the nucleus when exposed to 37 ◦C for 1 h (Figure 6c). This translocation is critical for the functionality of this heat responsive transcription coactivator.

**Figure 6.** *Cont*.

**Figure 6.** Heat stress led to the nucleus localization of the BocMBF1c protein. (**a**) Schematics of the constructs for 35s::*eGFP* and 35s::*BocMBF1c*-*eGFP* (pBE-*BocMBF1c*-*eGFP*). (**b**) The GFP protein alone has no obvious localization preference either under normal temperature or high temperature in *N. benthamiana* leaf tissue. (**c**) BocMBF1c translocated to the nucleus when exposed to 37 ◦C for 1 h. Scale bar, 50 μm.

#### *2.5. Phenotype and Thermotolerance Analysis of BocMBF1c Overexpression Lines*

We selected three T4 homozygous transgenic lines (Figure S3) overexpressed with the *BocMBF1c* and β-glucuronidase (*GUS*) fusion gene driven by the 35S *cauliflower mosaic virus* (*CaMV*) promoter (Figure 7a) for further study. We named them *BocMBF1c*–OE1, *BocMBF1c*–OE2, and *BocMBF1c*–OE3. β-glucuronidase activity analyses were performed with 5 day old seedlings to validate the expression of *BocMBF1c* (Figure 7b). We also examined the growth of 2 week old *BocMBF1c-*OE plants. The *BocMBF1c-*OE plants show larger sizes compared to wild type (WT) under normal growth conditions (Figure 7c) which is consistent with a previous report [14]. The five-day-old seedlings grown on MS plates were used for thermal tolerance analysis. After treatment at 46 ◦C for 2 h, the transgenic lines showed a survival ratio of 21.7% to 30.0%, while only 10.7% of the WT plants survived after heat stress (Figure 7d).

**Figure 7.** *Cont*.

**Figure 7.** Phenotypic characterization of *BocMBF1c-OE* plants. (**a**) Schematic illustration of the 35s::*BocMBF1c-GUS* construct used for ectopic expression. (**b**) Expression of *BocMBF1c* was validated by β-glucuronidase (GUS) staining (*n* ≥ 15). (**c**) Compared with the wild type (WT) seedlings, the fresh weight of the *BocMBF1c*-OE plants indicated better growth, and the plants shown in the pictures were 14 days post-germination. Error bars represent the standard deviation of the means (*n* = 20). (**d**) All three *BocMBF1c*-OE lines showed a higher survival ratio after heat stress than WT (*n* =100). Error bars are the mean ± SD of three independent experiments. Photos were taken 2 days after heat stress. Asterisk indicates statistical significance by the Student's test (*p* < 0.05).

#### **3. Discussion**

#### *3.1. BocMBF1c Can Respond to Multiple Stresses and Hormone Treatment in Addition to Heat Stress in Chinese Kale*

We previously measured the effects of heat stress among 10 different Chinese kale varieties with different growth parameters, and the variety "Ai-jiao-xiang-gu" was found to be the most tolerable to heat stress [15]. We further measured the expression level of *BocMBF1c*; "Ai-jiao-xiang-gu" had the highest increase in expression level [15], indicating its important role in heat stress tolerance.

The promoter of *BocMBF1c* contained several HSEs; moreover, in *N. benthamiana* cells, the *BocMBF1c* promoter activity remained steady under ambient temperatures but was activated rapidly by high temperature stress. This is supported by semi-quantitative and qPCR evidence that the transcript abundance of *BocMBF1c* in planta is strongly and rapidly increased in response to heat stress treatment. Together, these data indicate that the expression of the *BocMBF1c* gene responds to heat stress via promoter activity. Our results showed that *BocMBF1c* can respond to cold, salinity, and several hormone treatments thus potentially possessing other functions besides thermal tolerance.

Bioinformatic analysis predicted that the *BocMBF1c* promoter contained ERE- and ABRE-responding cis-elements, implying ERFs and ABFs (ABRE binding factors)/ABEB could regulate the expression of *BocMBF1c*. Accumulation of ABA can initiate the expression of stress-responsive genes and participate in the regulation of the network through ABFs/AREB [16]. Ethylene and JA can stimulate the expression of downstream stress-related genes by activating ERF1 [17]. Previous reports showed that AtMBF1c functions upstream of SA and ET during thermotolerance [10]. From our results, *BocMBF1c* reacted distinctly under different hormone treatments. The expression levels of *BocMBF1c* both increased after exogenous application of MeJA and ABA although with different temporal response feature. For MeJA and ABA treatment, *BocMBF1c* expression peaked after 2 h and 4 h, respectively, which is possibly attributed to the difference in activation times for ERF1 and ABFs/AREB. After SA and ET application, the expression level of *BocMBF1c* both increased after a certain reduction period at the beginning. This indicates that *BocMBF1c* could possibly be negatively regulated by a high concentration of ET and SA, since the applied hormones would slowly degrade after application. With lower concentrations, ET can activate ERF1 thus stimulating the expression of *BocMBF1c*.

Overall, the accumulation of ABA, ET, and JA can activate corresponding transcription factors, which will then interact with the cis-elements located at the *BocMBF1c* promoter. The expressed BocMBF1c protein would then participate in the heat stress regulation network by regulating the downstream SA and ET signaling pathway (Figure 8).

**Figure 8.** A model for the relation of *BocMBF1c* and other hormones. abscisic acid (ABA) and ethylene (ET) and jasmonic acid (JA)/MeJA activate ABRE binding factors (ABFs)/ABRE and Ethylene response factor1 (ERF1), respectively. ABFs/ABRE and ERF1 then upregulate the *BocMBF1c* gene by interacting with cis-elements in the promoter region. The BocMBF1c protein can regulate the salicylic acid (SA) and ethylene (ET) heat stress responsive pathway. High concentrations of SA and ET will inhibit the expression of *BocMBF1c* through negative feedback.

Through analysis of the *Arabidopsis* public database (http://bar.utoronto.ca/efp\_arabidopsis/cgibin/efpweb.cgi), we found *AtMBF1c* could be upregulated by heat, salt, and ABA treatment, and that it does not respond to cold and MeJA treatment (Figure S4) (http://bar.utoronto.ca/efp\_arabidopsis/cgibin/efpWeb.cgi) [18]. The different responsive spectrums between *BocMBF1c* and *AtMBF1c* implies the *BocMBF1c* gene could participate in stress-related tolerance for the host plant.

The BocMBF1c protein quickly relocated to the nucleus when the cells were exposed to high temperature, further suggesting that this is a regulator for the expression of many heat stress-responsive genes. However, the mechanism of this nuclear translocation in a high-temperature environment remains unknown and should be the focus of future study. Likewise, downstream genes activated by this transcription factor remain unknown in Chinese kale, and transcriptome analysis by next-generation sequencing would be useful in the identification of candidate downstream genetic elements. Expression of these elements as well as known targets from other species can be further confirmed by qPCR. Earlier studies found the "CTAGA" element as a potential binding sequence for downstream genes; thus, it may be interesting to analyze the orthologs of the downstream genes in Chinese kale. This may

reveal the reason for the relative temperature sensitivity of Chinese kale. Thus, it is highly likely that the *BocMBF1c* gene is necessary for multiple biotic/abiotic stresses in Chinese kale.

#### *3.2. BocMBF1c Can be Applied to Improve the Productivity of Chinese Kale*

The high-level conservation of *MBF1c* sequences (Table S1 and Figure S2) implies that this gene is critical for thermal tolerance and climate adaptation. For Chinese kale, bolting stems are the main edible tissue. The genetic mechanism underlying declined yield and quality of bolting stems under heat stress are still elusive in Chinese kale; cloning of *BocMBF1c* genes as a key heat tolerance factor would shed light on improving the heat tolerance of Chinese kale.

Transformation of Chinese kale is readily feasible and has been used for improving drought stress tolerance [19]. Our results indicate that the *BocMBF1c* gene can respond to heat treatment from both transcription and protein localization level. Overexpression *BocMBF1c* in *Arabidopsis* result in larger-sized plants with enhanced heat tolerance, while knockout of the *BocMBF1c* gene will be needed to fully validate its role in the thermal tolerance of Chinese kale, for example, by the CRISPR/Cas9-based gene editing technique [20]. Ectopic expression of *AtMBF1c* can improve tolerance to bacterial infection, heat, and osmotic stress in *Arabidopsis*[14]. Further study showed that overexpression of *AtMBF1c* could improve thermal tolerance without impairing the yield in *Arabidopsis* [10], soybean [11], and rice [21] under controlled conditions. A recent report showed that overexpression of *MBF1c* in Antarctic moss could make the plant bigger and enhance tolerance to salinity stress in *Arabidopsis* [12]. Thus, it will be intriguing to overexpress the *BocMBF1c* gene as a potential solution to improving the biotic/abiotic stress resistance and productivity of Chinese kale.

#### **4. Materials and Methods**

#### *4.1. Plant Growth and Abiotic Treatment Conditions*

We collected Chinese kale samples for tissue-specific expression analysis from the greenhouse (22–25 ◦C) at the South China Agricultural University (Guangzhou, China) [2]. For abiotic treatment, Chinese kale (*B*. *oleracea* var. *chinensis* Lei) variety "Ai-jiao-xiang-gu" were grown in a culture room under controlled conditions at 25 ◦C, with a 16 h light/8 h dark cycle. Chinese kale seedlings of 4~6 true leaves were transferred to the growth chamber for 24 h with the same conditions, and then the temperature was switched to heat (37 ◦C) and cold (4 ◦C) treatments. Leaf tissues were harvested for RNA extraction at 0 h, 0.5 h, 1 h, 2 h, 4 h, and 8 h.

For salinity treatment, each Chinese kale seedling of 4~6 true leaves were sprayed and watered by 100 mL of 100 mM NaCl, and leaf tissues were collected with the timeline of 0 h, 1 h, 2 h, 4 h, 8 h, and 16 h. Similarly, hormone treatments were carried out by applying 50 μM of MeJA solution, 100 μM ABA, 100 μM SA, and 100 ppm CEPA, respectively, and leaf tissues were collected along the same timeline. All samples used in this study were collected from three separated Chinese kale.

*Arabidopsis thaliana* (ecotype Columbia) were grown in chamber under controlled condition at 22 ◦C day/18 ◦C night, 70% relative humidity, and 100 μmol m−<sup>2</sup> s−<sup>1</sup> with a 16 h light/8 h dark light cycle. *Arabidopsis* seeds were germinated on MS plates or soil (Sunshine® Mix #8, Sun Gro Horticulture) after being sterilized with 10% bleach for 10 min. All seeds were kept at 4 ◦C for at least 2 days before transfer to the growing chamber.

#### *4.2. Cloning of the BocMBF1c CDS and Promoter*

Leaves of Chinese kale were harvested for genomic DNA extraction [22]. We used the *MBF1c* gene from *Brassica rapa* (LOC103828628) as a reference sequence (Supplementary B1) to design primers for the CDS of *BocMBF1c*. The gDNA fragment containing the *BocMBF1c* CDS was amplified with primer MBF1c-up (5'-TCGAACTCTCCAGAAACTCGT-3') and MBF1c-dw (5'-CGTTTGCCAGATACGATGT-3'). Next, *BocMBF1c* promoter was amplified with primer MBF1cP-F (5'-AGAATCCCCATTGACAGCCT-3') and MBF1cP-R (5'-GGCATCGTCGCTGTTGTTTA-3'), designed

from the gene sequence of *Brassica oleracea* (Supplementary B2). All PCRs were carried out with HiFiTaq PCR StarMix (GeneStar) according to the manufacturer's protocol.

#### *4.3. Sequence Analysis of the BocMBF1c Gene*

The open reading frame (ORF) of *BocMBF1c* was predicted by the ORF finder (http:// www.bioinformatics.org/sms2/orf\_find.html), and the amino acid sequence was generated by the Expasy-Translate tool (https://web.expasy.org/translate/). Sequence query and alignment were carried out with BLAST (https://blast.ncbi.nlm.nih.gov/Blast.cgi). A phylogenetic tree was constructed with the software Mega 6.05, and the conserved sites were identified by DNAMEN.

The cis-elements of the *BocMBF1c* promoter were predicted by PlantCARE Database (http: //bioinformatics.psb.ugent.be/webtools/plantcare/html/).

#### *4.4. Promoter Activity and Subcellular Localization*

The putative *BocMBF1c* promoter of the 1409 bp upstream sequence from the initiation codon was cloned into the binary vector pBI101-GFP with In-Fusion cloning, using primers designed following the manufacturer's instructions (http://bioinfo.clontech.com/infusion/convertPcrPrimersInit. do), including pBI101 infusion M-P-F (5'-GCAGGTCGACTCTAGAGGTACCGAATAGCTCTCCCC-3') and pBI101 infusion M-P-R (5'- CTTTACCCATTCTAGAGGCATCGTCGCTGTTGTTTA-3'). Similarly, CDS of the *BocMBF1c* was inserted into the pBE-eGFP construct by the In-Fusion technique, with primers MBF1-yxb–F (5'-CGCGGGCCCGGGATCCATGCCGAGCAGAT-3') and MBF1-yxb-R (5'-GGCGACCGGTGGATCCTTGACATGTTT-3'). *Xba* I was used to digest the pBI101-GFP vector, and pBE-eGFP was digested by *Bam*H I, resulting in pBI101-*pBocMBF1c*-*GFP* and pBE-*BocMBF1c*-*eGFP* constructions.

Transient expression was carried out in the leaf tissue of *Nicotiana benthamiana* seedlings with 4 to 6 true leaves, and with *Agrobacterium* strain GV3101 harboring the pBI101-*pBocMBF1c*-*GFP* and pBE-*BocMBF1c*-*eGFP* constructs. The GFP signal was observed two days after agro-infiltration under fluorescent microscopy [23].

#### *4.5. Gene Expression Patterns of BocMBF1c by qPCR and Semi-Quantitative RT-PCR*

The RNA was extracted using HiPure Plant RNA Kits (Magen), and HiScript® II Q Select RT SuperMix (Vazyme) was used for cDNA synthase. Gene expression was assessed by real-time PCR using AceQ® qPCR SYBR® Green Master Mix (Vazyme) and Bio-Rad iQ5 Multicolor Real-Time PCR Detection System. Primers MBF1c-YG1-UP (5'-GTTAACGCGGCTCTCAGAAG-3') and MBF1c YG1-DW (5'-TCCTCCAGCTTCTTCGTGTT-3') were used for *BocMBF1c*; *Tubulin8* was used as an internal control with primers Tubulin-F (5'-CTTCTTTCGTGCTCATTTTGCC-3') and Tubulin-R (5'-CCATTCCCTCGTCTCCACTTCT-3') [19]. The qPCR conditions: initial denaturation of 95 ◦C for 5 min, then 40 cycles of 95 ◦C for 10 s followed by 60 ◦C for 30 s; melting curves were obtained by gradually increasing the temperature from 60 ◦C to 95 ◦C for 15 s at a rate of 0.5 ◦C /s. Relative expression was calculated by the ΔΔCt threshold method [24]. For heat and cold treatment, semi-quantitative RT-PCR was used to detect the expression pattern of *BocMBF1c* with Recombinant Taq DNA Polymerase TaKaRa Taq™ (Takara) and the same primers of qPCR. Semi-quantitative RT-PCR conditions: initial denaturation of 94 ◦C for 2 min, then 26 cycles for *Tubulin8* and 30 cycles for *BocMBF1c* of 95 ◦C for 30 s, 56 ◦C for 15 s, 72 ◦C for 10 s. Three biological replicates with triple technical replicates were used for all samples.

#### *4.6. Arabidopsis Transformation and Phenotypic Analysis of Transgenic Plants*

The *BocMBF1c* coding sequence was fused with *GUS* and then expressed in *Arabidopsis thaliana* driven by the *35S CaMV* promoter and then integrated into the *pBI121* binary vector. The floral dipping method [25] was used for *Arabidopsis* transformation. Transgenic seeds were selected on a 0.5 MS agar plate with 50 mg/L kanamycin and 300 mg/L timentin and then transplanted to soil for setting the seeds under standard growth conditions. Three homozygous transgenic lines from independent transformation events were used for further study (Figure S3).

Five-day-old *BocMBF1c*-OE and Col-0 *Arabidopsis* seedlings were collected for GUS staining analysis after germination on MS plates [13]. Two-week-old seedlings grown on soil were harvested for measuring fresh weight. For thermal tolerance analysis, five-day-old seedlings grown on MS plates were treated at 46 ◦C for 2 h, and then these plates were put back into growth chambers under normal condition for another 2 days.

**Supplementary Materials:** Supplementary materials can be found at http://www.mdpi.com/1422-0067/20/22/ 5637/s1.

**Author Contributions:** Conceptualization, C.C. and J.L.; Data curation, L.Z.; Formal analysis, L.Z., X.-L.M. and B.C.; Investigation, L.Z.; Methodology, L.Z., B.Y. and C.C.; Project administration, G.C. and J.L.; Supervision, J.L.; Validation, L.Z. and X.-L.M.; Visualization, L.Z. and X.-L.M.

**Funding:** This work was supported by the key project of Guangdong Science and Technology Section, grant numbers 2013B051000069 and 2014B020202005, and the key project of the Guangzhou Science Technology and Innovation Commission, grant number 201508030021.

**Acknowledgments:** We acknowledge Joanne Ernest (Global Institute for Food Security) for providing very valuable comments when preparing this manuscript; we thank Yehua He for providing the pBI-101-eGFP construct and Fengpin Wang for aiding in the agro-infiltration in tobacco leaves.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Abbreviations**


#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
