*2.2. Phylogenetic and Genomic Distribution and Organization Analysis of PP2C Gene Family*

To study the phylogenetic relationship of the *PP2C* genes among cotton plants (*G. arboreum*, *G. barbadense*, *G. hirsutum*, and *G. raimondii*), a maximum likelihood (ML) tree was constructed with *Arabidopsis* using MEGA 7.0. The phylogenetic tree revealed that the *PP2C* genes can be further divided into 12 subgroups, as previously reported [26,27] (Supplementary Figures S1–S3). Moreover, 181 *GhPP2C* genes were clustered into 12 subgroups (A–L) with *Arabidopsis*. In the phylogenetic tree, subgroup D had the most members of the *PP2C* genes (38), followed by subgroup E (27), and the least number of *PP2C* genes was observed in subgroup L, having 4 *GhPP2C* genes (Figures 1 and 2a). Moreover, we also analyzed the conserved motifs and gene structure based on phylogenetic relationships to provide insight into the structural features of the PP2C members in *G. hirsutum*. For GhPP2C proteins, 10 conserved motifs were acquired with the help of MEME. The majority of the PP2C family contained motifs 7 and 2, except a few genes showed motifs 3 and 4. This indicates that all the identified *PP2C* genes have typical family features and the proteins classified into the same subgroup share similar amino acid sequences (Figure 2b). Alongside, their logos were obtained by online MEME server. A total of 10 types of consensus motifs were obtained in all of the GhPP2C proteins and their distribution patterns are presented in Supplementary Figure S4. To better understand the gene structure of *PP2C* genes in cotton, exon-intron organizations of these genes were also tested (Figure 2c) [26], and most of the subgroups contained 1-10 introns. These results specify that *PP2C* genes in the same subgroup show more or less similar exon-intron organization.

color.

*Int. J. Mol. Sci.* **2019**, *20*, x; doi: FOR PEER REVIEW www.mdpi.com/journal/ijms

**Figure 1.** Phylogenetic relationship of *PP2C* genes between *G*. *hirsutum* and *A. thaliana*. The phylogenetic tree was constructed by MEGA 7 using the Maximum Likelihood Method (1000 bootstrap). Genes of protein phosphatase (PP2C) from different subgroups are marked with various **Figure 1.** Phylogenetic relationship of *PP2C* genes between *<sup>G</sup>*. *hirsutum* and *A. thaliana*. The phylogenetictree was constructed by MEGA 7 using the Maximum Likelihood Method (1000 bootstrap). Genes of protein phosphatase (PP2C) from different subgroups are marked with various color.

*Int. J. Mol. Sci.* **2019**, *20*, x; doi: FOR PEER REVIEW www.mdpi.com/journal/ijms

**Figure 2.** (**a**) (**b**) and (**c**). Phylogenetic relationships, gene structure, the exon-intron, and upstream/downstream region are represented, respectively. The phylogenetic tree was constructed by MEGA 7 using the Maximum Likelihood Method (1000 bootstrap). At the bottom of the figure, the relative position is proportionally displayed based on the kilobase scale. **Figure 2.** (**a**) (**b**) and (**c**). Phylogenetic relationships, gene structure, the exon-intron, and upstream/ downstream region are represented, respectively. The phylogenetic tree was constructed by MEGA7 using the Maximum Likelihood Method (1000 bootstrap). At the bottom of the figure, the relative position is proportionally displayed based on the kilobase scale.

#### *2.3. Chromosomal Localization and Syntenic Relationships of PP2C Gene Family 2.3. Chromosomal Localization and Syntenic Relationships of PP2C Gene Family*

The chromosomal localization of A and D genomes of *G. hirsutum* was analyzed using Tbtools software. A total of 89 *GhPP2C* genes were allocated in D genome (D01–D13) ranging from 2–14 genes per chromosome and only 2 genes were found on the scaffold. Moreover, every chromosome showed variation in a number of genes, such as D04 exhibited the highest number of *GhPP2C* genes (14), followed by 12 genes in D02, and the least number of genes (2) were recorded for D07 (Figure The chromosomal localization of A and D genomes of *G. hirsutum* was analyzed using Tbtools software. A total of 89 *GhPP2C* genes were allocated in D genome (D01–D13) ranging from 2–14 genes per chromosome and only 2 genes were found on the scaffold. Moreover, every chromosome showed variation in a number of genes, such as D04 exhibited the highest number of *GhPP2C* genes (14), followed by 12 genes in D02, and the least number of genes (2) were recorded for D07 (Figure 3).

3). On the other hand, we also demonstrated the chromosomal localization for A genome (A01–A12). A total of 79 genes were found ranging from 2–15 per chromosomes and 9 of them were located on On the other hand, we also demonstrated the chromosomal localization for A genome (A01–A12). A total of 79 genes were found ranging from 2–15 per chromosomes and 9 of them were located on the scaffold. The highest number of genes was found on A05 (15), followed by A11 (8), and the least number of *PP2C* genes (2) were found on A13 (Figure 4). These findings suggested that GhPP2C were allocated unevenly to different chromosomal locations. *Int. J. Mol. Sci.* **2019**, *20*, x; doi: FOR PEER REVIEW www.mdpi.com/journal/ijms the scaffold. The highest number of genes was found on A05 (15), followed by A11 (8), and the least number of *PP2C* genes (2) were found on A13 (Figure 4). These findings suggested that GhPP2C were allocated unevenly to different chromosomal locations. *Int. J. Mol. Sci.* **2019**, *20*, x; doi: FOR PEER REVIEW www.mdpi.com/journal/ijms the scaffold. The highest number of genes was found on A05 (15), followed by A11 (8), and the least number of *PP2C* genes (2) were found on A13 (Figure 4). These findings suggested that GhPP2C

Moreover, a collinear correlation was also demonstrated between *G. hirsutum* and *Arabidopsis* (A and D genomes) (Figure 5a,b). To validate our results, we also performed a collinear relationship of *PP2C* genes using only *G. hirsutum* (Supplementary Figure S5). The results exhibited high conservation among PP2C members between the A and D genomes of cotton. Moreover, a collinear correlation was also demonstrated between *G. hirsutum* and *Arabidopsis* (A and D genomes) (Figure 5a,b). To validate our results, we also performed a collinear relationship of *PP2C* genes using only *G. hirsutum* (Supplementary Figure S5). The results exhibited high conservation among PP2C members between the A and D genomes of cotton. were allocated unevenly to different chromosomal locations. Moreover, a collinear correlation was also demonstrated between *G. hirsutum* and *Arabidopsis* (A and D genomes) (Figure 5a,b). To validate our results, we also performed a collinear relationship of *PP2C* genes using only *G. hirsutum* (Supplementary Figure S5). The results exhibited high conservation among PP2C members between the A and D genomes of cotton.

**Figure 3.** Chromosomal locations of the PP2C D genome of cotton were obtained from the Generic File Format (GFF) file and displayed using TBtools software. The red line indicates the collinear relationship for different chromosomes. **Figure 3.** Chromosomal locations of the PP2C D genome of cotton were obtained from the Generic File Format (GFF) file and displayed using TBtools software. The red line indicates the collinear relationship for different chromosomes. **Figure 3.** Chromosomal locations of the PP2C D genome of cotton were obtained from the Generic File Format (GFF) file and displayed using TBtools software. The red line indicates the collinear relationship for different chromosomes.

**Figure 4.** Chromosomal locations of PP2C A genome of cotton were obtained from the GFF file and displayed using TBtools software. The red line indicates the collinear relationship for different chromosomes. **Figure 4.** Chromosomal locations of PP2C A genome of cotton were obtained from the GFF file and displayed using TBtools software. The red line indicates the collinear relationship for different **Figure 4.** Chromosomal locations of PP2C A genome of cotton were obtained from the GFF file and displayed using TBtools software. The red line indicates the collinear relationship for different chromosomes.

chromosomes.

*Int. J. Mol. Sci.* **2019**, *20*, x; doi: FOR PEER REVIEW www.mdpi.com/journal/ijms

**Figure 5.** (**a**) and (**b**). Chromosomal locations of PP2C (A and D) genome of cotton were obtained from the GFF file and displayed using TB tools software. The pink line indicated the collinear relationship for different chromosomes. **Figure 5.** (**a**) and (**b**). Chromosomal locations of PP2C (A and D) genome of cotton were obtained from the GFF file and displayed using TB tools software. The pink line indicated the collinear relationship for different chromosomes.

#### *2.4. Analysis of Putative Regulatory Cis-Element and Gene Duplication Analysis of PP2C Gene Family in Cotton 2.4. Analysis of Putative Regulatory Cis-Element and Gene Duplication Analysis of PP2C Gene Family in Cotton*

In the promoter region of *PP2C* genes, cis-acting elements play a critical role as stress-adaptive signaling in plants by interacting with their cognate transcription factor (TF). For instance, abscisic acid (ABA)-responsive elements (ABREs) are involved in salt, drought, and ABA signaling [28]. LTR is crucial to chilling response and regulation [29]. Likewise, TCA-elements and TGACG-motif are responsive to salicylic acid (SA) and MeJA treatments [30]. The exploration of the cis-acting element of *G. hirsutum PP2C* genes was executed by using the PlantCARE database and seven common cis-regulatory elements were briefly summarized (Figure 6 and Supplementary Table S5). The results unveiled that most of the genes contributed in various signaling pathways such as phytohormones and biotic and abiotic regulatory stress factors. On the other hand, *PP2C* genes are recognized to show a major part in both biotic and abiotic stress phenomena. Most of the genes were highly (34.88%) responsive to light (AE-BOX, BOX-4, LAMP-ELEMENTS, GAG-motif, GATA-motif), followed by (24.32%) hormones (ABRE, CGTCA, TGA, TCA, AuxRe, GARE-motif), (18.61%) and other regulatory cis-elements (HD-ZIP3, o2-site, AT-Rich elements, CAT-BOX, A-Box, EIRE), while the fewest genes (0.93% and 0.85%) were mainly counted in circadian and enhancer elements (5UTR Py-rich stretch, TA-Rich Region, and GC-motif), respectively. In plants, the circadian cis-regulatory element is known to control the circadian rhythms. Moreover, these results indicate that *PP2C* genes are vital in various biotic-abiotic/hormone signaling which might be hypothesized by their diversity In the promoter region of *PP2C* genes, cis-acting elements play a critical role as stress-adaptive signaling in plants by interacting with their cognate transcription factor (TF). For instance, abscisic acid (ABA)-responsive elements (ABREs) are involved in salt, drought, and ABA signaling [28]. LTR is crucial to chilling response and regulation [29]. Likewise, TCA-elements and TGACG-motif are responsive to salicylic acid (SA) and MeJA treatments [30]. The exploration of the cis-acting element of *G. hirsutum PP2C* genes was executed by using the PlantCARE database and seven common cis-regulatory elements were briefly summarized (Figure 6 and Supplementary Table S5). The results unveiled that most of the genes contributed in various signaling pathways such as phytohormones and biotic and abiotic regulatory stress factors. On the other hand, *PP2C* genes are recognized to show a major part in both biotic and abiotic stress phenomena. Most of the genes were highly (34.88%) responsive to light (AE-BOX, BOX-4, LAMP-ELEMENTS, GAG-motif, GATA-motif), followed by (24.32%) hormones (ABRE, CGTCA, TGA, TCA, AuxRe, GARE-motif), (18.61%) and other regulatory cis-elements (HD-ZIP3, o2-site, AT-Rich elements, CAT-BOX, A-Box, EIRE), while the fewest genes (0.93% and 0.85%) were mainly counted in circadian and enhancer elements (5UTR Py-rich stretch, TA-Rich Region, and GC-motif), respectively. In plants, the circadian cis-regulatory element is known to control the circadian rhythms. Moreover, these results indicate that *PP2C* genes are vital in various biotic-abiotic/hormone signaling which might be hypothesized by their diversity in nature.

in nature. For gene duplication analysis, we used MCScanX to determine the types of gene duplications and the results suggested most of the genes were segmental (167) and few of them were dispersed (12); thus, indicating that segmental duplication plays a major contribution to the expansion of the *PP2C* gene family. Moreover, the *PP2C* genes might have experienced functional discrepancy due to gene duplications, and few of them might have lost their unique functions, developed novel functions, or preserved partition of innovative functions [31,32]. During the evolutionary processes, genes are often exposed to various selective pressures such as positive, neutral, and purifying selection. Additionally, for a better understanding of the selection pressure between the duplicated genes, we calculated the *Ka/Ks* ratios among selected genes from segmental and dispersed (Figure 7 and Table 1). It was shown that only two pairs have a positive selection (>1) *Ka/Ks*, while the rest For gene duplication analysis, we used MCScanX to determine the types of gene duplications and the results suggested most of the genes were segmental (167) and few of them were dispersed (12); thus, indicating that segmental duplication plays a major contribution to the expansion of the *PP2C* gene family. Moreover, the *PP2C* genes might have experienced functional discrepancy due to gene duplications, and few of them might have lost their unique functions, developed novel functions, or preserved partition of innovative functions [31,32]. During the evolutionary processes, genes are often exposed to various selective pressures such as positive, neutral, and purifying selection. Additionally, for a better understanding of the selection pressure between the duplicated genes, we calculated the *Ka/Ks* ratios among selected genes from segmental and dispersed (Figure 7 and Table 1). It was shown that only two pairs have a positive selection (>1) *Ka/Ks*, while the rest were purifying in nature with <1.00, reducing divergence after duplication.

were purifying in nature with <1.00, reducing divergence after duplication.

*Int. J. Mol. Sci.* **2019**, *20*, x; doi: FOR PEER REVIEW www.mdpi.com/journal/ijms

**Figure 6.** The ratio of various cis-elements in cotton. **Figure 6.** The ratio of various cis-elements in cotton. **Figure 6.** The ratio of various cis-elements in cotton.

**Figure 7.** The correlation between *Ks* and *Ka* for duplicated genes (segmental and dispersed). **Figure 7.** The correlation between *Ks* and *Ka* for duplicated genes (segmental and dispersed).

**Figure 7.** The correlation between *Ks* and *Ka* for duplicated genes (segmental and dispersed). **Table 1.** The gene with outlier *Ka/Ks* values and types of duplications i.e., segmental and dispersed **Table 1.** The gene with outlier *Ka/Ks* values and types of duplications i.e., segmental and dispersed **Table 1.** The gene with outlier *Ka/Ks* values and types of duplications i.e., segmental and dispersed genes.


*2.5. Expression Profiling of PP2C Gene Family in Different Tissues of G. hirsutum 2.5. Expression Profiling of PP2C Gene Family in Different Tissues of G. hirsutum 2.5. Expression Profiling of PP2C Gene Family in Different Tissues of G. hirsutum*

To gain insight into the tissue-specific expression patterns of the cotton, previously reported [24] transcriptome data was utilized for various tissues (root, stem, leaf, patel, and stamen) and fiber development (3, 6, 9, 12, and 15 days post anthesis). For instance, Figure 8a reveals the expression levels of *PP2C* genes drastically varied in various tissues and most numbers of them were highly expressed. However, the expression of a few genes (GhPP2C12, 16, 46, 45, 47, 50, 57, 74, 79, 95, 97, 122, 124, 125, 132, 141, 154, 159, and GhPP2C170) did not show any striking expression in any tissues. To gain insight into the tissue-specific expression patterns of the cotton, previously reported [24] transcriptome data was utilized for various tissues (root, stem, leaf, patel, and stamen) and fiber development (3, 6, 9, 12, and 15 days post anthesis). For instance, Figure 8a reveals the expression levels of *PP2C* genes drastically varied in various tissues and most numbers of them were highly expressed. However, the expression of a few genes (GhPP2C12, 16, 46, 45, 47, 50, 57, 74, 79, 95, 97, 122, 124, 125, 132, 141, 154, 159, and GhPP2C170) did not show any striking expression in any tissues. Though, some of the *PP2C* genes were expressed in one or more tissues (GhPP2C15, 17, 18, 27, 40, 44, To gain insight into the tissue-specific expression patterns of the cotton, previously reported [24] transcriptome data was utilized for various tissues (root, stem, leaf, patel, and stamen) and fiber development (3, 6, 9, 12, and 15 days post anthesis). For instance, Figure 8a reveals the expression levels of *PP2C* genes drastically varied in various tissues and most numbers of them were highly expressed. However, the expression of a few genes (GhPP2C12, 16, 46, 45, 47, 50, 57, 74, 79, 95, 97, 122, 124, 125, 132, 141, 154, 159, and GhPP2C170) did not show any striking expression in any tissues.

Though, some of the *PP2C* genes were expressed in one or more tissues (GhPP2C15, 17, 18, 27, 40, 44, 46, 64, 68, 72, 78, 117, 127, 129, 138, 151, 168, and GhPP2C171). Intriguingly, the tissue-specific

46, 64, 68, 72, 78, 117, 127, 129, 138, 151, 168, and GhPP2C171). Intriguingly, the tissue-specific

Though, some of the *PP2C* genes were expressed in one or more tissues (GhPP2C15, 17, 18, 27, 40, 44, 46, 64, 68, 72, 78, 117, 127, 129, 138, 151, 168, and GhPP2C171). Intriguingly, the tissue-specific clustering (Figure 8b) showed that root (2), stamen (3), patel (3), and leaf (1) genes were commonly involved in the tissue developmental role in cotton. To gain further insights into the connection between these *GhPP2C* genes in tissue-specific responses, a correlation analysis was established based on the Pearson correlation coefficients (PCCs) (*p* = 0.05). Results showed a higher positive correlation among various specific tissues (Figure 8c and Supplementary Table S6). 80, 86, 89, 91, 98, 100, 110, 117, 137, 148, 155, 171, and GhPP2C178, respectively. As shown in Figure 9b, the clustering analysis indicated the common developmental genes in 3 days post anthesis (DPA) (4), 6DPA (2), 9DPA (3), and 15DPA (1), respectively. The PCCs-based correlation analysis of the relative gene expression of selected genes suggested a high positive correlation and low inverse correlation between selected genes. In addition, some genes exhibited an inverse correlation among various fiber developmental stages (Figure 9c and Supplementary Table S6). These findings suggested that *PP2C* genes share diverse and high expression patterns in tissues and fiber development, which implied *PP2C* genes are conserved.

(Figure 9a), but some of them were not expressed, such as GhPP2C11, 21, 25, 26, 31, 40, 42, 45, 54, 76,

*Int. J. Mol. Sci.* **2019**, *20*, x; doi: FOR PEER REVIEW www.mdpi.com/journal/ijms

clustering (Figure 8b) showed that root (2), stamen (3), patel (3), and leaf (1) genes were commonly involved in the tissue developmental role in cotton. To gain further insights into the connection between these *GhPP2C* genes in tissue-specific responses, a correlation analysis was established based on the Pearson correlation coefficients (PCCs) (*p* = 0.05). Results showed a higher positive

Furthermore, the expression patterns of fiber developmental stages (3, 6, 9, 12, and 15 days post

correlation among various specific tissues (Figure 8c and Supplementary Table S6).

**Figure 8.** (**A**) Heat map of expression profiles (in log2-based fragments per kilobase of transcript per million fragments mapped (FPKM)) for PP2C in the five various tissues—root, stem, leaf, patel, and stamen. The expression levels are indicated by the color bar. (**B**) Venn diagram analysis of the tissue expression of PP2C. (**C**) Pearson's correlation coefficients (PCCs) of FPKM-based values of 181 GhPP2Cs in different tissues using Rstudio. **Figure 8.** (**A**) Heat map of expression profiles (in log<sup>2</sup> -based fragments per kilobase of transcript per million fragments mapped (FPKM)) for PP2C in the five various tissues—root, stem, leaf, patel, and stamen. The expression levels are indicated by the color bar. (**B**) Venn diagram analysis of the tissue expression of PP2C. (**C**) Pearson's correlation coefficients (PCCs) of FPKM-based values of 181 GhPP2Cs in different tissues using Rstudio.

Furthermore, the expression patterns of fiber developmental stages (3, 6, 9, 12, and 15 days post anthesis) exhibited a dynamic expression level. Majority of the *PP2C* genes were highly expressed (Figure 9a), but some of them were not expressed, such as GhPP2C11, 21, 25, 26, 31, 40, 42, 45, 54, 76, 80, 86, 89, 91, 98, 100, 110, 117, 137, 148, 155, 171, and GhPP2C178, respectively. As shown in Figure 9b, the clustering analysis indicated the common developmental genes in 3 days post anthesis (DPA) (4), 6DPA (2), 9DPA (3), and 15DPA (1), respectively. The PCCs-based correlation analysis of the relative

gene expression of selected genes suggested a high positive correlation and low inverse correlation between selected genes. In addition, some genes exhibited an inverse correlation among various fiber developmental stages (Figure 9c and Supplementary Table S6). These findings suggested that *PP2C* genes share diverse and high expression patterns in tissues and fiber development, which implied *PP2C* genes are conserved. *Int. J. Mol. Sci.* **2019**, *20*, x; doi: FOR PEER REVIEW www.mdpi.com/journal/ijms

**Figure 9.** (**A**) Heat map of expression profiles (in log2-based FPKM) for PP2C in the five fiber developments: 3 days post anthesis (DPA), 6, 9, 12, and 15 DPA. The expression levels are indicated by the color bar. (**B**) Venn diagram analysis of the fiber development of PP2C. (**C**) Pearson's correlation coefficients (PCCs) of FPKM-based values of 181 GhPP2Cs in different fiber development **Figure 9.** (**A**) Heat map of expression profiles (in log<sup>2</sup> -based FPKM) for PP2C in the five fiber developments: 3 days post anthesis (DPA), 6, 9, 12, and 15 DPA. The expression levels are indicated by the color bar. (**B**) Venn diagram analysis of the fiber development of PP2C. (**C**) Pearson's correlation coefficients (PCCs) of FPKM-based values of 181 GhPP2Cs in different fiber development using Rstudio.

#### using Rstudio. *2.6. qRT-PCR Analysis of the Candidate PP2C Gene Family in Response to Various Stresses*

*2.6. qRT-PCR Analysis of the Candidate PP2C Gene Family in Response to Various Stresses*  A prediction of the cis-regulatory elements indicated that *GhPP2C* genes may participate in responses to heat, cold, drought, and NaCl stress tolerance. Moreover, the expression profiling of PP2C has been studied in various species after exposure to abiotic–biotic and hormones stresses [3,26]. To verify this hypothesis, we subjected the cotton seedlings to four various abiotic stress treatments such as heat, cold, drought, and NaCl stress, and then carefully selected 30 genes for qRT-PCR. The results showed that some *GhPP2C* genes exhibited high transcript levels after exposure to multiple treatments, but a few of them were induced by one or more treatments. For instance, heat stress possesses the dominant portion of down-regulated genes (52%). However, cold A prediction of the cis-regulatory elements indicated that *GhPP2C* genes may participate in responses to heat, cold, drought, and NaCl stress tolerance. Moreover, the expression profiling of PP2C has been studied in various species after exposure to abiotic–biotic and hormones stresses [3,26]. To verify this hypothesis, we subjected the cotton seedlings to four various abiotic stress treatments such as heat, cold, drought, and NaCl stress, and then carefully selected 30 genes for qRT-PCR. The results showed that some *GhPP2C* genes exhibited high transcript levels after exposure to multiple treatments, but a few of them were induced by one or more treatments. For instance, heat stress possesses the dominant portion of down-regulated genes (52%). However, cold stress showed 70% of the genes were up-regulated and 30% decreased in transcript level, followed by drought stress,

>0.5, mild positive <0.5 and >0, and negative correlation <0). The results emphasized that both cold and salt stress showed 12 each PCC values having a highly positive correlation, while negative

stress showed 70% of the genes were up-regulated and 30% decreased in transcript level, followed by drought stress, which exhibited about 53% and 47% of up- and down-regulated *GhPP2C* genes,

which exhibited about 53% and 47% of up- and down-regulated *GhPP2C* genes, respectively. On the other hand, exposure to salt stress resulted in 56% up-regulation and 44% down-regulation in genes (Figure 10). Among these 30 genes, we also calculated the Pearson correlation coefficient (PCC) based on the expression by making three categories (i.e., highly positive >0.5, mild positive <0.5 and >0, and negative correlation <0). The results emphasized that both cold and salt stress showed 12 each PCC values having a highly positive correlation, while negative correlation was recorded in heat and drought with 6 each PCC values (Figure 11 and Supplementary Table S6). Taken together, all the 30 genes were induced by different abiotic stresses, but the diversity in the expression profiling of *GhPP2C* genes may suggest that these genes may be critical to abiotic-stress responses. *Int. J. Mol. Sci.* **2019**, *20*, x; doi: FOR PEER REVIEW www.mdpi.com/journal/ijms correlation was recorded in heat and drought with 6 each PCC values (Figure 11 and Supplementary Table S6). Taken together, all the 30 genes were induced by different abiotic stresses, but the diversity in the expression profiling of *GhPP2C* genes may suggest that these genes may be critical to abiotic-stress responses.

**Figure 10.** Relative expression analysis by qRT-PCR of *PP2C* genes under heat, cold, drought, and salt stressing in *G*. *hirsutum*. **Figure 10.** Relative expression analysis by qRT-PCR of *PP2C* genes under heat, cold, drought, and salt stressing in *G*. *hirsutum*.

*Int. J. Mol. Sci.* **2019**, *20*, x; doi: FOR PEER REVIEW www.mdpi.com/journal/ijms

**Figure 11.** The Pearson's correlation coefficient (PCC) values under response to heat, cold, drought, and salt treatments. **Figure 11.** The Pearson's correlation coefficient (PCC) values under response to heat, cold, drought, and salt treatments.

#### **3. Discussion 3. Discussion**

In this study, we systematically investigate the *PP2C* genes based on the genome-wide analysis. Although the *PP2C* gene family has been in many species, the knowledge of *PP2C* genes in cotton is yet to be elucidated and their systematic analysis has not been reported. Previously, the *PP2C* gene family has been reported in different plant species such as maize [12], rice [11], *Arabidopsis* [10], hot pepper [33], banana [34], and *Brachypodium distachyon* [3]. Thus, we observed high variations in the number of *PP2C* genes that might be present during whole-genome duplication (WGD) events. For exploring new biological functions, evolutionary implications and its expansions are mainly based on gene duplications [35]. Therefore, to study evolutionary analysis and polyploid formation, cotton is an excellent model and ideal crop by being typically allotetraploid [36]. The importance of evolutionary analysis is further reflected in the fact that most of the angiosperms have undergone either one or multiple polyploidization events [37–39]. Herein, we systematically found PP2C-encoding genes in four cotton species—87 (*Gossypium arboreum*), 147 (*Gossypium barbadense*), 181 (*Gossypium hirsutum*), and 99 (*Gossypium raimondii*). Cotton species represent the high number of genes, specifying that the *PP2C* gene family have undergone extensive expansion during the evolution of cotton. In the current study, a comprehensive genome-wide analysis was executed, such as gene identification, gene structure organization, phylogenetic characterization, syntenic relationships, chromosomal localization, and gene duplications. Moreover, transcriptional profiling of 30 *PP2C* genes was also performed, after exposure to heat, cold, drought, and salt stress was also In this study, we systematically investigate the *PP2C* genes based on the genome-wide analysis. Although the *PP2C* gene family has been in many species, the knowledge of *PP2C* genes in cotton is yet to be elucidated and their systematic analysis has not been reported. Previously, the *PP2C* gene family has been reported in different plant species such as maize [12], rice [11], *Arabidopsis* [10], hot pepper [33], banana [34], and *Brachypodium distachyon* [3]. Thus, we observed high variations in the number of *PP2C* genes that might be present during whole-genome duplication (WGD) events. For exploring new biological functions, evolutionary implications and its expansions are mainly based on gene duplications [35]. Therefore, to study evolutionary analysis and polyploid formation, cotton is an excellent model and ideal crop by being typically allotetraploid [36]. The importance of evolutionary analysis is further reflected in the fact that most of the angiosperms have undergone either one or multiple polyploidization events [37–39]. Herein, we systematically found PP2C-encoding genes in four cotton species—87 (*Gossypium arboreum*), 147 (*Gossypium barbadense*), 181 (*Gossypium hirsutum*), and 99 (*Gossypium raimondii*). Cotton species represent the high number of genes, specifying that the *PP2C* gene family have undergone extensive expansion during the evolution of cotton. In the current study, a comprehensive genome-wide analysis was executed, such as gene identification, gene structure organization, phylogenetic characterization, syntenic relationships, chromosomal localization, and gene duplications. Moreover, transcriptional profiling of 30 *PP2C* genes was also performed, after exposure to heat, cold, drought, and salt stress was also carried out [12,27].

carried out [12,27]. In this study, the subcellular predictions for most of the members of GaPP2C, GbPP2C, GhPP2C, and GrPP2C were mainly located in different organelles such as the chloroplast, nuclear region, mitochondria, cytoplasm, and others. In addition, the gene characteristics of GaPP2C, GbPP2C, GhPP2C, and GrPP2C drastically vary including the protein length (aa), molecular weight (MW), predicted isoelectric point (PI), and a grand average of hydropathicity (GRAVY), signifying In this study, the subcellular predictions for most of the members of GaPP2C, GbPP2C, GhPP2C, and GrPP2C were mainly located in different organelles such as the chloroplast, nuclear region, mitochondria, cytoplasm, and others. In addition, the gene characteristics of GaPP2C, GbPP2C, GhPP2C, and GrPP2C drastically vary including the protein length (aa), molecular weight (MW), predicted isoelectric point (PI), and a grand average of hydropathicity (GRAVY), signifying that different PP2C proteins may play complex roles in variable microenvironments.

that different PP2C proteins may play complex roles in variable microenvironments. The estimate of evolutionary patterns, such as the rate of computing the selection pressure analysis (*Ka/Ks*), provides useful information about purifying, positive, and neutral selections of gene pairs during the rate of divergence [40]. In evolutionary events, if the value of the *Ka/Ks* ratio is <1.00, this represents purifying selection, a *Ka/Ks* ratio of 1.00 indicates neutral selection, and/or *Ka/Ks* >1.00 depicts positive selection [41,42]. Similarly, during evolutionary processes and expansion of a gene family, these indicators are used for the selection history. In this study, we The estimate of evolutionary patterns, such as the rate of computing the selection pressure analysis (*Ka/Ks*), provides useful information about purifying, positive, and neutral selections of gene pairs during the rate of divergence [40]. In evolutionary events, if the value of the *Ka/Ks* ratio is <1.00, this represents purifying selection, a *Ka/Ks* ratio of 1.00 indicates neutral selection, and/or *Ka/Ks* >1.00 depicts positive selection [41,42]. Similarly, during evolutionary processes and expansion of a gene family, these indicators are used for the selection history. In this study, we calculated these values among segmental and dispersed pairs of *PP2C* genes with the help of the MEGA7.0 program. Thus,

calculated these values among segmental and dispersed pairs of *PP2C* genes with the help of the MEGA7.0 program. Thus, we estimated the selection pressure analysis of duplicated genes (i.e., we estimated the selection pressure analysis of duplicated genes (i.e., segmental and dispersed) pairs. The majority of the GhPP2C pairs showed *Ka/Ks* ratios of less than 1.00, indicating the purifying selection, and only two pairs showed values more than 1.00, speculating the positive selection [31].

Plant productivity is always uncertain due to a range of climatic challenges such as heat, cold, drought, and salt stress being the major factors in limiting crop production. Previous studies reported that *AtPP2CG1* regulates positively against salt tolerance in *Arabidopsis* and is induced by drought, salt, or exogenous ABA treatment [43]. In *Arabidopsis*, two members of *PP2C* genes responded inversely; for example, *AP2C1* expression was powerfully tempted by drought, wounding, and cold but *AP2C2* was slightly prompted by these treatments too [44]. These findings highlighted the significance of critical members of the *PP2C* gene family in the model plant *Arabidopsis*, but their specific functions may be significantly varied in cotton. For achieving gene expression patterns in diverse growth phases of GhPP2C, we systematically analyzed the previously reported transcriptome data in various tissues (root, stem, leaf, patel, and stamen) and fiber development (3, 6, 9, 12, and 15 DPA). These results showed that most of the PP2C were highly varied, speculating the high diversity in their functions. However, few genes have shown tissue-specific expression, indicating their common importance to plant development. In *GhPP2C* genes, we also explored the promoter regions for identification of common conserved cis-regulatory elements. As a consequence, these results exhibit the participation of *PP2C* genes in various hormone signaling, including both biotic and abiotic stress factors. In the present study, to provide validation to our hypothesis, we also tested 30 candidate genes under various stresses using qRT-PCR. The majority of *PP2C* genes showed high striking transcriptional changes by exposure to heat, cold, drought, and salt stress, suggesting that *PP2C* genes might be crucial to stress tolerance in cotton. Noticeably, recent studies in *Arabidopsis* and rice also uncovered the pivotal role of PP2C candidate genes [11,45], however, in cotton, it is largely obscured. Moreover, 30 candidate genes involved in expression profiling demonstrated their critical role in *Gossypium hirsutum*.

Therefore, taken together, the results of our study provided valuable insight, signifying that PP2C has provided a stepping stone to the molecular mechanism in cotton. Alongside, the differential expression profiling and gene duplications analysis might have experienced functional divergence, and their further study will help considerably in improving the crop yield and quality and cultivating new resistant varieties.

## **4. Conclusions**

We analyzed the genome analysis, evolutionary rates, and molecular characterization of *PP2C* genes in the *Gossypium* genome. Herein, we identified 87 (*Gossypium arboreum*), 147 (*Gossypium barbadense*), 181 (*Gossypium hirsutum*), and 99 (*Gossypium raimondii*) *PP2C* genes by bioinformatics analysis in cotton species. Gene synteny analysis showed that GhPP2C are highly conserved, while the gene duplications analysis reflected that only segmental duplication plays a starring role in the *PP2C* gene extension in cotton. Also, the results of the phylogenetic analysis categorized the *PP2C* genes into 12 subgroups. We further explored the previously published RNA-sequence (RNA-seq) data to compare the tissue-specific response of *PP2C* genes and their critical role in organ development and fiber. Various *PP2C* genes responded promptly to abiotic stresses, including heat, cold, salt, and drought, suggesting their crucial role in abiotic stress tolerance. As a consequence, our findings will facilitate advanced research on the functional analysis of *PP2C* genes regarding their critical role in tissues, fiber development, and abiotic stress tolerance.

## **5. Materials and Methods**
