*Article* **Exploring the Molecular Mechanism underlying the Stable Purple-Red Leaf Phenotype in** *Lagerstroemia indica* **cv. Ebony Embers**

**Zhongquan Qiao** † **, Sisi Liu** † **, Huijie Zeng, Yongxin Li, Xiangying Wang, Yi Chen, Xiaoming Wang \* and Neng Cai \***

Hunan Academy of Forestry, 658 South Shaoshan Road, Changsha 410004, China;

qiaozhongquan110@163.com (Z.Q.); liusisi274@126.com (S.L.); run507@163.com (H.Z.); yh5403@sohu.com (Y.L.); S13617494939@163.com (X.W.); lotofa@163.com (Y.C.)


Received: 18 October 2019; Accepted: 8 November 2019; Published: 11 November 2019

**Abstract:** *Lagerstroemia indica* is an important ornamental tree worldwide. The development of cultivars with colorful leaves and increased ornamental value represents one of the current main research topics. We investigated the anthocyanin profiles in two contrasting cultivars for leaf color phenotypes and explored the underlying molecular basis. Both cultivars display purple-red young leaves (Stage 1), and when the leaves mature (Stage 2), they turn green in HD (Lagerstroemia Dynamite) but remain unchanged in ZD (Lagerstroemia Ebony Embers). Seven different anthocyanins were detected, and globally, the leaves of ZD contained higher levels of anthocyanins than those of HD at the two stages with the most pronounced difference observed at Stage 2. Transcriptome sequencing revealed that in contrast to HD, ZD tends to keep a higher activity level of key genes involved in the flavonoid–anthocyanin biosynthesis pathways throughout the leaf developmental stages in order to maintain the synthesis, accumulation, and modification of anthocyanins. By applying gene co-expression analysis, we detected 19 key MYB regulators were co-expressed with the flavonoid–anthocyanin biosynthetic genes and were found strongly down-regulated in HD. This study lays the foundation for the artificial manipulation of the anthocyanin biosynthesis in order to create new *L. indica* cultivars with colorful leaves and increased ornamental value.

**Keywords:** *Lagerstroemia indica*; gene expression; ornamental value; anthocyanins; leaf coloration; directional improvement

#### **1. Introduction**

*Lagerstroemia indica* L. is a deciduous shrub and small tree of the genus Lagerstroemia with a great ornamental value thanks to its attractive blossom, long-lasting flowering period, and vase-shaped features [1]. It originated in China and has long been used in landscaping in major cities, including Anyang, Fuyang, and Jincheng [2]. *L. indica* cultivars have a wide range of flower colors (white, red, purple, and their combined variants), which contrast with a dark green foliage. However, the few existing cultivars with both colorful flowers and leaves have attracted a great interest and are the prime choice on the market [3]. Therefore, the development of new cultivars with colorful leaves and increased ornamental value has become one of the key research directions in breeding programs. In line with this, the United States Department of Agriculture has released the cultivar 'Lagerstroemia Ebony Embers', which has stable purple-red leaves throughout its leaf development [4]. So far, efforts to develop new *L. indica* cultivars have been mainly based on traditional breeding techniques [5–8]. Hence, it is still tedious to achieve the directional improvement of leaf color in *L. indica*. It is expected

that modern molecular techniques will considerably facilitate and accelerate the improvement of leaf color in *L. indica* [9]. However, this requires a detailed understanding of the molecular mechanism of color formation in leaves of *L. indica*.

Color formation is one the most investigated and fascinating research questions in ornamental plants. Flavonoids, particularly anthocyanins, have been reported as the main coloring pigments in plants [10]. Anthocyanins provide a large spectrum of colors ranging from orange/red to violet/blue. Over the past decades, numerous works have clarified the biosynthetic pathway of anthocyanins, which is a very well-conserved network in plant species [11,12]. The key structural genes that catalyze the early and late steps of anthocyanin biosynthesis have been revealed and include phenylalanine ammonia-lyase (PAL), chalcone synthase (CHS), chalcone isomerase (CHI), flavonone 3-hydroxylase (F3H), flavonoid 3'-monooxygenase (F30H), dihydroflavonol 4-reductase (DFR), anthocyanin synthase (ANS), and UDP-glucose-flavonoid 3-*O*-glucosyltrasnferase (UFGT) [13]. The specific variation in the expression levels of these structural genes through various and complex regulation mechanisms results in quantitative and qualitative variations of anthocyanins, underlying the difference of colorations observed between species, genotypes, organs, or even between various positions on the same plant tissue. Transcription factors (TF) such as MYB, basic helix loop-helix, and WD40 genes were reported to be the key modulators of the anthocyanin biosynthetic structural genes [14–16], but other regulators belonging to the TF families of WRKY and NAC have also been discovered [17–19]. Moreover, recent studies have demonstrated that genetic mutations and microRNAs represent other forms of regulation of the anthocyanin biosynthetic genes [20,21]. The species-specific peculiarity of anthocyanin regulation mechanisms justifies the numerous studies on color formation in plants.

The overall goal of this study is to clarify the molecular mechanism of color formation in leaves of *L. indica*. To achieve this objective, we explored the key anthocyanins conferring the purple-red color in leaves of 'Lagerstroemia Ebony Embers' compared to the cultivar 'Lagerstroemia Dynamite', which features green-colored mature leaves. In addition, we investigated the competition mechanism between different branches of anthocyanin biosynthesis and the TFs regulating the anthocyanin biosynthetic genes. The findings from this study will guide the artificial manipulation of the anthocyanin biosynthesis in order to create new cultivars with colorful leaves and increased ornamental value.

#### **2. Results**

#### *2.1. Anthocyanin Analysis in the Leaves of the Two* Lagerstroemia indica *Cutlivars*

Two cultivars of *Lagerstroemia indica* with different leaf color phenotypes were studied. The two cultivars display purple-red young leaves (Stage 1), and when the leaves mature (Stage 2), they turn into green color in HD (Lagerstroemia Dynamite) but remain unchanged in ZD (Lagerstroemia Ebony Embers) (Figure 1A–D). Anthocyanins are known to be the major coloring pigments in plants [10]. We characterized the anthocyanin contents in the leaf samples of the two cultivars at the two stages of development. Seven anthocyanins, including peonidin O-hexoside, rosinidin O-hexoside, cyanidin O-syringic acid, cyanidin 3-O-glucoside (kuromanin), delphinidin 3-O-glucoside (mirtillin), cyanidin 3,5-O-diglucoside (cyanin), and cyanidin were detected (Table S1). Quantitative profiles showed that cyanidin was only detected in the leaves of HD, while the six other anthocyanins were present at different concentrations in the two cultivars. Globally, the leaves of ZD contained higher levels of anthocyanins than those of HD at the two stages with the most pronounced difference observed at Stage 2 (Figure 1E). In addition, the total anthocyanin content decreased from Stage 1 to Stage 2 in both cultivars, but the decrease was more conspicuous in HD (Figure 1E). This suggests that the leaf color change observed at Stage 2 in HD is associated with a significant decrease of total anthocyanins. Next, the concentrations of each metabolite were compared between the two cultivars (ZD-1\_vs\_HD-1 and ZD-2\_vs\_HD-2) and between the two developmental stages (HD-1\_vs\_HD-2 and ZD-1\_vs\_ZD-2) in order to identify the differentially accumulated metabolites (DAM) with the following parameters: variable importance in projection ≥1 and fold change ≥2 or fold change ≤0.5 [22]. In total, we found five, two, four, and seven DAM for HD-1\_vs\_HD-2, ZD-1\_vs\_ZD-2, ZD-1\_vs\_HD-1, and ZD-2\_vs\_HD-2, respectively (Table 1). This result further supports the premise that a strategy toward maintaining a higher content of all detected anthocyanins (except cyanidin) in ZD underpins the stable leaf coloration observed throughout the developmental stages. following parameters: variable importance in projection ≥1 and fold change ≥2 or fold change ≤0.5 [22]. In total, we found five, two, four, and seven DAM for HD-1\_vs\_HD-2, ZD-1\_vs\_ZD-2, ZD-1\_vs\_HD-1, and ZD-2\_vs\_HD-2, respectively (Table 1). This result further supports the premise that a strategy toward maintaining a higher content of all detected anthocyanins (except cyanidin) in ZD underpins the stable leaf coloration observed throughout the developmental stages.

*Int. J. Mol. Sci.* **2019**, *20*, x FOR PEER REVIEW 3 of 17

**Figure 1.** The phenotypes of young (**A**,**C**) and mature (**B**,**D**) leaves of HD and ZD. The bar represents 1.2 cm. (**E**) Total anthocyanin content measured in leaves of HD and ZD at young (Stage 1, S1) and mature (Stage 2, S2) stages. HD represents the cultivar Lagerstroemia Dynamite, while ZD represent the cultivar Lagerstroemia Ebony Embers. Data are from three replicate samples and represent the ion abundance of the anthocyanin compounds. **Table 1.** Concentration of the anthocyanins detected in the two *L. indica* cutlivars and their log2 fold-**Figure 1.** The phenotypes of young (**A**,**C**) and mature (**B**,**D**) leaves of HD and ZD. The bar represents 1.2 cm. (**E**) Total anthocyanin content measured in leaves of HD and ZD at young (Stage 1, S1) and mature (Stage 2, S2) stages. HD represents the cultivar Lagerstroemia Dynamite, while ZD represent the cultivar Lagerstroemia Ebony Embers. Data are from three replicate samples and represent the ion abundance of the anthocyanin compounds.


change values. Bold values are those significantly changed between compared samples. **Compounds HD-1 ZD-1 HD-2 ZD-2 Log2 Fold Change HD-ZD-ZD-ZD-Table 1.** Concentration of the anthocyanins detected in the two *L. indica* cutlivars and their log2 fold-change values. Bold values are those significantly changed between compared samples.

two stages and in triplicate. In total, 12 RNA-seq were generated, yielding a total of 283 millions reads *2.2. De Novo Transcriptome Assembly and Gene Expression Profiles in the Two* L. indica *Cutlivars at Di*ff*erent Leaf Developmental Stages*

In order to decode the genes involved in the differential leaf color phenotype in HD and ZD, we de novo sequenced and assembled the transcriptome from leaf samples of the two cultivars at the two stages and in triplicate. In total, 12 RNA-seq were generated, yielding a total of 283 millions reads and 84 Gb of clean data with 94% of bases scoring Q30 and above (Table 2). Using the Trinity software, 45,925 unigenes were assembled. To predict the functions of these genes, various databases were searched, including COG (18475), GO (33922), KEGG (17473), KOG (26429), Pfam (36768), Swiss-Prot (32110), eggNOG (42527), and NR (43088), resulting in a total of 43,208 functionally annotated genes (Figure 2A–D). NR database homologous species distribution analysis showed that *L. indica* (Lythraceae) shares 40% of its genes with *Eucalyptus grandis* (Myrtaceae), both species belonging to the same order: Myrtales (Figure 2E). We searched for genes encoding transcription factors (TF) and obtained a total of 2504 TFs classified into various families (Table S2).


**Table 2.** Overview of the transcriptome sequencing dataset and quality check.

. **Figure 2.** Functional annotation of the unigenes detected in *L. indica* based on orthologs from various databases. (**A**) KOG; (**B**) COG; (**C**) GO; (**D**) EggNOG; and (**E**) NR database homologous species distribution analysis. **Figure 2.** Functional annotation of the unigenes detected in *L. indica* based on orthologs from various databases. (**A**) KOG; (**B**) COG; (**C**) GO; (**D**) EggNOG; and (**E**) NR database homologous species distribution analysis.

Gene expression levels were estimated with the fragments per kilobase of exon per million fragments mapped (FPKM) values ranging from 0.01 to 786,889 (Figure 3A). To assess the quality of the replicate samples, we performed hierarchical clustering analysis based on FPKM data. The result showed that all the biological replicates clustered together, suggesting a high reliability of our sequencing data (Figure 3B). Moreover, two main groups were displayed, including one group (G1) for the green-colored leaf samples (HD-2) and one group (G2) for the purple-red-colored samples (HD-1, ZD-1 and ZD-2). Also, G2 could be split into two subgroups, including G2\_1 gathering samples from the young stage of both cultivars and G2\_2 gathering samples from the mature stage of ZD. Overall, the sample clustering pattern was clearly according to the leaf color phenotype. *Int. J. Mol. Sci.* **2019**, *20*, x FOR PEER REVIEW 6 of 17

**Figure 3.** Overview of the transcriptome sequencing in *L. indica* leaves. (**A**) Gene expression profiles in the 12 libraries. HD represents the cultivar Lagerstroemia Dynamite, while ZD represents the cultivar Lagerstroemia Ebony Embers; (**B**) heatmap clustering showing correlation among samples based on global expression profiles. G1 and G2 represent the two major groups of samples, while G2\_1 and G2\_2 represent the subgroups of the group G2. **Figure 3.** Overview of the transcriptome sequencing in *L. indica* leaves. (**A**) Gene expression profiles in the 12 libraries. HD represents the cultivar Lagerstroemia Dynamite, while ZD represents the cultivar Lagerstroemia Ebony Embers; (**B**) heatmap clustering showing correlation among samples based on global expression profiles. G1 and G2 represent the two major groups of samples, while G2\_1 and G2\_2 represent the subgroups of the group G2.

#### *2.3. Differentially Expressed Genes between the Two* L. indica *Cutlivars 2.3. Di*ff*erentially Expressed Genes between the Two* L. indica *Cutlivars*

To uncover the genes involved in the different levels of anthocyanins in leaves of HD and ZD, the gene expression values expressed as FPKM were compared between cultivars and developmental stages. The differentially expressed genes (DEG) were detected with the following parameters: fold change >2 and a false discovery rate correction set at *p* < 0.01. The results showed large numbers of DEGs between compared pair of samples HD-1\_vs\_HD-2 (9247), ZD-1\_vs\_ZD-2 (9852), ZD-1\_vs\_HD-1 (13990), and ZD-2\_vs\_HD-2 (14688) (Figure 4). Gene ontology (GO) enrichment analysis was performed for these four types of DEGs (Figure S1). The metabolic process and cellular process were the most enriched GO terms in the biological process, the cell and cell part were the most enriched cellular component GO terms, while catalytic activity and binding were clearly enriched as molecular functions. These results suggest that transcription factors (binding activity) and high enzymatic activity are involved in the modulation of leaf coloration in *L. indica*. The highest numbers of DEGs (approximately 1/3 of total expressed genes) were observed by comparing the two cultivars independently of the stages, which indicates a large variation in their genetic make up. We focused our analysis on the genes related to the flavonoid–anthocyanin biosynthesis and MYB transcription factors detected within the DEGs. To uncover the genes involved in the different levels of anthocyanins in leaves of HD and ZD, the gene expression values expressed as FPKM were compared between cultivars and developmental stages. The differentially expressed genes (DEG) were detected with the following parameters: fold change >2 and a false discovery rate correction set at *p* < 0.01. The results showed large numbers of DEGs between compared pair of samples HD-1\_vs\_HD-2 (9247), ZD-1\_vs\_ZD-2 (9852), ZD-1\_vs\_HD-1 (13990), and ZD-2\_vs\_HD-2 (14688) (Figure 4). Gene ontology (GO) enrichment analysis was performed for these four types of DEGs (Figure S1). The metabolic process and cellular process were the most enriched GO terms in the biological process, the cell and cell part were the most enriched cellular component GO terms, while catalytic activity and binding were clearly enriched as molecular functions. These results suggest that transcription factors (binding activity) and high enzymatic activity are involved in the modulation of leaf coloration in *L. indica*. The highest numbers of DEGs (approximately 1/3 of total expressed genes) were observed by comparing the two cultivars independently of the stages, which indicates a large variation in their genetic make up. We focused our analysis on the genes related to the flavonoid–anthocyanin biosynthesis and MYB transcription factors detected within the DEGs.

*Int. J. Mol. Sci.* **2019**, *20*, x FOR PEER REVIEW 7 of 17

**Figure 4.** Volcano plot showing the up-regulated genes, down-regulated genes, and genes that were not regulated between pairs of compared samples. (**A**) HD-1\_vs\_HD-2, (**B**) ZD-1\_vs\_ZD-2; (**C**) ZD-1\_vs\_HD-1; and (**D**) ZD-2\_vs\_HD-2. HD represents the cultivar Lagerstroemia Dynamite, while ZD represents the cultivar Lagerstroemia Ebony Embers. **Figure 4.** Volcano plot showing the up-regulated genes, down-regulated genes, and genes that were not regulated between pairs of compared samples. (**A**) HD-1\_vs\_HD-2, (**B**) ZD-1\_vs\_ZD-2; (**C**) ZD-1\_vs\_HD-1; and (**D**) ZD-2\_vs\_HD-2. HD represents the cultivar Lagerstroemia Dynamite, while ZD represents the cultivar Lagerstroemia Ebony Embers.

#### *2.4. DEGs Related to the Flavonoid–Anthocyanin Biosynthesis and Mechanisms Underlying the Differential Leaf Color Phenotypes*  Since we observed a higher content of total anthocyanins in the leaves of ZD than HD at the two *2.4. DEGs Related to the Flavonoid–Anthocyanin Biosynthesis and Mechanisms Underlying the Di*ff*erential Leaf Color Phenotypes*

developmental stages (Figure 1E), we further compared the expressed genes related to the flavonoid– anthocyanin biosynthesis between the two cultivars at each stage. We obtained 74 and 71 DEGs at Stage 1 and Stage 2, respectively, resulting in a total of 96 DEGs with the majority of these DEGs being higher expressed in ZD than HD, particularly at Stage 2 (Table S3). This result is not intriguing, and shows that a stronger activity of genes related to the flavonoid–anthocyanin biosynthesis in the leaves of ZD promotes a higher synthesis and accumulation of anthocyanins. To elucidate the molecular mechanism underlying the change in leaf color observed in HD while the leaf color of ZD remained stable, we investigated the DEGs between the two developmental stages in each cultivar. In total, 74 (10 up-regulated and 64 down-regulated) and 52 (nine up-regulated and 43 downregulated) genes involved in the flavonoid–anthocyanin biosynthesis were differentially expressed from Stage 1 to Stage 2 in HD and ZD, respectively. The higher number of altered genes, particularly the down-regulated genes, in HD as compared to ZD highlights a mechanism to strongly limit anthocyanin biosynthesis. Thirty DEGs displayed the same pattern between the two cultivars, suggesting that they are not involved in the differential leaf color phenotype (Table S4). In contrast, 44 DEGs showed either opposite patterns between the two cultivars or a huge difference in the gene expression fold change from Stage 1 to Stage 2. We infer that these genes are crucial for the differential leaf color phenotype observed in the two cultivars. To better understand how these genes affect the Since we observed a higher content of total anthocyanins in the leaves of ZD than HD at the two developmental stages (Figure 1E), we further compared the expressed genes related to the flavonoid–anthocyanin biosynthesis between the two cultivars at each stage. We obtained 74 and 71 DEGs at Stage 1 and Stage 2, respectively, resulting in a total of 96 DEGs with the majority of these DEGs being higher expressed in ZD than HD, particularly at Stage 2 (Table S3). This result is not intriguing, and shows that a stronger activity of genes related to the flavonoid–anthocyanin biosynthesis in the leaves of ZD promotes a higher synthesis and accumulation of anthocyanins. To elucidate the molecular mechanism underlying the change in leaf color observed in HD while the leaf color of ZD remained stable, we investigated the DEGs between the two developmental stages in each cultivar. In total, 74 (10 up-regulated and 64 down-regulated) and 52 (nine up-regulated and 43 down-regulated) genes involved in the flavonoid–anthocyanin biosynthesis were differentially expressed from Stage 1 to Stage 2 in HD and ZD, respectively. The higher number of altered genes, particularly the down-regulated genes, in HD as compared to ZD highlights a mechanism to strongly limit anthocyanin biosynthesis. Thirty DEGs displayed the same pattern between the two cultivars, suggesting that they are not involved in the differential leaf color phenotype (Table S4). In contrast, 44 DEGs showed either opposite patterns between the two cultivars or a huge difference in the gene expression fold change from Stage 1 to Stage 2. We infer that these genes are crucial for the differential leaf color phenotype observed in the two cultivars. To better understand how these genes affect the leaf color, we mapped them on the flavonoid–anthocyanin biosynthesis pathways (Figure 5), which have been well characterized in plants [11,12]. The main precursors for flavonoids are 4-coumaroyl CoA and three molecules of malonyl CoA that produce chalcones by chalcone synthase (CHS) [13]. We identified 17 chalcone synthase [EC:2.3.1.74] (CHS) genes, including 16 strongly down-regulated in HD from Stage 1 to Stage 2, but these genes were unaltered or just slightly down-regulated in ZD. Flavanones are produced from chalcones via chalcone isomerase [EC:5.5.1.6] (CHI). We detected four CHI down-regulated in HD, but they were all unaffected in ZD at Stage 2. The pathway is further catalyzed by flavanone 3-hydroxylase [EC:1.14.11.9] (F3H) to yield dihydrokaempferol and subsequently by flavonoid 3'-monooxygenase [EC:1.14.14.82] (F3'H) to yield dihydroquercetin. We found 10 F3'H DEGs, nine of which were strongly silenced in HD from Stage 1 to Stage 2, but were unaffected or just slightly down-regulated in ZD. Dihydroflavonol 4-reductase (DFR) catalyzes the synthesis of leucoanthocyanidins, which could be converted into anthocyanidins (by anthocyanidin synthase [EC:1.14.20.4] (ANS)) or proanthocyanidins (by leucoanthocyanidin reductase [EC:1.17.1.3] (LAR)). In contrast to the previous flavonoid–anthocyanin biosynthetic DEGs, we found only one LAR (F01\_transcript/54491) up-regulated in ZD but not affected in HD from Stage 1 to Stage 2, denoting a mechanism toward a high accumulation of proanthocyanidins in ZD leaves. Finally, anthocyanidins are converted into anthocyanins via UDP-flavonoid glucosyl transferase (UFGT) [13]. UFGT genes are active in the last step of anthocyanin modifications and without their actions, anthocyanins are unstable and can not accumulate in the cells to give the purple-red pigmentation [23]. In this study, we observed 12 various DEGs involved in this last step of anthocyanin modification, including two flavonol 3-O-glucosyltransferase [EC:2.4.1.91] (UFGT), seven anthocyanidin 3-O-glucosyltransferase [EC:2.4.1.115] (UA3GT), two anthocyanidin 5,3-O-glucosyltransferase [EC:2.4.1.-] (GT1), and one anthocyanidin 3-O-glucoside 2000-O-xylosyltransferase [EC:2.4.2.51] (UGT). Interestingly, the expression levels of 11 out of these 12 genes were highly repressed from Stage 1 to Stage 2 in HD, while the majority was stably expressed in ZD. Distinctively, the gene F01\_transcript/29830 (anthocyanidin 3-O-glucosyltransferase (UA3GT) was strongly up-regulated in ZD but was found to be repressed in HD. Collectively, our results demonstrate that in contrast to HD, ZD tends to keep a high activity level of key genes involved in the flavonoid–anthocyanin biosynthesis pathways throughout the leaf developmental stages in order to maintain the synthesis, accumulation, and modification of anthocyanins (probably proanthocyanidins, too). *Int. J. Mol. Sci.* **2019**, *20*, x FOR PEER REVIEW 8 of 17 leaf color, we mapped them on the flavonoid–anthocyanin biosynthesis pathways (Figure 5), which have been well characterized in plants [11,12]. The main precursors for flavonoids are 4-coumaroyl CoA and three molecules of malonyl CoA that produce chalcones by chalcone synthase (CHS) [13]. We identified 17 chalcone synthase [EC:2.3.1.74] (CHS) genes, including 16 strongly down-regulated in HD from Stage 1 to Stage 2, but these genes were unaltered or just slightly down-regulated in ZD. Flavanones are produced from chalcones via chalcone isomerase [EC:5.5.1.6] (CHI). We detected four CHI down-regulated in HD, but they were all unaffected in ZD at Stage 2. The pathway is further catalyzed by flavanone 3-hydroxylase [EC:1.14.11.9] (F3H) to yield dihydrokaempferol and subsequently by flavonoid 3'-monooxygenase [EC:1.14.14.82] (F3'H) to yield dihydroquercetin. We found 10 F3'H DEGs, nine of which were strongly silenced in HD from Stage 1 to Stage 2, but were unaffected or just slightly down-regulated in ZD. Dihydroflavonol 4-reductase (DFR) catalyzes the synthesis of leucoanthocyanidins, which could be converted into anthocyanidins (by anthocyanidin synthase [EC:1.14.20.4] (ANS)) or proanthocyanidins (by leucoanthocyanidin reductase [EC:1.17.1.3] (LAR)). In contrast to the previous flavonoid–anthocyanin biosynthetic DEGs, we found only one LAR (F01\_transcript/54491) up-regulated in ZD but not affected in HD from Stage 1 to Stage 2, denoting a mechanism toward a high accumulation of proanthocyanidins in ZD leaves. Finally, anthocyanidins are converted into anthocyanins via UDP-flavonoid glucosyl transferase (UFGT) [13]. UFGT genes are active in the last step of anthocyanin modifications and without their actions, anthocyanins are unstable and can not accumulate in the cells to give the purple-red pigmentation [23]. In this study, we observed 12 various DEGs involved in this last step of anthocyanin modification, including two flavonol 3-O-glucosyltransferase [EC:2.4.1.91] (UFGT), seven anthocyanidin 3-O-glucosyltransferase [EC:2.4.1.115] (UA3GT), two anthocyanidin 5,3-Oglucosyltransferase [EC:2.4.1.-] (GT1), and one anthocyanidin 3-O-glucoside 2'''-O-xylosyltransferase [EC:2.4.2.51] (UGT). Interestingly, the expression levels of 11 out of these 12 genes were highly repressed from Stage 1 to Stage 2 in HD, while the majority was stably expressed in ZD. Distinctively, the gene F01\_transcript/29830 (anthocyanidin 3-O-glucosyltransferase (UA3GT) was strongly upregulated in ZD but was found to be repressed in HD. Collectively, our results demonstrate that in contrast to HD, ZD tends to keep a high activity level of key genes involved in the flavonoid– anthocyanin biosynthesis pathways throughout the leaf developmental stages in order to maintain the synthesis, accumulation, and modification of anthocyanins (probably proanthocyanidins, too).

**Figure 5.** Flavonoid–anthocyanin biosynthetic genes in *L. indica*. The differentially expressed genes between HD and ZD from Stage 1 (young leaves) to Stage 2 (mature leaves) are highlighted in red color. Phenylalanine ammonia-lyase (PAL), cinnamic acid 4-hydroxylase (C4H), 4 coumarate CoA ligase (4CL), chalcone synthase (CHS), chalcone isomerase (CHI), flavanone 3-hydroxylase (F3H), **Figure 5.** Flavonoid–anthocyanin biosynthetic genes in *L. indica*. The differentially expressed genes between HD and ZD from Stage 1 (young leaves) to Stage 2 (mature leaves) are highlighted in red color. Phenylalanine ammonia-lyase (PAL), cinnamic acid 4-hydroxylase (C4H), 4 coumarate CoA ligase (4CL), chalcone synthase (CHS), chalcone isomerase (CHI), flavanone 3-hydroxylase (F3H), flavonoid 3'-monooxygenase (F30H), dihydroflavonol 4-reductase (DFR), by anthocyanidin synthase (ANS), leucoanthocyanidin reductase (LAR), UDP-flavonoid glucosyl transferase (UFGT), anthocyanidin 3-O-glucosyltransferase (UA3GT), anthocyanidin 5,3-O-glucosyltransferase (GT1), and anthocyanidin 3-O-glucoside 2000-O-xylosyltransferase (UGT). The heatmap show the log2 fold change of the gene expression from Stage 1 to Stage 2. HD represents the cultivar Lagerstroemia Dynamite, while ZD represents the cultivar Lagerstroemia Ebony Embers.

#### *2.5. Active MYB Transcripion Factors Regulating Gene Expression for the Di*ff*erential Leaf Color Phenotypes*

It has been documented in several plant species that the structural genes involved in the flavonoid–anthocyanin biosynthesis pathways are mainly regulated by MYB transcription factors [24]. In total, 663 and 620 TFs DEGs were involved in gene regulation activity from Stage 1 to Stage 2 in ZD and HD, respectively (Table S5). Among these TFs, we retrieved 61 and 60 MYB TFs in ZD and HD, respectively. Comparative analysis of the gene expression fold change of these MYB TFs showed that 36 MYBs were commonly differentially expressed in both cultivars with similar fold changes within each stage (Table S6). However, we uncovered 49 other MYBs genes, which exhibited contrasting expression patterns between the two cultivars and are likely to be the key regulators of the structural genes involved in the flavonoid–anthocyanin biosynthesis pathways in *L. indica* (Table S7).

The gene co-expression network approach constructs the network of genes (co-expressed modules) with co-activation across a group of samples. Genes with similar expression patterns under multiple, but resembling experimental conditions have a high probability of sharing similar functions or being involved in related biological pathways [25,26]. To better decipher the regulation pattern of the structural genes involved in the flavonoid–anthocyanin biosynthesis pathways by the candidate regulator MYBs, we performed a gene co-expression analysis [25]. To give more power to the gene co-expression analysis, we further sequenced the transcriptome from leaves of HD at two intermediate stages (IS-1 and IS-2) between Stage 1 and Stage 2, when the leaf color gradually changes form purple-red to green (Figure S2, Table S8). Gene co-expression analysis of a total of 18 RNA-seq data resulted into 22 co-expressed gene modules (Figure S3, Table S9). Interestingly, 19 key MYB regulators and 32 flavonoid–anthocyanin biosynthetic genes were co-expressed in three different modules: dark red, yellow, and blue. In each of these modules, the MYB transcription factors have a high probability of regulating the target co-expressed flavonoid–anthocyanin biosynthetic genes. In the dark red module, six MYB regulators are co-expressed with 18 structural genes, including CHS, CHI, F30H, UFGT, and UA3G (Figure 6A,B). The MYBs were preferentially down-regulated in HD from Stage 1 to Stage 2, which correlated with the strong down-regulation of the target structural genes in HD. This suggests that MYBs from this module are positive modulators of color formation in leaves of *L. indica*. Similarly in the yellow module, six MYBs were strongly down-regulated in HD as compared to ZD, and this correlated with a more reduced expression level of the structural genes (CHS, F30H, UFGT, GT1, and UA3G) in HD (Figure 6C,D). Finally, in the blue module, seven MYBs were preferentially up-regulated in ZD to induce the expression levels of one LAR and one UA3G gene (Figure 6E,F). Globally, the gene co-expression analysis revealed that MYBs are positive modulators of the structural genes and the strong down-regulation of most of these MYB regulators from Stage 1 to Stage 2 observed in HD may limit the activity of the enzymes that catalyze the flavonoid–anthocyanin biosynthesis pathways, resulting in a reduced anthocyanin accumulation in the leaves.

To confirm the differential expression levels of the candidate structural genes and the co-expressed MYB transcription factors detected by the RNA-seq analysis, we conducted a quantitative real-time PCR on 21 selected genes from all modules. As expected, the qRT-PCR results were well correlated with the RNA-seq report (*R* <sup>2</sup> = 0.84; Figure S4), demonstrating the reliability of the report from this study.

*Int. J. Mol. Sci.* **2019**, *20*, x FOR PEER REVIEW 10 of 17

**Figure 6.** Gene co-expression analysis of the MYB regulators and their target genes related to the flavonoid–anthocyanin biosynthesis pathways in *L. indica*. (**A–B**) expression fold change and network of the genes co-expressed in the dark red module; (**B–C**) expression fold change and network of the genes co-expressed in the yellow module; (**D–E**) expression fold change and network of the genes coexpressed in the blue module. HD represents the cultivar Lagerstroemia Dynamite, while ZD represents the cultivar Lagerstroemia Ebony Embers. **3. Discussion Figure 6.** Gene co-expression analysis of the MYB regulators and their target genes related to the flavonoid–anthocyanin biosynthesis pathways in *L. indica*. (**A**,**B**) expression fold change and network of the genes co-expressed in the dark red module; (**B**,**C**) expression fold change and network of the genes co-expressed in the yellow module; (**D**,**E**) expression fold change and network of the genes co-expressed in the blue module. HD represents the cultivar Lagerstroemia Dynamite, while ZD represents the cultivar Lagerstroemia Ebony Embers.

#### Although it is well known that anthocyanins are the key pigments coloring plant organs [10], **3. Discussion**

their composition and concentration greatly vary among plant species [27]; therefore, it is impossible to predict the key molecules underlying specific colorations in plants without a detailed metabolic profiling. Anthocyanidins are the aglycone units of anthocyanins, and there are six major types found widely in plants, namely pelargonidin, cyanidin, peonidin, delphinidin, petunidin, and malvidin [28]. There was no previous report of the leaf anthocyanin profile in *L. indica*, but several studies were conducted on the flowers of different cultivars. Collectively, four anthocyanins were reported in *L. indica* flowers, including delphinidin 3-O-glucoside, petunidin 3-Oglucoside, cyanidin 3-O-glucoside, and malvidin 3-O-glucoside [29–32]. It is worth mentioning that these authors analyzed flowers with various colors, including purple-red, purple, purple-violet, violet, and white. In the present study, we obtained a less diverse set of anthocyanins, and only two of the flower anthocyanins (delphinidin 3-O-glucoside and cyanidin 3-O-glucoside) were detected in the leaves. This is understandable since only one leaf color was studied here. The intriguing findings in this study are those regarding the diversity of cyanidin glycoside-derived and methylated-derived compounds in both cultivars (cyanidin O-syringic acid, cyanidin 3-O-glucoside, cyanidin 3,5-O-diglucoside, and rosinidin Ohexoside), although cyanidin was only found in HD. The absence of cyanidin in ZD implies that it is systematically converted into the glycoside-derived and methylated-derived forms. Pelargonidin and cyanidin are the red series pigments in plants [29]. The absence of pelargonidin in the leaves of both cultivars indicate that cyanidin derivatives represent the main molecules conferring the purplered coloration. Our results are in agreement with the previous report of Zhang et al. [32], who showed that cyanidin 3-O-glucoside was mainly concentrated in cultivars with purple-red flowers. Although it is well known that anthocyanins are the key pigments coloring plant organs [10], their composition and concentration greatly vary among plant species [27]; therefore, it is impossible to predict the key molecules underlying specific colorations in plants without a detailed metabolic profiling. Anthocyanidins are the aglycone units of anthocyanins, and there are six major types found widely in plants, namely pelargonidin, cyanidin, peonidin, delphinidin, petunidin, and malvidin [28]. There was no previous report of the leaf anthocyanin profile in *L. indica*, but several studies were conducted on the flowers of different cultivars. Collectively, four anthocyanins were reported in *L. indica* flowers, including delphinidin 3-O-glucoside, petunidin 3-Oglucoside, cyanidin 3-O-glucoside, and malvidin 3-O-glucoside [29–32]. It is worth mentioning that these authors analyzed flowers with various colors, including purple-red, purple, purple-violet, violet, and white. In the present study, we obtained a less diverse set of anthocyanins, and only two of the flower anthocyanins (delphinidin 3-O-glucoside and cyanidin 3-O-glucoside) were detected in the leaves. This is understandable since only one leaf color was studied here. The intriguing findings in this study are those regarding the diversity of cyanidin glycoside-derived and methylated-derived compounds in both cultivars (cyanidin O-syringic acid, cyanidin 3-O-glucoside, cyanidin 3,5-O-diglucoside, and rosinidin O-hexoside), although cyanidin was only found in HD. The absence of cyanidin in ZD implies that it is systematically converted into the glycoside-derived and methylated-derived forms. Pelargonidin and cyanidin are the red series pigments in plants [29]. The absence of pelargonidin in the leaves of both cultivars indicate that cyanidin derivatives represent the main molecules conferring the purple-red coloration. Our results are in agreement with the previous report of Zhang et al. [32], who showed that cyanidin 3-O-glucoside was mainly concentrated in cultivars with purple-red flowers.

RNA sequencing offers the opportunity to simultaneously profile the expression levels of thousand of genes [33]. Zhang et al. [34] and Wang et al. [35] sequenced the leaf transcriptome in *L. indica* to study the flowering regulatory genes and powdery mildew disease responsive genes, respectively. Globally, these authors assembled ~37000 genes, which is lower than the number of unigenes reported in the present study (45925). The difference in the numbers of detected genes may be attributed to the advanced sequencing platform and bioinformatic packages employed for unigene assembly in this work. Our goal was to explore the molecular mechanism underlying the differential leaf color phenotypes in the two cutivars, with a focus on the genes involved in the biosynthetic pathway of anthocyanins and their regulators [11,12]. In fact, the quantitative and qualitative variation of anthocyanins in plants are strongly correlated with the differential expression of key structural genes involved in the anthocyanin biosynthesis pathways [18,36]. In this study, several classes of structural genes related to the flavonoid–anthocyanin biosynthesis were differentially expressed between the two cultivars and have been mapped to the early steps (chalcone synthase (CHS), chalcone isomerase (CHI), and flavonoid 3'-monooxygenase (F30H)) and late steps (leucoanthocyanidin reductase (LAR), UDP-flavonoid glucosyl transferase (UFGT), anthocyanidin 3-O-glucosyltransferase (UA3GT), anthocyanidin 5,3-O-glucosyltransferase (GT1), and anthocyanidin 3-O-glucoside 2000-O-xylosyltransferase (UGT)) (Figure 5) [12]. These genes were globally down-regulated from the young leaf stage to the mature stage in both cultivars; however, we noticed that ZD tends to maintain a stronger activity as compared to HD (Table S3), which presumably favors the observed high accumulation of anthocyanins in ZD (Figure 1E). More often, genes belonging to either the early steps or the late steps, but not both simultaneously have been reported to differentially modulate anthocyanin contents in contrasting colored samples. For example, Chen et al. [37] demonstrated that low expression levels of C4H, CHS, and F3H in white petals, contrarily to the red petals of peach, reduce the formation of dihydro-kaempferol (DHK), and thereby inhibit the anthocyanin accumulation. In addition, Jiao et al. [38] showed that PAL was weakly expressed in the white-flesh peach and limits anthocyanin production. In contrast, Zhuang et al. [21] showed that a strong anthocyanin accumulation in purple turnip was attributed to an up-regulation of DFR, ANS, and UFGT genes. LAR converts leucoanthocyanidins into proanthocyanidins. In this study, the up-regulation of the LAR gene (*F01\_transcript*/*54491*) in ZD suggests an increment of the proanthocyanidins content from Stage 1 to Stage 2, but this mechanism may not be relevant to the stable leaf coloration observed in ZD, since proanthocyanidins are colorless in nature [39]. The class of genes involved in the modification of anthocyanidins (UDP-flavonoid glucosyl transferase (UFGT), anthocyanidin 3-O-glucosyltransferase (UA3GT), anthocyanidin 5,3-O-glucosyltransferase (GT1), and anthocyanidin 3-O-glucoside 2000-O-xylosyltransferase (UGT)) was particularly enriched (Figure 5). This was expected, since the anthocyanins detected in leaf samples were mainly glycoside-derived compounds (Table S1). Anthocyanidins are highly unstable and easily susceptible to degradation; therefore, glycosylation is essential to stabilize them [40]. Furthermore, glycosylation serve as a signal for transport of the anthocyanins to vacuoles, where they can function as pigments [41]. Since most of these genes were higher expressed in ZD than HD, correlating with the stronger content of glycoside-derived anthocyanins, we deduce that the glycosylation of anthocyanins (particularly cyanidin) is a key mechanism for the stable purple-red colored leaf phenotype observed in ZD, exactly as previously demonstrated in peach [42].

The expression levels of structural genes involved in the flavonoid–anthocyanin pathway are in part regulated by transcription factors (TF), particularly by the MYB family members [24]. We uncovered 49 candidate MYBs that are likely to be the key regulators of the structural genes involved in the flavonoid–anthocyanin pathway (Table S7). Many studies have reported several differentially expressed MYB genes as potential regulators, but the target genes of each specific MYB regulator and the regulatory network are often overlooked. The mechanisms of gene expression regulation by a TF could be simple (direct binding to the binding motif in the promoter region of the targets) or more complex, involving other cofactors. An example of the complex regulation mechanism is the feed-forward loop mechanism where a three-gene pattern is composed of two input transcription factors, one of which regulates the other, which both jointly regulate a target gene [43]. Hence, it is essential to clearly delineate the

network of interaction between candidate MYBs and their targets in order to facilitate the directional manipulation of the expression levels of the structural genes involved in the flavonoid–anthocyanin pathway. In this study, we revealed three co-expressed modules containing candidate MYB regulators and their target structural genes (Figure 6). Overall, we found that MYBs are positive regulators of these structural genes; therefore, increasing the activity levels of some MYBs from these co-expressed modules, particularly those from the dark red and yellow modules, may have high potential to confer stable purple-red coloration in the leaves of HD and other *L. indica* cultivars.

#### **4. Materials and Methods**

#### *4.1. Plant Materials*

Two cultivars of *Lagerstroemia indica* L. were used as plant materials. The cultivar Lagerstroemia Dynamite was developed by the Carl Whitcomb breeding program, (Carl Whitcomb Lacebark Inc. Stillwater, OK, USA) and features purple-red young leaves, which gradually turn into a green color when they mature (Figure 1, Figure S2). The second cultivar, Lagerstroemia Ebony Embers released by the USDA, displays stable purple-red leaves throughout all the leaf developmental stages. In this study, Lagerstroemia Dynamite and Lagerstroemia Ebony Embers were named as HD and ZD, respectively. Both cultivars were grown under natural environmental conditions in the experimental station of the Hunan Academy of Forestry, China. Leaf blades were collected at different developmental stages (Figure 1, Figure S2) from three independent plants (2 years old) of each cultivar, quickly frozen in liquid nitrogen, and stored at −80 ◦C until further use.

#### *4.2. Anthocyanin Analysis*

The sample preparation, extract analysis, anthocyanin identification and quantification were performed at Wuhan MetWare Biotechnology Co., Ltd. (www.metware.cn) following their standard procedures and previously described by Cao et al. [22]. Before the data analysis, quality control (QC) analysis was conducted to confirm the reliability of the data. The QC sample was prepared by the mixture of sample extracts and inserted into every four samples to monitor the changes in repeated analyses. Data matrices with the intensity of the metabolite features from the samples were uploaded to the Analyst 1.6.1 software (AB SCIEX, Ontario, Canada) for statistical analyses. The supervised multivariate method, partial least squares-discriminant analysis (PLS-DA), was used to maximize the metabolome differences between the pair of samples. The relative importance of each metabolite to the PLS-DA model was checked using the parameter called variable importance in projection (VIP). Metabolites with VIP ≥1 and fold change ≥2 or fold change ≤0.5 were considered as differential metabolites for group discrimination [22].

#### *4.3. Transcriptome Sequencing and Data Analysis*

RNA extraction, transcriptome library preparation, sequencing, and bioinformatics analysis were conducted at the Biomarker Technologies (Beijing, China, www.biomarker.com.cn) following their standard procedures and previously described by Zhu et al. [44]. Briefly, total RNA was extracted from the leaf samples using a Spin Column Plant total RNA Purification Kit (Sangon Biotech, Shanghai, China) according to the manufacturer's instructions. Sequencing libraries were constructed following the protocol of the Gene Expression Sample Prep Kit (Illumina, San Diego, CA, USA). The first-strand cDNAs were synthesized from the total RNA with random hexamer primers, followed by second-strand cDNAs synthesis using DNA polymerase I (New England BioLabs, Ipswich, MA, USA) and RNase H (Invitrogen, Waltham, MA, USA). After end repair, adaptor ligation, and the addition of index codes for each sample, PCR amplification was conducted. The purity and quality of the libraries were measured by an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA) and Qubit 2.0 (Life Technologies, Carlsbad, CA, USA). Then, the libraries were pair-end sequenced by using the Illumina HiSeq 2500 platform (Illumina Inc., San Diego, USA).

The raw RNA-seq reads were quality-checked with the FastQC package (http://www. bioinformatics.babraham.ac.uk/projects/fastqc/), and adaptor sequences and low-quality reads (containing >50% bases with a Phred quality score <15 and reads with more than 1% ambiguous residues N) were removed. The high-quality reads from all the libraries were de novo assembled into transcripts using the software Trinity (version r20140717, [45]) by employing the paired-end method. Next, the transcripts were realigned to construct unigenes. The assembled unigenes were annotated by searching against various databases such as the Kyoto Encyclopedia of Genes and Genomes (KEGG) [46], Gene Ontology (GO) [47], Clusters of Orthologous Groups (COG) [48], PfAM [49], Swiss-Prot [50], eggNOG [51], NR [52], and euKaryotic Orthologous Groups (KOG) [53] using BLAST [54] with a threshold of E-value <sup>&</sup>lt;1.0 <sup>×</sup> <sup>10</sup>−<sup>5</sup> . The software KOBAS2.0 [55] was employed to get the unigene KEGG orthology. The analogs of the unigene amino acid sequences were searched against the Pfam database [48] using the HMMER tool [56] with a threshold of E-value <sup>&</sup>lt;1.0 <sup>×</sup> <sup>10</sup>−10. The unigenes were counted and normalized into fragments per kilobase of transcript per million fragments mapped reads (FPKM) value using RSEM [57]. Differentially expressed genes (DEGs) between pairs of samples were determined using the EdgeR Bioconductor package [58]. False discovery rate values less than 0.01 and |fold change|≥2 were set as criteria to decide the significant differences in gene expression.

#### *4.4. Gene Co-Expression Analysis*

Weighted Gene Co-Expression Network Analysis (WGCNA) package version 1.61 [59] was used to construct the gene co-expression networks from the normalized log2-transformed FPKM matrix as described by Lv et al. [25] and Dossa et al. [60]. Network visualization for the co-expressed gene modules related to MYB and flavonoid–anthocyanin biosynthesis pathways was performed using the Cytoscape software version 3.6.1 [61].

#### *4.5. Quantitative RT-PCR Analysis*

Quantitative PCR was performed using the SYBR Premix Ex Taq™ Kit (Takara, Dalian, China) according to the manufacturer's instructions on the StepOne plus Real time PCR Platform (Applied Biosystems, CA, USA) with the following protocol: 95 ◦C for 10 min, followed by 40 cycles of 95 ◦C for 15 s, and at 60 ◦C for 60 s [62]. Each reaction was performed using a 20-µL mixture containing 10 µL of 2 × ChamQ SYBR qPCR Master Mix, 6 µL of nuclease-free water, 1 µL of each primer (10 mM), and 2 µL of four-fold diluted cDNA. All of the reactions were run in 96-well plates, and each cDNA was analyzed in triplicate. Specific primer pairs of 21 selected genes were designed using the Primer Premier 5.0 [63] (Table S10). The *Actin* gene was used as the internal control. Data are presented as relative transcript levels based on the 2−∆∆Ct method [64].

**Supplementary Materials:** Supplementary materials can be found at http://www.mdpi.com/1422-0067/20/22/5636/ s1. Table S1. Quantification of the detected anthocyanins in the two *L. indica* cultivars at Stage 1 (young leaves) and Stage 2 (mature leaves); Table S2. Genes encoding transcription factors detected in *L. indica* transcriptome; Table S3. List of the differentially expressed genes between ZD and HD at Stage 1 and Stage 2 and their log2 fold change values; Table S4. List of the differentially expressed genes between Stage 1 and Stage 2 in HD and ZD that display the same fold-change patterns between the two cultivars; Table S5. Differentially expressed genes encoding transcription factors form Stage 1 to Stage 2; Table S6. List of the differentially expressed MYB genes between Stage 1 and Stage 2 in HD and ZD that display the same fold-change patterns between the two cultivars; Table S7. List of the differentially expressed MYB genes between Stage 1 and Stage 2 in HD and ZD that display different patterns of fold change of a huge difference between the two cultivars; Table S8. Overview of the transcriptome sequencing dataset and quality check of the leaves from HD at the intermediate stage (IS1 and IS2); Table S9. List of the genes belonging of each of the 22 modules detected through gene co-expression analysis; Table S10. The primer sequences of genes used for quantitative real time PCR; Figure S1. Gene ontology enrichment analysis of the differentially expressed genes between (A) HD-1\_vs\_HD-2, (B) ZD-1\_vs\_ZD-2; (C) ZD-1\_vs\_HD-1; (D) ZD-2\_vs\_HD-2. HD represents the cultivar Lagerstroemia Dynamite, while ZD represents the cultivar Lagerstroemia Ebony Embers. Figure S2. The phenotypes of HD leaves at the intermediate Stage 1 and Stage 2 when the leaf color is gradually turning from purple-red to green; Figure S3. Dendrogram clustering of the genes and identification of the co-expressed modules; Figure S4. qRT-PCR (2−∆∆*C*<sup>t</sup> ) analysis of 21 selected genes within the differentially expressed genes detected in this study. Correlation analysis between qRT-PCR and RNA-seq (log2 fold change).

**Author Contributions:** Y.C. and N.C. designed and supervised the study. Z.Q., X.W., and S.L. conducted the experiments. Z.Q., H.Z., and Y.L. performed data analysis. X.W. prepared plant materials and carried out qRT-PCR analysis. Z.Q. drafted the manuscript. All authors read and approved the final manuscript.

**Funding:** This research was funded by Hunan Provincial Natural Science Foundation (2019JJ50305) and Changsha Science and Technology Plan Project (kq1801026).

**Conflicts of Interest:** The authors declare no conflict of interest.

## **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Engineered Artificial MicroRNA Precursors Facilitate Cloning and Gene Silencing in Arabidopsis and Rice**

**Dandan Zhang 1,2,**† **, Nannan Zhang 1,3,**† **, Wenzhong Shen 1,\* and Jian-Feng Li 1,\***


Received: 24 September 2019; Accepted: 8 November 2019; Published: 10 November 2019

**Abstract:** Plant genome sequences are presently deciphered at a staggering speed, due to the rapid advancement of high-throughput sequencing technologies. However, functional genomics significantly lag behind due to technical obstacles related to functional redundancy and mutant lethality. Artificial microRNA (amiRNA) technology is a specific, reversible, and multiplex gene silencing tool that has been frequently used in generating constitutive or conditional mutants for gene functional interrogation. The routine approach to construct amiRNA precursors involves multiple polymerase chain reactions (PCRs) that can increase both time and labor expenses, as well as the chance to introduce sequence errors. Here, we report a simplified method to clone and express amiRNAs in Arabidopsis and rice based on the engineered Arabidopsis miR319a or rice miR528 precursor, which harbor restriction sites to facilitate one-step cloning of a single PCR product. Stem-loop reverse-transcriptase quantitative PCR (RT-qPCR) and functional assays validated that amiRNAs can be accurately processed from these modified precursors and work efficiently in plant protoplasts. In addition, Arabidopsis transgenic plants overexpressing the modified miR319a precursor or its derived amiRNA could exhibit strong gene silencing phenotypes, as expected. The simplified amiRNA cloning strategy will be broadly useful for functional genomic studies in Arabidopsis and rice, and maybe other dicotyledon and monocotyledon species as well.

**Keywords:** plant genome; artificial microRNA; gene silencing; Arabidopsis; rice

## **1. Introduction**

With the advent of whole-genome sequencing technologies, plant genomic data are expanding at an explosive rate. In the post-genomic era, analyzing these genomic data and studying the functions of newly discovered genes is critical for understanding the nature of plant genomes and accelerating the process of crop improvement. One of the most frequently used strategies to study gene function is to create loss-of-function mutants. In past decades, a large number of mutant libraries in model plant species, such as Arabidopsis and rice, have been constructed through physical, chemical, or biological (T-DNA and transposon insertion) mutagenesis [1–3]. However, tedious large-scale screening is required to identify the genes responsible for desired mutant phenotypes [4]. Additionally, random mutagenesis could not broadly cover the whole plant genome. Recently, the powerful CRISPR/Cas9 technology, which enables targeted genome modifications, has already revolutionized plant genome research [5]. Although the CRISPR/Cas9 system is simple, efficient, and highly specific, there are still some limitations related to its application in plant research. First, CRISPR/Cas9-mediated gene disruption is less efficient in targeting heterochromatic regions [6], limiting the range of targetable genes. Second, permanent deletion of essential genes by CRISPR/Cas9 can result in lethality [7,8]. Third, transcripts of many plant genes undergo alternative splicing (AS) in the same or different cell types, producing multiple proteins with different structural domains [9]. However, the CRISPR/Cas9 system is unable to specifically inactivate a certain AS isoform in a cell type-specific manner.

MicroRNAs (miRNAs), a class of endogenous small noncoding RNAs with the size of 21–24 nucleotides, can mediate post-transcriptional and translational gene regulation. miRNAs play important roles in diverse aspects of plant development and plant responses to biotic and abiotic stresses [10,11]. The biogenesis of miRNA is a multistep process that begins with the transcription of a miRNA gene into a primary transcript (pri-miRNA) [12]. Pri-miRNA is sequentially processed into a stem-loop structured precursor (pre-miRNA) by DICER-LIKE1 (DCL1), and pre-miRNA is then processed into miRNA/miRNA\* duplex and stabilized by methyltransferase HUA ENHANCER1 (HEN1) [13]. The methylated miRNA duplex is eventually loaded into the ARGONAUTE (AGO) protein to form the so-called RNA-induced silencing complexes (RISCs), followed by the release and degradation of miRNA\* [14]. By targeting complementary sequences, RISCs negatively regulate gene expression through mRNA degradation and/or translation inhibition [14,15].

Artificial microRNA (amiRNA) technology has already been successfully developed to silence target gene expression by producing artificially designed miRNAs using the naturally existing miRNA precursor as a backbone [16,17]. Compared to genome editing tools, the amiRNA technology offers more flexibility and reversibility in generating loss-of-function mutants without altering DNA sequences. Since the expression of amiRNAs can be tightly controlled by chemical-inducible or cell/tissue-specific promoters [17], amiRNAs are widely utilized for investigating gene functions associated with mutant lethality [18,19]. Moreover, amiRNA has a high silencing specificity and only recognizes target sequences with less than 5 mismatches [17,20], making it an ideal tool to silence individual AS isoforms or multiple genes sharing short conserved sequences [17,21].

In general, amiRNA-expressing plasmids are constructed according to the method described by Schwab et al. [17], as follows: The miRNA and miRNA\* of pre-miR319a are replaced by amiRNA/amiRNA\* sequences through site-directed mutagenesis using overlapping polymerase chain reactions (PCRs). However, this method is time-consuming and cost ineffective because it involves four PCRs using three pairs of primers. Here, we report a simplified method for amiRNA cloning. We modified the most commonly used miRNA precursor backbones, pre-miR319a for Arabidopsis or related dicot species, and pre-miR528 for rice or related monocot species, by introducing restriction sites using PCR. With the modified amiRNA backbones, only one PCR is needed to amplify the stem-loop fragment containing a newly designed amiRNA/amiRNA\* duplex with restriction enzyme sites, which can then be easily inserted into the engineered pre-miR319a or pre-miR528 in the expression vectors. We also provided evidence that amiRNAs produced in this way can be equally effective in protoplasts or transgenic plants as those produced using the traditional approach.

#### **2. Results**

#### *2.1. Strategy for Simplified amiRNA Construction Using a Modified Arabidopsis miRNA319a Backbone*

The previous overlapping PCR strategy to assemble a new amiRNA precursor involves four PCRs in two rounds (Figure 1A) [17]. To simplify the procedure and accelerate the amiRNA construction process, we tried to engineer the pre-miR319a backbone by introducing minor changes in its DNA sequences to create restriction sites for amiRNA sequence insertion (Figure 1A–C). For many plant miRNA precursors, the lower stem located ~15 nt below miRNA/miRNA\* is critical for miRNA processing. A single change in the lower stem of pre-miRNA can completely abolish miRNA processing [22–24]. The ssRNA (single strand RNA) region, an unpaired region downstream of the lower stem seems to be less important for miRNA production [23,25]. Thus, we modified the pre-miR319a backbone by

mutating GAATTG and TCTTGA sequences within the ssRNA region to *Eco*RI (GAATTC) and *Xba*I (TCTAGA) restriction sites, respectively (Figure 1B,C). After the modifications, a single PCR product of the stem-loop fragment containing the amiRNA/amiRNA\* sequences can be inserted into the amiRNA backbone (Table S1) using *Eco*RI/*Xba*I, which greatly simplifies the construction procedure and enables possible high-throughput amiRNA construction. lower stem seems to be less important for miRNA production [23,25]. Thus, we modified the premiR319a backbone by mutating GAATTG and TCTTGA sequences within the ssRNA region to *Eco*RI (GAATTC) and *Xba*I (TCTAGA) restriction sites, respectively (Figure 1B,C). After the modifications, a single PCR product of the stem-loop fragment containing the amiRNA/amiRNA\* sequences can be inserted into the amiRNA backbone (Table S1) using *Eco*RI/*Xba*I, which greatly simplifies the construction procedure and enables possible high-throughput amiRNA construction.

*Int. J. Mol. Sci.* **2019**, *20*, x FOR PEER REVIEW 3 of 14

processing [22–24]. The ssRNA (single strand RNA) region, an unpaired region downstream of the

**Figure 1.** Engineered Arabidopsis miR319a precursor allows a one-step construction of a new amiRNA

precursor. (**A**) Diagram of amiRNA construction using the engineered miR319a precursor (pre-miR319a). The upper diagram describes an overlapping PCR strategy for generating a new amiRNA precursor by 4 PCRs in two rounds. In the first round, 3 independent PCRs are performed using the indicated primers. A mixture of 3 PCR products is used as a template to conduct the second round of PCR (4th PCR) using the indicated primers. The lower diagram describes a restriction enzyme-based strategy to assemble a new amiRNA precursor. The *Eco*RI/*Xba*I sites are created at the base of miR319a stem-loop in the engineered pre-miR319a. A single PCR product amplified using a pair of mega-primers containing customized amiRNA/amiRNA\* sequences, and *Eco*RI/*Xba*I sites are digested by *Eco*RI/*Xba*I and inserted into the same digested engineered pre-miR319a. In the resulting amiRNA precursor, the amiRNA and amiRNA\* are colored in magenta and blue, respectively. (**B**) The engineered pre-miR319a contains a G-to-C mutation and a T-to-A mutation that create *Eco*RI and *Xba*I sites (underlined), respectively. The nucleotides in magenta and blue correspond to amiRNA and amiRNA\* , respectively. (**C**) Diagram of the original and engineered pre-miR319a. Mutated nucleotides are highlighted in red. (pre-miR319a). The upper diagram describes an overlapping PCR strategy for generating a new amiRNA precursor by 4 PCRs in two rounds. In the first round, 3 independent PCRs are performed using the indicated primers. A mixture of 3 PCR products is used as a template to conduct the second round of PCR (4th PCR) using the indicated primers. The lower diagram describes a restriction enzyme-based strategy to assemble a new amiRNA precursor. The *Eco*RI/*Xba*I sites are created at the base of miR319a stem-loop in the engineered pre-miR319a. A single PCR product amplified using a pair of mega-primers containing customized amiRNA/amiRNA\* sequences, and *Eco*RI/*Xba*I sites are digested by *Eco*RI/*Xba*I and inserted into the same digested engineered pre-miR319a. In the resulting amiRNA precursor, the amiRNA and amiRNA\* are colored in magenta and blue, respectively. (**B**) The engineered pre-miR319a contains a G-to-C mutation and a T-to-A mutation that create *Eco*RI and *Xba*I sites (underlined), respectively. The nucleotides in magenta and blue correspond to amiRNA and amiRNA\* , respectively. (**C**) Diagram of the original and engineered pre-miR319a. Mutated nucleotides are highlighted in red.

*Int. J. Mol. Sci.* **2019**, *20*, x FOR PEER REVIEW 4 of 14

amiRNA precursor. (**A**) Diagram of amiRNA construction using the engineered miR319a precursor

#### *2.2. Engineered Pre-miR319a Generated Functional miR319a as Demonstrated by the Silencing Phenotype 2.2. Engineered Pre-miR319a Generated Functional miR319a as Demonstrated by the Silencing Phenotype*

To test whether the engineered pre-miR319a remains functional, we generated Arabidopsis transgenic plants overexpressing the original and engineered pre-miR319a, respectively. It has been previously reported that miR319a controls Arabidopsis leaf development and morphogenesis through targeting and down-regulating the expression of several TCP (Teosinte branched1/Cycloidea/Proliferating cell factor) family members [26–29]. Arabidopsis gain-of-function mutant *jaw-D* with overexpression of miR319a exhibits a jagged and wavy leaf phenotype [26]. As expected, the transgenic plants overexpressing both the original and engineered pre-miR319a showed a curly and serrated leaf phenotype (Figure 2A). The relative abundances of mature miR319a in these transgenic plants were further determined using the stem-loop RT-qPCR technique that is specialized for accurate quantification of mature miRNA [30]. We detected comparable production of mature miR319a from engineered pre-miR319a as original pre-miR319a (Figure 2B). These results suggest that mature miR319a can be generated from the modified pre-miRNA319, as well as from the native pre-miR319a. To test whether the engineered pre-miR319a remains functional, we generated Arabidopsis transgenic plants overexpressing the original and engineered pre-miR319a, respectively. It has been previously reported that miR319a controls Arabidopsis leaf development and morphogenesis through targeting and down-regulating the expression of several TCP (Teosinte branched1/Cycloidea/Proliferating cell factor) family members [26–29]. Arabidopsis gain-of-function mutant *jaw-D* with overexpression of miR319a exhibits a jagged and wavy leaf phenotype [26]. As expected, the transgenic plants overexpressing both the original and engineered pre-miR319a showed a curly and serrated leaf phenotype (Figure 2A). The relative abundances of mature miR319a in these transgenic plants were further determined using the stem-loop RT-qPCR technique that is specialized for accurate quantification of mature miRNA [30]. We detected comparable production of mature miR319a from engineered pre-miR319a as original pre-miR319a (Figure 2B). These results suggest that mature miR319a can be generated from the modified pre-miRNA319, as well as from the native pre-miR319a.

**Figure 2.** Engineered pre-miR319a retains its function in transgenic plants. **(A)** Phenotypic comparison of transgenic Arabidopsis plants expressing the original or engineered miR319a precursor. Four-week-old plants are shown. Scale bar, 0.5 cm. **(B)** Stem-loop RT-qPCR validates comparable production of mature miR319a from the original or engineered pre-miR319a in transgenic plants. The quantitative PCR data are presented as means ± SD of at least three independent repeats with endogenous *snoR101* expression level set as 1. \*\* *p* < 0.01 (student's *t* test). **Figure 2.** Engineered pre-miR319a retains its function in transgenic plants. (**A**) Phenotypic comparison of transgenic Arabidopsis plants expressing the original or engineered miR319a precursor. Four-week-old plants are shown. Scale bar, 0.5 cm. (**B**) Stem-loop RT-qPCR validates comparableproduction of mature miR319a from the original or engineered pre-miR319a in transgenic plants. The quantitative PCR data are presented as means <sup>±</sup> SD of at least three independent repeats withendogenous *snoR101* expression level set as 1. \*\* *<sup>p</sup>* <sup>&</sup>lt; 0.01 (student's *<sup>t</sup>* test).

*2.3. amiRNAs Produced from Engineered Pre-miR319a Have Comparable Efficiencies in Gene Silencing* 

*Transgenic Plants* 

## *2.3. amiRNAs Produced from Engineered Pre-miR319a Have Comparable E*ffi*ciencies in Gene Silencing*

*Int. J. Mol. Sci.* **2019**, *20*, x FOR PEER REVIEW 5 of 14

To provide more evidence that the modifications of pre-miR319a would not affect amiRNA processing and maturation, we compared side-by-side the silencing efficiencies of amiRNAs produced from the original or engineered pre-miR319a using an ETPamir assay [31]. In ETPamir assay, a target gene encoding epitope-tagged target protein is co-expressed with individual amiRNAs in protoplasts, and the silencing efficiency of each amiRNA is inversely reflected by the accumulation of target proteins, which can be monitored by immunoblotting using anti-tag antibodies [31,32]. By targeting Arabidopsis *PHYTOENE DESATURASE 3* (*PDS3*), *MAP*/*ERK KINASE KINASE 1 (MEKK1*) or *MAP KINASE KINASE KINASE 3* (*MAPKKK3*), we found that the amiRNAs produced from the engineered pre-miR319a appeared to be as efficient as or even slightly more effective than those from the original pre-miRNA319a (Figure 3A). We also measured the abundances of mature amiRNAs produced in ETPamir assays by stem-loop RT-qPCR and found that the engineered pre-miR319a could produce comparable or even higher amounts of mature amiRNAs than the original pre-miR319a (Figure 3B). These results imply that the engineered pre-miR319a is fully functional in generating mature amiRNAs. To provide more evidence that the modifications of pre-miR319a would not affect amiRNA processing and maturation, we compared side-by-side the silencing efficiencies of amiRNAs produced from the original or engineered pre-miR319a using an ETPamir assay [31]. In ETPamir assay, a target gene encoding epitope-tagged target protein is co-expressed with individual amiRNAs in protoplasts, and the silencing efficiency of each amiRNA is inversely reflected by the accumulation of target proteins, which can be monitored by immunoblotting using anti-tag antibodies [31,32]. By targeting Arabidopsis *PHYTOENE DESATURASE 3* (*PDS3*), *MAP/ERK KINASE KINASE 1 (MEKK1*) or *MAP KINASE KINASE KINASE 3* (*MAPKKK3*), we found that the amiRNAs produced from the engineered pre-miR319a appeared to be as efficient as or even slightly more effective than those from the original pre-miRNA319a (Figure 3A). We also measured the abundances of mature amiRNAs produced in ETPamir assays by stem-loop RT-qPCR and found that the engineered pre-miR319a could produce comparable or even higher amounts of mature amiRNAs than the original premiR319a (Figure 3B). These results imply that the engineered pre-miR319a is fully functional in generating mature amiRNAs.

**Figure 3.** amiRNAs produced from engineered pre-mir319a exhibit equal efficacy in silencing target gene expression. (**A**) Comparison of the performance of amiRNAs produced from the original or engineered pre-miR319a using the ETPamir assay. amiRNAs expressed from engineered pre-miR319a are slightly more effective in silencing Arabidopsis *PDS3*, *MEKK1*, and *MAPKKK3* expression than those expressed from original pre-miR319a in protoplasts. Three independent repeats with GFP-HA as an untargeted internal control produced similar results. (**B**) Detection of mature amiRNAs produced from the original or engineered pre-miR319a in the ETPamir assay. Mature amiRNAs were detected using stem-loop RT-qPCR. The quantitative PCR data represent means ± SD of at least three independent repeats using *mCherry-HA* as a transfection control. \* *p* < 0.05, \*\* *p* < 0.01 (student's *t* test). **Figure 3.** amiRNAs produced from engineered pre-mir319a exhibit equal efficacy in silencing target gene expression. (**A**) Comparison of the performance of amiRNAs produced from the original or engineered pre-miR319a using the ETPamir assay. amiRNAs expressed from engineered pre-miR319a are slightly more effective in silencing Arabidopsis *PDS3*, *MEKK1*, and *MAPKKK3* expression than those expressed from original pre-miR319a in protoplasts. Three independent repeats with GFP-HA as an untargeted internal control produced similar results. (**B**) Detection of mature amiRNAs produced from the original or engineered pre-miR319a in the ETPamir assay. Mature amiRNAs were detected using stem-loop RT-qPCR. The quantitative PCR data represent means ± SD of at least three independent repeats using *mCherry-HA* as a transfection control. \* *p* < 0.05, \*\* *p* < 0.01 (student's *t* test).

*2.4. amiRNAs Produced from Engineered Pre-miR319a Could Effectively Silence Target Gene Expression in* 

Next, we evaluated the efficiencies of amiRNAs produced from engineered pre-miR319a in planta. *CNGC4* (CYCLIC NUCLEOTIDE-GATED CATION CHANNEL 4) was selected as the target

procedure.

#### *2.4. amiRNAs Produced from Engineered Pre-miR319a Could E*ff*ectively Silence Target Gene Expression in Transgenic Plants*

Next, we evaluated the efficiencies of amiRNAs produced from engineered pre-miR319a in planta. *CNGC4* (CYCLIC NUCLEOTIDE-GATED CATION CHANNEL 4) was selected as the target gene as the null phenotype of *CNGC4* has been reported [33,34] and is easy to observe. Three amiRNAs targeting *CNGC4* were constructed using the engineered pre-miR319a as backbone and their activities were assessed first by the ETPamir assay. The results showed that three amiR-*CNGC4s* could all suppress *CNGC4* expression, but they displayed different silencing efficiencies. amiR*-CNGC4-*1 could almost completely silence *CNGC4* expression (Figure 4A), whereas amiR*-CNGC4-*2 and amiR*-CNGC4-*3 were less effective. So, we chose amiR*-CNGC4-*1 to silence endogenous *CNGC4* in our transgenic plants. The engineered or original pre-amiR-*CNGC4-*1 construct was subsequently introduced into Arabidopsis Col-0 plants. Transgenic plants overexpressing engineered or original pre-amiR-*CNGC4-*1 both exhibited smaller leaves and shorter petioles relative to the wild-type plants, resembling the dwarf phenotype of *cngc4* T-DNA null mutant (Figure 4B). These results validate that the engineered pre-miR319a can be utilized to produce effective amiRNAs for target gene silencing. *Int. J. Mol. Sci.* **2019**, *20*, x FOR PEER REVIEW 6 of 14 gene as the null phenotype of *CNGC4* has been reported [33,34] and is easy to observe. Three amiRNAs targeting *CNGC4* were constructed using the engineered pre-miR319a as backbone and their activities were assessed first by the ETPamir assay. The results showed that three amiR-*CNGC4s*  could all suppress *CNGC4* expression, but they displayed different silencing efficiencies. amiR*-CNGC4-*1 could almost completely silence *CNGC4* expression (Figure 4A), whereas amiR*-CNGC4-*2 and amiR*-CNGC4-*3 were less effective. So, we chose amiR*-CNGC4-*1 to silence endogenous *CNGC4* in our transgenic plants. The engineered or original pre-amiR-*CNGC4-*1 construct was subsequently introduced into Arabidopsis Col-0 plants. Transgenic plants overexpressing engineered or original pre-amiR-*CNGC4-*1 both exhibited smaller leaves and shorter petioles relative to the wild-type plants, resembling the dwarf phenotype of *cngc4* T-DNA null mutant (Figure 4B). These results validate that the engineered pre-miR319a can be utilized to produce effective amiRNAs for target gene silencing.

**Figure 4.** amiRNAs produced from the engineered pre-miR319a are fully functional. **(A)** Comparison of the silencing efficiency of three *CNGC4*-targeting amiRNAs expressed from the engineered pre-miR319a using the ETPamir assay. Note that amiR-*CNGC4*-1 (red) is the most potent amiRNA for silencing *CNGC4*. Three independent repeats with GFP-HA as an untargeted internal control produced similar results. **(B)** Comparison of the performance of amiR-*CNGC4*-1 produced from the original or engineered amiRNA precursor in transgenic plants. *cngc4* is a T-DNA insertion null mutant of *CNGC4*. Scale bar, 0.5 cm. **Figure 4.** amiRNAs produced from the engineered pre-miR319a are fully functional. (**A**) Comparison of the silencing efficiency of three *CNGC4*-targeting amiRNAs expressed from the engineered pre-miR319a using the ETPamir assay. Note that amiR-*CNGC4*-1 (red) is the most potent amiRNA for silencing *CNGC4*. Three independent repeats with GFP-HA as an untargeted internal control produced similar results. (**B**) Comparison of the performance of amiR-*CNGC4*-1 produced from the original or engineered amiRNA precursor in transgenic plants. *cngc4* is a T-DNA insertion null mutant of *CNGC4*. Scale bar, 0.5 cm.

#### *2.5. Strategy for Simplified amiRNA Construction Using a Modified Rice miRNA528 Backbone 2.5. Strategy for Simplified amiRNA Construction Using a Modified Rice miRNA528 Backbone*

Rice miR528 precursor (pre-miR528) is frequently used for generating amiRNAs and gene silencing in many monocot species [35–37]. To test whether the same strategy can be applied for amiRNA production using pre-miR528, we engineered pre-miR528 by mutating AGGTCT and GAAGTT sequences in the ssRNA region to *StuI* (AGGCCT) and *Eco*RI (GAATTC) restriction sites, respectively (Figure 5A–C). Therefore, PCR products of the stem-loop fragment containing the amiRNA/amiRNA\* duplex can be generated using a pair of mega primers (Table S2) and be readily inserted into the engineered pre-miR528 after *Stu*I/*Eco*RI digestion and ligation. We next evaluated the silencing efficiencies of amiRNAs produced from the engineered pre-Rice miR528 precursor (pre-miR528) is frequently used for generating amiRNAs and gene silencing in many monocot species [35–37]. To test whether the same strategy can be applied for amiRNA production using pre-miR528, we engineered pre-miR528 by mutating AGGTCT and GAAGTT sequences in the ssRNA region to *StuI* (AGGCCT) and *Eco*RI (GAATTC) restriction sites, respectively (Figure 5A–C). Therefore, PCR products of the stem-loop fragment containing the amiRNA/amiRNA\* duplex can be generated using a pair of mega primers (Table S2) and be readily inserted into the engineered pre-miR528 after *Stu*I/*Eco*RI digestion and ligation.

miR528 using the ETPamir assay. The amiRNAs produced from both original and engineered premiR528 could trigger efficient silencing of the foreign gene *GFP* and the rice endogenous gene *OsCEBiP* (*CHITIN ELICITOR BINDING PROTEIN*) in rice cells (Figure 6). There is no detectable difference in silencing efficiencies between amiRNAs produced from the original or engineered premiR528 (Figure 6). These data indicate that the same strategy could be applied to pre-miR528

*Int. J. Mol. Sci.* **2019**, *20*, x FOR PEER REVIEW 7 of 14

**Figure 5.** The engineered rice miR528 precursor allows a one-step construction of a new amiRNA precursor. **(A)** Diagram of amiRNA construction using the engineered miR528 precursor (premiR528). In the engineered pre-miR528, *Stu*I /*Eco*RI sites are created at the base of miR528 stem-loop. The amiRNA and amiRNA\* are colored in magenta and blue, respectively. **(B)** The engineered premiR528 contains mutations that can create *Stu*I and *Eco*RI sites (underlined), respectively. The nucleotides in magenta and blue correspond to amiRNA and amiRNA\* , respectively. **(C)** Diagram of the original and engineered pre-miR528. Mutated nucleotides are highlighted in red. **Figure 5.** The engineered rice miR528 precursor allows a one-step construction of a new amiRNA precursor. (**A**) Diagram of amiRNA construction using the engineered miR528 precursor (pre-miR528). In the engineered pre-miR528, *Stu*I /*Eco*RI sites are created at the base of miR528 stem-loop. The amiRNA and amiRNA\* are colored in magenta and blue, respectively. (**B**) The engineered pre-miR528 contains mutations that can create *Stu*I and *Eco*RI sites (underlined), respectively. The nucleotides in magenta and blue correspond to amiRNA and amiRNA\* , respectively. (**C**) Diagram of the original and engineered pre-miR528. Mutated nucleotides are highlighted in red.

We next evaluated the silencing efficiencies of amiRNAs produced from the engineered pre-miR528 using the ETPamir assay. The amiRNAs produced from both original and engineered pre-miR528 could trigger efficient silencing of the foreign gene *GFP* and the rice endogenous gene *OsCEBiP* (*CHITIN ELICITOR BINDING PROTEIN*) in rice cells (Figure 6). There is no detectable difference in silencing efficiencies between amiRNAs produced from the original or engineered pre-miR528 (Figure 6). These data indicate that the same strategy could be applied to pre-miR528 engineering, which leads to production of functional amiRNAs in rice, but with a simple cloning procedure. *Int. J. Mol. Sci.* **2019**, *20*, x FOR PEER REVIEW 8 of 14

**Figure 6.** amiRNAs produced from the engineered pre-miR528 are functional in rice cells. Comparison of the performance of amiRNAs produced from the original or engineered pre-miR528 was conducted by the ETPamir assay. Three independent repeats with GFP-FLAG or BAK1-HA as an untargeted internal control produced similar results. **Figure 6.** amiRNAs produced from the engineered pre-miR528 are functional in rice cells. Comparison of the performance of amiRNAs produced from the original or engineered pre-miR528 was conducted by the ETPamir assay. Three independent repeats with GFP-FLAG or BAK1-HA as an untargeted internal control produced similar results.

#### **3. Discussion 3. Discussion**

The amiRNA technology is not only a powerful genetic tool for generating loss-of-function mutants in basic plant research, but is also an effective strategy to engineer crops for beneficial agronomic traits [38,39] and enhanced disease resistance against pathogens [40–43] or pests [44,45]. As amiRNA is produced from an endogenous plant miRNA precursor, and the promoter and terminator of an amiRNA expression cassette can be derived from plants, this technology may raise minimal concerns about introducing foreign genetic elements into engineered crops. The amiRNA technology is not only a powerful genetic tool for generating loss-of-function mutants in basic plant research, but is also an effective strategy to engineer crops for beneficial agronomic traits [38,39] and enhanced disease resistance against pathogens [40–43] or pests [44,45]. As amiRNA is produced from an endogenous plant miRNA precursor, and the promoter and terminator of an amiRNA expression cassette can be derived from plants, this technology may raise minimal concerns about introducing foreign genetic elements into engineered crops.

Selecting a suitable miRNA precursor backbone to express amiRNAs is vital for successfully silencing target gene expression. Many plant miRNA precursors such as Arabidopsis miR319a, miR172a, miR395, miR390 and rice miR390, and miR528 have been used as backbones for expressing amiRNAs to confer specific gene silencing [17,35,46–48]. The Arabidopsis miR319a and rice miR528 precursors are the most commonly used amiRNA expression backbones that have been widely used to generate loss-of-function mutants in dicot and monocot plant species, since they are highly conserved across the plant kingdom [17,49,50]. However, the traditional overlapping PCR approach to construct amiRNA plasmid is tedious, time-consuming, and inefficient, especially for highthroughput application [17]. In this study, we provide a simplified method by mutating the sequence of pre-miR319a and pre-miR528 to create restriction sites for subsequent insertion of PCR products (Figures 1 and 5). Therefore, the customized amiRNA/amiRNA\* sequences can be easily inserted into the backbone of pre-miR319a and pre-miR528 by one single PCR, followed by restriction digestion and ligation. Our new strategy could dramatically improve the efficiency of amiRNA construction. Selecting a suitable miRNA precursor backbone to express amiRNAs is vital for successfully silencing target gene expression. Many plant miRNA precursors such as Arabidopsis miR319a, miR172a, miR395, miR390 and rice miR390, and miR528 have been used as backbones for expressing amiRNAs to confer specific gene silencing [17,35,46–48]. The Arabidopsis miR319a and rice miR528 precursors are the most commonly used amiRNA expression backbones that have been widely used to generate loss-of-function mutants in dicot and monocot plant species, since they are highly conserved across the plant kingdom [17,49,50]. However, the traditional overlapping PCR approach to construct amiRNA plasmid is tedious, time-consuming, and inefficient, especially for high-throughput application [17]. In this study, we provide a simplified method by mutating the sequence of pre-miR319a and pre-miR528 to create restriction sites for subsequent insertion of PCR products (Figures 1 and 5). Therefore, the customized amiRNA/amiRNA\* sequences can be easily inserted into the backbone of pre-miR319a and pre-miR528 by one single PCR, followed by restriction digestion and ligation. Our new strategy could dramatically improve the efficiency of amiRNA construction.

Many plant miRNAs such as Arabidopsis miR172a and miR169a are processed in a canonical "base to loop" manner [22–24]. For these miRNA precursors, the secondary structure of the lower stem is essential for miRNA processing. Disruption of the closing bulge structure in the lower stem by point mutation affects miRNA accumulation [23,24]. Meanwhile, other miRNAs such as Arabidopsis miR319a and miR159a have been reported to be processed in a "loop to base" direction [25]. Although complete removal of the lower stem sequences seemed to have little impact on the miR319 production, overexpressing the pre-miR319a lacking the lower stem caused a less severe leaf crinkled phenotype compared with overexpressing the full-length pre-miR319a [25]. We speculated that deletion of the lower stem bases may impair the accuracy of miR319a processing. Thus, fulllength precursors are maintained as backbones for constructing engineered amiRNA vectors. In our case, the mutated sequences of the engineered pre-miR319a and pre-miR528 are located in the unpaired ssRNA region (Figures 1 and 5), which may be less important for miRNA processing [25]. Indeed, the modifications on the pre-miR319a and pre-miR528 have little influence on their processing and functionality (Figures 2–4 and 6). It has been reported that miRNAs repress target gene expression through two modes of action, mRNA cleavage and translation inhibition [51]. Many plant miRNAs such as Arabidopsis miR172a and miR169a are processed in a canonical "base to loop" manner [22–24]. For these miRNA precursors, the secondary structure of the lower stem is essential for miRNA processing. Disruption of the closing bulge structure in the lower stem by point mutation affects miRNA accumulation [23,24]. Meanwhile, other miRNAs such as Arabidopsis miR319a and miR159a have been reported to be processed in a "loop to base" direction [25]. Although complete removal of the lower stem sequences seemed to have little impact on the miR319 production, overexpressing the pre-miR319a lacking the lower stem caused a less severe leaf crinkled phenotype compared with overexpressing the full-length pre-miR319a [25]. We speculated that deletion of the lower stem bases may impair the accuracy of miR319a processing. Thus, full-length precursors are maintained as backbones for constructing engineered amiRNA vectors. In our case, the mutated sequences of the engineered pre-miR319a and pre-miR528 are located in the unpaired ssRNA region (Figures 1 and 5), which may be less important for miRNA processing [25]. Indeed, the modifications on the pre-miR319a and pre-miR528 have little influence on their processing and functionality (Figures 2–4 and 6). It has been reported that miRNAs repress target gene expression through two modes of action, mRNA cleavage and translation inhibition [51]. However, the evaluation

However, the evaluation of efficacy of amiRNA in many studies is largely based on the measurement of target mRNA [45–48], without checking the abundance of target proteins. Using the ETPamir of efficacy of amiRNA in many studies is largely based on the measurement of target mRNA [45–48], without checking the abundance of target proteins. Using the ETPamir assay, we proved that amiRNAs produced by engineered pre-miR319a and pre-miR528 could effectively block the target protein accumulation (Figures 3 and 6). The high efficacy of amiRNA expressed from the engineered pre-miR319a was further confirmed by the dwarf phenotype of amiR-*CNGC4*-1 overexpression lines (Figure 4). Taken together, we reason that the engineered amiRNA backbones should be fully functional as their original precursors.

In comparison to other amiRNA construction methods, our approach offers two advantages, as follows: First, full-length precursors are used as backbones and no changes are made in the lower stem, allowing amiRNAs to be processed accurately. Although some simplified amiRNA construction methods have been previously reported, those studies used the precursors either without the lower stem [47,52,53] or with a mutated lower stem [46,54,55] to express amiRNAs. Although the precursors in those methods could successfully produce amiRNAs and suppress target gene expression, at least in some cases there is no convincing evidence to prove that these truncated or mutated precursors are functioning equally like the full-length natural precursors. Second, we used the conventional restriction digestion-ligation strategy to construct amiRNA vectors, which balances the cloning efficiency and cost. Compared with the Gateway cloning system [56] and TA-based cloning system [52,54], whose cloning efficiency largely relies on commercial cloning kits or relatively expensive enzymes, our method is apparently more cost-saving.

In conclusion, we explored a simple and efficient method to construct amiRNA expression cassettes by creating restriction sites within the basal region of Arabidopsis and rice amiRNA precursors. We demonstrated that these modified amiRNA precursors are fully functional in plant protoplasts and transgenic plants. Hopefully, this new amiRNA cloning strategy will be useful for genome research in dicot and monocot plant species.

#### **4. Materials and Methods**

#### *4.1. Plant Growth*

Wild-type Col-0 or transgenic *Arabidopsis thaliana* plants were grown in a plant growth room on moistened Jiffy soil (Jiffy Substrates ®, Jiffy Group, Pärnumaa, Estonia), which are high-quality sphagnum peat-based growing substrates with a high organic content and water capacity to encourage rapid rooting and uniform growth. The Arabidopsis growth conditions are fixed at 65% humidity and 75 <sup>µ</sup>mol·m−<sup>2</sup> ·s −1 light intensity under photoperiods of 12 h light at 23 ◦C and 12 h dark at 20 ◦C. Zhonghua 11 rice (*Oryza sativa*) plants were grown on Jiffy soil in a plant growth chamber under photoperiods of 12 h light (200 <sup>µ</sup>mol·m−<sup>2</sup> ·s −1 ) at 30 ◦C and 12 h dark at 27 ◦C, with a constant humidity of 70%.

#### *4.2. Plasmid Construction*

Routine molecular cloning procedures were followed for plasmid construction. The original sequences of Arabidopsis miR319a or rice miR528 backbones were mutagenized by PCR-based mutagenesis to generate engineered miRNA precursors (pre-amiRNA). The amiRNA vectors *HBT-amiR-MEKK1*, *HBT-amiR-PDS3*, and *HBT-amiR-MAPKKK3* were constructed using pre-miR319a as the backbone, while *HBT-amiR-GFP* and *HBT-amiR-OsCEBiP* were constructed using pre-miR528 as the backbone. The original amiRNA vectors were cloned by traditional overlapping PCR, described by Schwab et al. [17]. The engineered amiRNA vectors were constructed as follows. Briefly, mega-primers containing customized amiRNA/amiRNA\* sequences were used for PCR amplification of primary miRNA fragment containing a new amiRNA/amiRNA\* duplex using the original pre-miR319a or pre-miR528 as PCR template. PCR amplicons were digested by *Eco*RI/*Xba*I or *StuI*/*Eco*RI and inserted into the same digested HBT vector harboring the engineered pre-miR319a or pre-miR528. For plant

transformation, the pre-amiRNA was digested by *Bam*HI/*Pst*I and inserted into the same digested pCB302 binary vector.

To express a target gene encoding double HA- or FLAG-tagged target protein in protoplasts, the full-length coding sequences of target genes were amplified by RT-PCR, digested by *Bam*HI/*Stu*I and inserted into the same digested *HBT-2HA* or *HBT-2FLAG* vector, where the target gene expression is driven by the *35S* promoter.

#### *4.3. Protoplast Isolation*

Four-week-old Arabidopsis or 10-day-old *Oryza sativa* (Zhonghua 11 rice) seedlings were used for protoplast isolation according to the procedure described previously [32,57]. Briefly, leaves of Arabidopsis or sheaths of rice were cut into 0.5-mm strips with a sterile razor blade. The strips were digested in 10 mL enzyme solution (1.5% cellulase R10, 0.2% macerozyme R10, 0.4 M mannitol, 20 mM KCl, 20 mM MES, pH 5.7, 10 mM CaCl2, and 0.1% BSA) at room temperature for 3 h under a dark condition. After mixing with 10 mL W5 solution (154 mM NaCl, 125 mM CaCl2, 5 mM KCl, and 2 mM MES, pH 5.7), the digestion mixture was filtered through a 75 µm FALCON cell strainer. Protoplasts were collected by centrifugation in a CL2 clinical centrifuge (Thermo Scientific, Weaverville, North Carolina, USA) for 2 min at 100× *g* for Arabidopsis or 5 min at 200× *g* for rice. Cells were resuspended with 10 mL W5 solution and rested on ice for 30 min. Before transfection, protoplasts were pelleted by centrifugation for 1 min at 100× *g* for Arabidopsis or 3 min at 200× *g* for rice, and were then resuspended with MMg solution (0.4 M mannitol, 15 mM MgCl2, and 4 mM MES, pH 5.7) to a final concentration of 2 <sup>×</sup> <sup>10</sup><sup>5</sup> cells per ml.

#### *4.4. Protoplast Transfection and ETPamir Assay*

DNA transfection was performed in a 2-mL round-bottom microcentrifuge tube, where 200 µL protoplasts were mixed with 21 µL (2 µg/µL) DNA cocktail and 220 µL PEG solution (40% PEG4000, *v*/*v*, 0.2 M mannitol and 0.1 M CaCl2), gently. After incubated at room temperature for 5 min (light) for Arabidopsis protoplasts or 15 min (dark) for rice protoplasts, transfection was quenched by adding 880 µL W5 solution. Transfected protoplasts were collected by centrifugation for 2 min at 100× *g* for Arabidopsis or 5 min at 200× *g* for rice, and were resuspended with 100 µL W5 solution. The cells were then transferred into 1 mL WI solution (0.5 M mannitol, 4 mM MES, pH 5.7, and 20 mM KCl) in a 6-well plate and were incubated in the dark.

The ETPamir assay was conducted according to the method described previously [31,32]. Briefly, 200 µL protoplasts were transfected with a DNA cocktail (2 µg/µL) containing 16 µL amiRNA expression construct, 4 µL target gene-HA/FLAG expression construct, and 1 µL transfection control plasmid expressing GFP-HA/FLAG. In parallel, a negative control was set up by replacing the amiRNA expression construct with an equal amount of empty vector. After co-transfection, protoplasts were incubated for 18–36 h in dark and then were collected for western blot analysis. The amiRNA performance was inversely correlated with the target protein accumulation.

#### *4.5. Western Blot*

After centrifugation, protoplasts were directly lysed with the lysis buffer (10 mM HEPES, pH 7.5, 100 mM NaCl, 1 mM EDTA, 10% Glycerol). The lysates were mixed with 6 × SDS-PAGE loading buffer and heated at 95 ◦C for 5 min or 55 ◦C for 10 min. Total proteins were subjected to SDS-PAGE (10%) and immunoblotting with anti-HA (Roche) or anti-FLAG (Sigma-Aldrich, Saint Louis, Missouri, USA) antibodies.

#### *4.6. Generation and Screen of Transgenic Plants*

The recombinant pCB302 binary plasmids were introduced into *Agrobacterium tumefaciens GV3101* cells by electroporation, which were in turn used for floral dip-mediated Arabidopsis transformation [58]. Transgenic Arabidopsis plants were selected on 1/2 MS medium containing 12.5 mg/L glufosinate ammonium.

#### *4.7. RNA Extraction and Mature amiRNA Detection*

For mature amiRNA detection in protoplasts, a total of 400 µL Arabidopsis protoplasts co-transfected with *HBT-pre-amiRNA* (original or engineered) plasmid and *pAN-mCherry-HA* plasmid were used for RNA extraction. For mature miR319a detection in transgenic plants, 30 mg rosette leaves of Col-0 and pre-miR319a overexpression lines were used for RNA extraction. Total RNA was extracted using the RNAiso Plus reagent (TaKaRa) according to the manufacturer's instructions. The protocol described earlier [30] with minor modifications was used for mature amiRNA detection. Briefly, 1 µg total RNA was converted into the first-strand cDNA with stem-loop RT primers for amiRNA and gene specific primer of *mCherry* using a PrimeScript™ RT reagent Kit with genomic DNA Eraser (TaKaRa) according to the manufacturer's instructions. RT-qPCR was performed in a LightCycler 96 Instrument (Roche) using the SYBR® *Premix Ex Taq*TM Kit (TaKaRa). Accumulation of mature amiRNAs produced from the original or engineered precursor in Arabidopsis protoplasts or pre-miR319a overexpression transgenic plants were normalized to the transcript levels of the transfection control *mCherry-HA* or *snoR101* (Small Nucleolar RNA 101), respectively.

**Supplementary Materials:** Supplementary materials can be found at http://www.mdpi.com/1422-0067/20/22/ 5620/s1.

**Author Contributions:** D.Z. designed and constructed modified amiRNA vectors and performed preliminary tests. N.Z. and W.S. conducted the majority of experiments to evaluate the engineered amiRNA vectors in protoplasts and transgenic plants. W.S., D.Z., and J.-F.L. wrote the manuscript. J.-F.L. supervised the study.

**Funding:** This work was supported by the Foundation of Guangzhou Science and Technology Key Project (No. 201904020041).

**Acknowledgments:** We thank Jen Sheen for the support of the initiation of this research.

**Conflicts of Interest:** No conflict of interest declared.

#### **Abbreviations**


#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

#### *Article*
