*2.4. Mapping of Di*ff*erential Genes and Metabolites Related to Flavonoid-Anthocyanin Biosynthesis Pathway*

The biosynthetic pathway of anthocyanins has been well-characterized in plants [22]. In order to predict the molecular mechanisms leading to the differential skin coloration in the two turnips, we have reconstructed the flavonoid-anthocyanin biosynthesis pathways (Figures 5 and 6). First, we searched among the DEGs, those coding for enzymes involved in the flavonoid-anthocyanin biosynthesis pathways. We obtained four genes namely, flavonol synthase [EC:1.14.11.23] (*c43941.graph\_c0*, FLS), dihydroflavonol-4-reductase [EC:1.1.1.234] (*c39842.graph\_c0*, DFR), anthocyanidin synthase [EC:1.14.11.19] (*c45741.graph\_c0*, ANS), and UDP-flavonoid glucosyl transferase [EC:2.4.1.91] (*c48211.graph\_c0*, UFGT). All these enzymes mainly participate in the late steps of the flavonoid-anthocyanin biosynthesis pathways.

**Figure 5.** Proposed model of the molecular mechanism leading to the high anthocyanin content in the the purple turnip (PT). Naringenin chalcone is isomerized by chalcone isomerase (CHI) to naringenin. Flavanone 3-hydroxylase (F3H) converts naringenin into dihydroflavonols (dihydrokaempferol, dihydroquercetin or dihydrotricetin). Then, the three dihydroflavonols are converted into colorless leucoanthocyanidins by dihydroflavonol 4-reductase (DFR) and subsequently to colored anthocyanidins by anthocyanidin synthase (ANS). Anthocyanidins are glycolsylated to facilitate their accumulation in cells by the enzyme flavonoid 3-*O*-glucosyltransferase (UFGT). Proanthocyanidins are generated by the action of leucoanthocyanidin reductase (LAR) from leucoanthocyanidins. DFR, ANS and UFGT were found significantly up-regulated in PT leading to a high content of 17 anthocyanins compounds (more than 20 times compared to the green turnip). In contrast, FLS was found significantly down-regulated and **Figure 5.** Proposed model of the molecular mechanism leading to the high anthocyanin content in the the purple turnip (PT). Naringenin chalcone is isomerized by chalcone isomerase (CHI) to naringenin. Flavanone 3-hydroxylase (F3H) converts naringenin into dihydroflavonols (dihydrokaempferol, dihydroquercetin or dihydrotricetin). Then, the three dihydroflavonols are converted into colorless leucoanthocyanidins by dihydroflavonol 4-reductase (DFR) and subsequently to colored anthocyanidins by anthocyanidin synthase (ANS). Anthocyanidins are glycolsylated to facilitate their accumulation in cells by the enzyme flavonoid 3-*O*-glucosyltransferase (UFGT). Proanthocyanidins are generated by the action of leucoanthocyanidin reductase (LAR) from leucoanthocyanidins. DFR, ANS and UFGT were found significantly up-regulated in PT leading to a high content of 17 anthocyanins compounds (more than 20 times compared to the green turnip). In contrast, FLS was found significantly down-regulated and may lead to a weak accumulation of flavonols. PT tends to prioritize anthocyanins accumulation by diverting dihydroflavonols to the anthocyanins biosynthesis pathway.

may lead to a weak accumulation of flavonols. PT tends to prioritize anthocyanins accumulation by diverting dihydroflavonols to the anthocyanins biosynthesis pathway. The flavonoid-anthocyanin biosynthesis pathways start from the key amino acid phenylalanine to produce 4-coumaroyl CoA by phenylalanine ammonia-lyase, cinnamic acid 4-hydroxylase and 4-coumarate CoA ligase [48]. The main precursors for flavonoids are 4-coumaroyl CoA and three molecules of malonyl CoA that produce chalcone by chalcone synthase (Dixon and Steele, 1999). Then, the pathway is catalyzed by a number of enzymes to yield flavanones (via chalcone isomerase), dihydroflavonols (via flavanone 3-hydroxylase) [49]. Dihydroflavonols are the keystone substrates for the biosynthesis of flavonols (via FLS) and anthocyanins (via DFR). In this study, we observed a constant down-regulation of one FLS in PT while a significant up-regulation of the expression level of one DFR during all the five developmental stages (Figures 5 and 6), indicating that PT tends to prioritize the anthocyanins biosynthesis over flavonols. Next, leucoanthocyanidins which are generated from DFR are converted into anthocyanidins (via ANS) [48]. Similar to DFR, we noticed a stout up-regulation of one ANS throughout PT growth pointing to high accumulation of anthocyanidins (Figures 5 and 6). Finally, anthocyanidins are converted into anthocyanins via UFGT [49]. We identified one UFGT significantly and constantly up-regulated in PT, showing a mechanism towards a strong accumulation of anthocyanins (Figures 5 and 6). Based on the metabolite detection and quantification, we confirm that the accumulated anthocyanins conferring the purple pigmentation in PT are mainly peonidin *O*-hexoside, cyanidin 3-*O*-glucoside (kuromanin), pelargonidin, malvidin 3,5-diglucoside (calvin), pelargonin, pelargonidin 3-*O*-beta-D-glucoside (callistephin chloride) and cyanidin (Figures 5 and 6).

**Figure 6.** Proposed model of the mechanism leading to the low anthocyanin content in the green turnip (GT). Naringenin chalcone is isomerized by chalcone isomerase (CHI) to naringenin. On the opposite, delphinidin and petunidin 3-*O*-glucoside are enriched in the skin of GT and may confer the greenish coloration (Figures 5 and 6). (dihydrokaempferol, dihydroquercetin or dihydrotricetin). Then, the three dihydroflavonols are converted into colorless leucoanthocyanidins by dihydroflavonol 4-reductase (DFR) and

naringenin. Flavanone 3-hydroxylase (F3H) converts naringenin into dihydroflavonols

*Int. J. Mol. Sci.* **2019**, *20*, x FOR PEER REVIEW 9 of 19

To confirm the differential expression levels of the four candidate structural genes together with the four key transcription factors detected by the RNA-seq analysis, we performed a quantitative real-time PCR (Table S4). The results showed that the genes *c43941.graph\_c0* (FLS) and *c42189.graph\_c1* (WRKY) were obviously down-regulated over the developmental stages in PT while the genes c39842.graph\_c0 (DFR), *c45741.graph\_c0* (ANS), *c48211.graph\_c0* (UFGT), *c33188.graph\_c0* (MYB), *c44079.graph\_c0* (MYB) and *c37493.graph\_c0* (bHLH) were all found clearly up-regulated over the developmental stages in PT (Figure 7A–H). The qRT-PCR results were therefore in perfect concordance with the RNA-seq report. subsequently to colored anthocyanidins by anthocyanidin synthase (ANS). Anthocyanidins are glycolsylated to facilitate their accumulation in cells by the enzyme flavonoid 3-*O*-glucosyltransferase (UFGT). Proanthocyanidins are generated by the action of leucoanthocyanidin reductase (LAR) from leucoanthocyanidins. DFR, ANS and UFGT were found significantly up-regulated in PT leading to a high content of 17 anthocyanins compounds (more than 20 times compared to the green turnip). In contrast, FLS was found significantly down-regulated and may lead to a weak accumulation of flavonols. PT tends to prioritize anthocyanins accumulation by diverting dihydroflavonols to the anthocyanins biosynthesis pathway.

**Figure 6.** Proposed model of the mechanism leading to the low anthocyanin content in the green turnip (GT). Naringenin chalcone is isomerized by chalcone isomerase (CHI) to naringenin. Flavanone 3-hydroxylase (F3H) converts naringenin into dihydroflavonols (dihydrokaempferol, **Figure 6.** Proposed model of the mechanism leading to the low anthocyanin content in the green turnip (GT). Naringenin chalcone is isomerized by chalcone isomerase (CHI) to naringenin. Flavanone 3-hydroxylase (F3H) converts naringenin into dihydroflavonols (dihydrokaempferol, dihydroquercetin or dihydrotricetin). Then, the three dihydroflavonols are converted into colorless leucoanthocyanidins by dihydroflavonol 4-reductase (DFR) and subsequently to colored anthocyanidins by anthocyanidin synthase (ANS). Anthocyanidins are glycolsylated to facilitate their accumulation in cells by the enzyme flavonoid 3-*O*-glucosyltransferase (UFGT). Proanthocyanidins are generated by the action of leucoanthocyanidin reductase (LAR) from leucoanthocyanidins. DFR, ANS and UFGT were found significantly down-regulated in GT leading to a low content of 13 anthocyanins compounds (less than 20 times compared to the purple turnip). In contrast, FLS was found significantly up-regulated and may lead to a high accumulation of flavonols. PT tends to prioritize flavonol accumulation by diverting dihydroflavonols to the flavonols biosynthesis pathway.

dihydroquercetin or dihydrotricetin). Then, the three dihydroflavonols are converted into colorless leucoanthocyanidins by dihydroflavonol 4-reductase (DFR) and subsequently to colored anthocyanidins by anthocyanidin synthase (ANS). Anthocyanidins are glycolsylated to facilitate their accumulation in cells by the enzyme flavonoid 3-*O*-glucosyltransferase (UFGT). Proanthocyanidins are generated by the action of leucoanthocyanidin reductase (LAR) from leucoanthocyanidins. DFR, ANS and UFGT were found significantly down-regulated in GT leading to a low content of 13 anthocyanins compounds (less than 20 times compared to the purple turnip). In contrast, FLS was found significantly up-regulated and may lead to a high accumulation of

**Figure 7.** Quantitative real time PCR validation of selected candidate genes predicted to differentially affect the anthocyanin profiles in the two turnips. (**A–H**) Relative expression level of *c33188.graph\_c0* (MYB), *c44079.graph\_c0* (MYB), *c37493.graph\_c0* (bHLH), *c42189.graph\_c1* (WRKY), *c43941.graph\_c0* (FLS), *c39842.graph\_c0* (DFR), *c45741.graph\_c0* (ANS) and *c48211.graph\_c0* (UFGT) between PT and GT at five developmental stages (S1–S5). PT represents the purple turnip while GT represents the green turnip and are represented by the grey and white bars, respectively. The error bar represents the SD of biological replicates. The *Actin* gene was used as the internal reference gene for normalization; (**I**) identification of a non-sense mutation in the gene *c39842.graph\_c0* (DFR) by comparing the sequences between PT and GT. The single nucleotide polymorphism (C/T) is located at the position 679 within the coding sequence of the gene and is predicted to generate an amino acid (aa) Q in PT while a stop codon in GT. The white box represents the exon while the black and gray boxes represent the UTR5′and UTR3′, respectively. The arrow indicates the transcription start site **Figure 7.** Quantitative real time PCR validation of selected candidate genes predicted to differentially affect the anthocyanin profiles in the two turnips. (**A–H**) Relative expression level of *c33188.graph\_c0* (MYB), *c44079.graph\_c0* (MYB), *c37493.graph\_c0* (bHLH), *c42189.graph\_c1* (WRKY), *c43941.graph\_c0* (FLS), *c39842.graph\_c0* (DFR), *c45741.graph\_c0* (ANS) and *c48211.graph\_c0* (UFGT) between PT and GT at five developmental stages (S1–S5). PT represents the purple turnip while GT represents the green turnip and are represented by the grey and white bars, respectively. The error bar represents the SD of biological replicates. The *Actin* gene was used as the internal reference gene for normalization; (**I**) identification of a non-sense mutation in the gene *c39842.graph\_c0* (DFR) by comparing the sequences between PT and GT. The single nucleotide polymorphism (C/T) is located at the position 679 within the coding sequence of the gene and is predicted to generate an amino acid (aa) Q in PT while a stop codon in GT. The white box represents the exon while the black and gray boxes represent the UTR50and UTR30 , respectively. The arrow indicates the transcription start site and transcription orientation.

#### and transcription orientation. *2.5. Detection of SNPs within the Four Candidate Structural Genes Regulating the Differential Skin 2.5. Detection of SNPs within the Four Candidate Structural Genes Regulating the Di*ff*erential Skin Coloration in Turnips*

*Coloration in Turnips*  Differential gene expression among individuals is not only caused by the modulation of transcription factors but could result from variations in the nucleotide sequences [50]. Herein, we investigated the single nucleotide polymorphisms (SNP) within the sequences of the four differentially expressed candidate structural genes (*c43941.graph\_c0*, *c39842.graph\_c0*, Differential gene expression among individuals is not only caused by the modulation of transcription factors but could result from variations in the nucleotide sequences [50]. Herein, we investigated the single nucleotide polymorphisms (SNP) within the sequences of the four differentially expressed candidate structural genes (*c43941.graph\_c0*, *c39842.graph\_c0*, *c45741.graph\_c0* and *c48211.graph\_c0*) associated with the anthocyanin biosynthesis pathways, which were predicted to modulate the pigment formation in turnip skin. Sequence comparison of the unigenes between the two turnips unveiled a putative SNP in *c39842.graph\_c0* (DFR). In the DFR gene which has a single coding sequence (CDS) of 1,158 nucleotides (nt), a point-nonsense mutation (C/T) at the position 679 nt was detected in the CDS. The SNP has an allele depth (number of reads) of 0/51 for GT and 113,774/0 for PT. This indicates that the allele T which induces a stop codon in the resulting protein is likely to be present only in GT while the allele C is apparently only present in PT (Figure 7I).

#### **3. Discussion**

Anthocyanin containing fruits and vegetables are an integral part of human diet without any known adverse effect [51]. In this study, we profiled the anthocyanin composition of two widely grown turnips (*Brassica rapa* ssp. rapa) in Xinjiang (China), with purple- (PT) and green-colored (GT) skins (Figure 1) and investigated the underlying genetic basis. Common aglycones of anthocyanin are pelargonidin, cyanidin, delphinidin, peonidin, petunidin, and malvidin [52]. Using the ultra-performance liquid chromatography and tandem mass spectrometry technologies, we determined 17 anthocyanin compounds in the turnip skins, classified into the common six aglycones of anthocynanins (Table 4, Table S2). In addition, we also detected four proanthocyanidins compounds, including procyanidin A1, procyanidin A2, procyanidin B2 and procyanidin B3 (Table 4, Table S2). There were no formal studies on the anthocyanins composition in *Brassica rapa* ssp. rapa, however, according to previous studies, cyanin glycosides are the major anthocyanin substances accumulated in Brassica crops [53–56]. Nine cyanidin anthocyanins were detected in purple cauliflower and purple cabbage [53]. Li [54] analyzed anthocyanin extracts of 0violet0 purple cabbage and obtained eight different anthocyanin components, whose basic component is cyanidin-malonyl-glucoside. Later on, Guo et al. [55] examined purple seaweed sprouts, purple turnips, and purple cabbage and identified 23 anthocyanins compounds, composed of 17 cyanidins and six pelargonidin. Recently, Park et al. [56] also found 11 anthocyanins, predominantly cyanindin in purple Kohlrabi (*Brassica oleracea* var. gongylodes). We deduce that there is a large variation of the anthocyanin profiles among purple Brassica vegetables but *Brassica rapa* ssp. rapa has one of the most diversified anthocyanin metabolites in root, which may confer a superior health-promoting attribute. Moreover, comparative analysis of the anthocyanin contents in PT and GT turnip skins in this study showed that they have differential profiles (Figure 4, Table S2) but PT had 20 times more anthocyanin levels than GT, which may explain the difference in their skin coloration.

Variation in anthocyanin content in plants has been linked to the differential expression of key genes encoding structural enzymes involved in the anthocyanin biosynthesis pathways [57,58]. These genes have been classified as early biosynthesis genes and include chalcone synthase (CHS), chalcone isomerase (CHI), and flavanone-3-hydroxylase (F3H), while others are classified as late biosynthesis genes, including dihydroflavonol-4-reductase (DFR), anthocyanidin synthase (ANS), and UDP-flavonoid glucosyl transferase (UFGT) [22]. To uncover the key structural genes modulating the differential pigmentation in skin of PT and GT, we de novo sequenced and assembled the whole transcriptome from skin samples collected at five developmental stages by the RNA-sequencing technology (Tables 1–3). Although Lin et al. [43] reported the full sequencing of the genome of *Brassica rapa* ssp. rapa, the sequence is not publicly available and has propelled us for the de novo transcript assembly. A very high number of unigenes was identified in this study (~76,000 genes) as compared to previous reports in *Brassica rapa* genotypes (~40,000 genes) [42,43]. Our results will fuel further investigations on the genetic variation underlying the diverse morphotypes found in this species. Based on the differential gene expression (DEG) analysis and gene annotation, we searched for all DEGs related to the flavonoid-anthocyanin biosynthesis. Our results revealed four DEGs between PT and GT, all classified as late biosynthesis genes. Dihydroflavonols are at the halfway of anthocyanin biosynthesis from the end of the activity of early biosynthesis enzymes and the beginning of the activity of the late biosynthesis enzymes and represent the same substrate for both anthocyanin biosynthesis and flavonols biosynthesis. In various plants, it has been documented that the up-regulation of early biosynthesis genes increases the formation of dihydroflavonols, which later facilitates the high anthocyanin accumulation [30,34,35,37,40]. However, in turnip, we uncovered a different mechanism leading to the differential anthocyanin content (Figures 5 and 6). In fact, the purple turnip (PT) tends to prioritize the anthocyanins biosynthesis over the flavonols biosynthesis by strongly down-regulating one flavonol synthase (*c43941.graph\_c0*, FLS) gene, which normally converts dihydroflavonols into flavonols. Then, we predicted that the dihydroflavonols are mainly diverted to the anthocyanins biosynthesis through a strong up-regulation of one DFR gene (*c39842.graph\_c0*). This will result in a high accumulation of leucoanthocyanidins in PT. A similar mechanism has been recently discovered in *Mimulus lewisii* [39]. They demonstrated that the gene *LAR1*, encoding an R2R3-MYB transcription factor positively regulates FLS, essentially eliminating anthocyanin biosynthesis in the white region around the corolla throat of *M. lewisii* flowers by diverting dihydroflavonol into flavonol biosynthesis from the anthocyanin pigment pathway [39]. Interestingly, the putative nonsense mutation identified in

the coding sequence of the DFR gene could lead to a nonfunctional protein in GT (Figure 7I), which may impair the accumulation of anthocyanins in GT skin. Moreover, PT strongly activated one ANS gene (*c45741.graph\_c0*) which will likely generate high level of anthocyanidins. All these genes (DFR, ANS) are essential for the formation of the higher content of anthocyanins in PT but without glucosylation, anthocyanins are unstable and do not accumulate in the cells to give the purple pigmentation [59]. In this regard, we detected one UFGT gene (*c48211.graph\_c0*) strongly up-regulated in PT and will likely favor the high accumulation of purple anthocyanin pigments in PT (Figure 5). Analogically, we deduced that the green turnip (GT) prioritizes flavonols biosynthesis through a high activity of the FLS gene and strongly reduces anthocyanin accumulation by down-regulating DFR, ANS and UFGT genes (Figure 6). Nonetheless, it is still unclear how the different anthocyanin profiles were generated in both varieties. For example, how delphinidin and petunidin 3-*O*-glucoside accumulates to higher levels in GT or why pelargonin and pelargonidin 3-*O*-beta-D-glucoside, pelargonidin *O*-acetylhexoside, cyanidin *O*-acetylhexoside are only detected in PT, will require additional investigations.

The activities of the structural genes in the flavonoid-anthocyanin biosynthetic pathways are regulated by other genes, predominantly transcription factors (TF) from the families of MYB, bHLH and WD40, which form ternary complexes called MBW [28,29,31,33]. Accordingly, in this study we also uncovered several members of MYB and bHLH families as the main drivers of transcriptional changes between the two turnips (Figure 3). However, we did not find any annotated gene corresponding to the WD40 within the DEGs. An et al. [32] have shown that the ternary complexes MBW is not indispensable for the regulation of anthocyanin genes in apple, therefore, with pending in-depth investigation, we are tempted to speculate that MYB and bHLH may be sufficient for the regulation of anthocyanin structural genes in turnips. Many WRKY genes were also differentially expressed between the two turnips (Figure 3), suggesting that they may play key roles. Lei et al. [60] demonstrated that *WRKY2* and *WRKY34* negatively regulate the expression of certain MYBs during plant male gametogenesis. Similarly, *AtWRKY40* binds to the W-box in promoters *AtMYB2* to inhibit its expression [61]. Later on, Verweij et al. [38] also showed in two different species how a WRKY gene negatively regulates the complex MYB-bHLH-WD40. In our study, the MYB and bHLH genes were mostly up-regulated while WRKY genes were mainly down-regulated in TP, implying that WRKYs may repress the expression levels of MYB and bHLH members. Among others, we propose four candidate TFs, including *c42189.graph\_c1* (WRKY) *c33188.graph\_c0* (MYB), *c44079.graph\_c0* (MYB) and *c37493.graph\_c0* (bHLH) for future thorough functional characterizations in turnip skin pigmentation.
