*4.2. Metabolic Profiling*

The sample preparation, extract analysis, metabolite identification and quantification were performed at Wuhan MetWare Biotechnology Co., Ltd. (www.metware.cn) following their standard procedures and previously described by Zhang et al. [48]

### *4.3. Metabolite Data Analysis*

Before the data analysis, quality control (QC) analysis was conducted to confirm the reliability of the data. The QC sample was prepared by the mixture of sample extracts and inserted into every five samples to monitor the changes in repeated analyses. Data matrices with the intensity of the metabolite features from the 30 samples were uploaded to the Analyst 1.6.1 software (AB SCIEX, Ontario, Canada) for statistical analyses. The supervised multivariate method, partial least squares-discriminant analysis (PLS-DA), was used to maximize the metabolome differences between the pair of samples. The relative importance of each metabolite to the PLS-DA model was checked using the parameter called variable importance in projection (VIP). Metabolites with VIP ≥ 1 and fold change ≥ 2 or fold change ≤ 0.5 were considered as differential metabolites for group discrimination [48].

#### *4.4. RNA Extraction, cDNA Library Construction, and Transcriptome Sequencing*

Total RNAs were extracted using Spin Column Plant total RNA Purification Kit following the manufacturer0 s protocol (Sangon Biotech, Shanghai, China). Purity of the extracted RNAs was assessed on 1% agarose gels followed by NanoPhotometer spectrophotometer (IMPLEN, Los Angeles, CA, USA). RNA quantification was performed using Qubit RNA Assay Kit in Qubit 2.0 Flurometer (Life Technologies, Carlsbad, CA, USA). Next, RNA integrity was checked by the RNA Nano 6000 Assay Kit of the Agilent Bioanalyzer 2100 system (Agilent Technologies, Santa Clara, CA, USA).

Sequencing libraries was created using NEB Next Ultra RNA Library Prep Kit following manufacturer0 s instructions. The index codes were added to each sample. Briefly, the mRNA was purified from 3 µg total RNA of each of three replicate using poly-T oligo-attached magnetic beads and then broken into short fragments to synthesize first strand cDNA. The second strand cDNA synthesis was subsequently performed using DNA Polymerase I and RNase H. PCR was carried out with Phusion High Fidelity DNA polymerase using universal PCR primers and index (×) primer. Finally, six paired-end cDNA libraries with an insert size of 300 bp were constructed for transcriptome sequencing and sequenced on Illumina HiSeq 2500 platform (Illumina Inc., San Diego, USA) by Biomarker Technology Corporation (www.biomarker.com.cn). The raw RNAseq data are submitted at: www.ncbi.nlm.nih.gov/bioproject/PRJNA558197.

#### *4.5. De novo Assembly, Functional Annotation, Classification and Metabolic Pathway Analysis*

The clean reads were retrieved after trimming adapter sequences, removal of low quality (containing > 50% bases with a Phred quality score < 15) and reads with unknown nucleotides (more than 1% ambiguous residues N) using the FastQC tool (http://www.bioinformatics.babraham.ac.uk/ projects/fastqc/). The high quality reads from all the 30 libraries were de novo assembled into transcripts using Trinity (Version r20140717, [62]) by employing paired-end method [63]. Next, the transcripts were realigned to construct unigenes. The assembled unigenes were then annotated by searching against various databases such as Kyoto Encyclopedia of Genes and Genomes (KEGG) [64], Gene Ontology (GO) [65], Clusters of Orthologous Groups (COG) [66], PfAM, Swissprot [67], egNOG [68], NR [69], euKaryotic Orthologous Groups (KOG) [70] using BLAST [71] with a threshold of E-value <sup>&</sup>lt; 1.0 <sup>×</sup> <sup>10</sup>−<sup>5</sup> .

The software KOBAS2.0 [72] was employed to get the unigene KEGG orthology. The analogs of the unigene amino acid sequences were searched against the Pfam database [73] using HMMER tool [74] with a threshold of E-value <sup>&</sup>lt; 1.0 <sup>×</sup> <sup>10</sup>−10. The sequenced reads were compared with the unigene library using Bowtie [75], and the level of expression was estimated in combination with RSEM [76]. The gene expression level was determined according to the fragments per kilobase of exon per million fragments mapped (FPKM).

#### *4.6. Di*ff*erential Expression and Enrichment Analysis*

The read count was normalized and EdgeR Bioconductor package [77] was used to determine the differential expression genes (DEGs) between the two varieties at each developmental stage with the fold change > 2 [44] and false discovery rate (FDR) correction set at *p* < 0.01. GO enrichment analysis was performed using the topGO method based on the wallenius noncentral hypergeometric distribution with *p* < 0.05 [78]. KEGG pathway enrichment analysis of the DEGs was done using KOBAS2.0 [72]. The FDR correction was employed (*p* < 0.05) to reduce false positive prediction of enriched GO terms and KEGG pathways.

## *4.7. SNP Analysis*

The reads and unigene sequences of each sample were compared using the software STAR [79] and the single nucleotide polymorphism (SNP) was identified through the pipeline (SNP Calling) for RNA-Seq by GATK2 [80]. Raw vcf files were filtered with GATK standard filter method and other parameters (clusterWindowSize: 35; MQ0 ≥ 4 and (MQ0/(1.0\*DP)) > 0.1; QUAL < 10; QUAL < 30.0 or QD < 5.0 or HRun > 5), and only SNPs with distance > 5 were retained.

## *4.8. Gene Expression Using Quantitative Real Time-PCR*

The qRT-PCR was performed on RNA extracted from root samples of both varieties at the five developmental stages as described by Dossa et al. [81] using the *Actin* gene as the internal control. Specific primer pairs of 15 selected genes were designed using the Primer Premier 5.0 [82] (Table S4). The qRT-PCR was conducted on a Roche Lightcyler® 480 instrument using the SYBR Green Master Mix (Vazyme), according to the manufacturer's protocol. Each reaction was performed using a 20 µL mixture containing 10 µL of 2 × ChamQ SYBR qPCR Master Mix, 6 µL of nuclease-free water, 1 µL of each primer (10 mM), and 2 µL of 4-fold diluted cDNA. All of the reactions were run in 96-well plates and each cDNA was analyzed in triplicate. The following cycling profile was used: 95 ◦C for 30 s, followed by 40 cycles of 95 ◦C/10 s, 60 ◦C/30 s. Data are presented as relative transcript level based on the 2−∆∆Ct method [83].
