4.3.4. Data Evaluation

Based on the self-built database MWDB (metware database) and the public database of metabolite information, the primary and secondary spectral data of mass spectrometry were qualitatively analyzed by software Analyst 1.6.3. For the qualitative analysis of some substances, the interference from isotope signals are removed, including duplicate signals of K+, Na<sup>+</sup> and NH4 <sup>+</sup>, as well as duplicate signals of fragment ions which were derived from other large molecules. The structure analysis of metabolites refers to the existing mass spectrometry public databases such as MassBank (http://www.massbank.jp/), KNAPSAcK (http://kanaya.naist.jp/KNApSAcK/), HMDB (http://www.hmdb.ca/) (Wishartetal.2013), MoToDB (http://www.ab.wur.nl/moto/) and METLIN (http://metlin.scripps.edu/index.php) [38].

The quantification of metabolites was carried out by using multiple reaction monitoring (MRM) mode. In the MRM model, the quadrupole first selected the precursor ions (parent ions) of the target substance. While screening the corresponding ions of other molecular weight substances to initially eliminate the interference, the precursor ions were ionized by the collision chamber to form a lot of fragment ions. Then, the fragment ions were filtered through the triple quadrupole to select a characteristic fragment ion, which eliminated the interference of non-target ions, making the quantitative inference more accurate and the better repeatability. After the metabolic substance spectrum analysis data of different samples were obtained, the mass spectrometry peaks of all substances were integrated, and the mass spectrometry peaks of the same metabolite in different samples were integrated and corrected [39].

#### 4.3.5. Data Analysis

The original data obtained were preprocessed at first (noise filtering, peak matching and peak extraction) and the data were corrected [40]. Then, the data of quality control entered the stage of statistical analysis.

Statistical analysis used multivariate analysis. Data were log-transformed and mean-centred using SIMCA software (V14.1, MKS Data Analytics Solutions, Umea, Sweden) for PCA and OPLS-DA analysis. PCA analysis was followed by automated modelling analysis [41,42]. The first principal component (PC1) was subjected to OPLS-DA modelling, and the model quality was tested by 7-fold cross validation. After that, the resulting R2Y (the interpretability of the model on the categorical variable Y) and Q2 (the predictability of the model) were used to evaluate the validity of the model. The permutation test was performed multiple times to generate different random Q2 values, which were used to further test model validity [43].

R software (version 3.0.3) was used for Hierarchical clustering analysis (HCA) analysis. The data were log 2 transformed and similarity assessment for clustering was based on the Euclidean distance coefficient.

Based on OPLS-DA analysis, the differential metabolites were screened by the following criteria: (1) If the difference of metabolites content between the control group and the experimental group is more than 2 times or less than 0.5, the difference is considered to be significant; (2) On the basis of the above, the metabolites with VIP ≥ 1 are selected.

Finally, the Kyoto Encyclopaedia of Genes and Genomes (KEGG) Pathway database (http: //www.kegg.jp/kegg/pathway.html) [44] and MW-database (Metware Biotechnology Co., Ltd, Wuhan, China) were centred on metabolic reactions and concatenates possible metabolic pathways.

#### *4.4. Enzyme Activity Determination*

The enzyme activities of 10 enzymes were determined by enzyme-linked immunosorbent assay (ELISA). Leaves were ground to powder in liquid nitrogen and accurately weighed 0.1 g sample in the centrifuge tube. Then, added the same volume of 0.1 mol/L precooled PBS solution, centrifuged about 3000 rpm/min at low-temperature and low-speed for 10 min. 1 mL of the supernatant was collectedn to test. The ELISA detection kit manual of the corresponding enzyme (ProNets Biotechnology Co., Ltd., Wuhan, China) was used to determine the enzyme.

#### *4.5. Total RNA Extraction and qRT-PCR Analysis*

The total RNA of *C. sinense* 'Red Sun' was extracted by RNA extraction kit (TIANGEN, Biotech, Beijing, China). The content of RNA in three samples was determined by nanodrop2000 (Thermo

Fisher, Waltham, MA, USA). 500 ng RNA was taken from each sample for reverse transcription by HiScript Q RT SuperMix for qPCR kit (Vazyme Biotech Co., Nanjing, China) to obtain cDNA. cDNA dilution of 10-folds was used for qRT-PCRanalysis.

QRT-PCR used CFX96TM Real-Time System (Bio-Rad, Hercules, CA, USA) following the instructions based on ChamQ SYBR qPCR Master Mix kit (Vazyme Biotech Co., Nanjing, China). CsACTIN was used as normalization standard for gene expression. The gene expression was calculated by 2−ΔΔCT. The primers for qRT-PCR are listed in Table S5.
