1. Introduction
Quantitative reverse transcription polymerase chain reaction (qRT-PCR) is an efficient, sensitive and reliable technique for quantifying the expression profiles of target genes in different tissues, following different hormone treatments and under various stresses [
1]. qRT-PCR enables comparison of changes in the expression of a target gene to changes in reference genes, including abundant transcripts and low-abundance transcripts of the target gene. Thus, the qRT-PCR is widely preferred over classic transcriptome analysis methods [
2]. Appropriate normalization is a key factor in quantifying transcript expression levels to avoid bias [
3]. Normalization requires the selection of one or more reference genes showing constant and stable expression levels in different tissues and following different treatments [
4,
5]. To eliminate differences in the initial volume of cDNA template, quality of RNA from different tissues and reaction efficiency of enzymes, reference genes are typically required for data correction in qPCR analysis [
6].
An ideal reference gene has the following characteristics: first, it is expressed stably in different tissues and cells; second, its expression is not greatly affected by environmental, biological or abiotic stress or other factors; finally, its expression level is similar to that of the target gene. Numerous studies of reference genes have been conducted in model plants and other species, such as
Arabidopsis [
7],
Populus euphratica [
8,
9],
Lycopersicon esculentum [
10],
Solanum tuberosum [
11],
Eucalyptus robusta [
12] and Slash Pine [
13]. According to previous studies, the most stable reference genes in many species are related to basic cellular processes, such as the genes for actin [
14], tubulin, elongation factor 1-α [
15], glyceraldehyde-3-phosphate dehydrogenase [
16] and 18S ribosomal RNA [
17]. However, many studies reported that the expression levels of these genes are not constant under all experimental conditions or in different species. No universal reference gene has been identified [
18,
19]. Comparative analysis of reference genes showed that widely used reference genes are not stable under different experimental conditions [
20,
21]. Therefore, to identify an accurate and efficient target gene expression profile for gene expression analysis, we determined the best reference genes in
Metasequoia for studying different plant tissues and following different hormone treatments before normalization of gene expression.
Several bioinformatics tools including geNorm [
22], NormFinder [
23] and BestKeeper [
24] have been utilized to analyze and assess the expression stability of reference genes for qRT-PCR data normalization. geNorm is a popular algorithm which reveals the most stable reference genes from among many tested candidate reference genes. The NormFinder algorithm identifies the optimal normalization gene among a set of candidates. BestKeeper is an Excel-based tool for selecting the best candidate using pairwise correlations. In addition to these methods, the ΔCt [
25] and GrayNorm [
26] algorithms have been widely employed for data analysis. The RankAggreg algorithm was used in a comprehensive sequencing study on genetic stability [
27].
Metasequoia glyptostroboides Hu & Cheng, regarded as the national first-class protected plant, is a “living fossil” discovered as a relict population in a remote area near the border of Hubei in South-Central China in the 1940s [
28]. The discovery of
Metasequoia was one of the most important events contributing to species protection of China in the past century [
29].
Metasequoia Miki ex Hu et Cheng belongs to the Cupressaceae family and
Metasequoia glyptostroboides Hu & Cheng is the only species in this genus.
Metasequoia is a coniferous tree species widely distributed in southern China and is considered as an ideal species for forest evolution studies. It is a high-quality timber species because it has conserved various characteristics of ancient trees but also evolved unique features over its long evolution [
30,
31]. The unique wood features of
Metasequoia including its beautiful and delicate texture, durability and weightlessness [
32]. It is widely used as a raw material for producing floors, walls and furniture. In addition to these applications,
Metasequoia is a popular research material in the fields of pharmacology and biochemistry [
33,
34]. Although
Metasequoia has high economic and scientific value, its use is limited by long growth and reproduction times, as observed for other gymnosperms.
Metasequoia requires approximately 25 years to form the first female core. After a long juvenile period, the core matures and can form flower buds. Approximately 45 years are required to form a large amount of fruit [
35]. Studies of the molecular mechanisms of floral bud formation and abscisic acid (ABA)-response signaling in this tree species have revealed that ABA promotes the growth and development of this gymnosperm.
Gene expression analyses are important for studying floral bud differentiation and abiotic stress in gymnosperms [
36]. The conversion from vegetative growth to reproductive growth is an important process in plant development and key event in plant reproduction. This process is regulated by both external environment and internal factors. In this study, we sought to evaluate candidate reference genes in
Metasequoia, a valuable resource for examining gene expression levels, for the first time. Our results provide a foundation for other researchers to choose reference genes for the normalization of mRNA levels by qRT-PCR in this tree species.
3. Discussion
The selection of stably expressed genes in different tissues and treatments is a key step in qRT-PCR analysis. However, in gymnosperms, the selection of reference genes is limited to
Cycas elongate [
37],
Abies alba [
38] and marine pine [
39].
Metasequoia originated in the Mesozoic Cretaceous and was declared extinct on Earth until 1941, when Chinese botanists discovered traces of this ancient and rare species in Hubei province, which had been preserved during the Quaternary Ice Age.
Metasequoia is referred to as a “living fossil” and is a unique tree species in China that has not evolved and its growth process has not diverged. Few studies have examined reference genes in
Metasequoia. Therefore, we evaluated the stability of the expression of candidate reference genes under specific conditions and then analyzed the usefulness of these genes as internal controls to normalize the expression of other genes.
In our study, 14 candidate reference genes were successfully identified and their expression characteristics were further analyzed using six statistical algorithms: ΔCt, Bestkeeper, NormFinder, geNorm, RankAggreg and GrayNorm. As shown in
Figure 1 and
Figure 2, each candidate reference gene showed different Ct values in all samples. The stability of each gene in different samples was clearly observed.
ACT2 and
TATA exhibited the most stable expression pattern. Our data demonstrate that
TATA had the highest rank in six tissues and following different hormone treatments. The TATA-box binding protein interacts with the TATA-box, which was the first promoter identified in eukaryotes. The TATA-box is relatively fixed in eukaryotic promoters and plays an important role in regulating the initiation of gene transcription [
40]. Notably, the TATA-box binding protein is preferable to other commonly used genes for qRT-PCR analysis in various species, such as, mouse [
41], human [
42] and
Aphis glycines [
43]. Recently, Izabela also recommended using this gene for normalization in molecular studies of primary and secondary dormancy in
Avena fatua L [
44]. An important feature of an ideal reference gene is that its expression should be constant regardless of the environmental and experimental conditions. However, recent studies showed that widely used reference genes are only stably differentially expressed under certain conditions.
TUB was predicted to be an unstable reference gene in this study.
TUB is a member of the tubulin gene family, which are commonly used as reference genes [
45] but in
Metasequoia, its expression stability was low in different tissues and hormone treatment.
TUB showed lower stable expression than several other reference genes under abiotic stresses in Seashore Paspalum [
46]. The results show that not all common reference genes are suitable as reference genes without selection and validation and the same reference genes may have different expression characteristics in different tissue samples and experimental conditions.
Notably, several
Metasequoia reference genes exhibited different expression patterns compared to in other species treated with ABA. It was previously reported that
UBQ is the most stable reference gene in NaCl-treated and ABA-treated
Platycladus orientalis [
45]; however,
UBQ exhibited unstable expression following ABA treatment in this study. Several other widely used reference genes, including
EF1α,
elF-5A and
TATA, were relatively better reference genes for ABA. In previous studies,
TATA was used as a stable reference gene under heavy metal stress [
47] but few reports have examined its expression under abiotic stress.
EF1α is a commonly used reference gene and was shown to be one of the best reference genes for ABA in
Populus euphraticarice [
8]. Additionally,
EF1α was selected as best choice for
Petunia hybrida [
48], soybean [
49] and
B. brizantha [
50]. This indicates that the reference gene verified in
Metasequoia is identical to previously reported reference genes. We focused on the influence of hormones on gene expression patterns. Evaluation of hormone treatment revealed the large impact of ABA on gene expression (
Table 5,
Figure 5D,H). ABA treatment is a type of abiotic stress that leads to activation of numerous stress-related genes and the synthesis of multiple functional proteins in plants. The expression patterns of these reference genes are likely to be greatly affected during this process.
TATA and
elF-5A are expressed stably during the flowering stage and are induced by hormone treatment in the leaves and in ABA-treated leaves, suggesting that the expression of these two genes in leaves is not affected by hormones, although different types of hormones have little effects on their expression in the leaves. In future studies,
TATA and
elF-5A can be used as valuable reference genes in qRT-PCR assays of hormone treatment in
Metasequoia. Additionally,
HIS and
ACT2, which are also stably expressed in treated buds, did not show stable expression in the treated leaves. These results demonstrate that different tissues under the same treatment conditions exhibit variable gene expression patterns. Our research highlights the importance of choosing suitable reference genes for different tissues.
The expression levels of the target genes were significantly changed when different reference genes were used for normalization, leading to unreliable experimental results. To identify stably expressed reference genes and compare the advantages of these different algorithms, we analyzed and compared the relative expression levels of two functional genes following different hormone treatments. The results revealed numerous differences in the calculated expression patterns of these two target genes when different combinations of reference genes were used. Our results in
Figure 7 support that a combination of reference genes is preferred rather than a single internal reference gene. The results of this study provide important information on ABA-response signaling in gymnosperms. Gymnosperms have large genomes but their genome-wide information is limited and the current results provide appropriate resources for qRT-PCR analysis in other species related to
Metasequoia.
4. Materials and Methods
4.1. Hormone Treatment and Sample Collection
Metasequoia trees located on the campus of Beijing Forestry University (latitude 40.0 N, longitude 116.2 E, 57 m above sea level) were growing in the natural environment. Different tissues were collected at 17:00, including leaves, stems, roots, buds, male cores and female cores. After collections, all samples were immediately frozen in liquid nitrogen and stored at −80 °C until RNA extraction.
Metasequoia flowering induction hormone solution contained indole butyric acid, Zeatin ribonucleoside (Sigma-Aldrich, St. Louis, MO, USA) and 0.5% Tween-20 [
51]. Current-year leaves and buds were treated with hormone solution, which we named as
MglFlora. The treatment was conducted on three
Metasequoia trees and each represented a biological replicate. The treated shoots were evenly sprayed with
MglFlora until liquid drops began to drip down from the leaves at 17:00 on 11 July 2017. In this treatment, mature leaves and buds from the same position were collected at 1, 7 and 14 days after treatment. For ABA treatment, the leaves were sprayed with 200 μM ABA (Sigma-Aldrich, St. Louis, MO, USA) solution at 17:00 on 11 July 2017. Leaves from the same position were collected at 0, 0.5, 1, 2, 4, 8 and 12 h after treatment. After collection, all samples were immediately frozen in liquid nitrogen and stored at −80 °C until RNA extraction. The average temperature in Beijing Forestry University from July 21 to 25, 2017 is 28.28 °C; the average rainfall during this period is about 170 mm.
4.2. Selection of Candidate Reference Genes and Primer Design
In a previous study, the first large-scale dataset of expressed sequence tags (ESTs) of
Metasequoia from vegetative and reproductive tissues was reported using 454 pyrosequencing technology [
52]. Based on the ESTs, 14 putative reference genes were selected as candidates for qRT-PCR normalization and two target genes were selected to investigate the effect of the choice of reference genes on normalization.
The genes selected were
ACT2 (ACTIN 2),
AP-2 (AP-2 complex subunit mu-like),
Cpn60β (60-kDa chaperonin β-subunit),
EF1α (elongation factor-1 α),
elF-5A (eukaryotic initiation factor 5A),
GAPDH (eukaryotic initiation factor 5A),
GIIα (glucosidase II α-subunit),
HIS (histone superfamily protein H3),
RA (rubisco activase),
RP (ribosomal L27e protein family),
RPL17 (ribosomal protein L17),
TATA (TATA binding protein 2),
TUB (tubulin β chain),
UBQ (ubiquitin family 6),
FT (FLOWERING LOCUS T.) and
PYL (pyrabactin resistance-like8). All primers of these candidate reference genes were designed by Integrated DNA Technologies (
http://sg.idtdna.com/calc/analyzer; Coralville, IA, USA). These primers produced amplicons of 150–250 bp, primer lengths were 18–32 bp, optimal Tm (annealing temperature) was 60 ± 2°C and GC% was between 40 and 60. Detailed information for these primers is shown in
Table 1 including the primer sequences and annealing temperature of the 14 candidate reference genes. Primer specificity was confirmed by melting-curve analysis after RT-PCR and amplicon sizes were confirmed by 2.0% agarose gel electrophoresis.
4.3. Total RNA Isolation and cDNA Synthesis
Frozen samples were ground into a fine powder under liquid nitrogen with a pestle and mortar. Total RNA was isolated from all samples using cetyltrimethyl ammonium bromide (CTAB) buffer according to the instructions [
53]. Dithiothreitol (DTT, Thermo Fisher Scientific, Waltham, MA, USA) is a strong denaturant that impacts RNase A (Thermo Fisher Scientific, Waltham, MA, USA) enzyme activity and free thiol, thereby reducing the RNase A enzyme content. DTT was added to the CTAB buffer to improve the extraction efficiency. After chloroform/isoamyl alcohol extraction, RNA was precipitated with 10 M LiCl
2, followed by extraction with SSTE buffer (1 M NaCl, 0.5% SDS, 10 mM Tris–HCl, 1 mM EDTA) and precipitated with ethanol again. Finally, 75% alcohol were used to clean the RNA precipitate. After the remaining alcohol was evaporated, all RNA samples were dissolved in RNA-free water for subsequent cDNA synthesis experiments. RNA quantity and purity were confirmed by measuring the optical density at OD260/OD280 and OD260/OD230 absorption ratio using a NanoDrop 2000 spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA). The integrity of the total RNA was checked on 0.8% agarose gels. After adjusting each of the RNA samples to the same concentration, 1000 ng RNA samples were prepared for the synthesis of cDNA strands. cDNA synthesis was performed using PrimerScript RT Enzyme (TaKaRa, Shiga, Japan) following the manufacturer’s instructions.
4.4. qRT-PCR Conditions and Amplification Efficiency
qRT-PCR was performed on a Roche 480 light Cycler instrument (Roche, Basel, Switzerland) using TB Green™ Premix Ex Taq™ II (Tli RNaseH Plus) (TaKaRa, Shiga, Japan). The reaction mixture was a total of 20 µL and contained 1 μL cDNA, 10 μL 2× TB Green™ Premix Ex Taq™ II (Tli RNaseH Plus), 0.6 μL 10 μM forward primer, 0.6 μL 10 μM reverse primer and 7.8 μL ddH2O. The amplification conditions were as follows: 95 °C for 10 s; 45 cycles at 95 °C for 5 s; 60 °C for 30 s; and 72 °C for 30 s. Each assay included three technical replicates. A standard curve was used to calculate the PCR efficiency (E) for each gene by qRT-PCR using a five-fold dilution series of the mixed cDNA template. The PCR efficiency was calculated by the equation: E = 10−1/slope, where slope indicates the slope of a standard curve was given by LightCycler software (Roche, Basel, Switzerland).
4.5. Statistical Analysis
The expression stabilities of the 14 reference genes in different tissues and hormone treatments were evaluated using six algorithms: ΔCt [
25], Bestkeeper [
54], NormFinder [
23], geNorm [
22], RankAggreg [
27] and GrayNorm [
26].
The principle of the ΔCt algorithm is to compare the relative expression of the ‘gene pair’ in each sample to identify the reference gene that can be stably expressed. The difference in Ct (ΔCt) between the two genes is first calculated. If the ΔCt of the two genes in different cDNA samples remains the same, these two genes were considered as stably expressed. The standard deviation (sd) reflects the change in the ΔCt of the two genes in all samples. Next, the ΔCt value between one gene and the other 13 genes was calculated and the 13 SD values were averaged to obtain the mean standard deviation (mSD).
BestKeeper is a tool that determines the stable level of reference genes based on pairwise correlations calculated from the percentage standard deviation (SD) and covariance (CV) among the raw Ct value data. Candidate reference genes showing the lowest CV values were considered as the most stable genes. Another function of the BestKeeper program is that it applies the decision coefficient (r) as an index of the program’s stability characterization. r was calculated to determine a credible normalization factor (NF), rather than evaluating the excellence of each reference gene’s expression stability separately. In this study, we used r to rank the stability values of the 14 candidate genes.
NormFinder values the gene stability based on the expression variations of these candidate reference genes, with a lower stability value indicating more stable reference genes. The algorithm calculates inter-group and intra-group changes in response to the NF, with the algorithm requiring at least three candidate genes and more than two samples per group.
geNorm analyzes the expression stability of a candidate reference gene according the stability value (M, M-value), with smaller M values indicating better stability. The standard of selected reference genes was based on the M value, with values of less than 1.5 considered to indicate stable expression of an ideal reference gene. Additionally, geNorm also valued the pairwise variation values (V), indicating the lowest number of reference genes required for reliable normalization.
GrayNorm is an algorithm that can identify a combination of reference genes with minimal deviation from non-normalized data. The principle of GrayNorm is to calculate the NF for each combination of treatment group and each possible reference gene. The closer the mean value of 1/NF for each treatment group is to 1.0, the smaller the biological variation and the more accurate the expression level of gene of interest can be calculated.
The R program v3.0.1 can load the RankAggreg v0.6.4 package. RankAggreg meets the needs of complex rank aggregation easily and expediently, integrating the stability measurements obtained from the four methods and then establishes a comprehensive ranking of reference genes. RankAggreg contains various algorithms, such as Cross-Entropy Monte Carlo algorithm, Genetic algorithm and a brute force algorithm. The detailed algorithm process can be referenced at
https://cran.r-project.org/web/packages/RankAggreg/RankAggreg.pdf. According to the ranking orders of these four methods, the Cross-Entropy Monte Carlo algorithm (CE) was used to demonstrate hierarchical aggregation. The result was intuitively output in the form of a line graph.