A Two-Step PCR Protocol Enabling Flexible Primer Choice and High Sequencing Yield for Illumina MiSeq Meta-Barcoding

Ko-Hsuan Chen; Reid Longley; Gregory Bonito; Hui-Ling Liao

doi:10.3390/agronomy11071274

,

and

¹

North Florida Research and Education Center, Soil and Water Sciences Department, University of Florida, Quincy, FL 32351, USA

²

Biodiversity Research Center, Academia Sinica, Taipei 115, Taiwan

³

Department of Microbiology and Molecular Genetics, Michigan State University, East Lansing, MI 48824, USA

⁴

Department of Plant Soil Microbial Sciences, Michigan State University, East Lansing, MI 48824, USA

Agronomy2021, 11(7), 1274;https://doi.org/10.3390/agronomy11071274

This article belongs to the Section Crop Breeding and Genetics

Version Notes

Order Reprints

Abstract

High-throughput amplicon sequencing that primarily targets the 16S ribosomal DNA (rDNA) (for bacteria and archaea) and the Internal Transcribed Spacer rDNA (for fungi) have facilitated microbial community discovery across diverse environments. A three-step PCR that utilizes flexible primer choices to construct the library for Illumina amplicon sequencing has been applied to several studies in forest and agricultural systems. The three-step PCR protocol, while producing high-quality reads, often yields a large number (up to 46%) of reads that are unable to be assigned to a specific sample according to its barcode. Here, we improve this technique through an optimized two-step PCR protocol. We tested and compared the improved two-step PCR meta-barcoding protocol against the three-step PCR protocol using four different primer pairs (fungal ITS: ITS1F-ITS2 and ITS1F-ITS4, and bacterial 16S: 515F-806R and 341F-806R). We demonstrate that the sequence quantity and recovery rate were significantly improved with the two-step PCR approach (fourfold more read counts per sample; determined reads ≈90% per run) while retaining high read quality (Q30 > 80%). Given that synthetic barcodes are incorporated independently from any specific primers, this two-step PCR protocol can be broadly adapted to different genomic regions and organisms of scientific interest.

Keywords:

MiSeq; Illumina; ITS; 16S rDNA; amplicon sequencing; metabarcoding

1. Introduction

The advancement of high-throughput sequencing has transformed our ability to explore microbial diversity across different environments. The overall sequence read output in combination with the ability to multiplex samples (i.e., pooling PCR products generated from many samples together in one sequencing effort and assigning the reads back to the original samples) makes MiSeq amplicon sequencing more cost-effective than other approaches. However, various technical difficulties may occur during library preparation, leading to low read quality or low clustering and read recovery rate [1]. To optimize sequencing quality, PhiX, the short DNA fragments that are derived from a well-characterized bacteriophage genome, is recommended by the manufacturer (Nicolas Devos, personal communication) to be added upon sequencing. In cases with 10% PhiX addition, 10% of unassigned reads are expected. If more than 10% unassigned reads are generated during sequencing, it is likely due to mistagging or unsuccessful tagging of barcodes to the intermediate PCR products during library construction. Many existing protocols utilize a one-step PCR for DNA amplicon sequencing [2,3,4]. In this one-step PCR approach, a barcode is linked to a primer that targets a specific genomic region [3]. However, when a new primer set is tested, new primers with barcodes are required (e.g., 96–384, each with ca. 60 bp synthetic oligonucleotides), which is costly. Compared to one-step PCR, the multi-step PCR library construction approach offers flexibility and creativity in experimental designs that allow for the exploration of different genomic regions and organisms by using the same barcode set with different primer targets.

Recently, Chen et al. 2018 [5] described a three-step PCR (3P) protocol that enabled flexible primer choice given the PCR-ligation of the barcode region to samples during library construction. This protocol also implemented a “frame shift” within the ordinary universal primer regions, which results in the sequence diversity that Illumina requires for accurate base calling and therefore for eliminating the need for PhiX addition [6]. This 3P protocol has been applied to several study systems [5,7,8,9,10] using the Illumina MiSeq sequencing platform. These studies span both mycobiome and microbiome systems and cover soil and plant samples. However, across these studies, sequences from the 3P approach generated 7 to 46 percent of undetermined reads even if the same sequencing provider was used (e.g., Duke University Center for Genomic and Computational Biology) [5,10]. The unpredictable read loss across sequencing runs can challenge the estimation of sequence depth and downstream data analysis. With the goal of achieving a lower percentage of undetermined reads consistently across sample types and runs, we modified and improved upon the 3P method through a two-step PCR (2P) protocol that is presented here.

Compared to 3P (reported by Chen et al., 2018 [5]), the three major modifications of 2P protocol include (1) the implementation of a touchdown PCR cycling at the final PCR step, (2) the application of a long primer (sequencing primer-frame shift-linker-gene specific primer) directly in the first step of PCR amplification, and (3) the addition of a magnetic bead clean-up step between PCR reactions. The implementation of a “touchdown” PCR cycling at the final PCR step provides steady annealing temperature reduction during the annealing step of the PCR thermocycler program. This improves the specificity and sensitivity of PCR amplification. Such an improvement leads to a significant decrease in the percentage of unassigned reads and an increase in the read recovery rate. The touchdown approach enables the exploration of a wide range of annealing temperatures that may be necessary to successfully amplify diverse gene regions of interest. By removing one PCR step in the second modification and by using the long primer (originally used the second step of 3P by Chen et al., 2018 [5]) as the first step of PCR amplification, savings in PCR preparation time and reagents were made. This also reduces the number of DNA templates in the final PCR step, lowering primer carry-over from the previous step. Meanwhile, the total number of PCR cycles remain at 30. In the third modification, leftover primers from the previous PCR were removed using a bead clean-up approach. This further limits the carryover of primers that may form primer dimers or may bias DNA quantification. To evaluate the PCR performance and sequencing results of 2P compared to the original 3P protocol, a trial was carried out using soil and plant materials collected from forests and farmlands (Figure 1). We prepared metabarcoding libraries using 2P and 3P procedures and compared the sequencing results. With this study, we aim to provide metabarcoding users with a robust protocol that can be applied to various targeted regions in different organisms.

Figure 1. Examples of sample sources tested in this study, including (A) roots and soils from a cotton field, (B,C) roots from pine forests, (D) Leaves of grass, and (E,F) soil cores collected from a grass field.

2. Materials and Methods

Overview of Three-Step and Two-Step PCR Library Construction

To compare the sequence outcomes of two-step PCR (2P) and three-step PCR (3P) [5], we sequenced the 16S and ITS rDNA gene regions. We compared the microbial communities (bacteria, archaea, and/or fungi) from diverse environmental samples including soil and leaf samples from the forest and agricultural lands as well as root samples of pine trees (Figure 1). The references of each primer set tested are provided in Supplementary Table S1. The methods used for DNA extraction from plant and soil samples were applied according to Liao et al., 2014 [11], and Beule et al., 2019 [8], respectively.

Briefly, the 3P protocol started with a template enrichment step with the amplification of a targeted genomic DNA region using organism/genomic region-specific primers, generating PCR product 3P_1st. Product 3P_1st was then supplied as the DNA template for the next PCR, which used organism/genomic region-specific primers with Illumina sequencing primers attached, producing product 3P_2nd. The 3P_2nd product was used as a DNA template for the final PCR step that PCR-ligates Illumina adaptors to both ends and adds a 10-bp barcode to the 3′ end of the targeted region, resulting in PCR product 3P_3rd (Table 1 and Table 2) [5]. To assess the effect of magnetic bead clean-up, a “3P+cleanup” (Table 1) protocol that only differs from 3P by including a clean-up step before the final PCR reaction was tested as well.

Table 1. Cross comparison of the steps applied for two-step PCR (2P), three-step PCR (3P), and three-step PCR with bead clean-up (3P+cleanup) protocols. Forward Gene Region-Specific Primer (FGRSP, e.g., ITS1F, 341F, and 515F) and Reverse Gene Region-Specific Primer (RGRSP, e.g., ITS2, ITS4, and 806R) can be replaced by forward and reverse primers of interest, respectively. x = barcode ID. Each of the 2P steps (Step 1–7) were described in detail in Figure 2.

Table 2. PCR program differences between two-step PCR (2P) and three-step PCR (3P) protocols, including cycle number, annealing approach, and template concentration.

The overall workflow of 2P is illustrated in Figure 2. The core steps of 2P include the isolation of the total genomic DNA from the samples (Step 1), followed by an initial PCR amplification using a pair of organism/genomic region-specific primers with Illumina sequencing primers attached (Step 2) to generate the PCR product 2P_1st (Figure 3). The 2P_1st product was cleaned up (Step 3) with a size-selection magnetic bead system (AMPure XP, Beckman Coulter, Inc., Indianapolis, IN, USA) to generate product C_2P_1st. C_2P_1st was used as the starting DNA template for the second PCR (Step 4). The second PCR further amplified product C_2P_1st with a universal primer set, adding Illumina adaptors and a 10-bp barcode that enabled the bioinformatic separation of reads derived from individual samples (Figure 3). The PCR product (2P_2nd) generated from the second PCR was purified with magnetic beads to generate C_2P_2nd (Step 5). The C_2P_2nd product was examined with gel electrophoresis to ensure the correct size of the amplicons. The DNA concentration was measured, and the samples were normalized and pooled (Step 6). The pooled DNA library was submitted to the sequence facility for sequencing (Step 7). The demultiplex step was carried out using Illumina bcl2fastq Conversion Software v2 with one base error allowance for samples sequenced at Duke University Center for Genomic and Computational Biology. The detailed 2P protocol is illustrated in the next section. The sequences of the oligonucleotides required for 2P are provided in Supplementary Tables S1–S3.

Figure 2. Schematic diagram of the workflow of two-step PCR protocol. The primers (forward: FGRSP_F1-F6, reverse: RGRSP_F1-F6) added in Step 2 include the “gene region-specific primer (GRSP)” (colored in green; e.g., ITS1F and ITS4); the “linker” region (orange), which contains two base pairs; the “frame shift” regions (pink), which are one to six randomized nucleotides (F1–F6); and the sequencing primers (blue) specific to the sequencing platform selected (e.g., Illumina). In Step 3, the forward primer (PCR_F) added includes sequencing primers and Illumina primers (purple). The reverse primer (PCR_R_bc_(X), X = barcode ID) has an additional 10 bp barcode region (yellow) that is used for the recognition of read originality. PCR product clean-up was carried out after each PCR reaction (Steps 3 and 5). The oligonucleotides and sequences of Read1_seq and Read2_seq are provided to the sequencing facility upon sample submission. Synthetic oligonucleotide sequences are provided in Supplementary Tables S1–S3. Thermocycler programs are illustrated in Figure 3.

Figure 3. Thermocycler programs for two-step PCR (2P) workflow. * Decrease by 0.3 degree Celsius every cycle; ** We recommend using 15 cycles. However, if PCR yielded low DNA concentrations, the number of cycles can be increased up to 20.

The major differences between 2P protocol and the original 3P protocol [5] include (Table 1 and Table 2): (1) a decrease in the number of PCR steps from three to two by eliminating the enrichment step but by maintaining the same total number of PCR cycles (30 cycles); (2) the use of a steady decrease in annealing temperature setting (touchdown approach) during PCR instead of a constant temperature for primer annealing, improving primer specificity and annealing and reducing primer dimer; (3) a decrease in the amount of DNA template added to the final PCR cycle; and (4) the implementation of a magnetic bead clean-up step before the final PCR reaction.

To evaluate the performance of 2P vs. 3P protocols, we first compared the lengths of PCR products based on the same DNA extractions to check if barcodes/adaptors were successfully added (dataset 1). Second, we compared the sequencing results of samples prepared with three protocols, including 2P, 3P, and 3P+cleanup (sample number = 7 for each protocol, dataset 2) (Table 1, Supplementary Table S4). The libraries were based on exactly the same DNA extractions with the same primer set (515F-806R) and pooled with equal moles of PCR products for sequencing. The differences of the sequencing results (i.e., read number and quality) across the three protocols were evaluated. We then compared the critical component of the sequencing report from 10 independent MiSeq runs, of which 5 runs were prepared with 2P protocols and the other 5 prepared with 3P protocol (dataset 3, Supplementary Table S5). All 10 MiSeq runs were conducted with a single sequencing provider (Duke University Center for Genomic and Computational Biology) using the same platform (Illumina MiSeq v3 300PE). For both the 2P and 3P libraries, we included various sample types (different soil or plant samples) representing heterogeneous DNA extractions targeting bacteria/archaea or fungi. While the heterogeneous nature of the samples tested could bring about inconsistency, their diverse contents also offer a valuable opportunity to evaluate the consistency across sample types and individual runs. To evaluate the consistency of sequencing quality and the percentage of undetermined reads across independent runs, we performed Welch Two Sample t-test (or Wilcoxon rank-sum test for non-normally distributed data) and Levene test for homogeneity of variance to assess the differences between the 2P and the 3P protocols and the variation within the same protocol, respectively. All of the statistical tests were conducted in R [12]. To validate that effects of 2P vs. 3P are consistent across sequencing facilities and different microbial groups, we independently replicated the 2P vs. 3P protocols and sequencing at Michigan State University (MSU), targeting fungi and bacteria from agricultural soils and from small mammal scat (dataset 4, Supplementary Method S1, Table S6). We compared the read number, richness, and taxonomic composition of bacteria and fungi based on the same DNA exactions. Finally, to evaluate the scale of increase/decrease of sequencing performance on the same sample, we calculated the ratio between 2P and 3P for read number and read quality generated by Duke and MSU sequencing facilities.

Shapiro–Wilk test was applied to assess the normality of the data. If the data followed a normal distribution, an ANOVA or Welch two-sample t-test was conducted. A post hoc Tukey HSD test was then performed for the significant outcomes evaluated using ANOVA. If data normality was rejected, a Kruskal–Wallis or Wilcoxon rank-sum test was carried out. When a Kruskal–Wallis test was significant, the pair-wise comparison was performed using Wilcoxon rank-sum test corrected by FDR multiple sample comparison. Raw reads of datasets 2 and 3 were deposited at the Sequence Read Archive of NCBI (Bioproject: PRJNA736330).

3. Detailed Workflow of Two-Step PCR (2P) Protocol

The recommended steps for 2P were listed here. This protocol has been tested on soil and root materials (Figure 1)

3.1. Step 1: DNA Preparation

3.1.1. Sample Collection and Processing

Samples collected from the field need to be immediately stored in 4 °C in the field. Sample processing (e.g., soil sieving and root picking) must be carried out within 24 h, and the samples are subsequently stored in −20°C or −80 °C freezer for longer sample preservation.

3.1.2. DNA Extraction

Soil DNAs are extracted with Qiagen PowerSoil kit. Depending on the soil type, a different amount of soil might be needed. Generally, 0.2–0.5 g of soil is recommended [8,10]. Root DNAs are extracted following a CTAB-based protocol described in Liao et al., 2014 [11]. Typically, more than three root tips or clusters from the Pinus species are recommended to obtain adequate DNA.

3.1.3. Methods (Optional) to Prevent PCR Inhibitors in Extracted DNA Affecting the Following PCR Steps

For DNA extractions from the root, DNA extraction may be cleaned up with AMPure XP beads (Beckman Coulter, Inc., Indianapolis, IN, USA). This is the optional step to reduce PCR inhibitors from root tissues that are co-extracted. The ratio of volume of beads to DNA extraction is 1:1. The clean-up protocol was performed according to Step 3.

Loading equal volumes of the substrate for DNA extraction and normalizing DNA concentration across sample types, such as soils or root tissue, helps with generating even libraries. Still, it is recommended that PCR on a series of DNA concentrations (1–20 ng/μL, determined by Thermo Scientific™ NanoDrop, or Qubit™) be performed first on a few samples to determine the optimal DNA concentration before bulk sample processing. In special cases when excessive PCR inhibitors (i.e., substrates in DNA extraction that inhibit PCR reaction) or low levels of targeted organisms are present in the DNA extraction, diluting or adding more original DNA into the PCR could be tested [13].

3.2. Performing the First PCR Using Thermocycler Program “2P_1st”

3.2.1. PCR Reagent Preparation

Use the DNA extraction from Step 1, and follow the PCR recipe 2P_1st in Table 3.

Table 3. PCR mix recipes for two-step PCR protocol.

3.2.2. PCR Amplification

Place the PCR tubes on a thermocycler, and use the PCR program 2P_1st setting illustrated in Figure 3. Briefly, this step amplifies specific genomic regions with a pair of forward (e.g., fungi: ITS1F; bacteria: 515F and 341F) and reverse (e.g., fungi: ITS2 and ITS4; bacteria: 806R) gene region-specific primers (FGRSP and RGRSP) that include frame-shift features (Figure 2, sequences provided in Supplementary Table S1). The steadily decreasing annealing temperature of the “touchdown” approach implemented in the 2P_1st thermocycler program facilitates the primer annealing efficiency for a wide range of primer sequences. This step yields PCR product 2P_1st.

3.3. First PCR Product Clean-Up

The PCR purification step is performed using the solid-phase reversible immobilization (SPRI) bead size selection system [14] that removes remaining dNTPs, primers, primer-dimers, and salt.

3.3.1. AMPure Bead Preparation

Take the Beckman Coulter™ Agencourt AMPure XP beads out of the refrigerator and warm them to room temperature for approximately 30 min before use. Mix the AMpure XP beads well with the solution by gently vortexing or shaking until the beads are fully suspended in the solution (i.e., the solution becomes homogeneously brown).

3.3.2. Mix AMPure Beads and PCR Products

Add 12.5 μL of AMpure XP beads into each PCR tube (the ratio of volume of beads to the PCR product is 1:1), and pipet at least 10 times to fully mix the PCR product and beads.

3.3.3. Separate Solution versus AMPure Beads

Place the PCR tube onto the magnetic stand (e.g., Agencourt SPRIPlate 96R Super Magnet Plate), and wait until the tube becomes clear (about 5 min). Take out 10 μL, and discard the liquid.

3.3.4. Ethanol Wash

Add 200 μL of 80% ethanol; be careful not to disturb the pellet. Discard all liquid. Let the tubes air dry for 15 min to allow for evaporation of the adhered ethanol.

3.3.5. Beads and DNA Resuspension

Remove the PCR tube from the magnetic stand, add 12.5 μL water into the PCR tube, mix well, and incubate at room temperature for 2 min.

3.3.6. AMpure Bead Separation and Removal

Place the tube on the magnetic stand, and wait until the tube becomes clear. Move 10 μL supernatant to a new tube. The cleaned-up PCR product of this step is C_2P_1st.

3.4. Performing the Second PCR Using Thermocycler Program “2P_2nd”

3.4.1. PCR Reagent Preparation

Use the cleaned PCR product from Step 3 (C_2P_1st) as the DNA template. Follow the PCR recipe in Table 3. The sequences of primers are provided in Supplementary Table S2.

3.4.2. PCR Amplification

Place the PCR tubes on a thermocycler, and use the PCR program 2P_2nd setting illustrated in Figure 3. This step adds an Illumina adaptor to the forward end and another adaptor with a barcode region to the reverse end of the DNA template. As the primers used herein are long (forward ≈56 bp and reverse ≈60 bp), the touchdown annealing temperature setting is optimized so that the primers adhere to the template DNA. The PCR product is 2P_2nd.

3.5. Second PCR Product Clean-Up

3.5.1. AMPure Beads Preparation

Take Beckman Coulter™ Agencourt AMPure XP beads from the refrigerator 30 min prior to use. Shake and vortex the AMpure XP beads to mix and spin them from the sides of the wells.

3.5.2. Mix AMPure Beads and PCR Products

Add 25 μL of AMpure XP beads into each PCR tube (the ratio of volume of beads to PCR product is 1:1), and pipet at least 10 times to fully mix the PCR product and beads.

3.5.3. Separate Solution versus AMPure Beads

Place the PCR tube onto the magnetic stand (e.g., Agencourt SPRIPlate 96R Super Magnet Plate), and wait until the tube becomes clear. Take out 48 μL, and discard the liquid.

3.5.4. Ethanol Wash

Add 200 μL of 80% ethanol; be careful not to disturb the pellet. Discard all liquid. Let the tubes air dry for 15 min to evaporate the adhered ethanol.

3.5.5. Beads and DNA Resuspension

Remove the PCR tube from the magnetic stand, add 25 μL water into the PCR tube, mix well, and incubate at room temperature for 2 min.

3.5.6. AMpure Bead Separation and Removal

Place the tube on the magnetic stand, and wait until the tube becomes clear. Move 23.5 μL of the supernatant to a new tube. This step generates the product C_2P_2nd.

3.6. PCR Product Evaluation and Multiplex

Each C_2P_2nd needs to be examined for amplicon size and DNA concentration.

3.6.1. Gel Preparation and Electrophoresis

Prepare a 1% gel using electrophoresis-grade agarose in 1X TAE (Tris-Acetate-EDTA) buffer and appropriate fluorescence dye (e.g., SYBR-safe) for DNA examination. After the gel solidifies, load 5 μL of C_2P_2nd (mixed with 6× loading dye) for every sample into separate wells of the gel. A 100 bp DNA ladder should also be loaded as a reference for DNA amplicon lengths.

3.6.2. Gel Examination

The gel is then examined under UV or blue light box to determine the size of DNA fragments and their integrity. The gel examination can also be applied to examine the potential issues with primer dimers. Primer dimers with sizes below 200 bp may or may not be sufficiently removed by the bead clean-up during Step 5.

3.6.3. DNA Quantity and Purity Assessment

Evaluating C_2P_2nd products with a spectrometer (e.g., NanoDrop, Thermo Scientific™, Waltham, MA, USA) is recommended before multiplexing. This step assesses the DNA concentration as well as detects the potential contaminations, including ethanol and polysaccharides [13]. Ethanol and polysaccharides could interfere with concentration measurements and downstream PCR performance. Due to the high absorbance of such contaminants at 230 nm, the DNA extraction with ethanol/polysaccharides contamination would have a A260/A230 ratio lower than 2.0 [15]. DNA concentration can be further checked with fluorometric quantification methods, such as the Qubit fluorometer (Thermo Scientific™, Waltham, MA, USA).

3.6.4. Sample Multiplexing

Based on the quantification report of every C_2P_2nd sample using NanoDrop or Qubit (Step 6.3), each amplicon sample is normalized to a 10 nM concentration in pure water prior to pooling. Approximately 200–300 pooled samples are prepared for a run of MiSeq 300 PE sequencing.

3.7. Submit Samples for Sequencing

3.7.1. Sample Submission to the Sequencing Facility

In addition to the pooled samples, two primer sets are needed for sequencing, Read1_seq sequencing primer, and Read2_2_seq sequence primer. Read1_seq and Read2_seq primers allow for the initiation of sequencing for the Illumina platform. Users need to provide the pooled product from Step 6 and custom Read1_seq Illumina sequencing primer (100 μM concentration) (Figure 2, Supplementary Table S3) [5] to the sequencing facility. In general, the Read2_seq sequence is available at the sequencing facility and submitting Read2_seq sequence is likely not necessary.

3.7.2. Submission Condition Inquiry

It is recommended to reach out to the sequencing facility prior to sample submission for primer design, multiplexing strategy, and sequencing depth requirements. Depending on fragment size, 10–15 picomolar (for 300–400 bp) to 25 picomolar (for 600–700 bp) can be loaded. The sequencing performed in this study was based on the platform Illumina MiSeq Paired-End 300 bp. Many sequencing facilities also assist with the demultiplexing process (i.e., assigning the reads to the sample they belong to). If so, provide the barcode list associated with individual samples (Supplementary Table S2).

3.8. Hardware Requirements

-: Thermocycler: any thermocycler allowing temperature to decrease per cycle should work.
-: Magnetic stands: PCR purification step is essential to remove remaining dNTPs and primer dimers that might be present in PCR products. To perform purification with paramagnetic beads, a 96-well magnetic plate is essential.
-: Nucleotide spectrophotometer or fluorometer: to obtain an accurate concentration of DNA across the cleaned-up 2P_2nd product, a spectrophotometer or fluorometer is required. In addition to reporting the DNA concentration, the spectrophotometer also reveals common contaminations (e.g., ethanol and phenolic compounds) in the PCR product. The fluorometer, on the other hand, is believed to provide a more accurate estimation of DNA concentration.
-: Gel electrophoresis system: horizontal electrophoresis system.
-: Gel imaging system: gel Documentation System

3.9. Synthetic Oligonucleotides

Three categories of oligonucleotide primer sets are applied to prepare 2P libraries for sequencing. These include (1) forward gene region-specific primers (FGRSP_F1-F6) and reverse gene region-specific primers (RGRSP_F1-F6), (2) PCR_F, and (3) PCR_R_bc1-PCR_R_bc300. The full sequences of six newly designed frame-shift primers targeting fungi (forward: ITS1F and 5.8S_fun; reverse: ITS2, ITS4, and ITS4_fun) [16,17] and bacteria (forward: 341F) [18] are listed in Supplementary Table S1. The barcodes, PCR_F primer, and Read1 and Read2 sequencing primers were published in the Supplementary Tables S1 and S2 in Chen et al., 2018 [5]. For the user’s convenience, the sequences are provided herein in Supplementary Tables S2 and S3.

4. Results and Discussion

Each C_2P_2nd needs to be examined for amplicon size and DNA concentration.

In this study, the efficiency of the PCR enrichment process is defined as successful amplification of the targeted templates. When the efficiency is high, the final PCR products should yield adequate PCR fragments of desired lengths. In cases where there is a decrease in the PCR efficiency, this can often be attributed to critical issues such as (1) primers binding non-specifically to undesired genomic regions and, thus, failing to amplify the targeted DNA fragment; (2) primers carried over from the previous amplification reactions (instead of newly added primer set) consumed in the current PCR cycles; and (3) insufficient primer-template annealing or low template concentration leading to inadequate PCR products. The second issue may lead to high proportions of undetermined reads in sequencing results [5]. To resolve these issues, the PCR efficiency of 3P [5] and 2P (this study) protocols were compared and their sequence outcomes were evaluated. Specifically, we examined whether the Illumina adaptors and barcodes were successfully attached to the amplified DNA regions during PCR library constructions. We also compared the recovery rate and quality of the reads generated from the two protocols. Our results indicate that the 2P protocol that implements a “touchdown” approach, an additional beads clean-up step, and a lowered DNA input of the final PCR largely improved the PCR efficiency. Such an improvement was further confirmed by receiving fewer undetermined reads in the sequencing data.

4.1. Evaluation of PCR Efficiency by Assessing the Amplicon Length

We evaluated the length of PCR products generated from 2P_1st and 2P_2nd. The successful attachment of the Illumina adaptors enables an increase of 65 bp for the second PCR-end products on top of the first PCR-end products. The increased 63 bp include the addition of Illumina adaptors in 5′- and 3′-end PCR products (forward = 29 bp; reverse = 24 bp) plus a barcode to the reverse end (10 bp) (Figure 4). On the contrary, incorrect PCR amplification occurs from primers carried over from the 2P_1st step instead of the newly added primers at the 2P_2nd step. Such unexpected primer usage may lead to a smaller increment in the overall length of the targeted community in final PCR products (3P_3rd and 2P_2nd). The DNA fragments of the PCR products amplified for the two samples at each step were around 650 to 1000 bp in length (shown by the black bands in Figure 4A). Figure 4A showed that, compared to the PCR products in 2P_1st, the increase in size of the 2P_2nd products by approximately 63 bp suggests that the 2P approach successfully amplified the desired template and attached the barcode.

Figure 4. PCR product lengths detected by Fragment Analyzer ™ measurements. PCR libraries were prepared with two-step (2P) and three-step (3P) protocols from the same DNA extractions (one soil and one root). PCR was performed with a fungus-specific primer set ITS1F and ITS4 (dataset 1). PCR products resulting from each of the three PCR steps of protocol 3P are referred to as 3P_1st, 3P_2nd, and 3P_3rd (Table 2). Similarly, the PCR steps of protocol 2P are referred to as 2P_1st and 2P_2nd (Table 2 and Figure 2). (A) Simulated gel image with DNA fragments shown as black bands. The Y-axis corresponds to the size of the fragment. The DNA fragments for both the soil and root samples demonstrated an increase in PCR product size from 2P_1st to 2P_2nd. Bands at 35 and 1500 bp are size standards; bands at ca. 66 bp are primer dimers. (B) An overlay of the DNA size spectrums generated from each PCR step using a soil and a root sample. The X-axis indicates the size of the DNA fragment measured. The Y-axis corresponds to the intensity of the Relative Fluorescence Unit (RFU) viewed as a proxy for fragment abundance at the given size. The root sample yielded PCR products with peaks at 845 bp (2P_1st), 900 bp (2P_2nd), and 861 bp (3P_3rd product). The PCR product of a soil sample yielded PCR product size peaks such as 849 bp (2P_1st), 911 bp (2P_2nd), and 875 bp (3P_3rd). The DNA standards of 35 and 1500 bp are shown. 2P = two-step PCR protocol; 3P = three-step PCR protocol.

The Fragment Analyzer was further applied to obtain a fine resolution for the fragment size generated from the PCR products of the 2P_1st, 2P_2nd, and 3P_3rd steps (Figure 4B). The amount of PCR end-products for the 3P_1st and 3P_2nd steps (with only 10 PCR cycles each) were below the detectable range of Fragment Analyzer and therefore not presented here. As shown in Figure 4B, the quantity of PCR end-products in each step was presented as relative fluorescent units (RFU) (Y-axis). The amplified DNA fragments were shown as the peaks ranging between ca. 600 to 1000 bp (highlighted in yellow) (Figure 4B). With a soil sample, the highest peak (amplified with ITS1F and ITS4 primer set) shifted from 849 bp (2P_1st, in black) to 911 bp (2P_2nd, in blue), while 3P_3rd was only at 875 bp (in red). Similarly, in a root sample, the highest peak shifted from 845 bp (2P_1st, in black) to 900 (2P_2nd, in blue) but was only 861 bp for 3P_3rd (in red). A higher quantity of PCR end-products detected for 2P_2nd compared to 3P_3rd indicates that the 2P protocol had a higher PCR efficiency of adding Illumina adapters and barcodes. The barcode sequence serves as the sample-specific tag for each sequence read. Successful attachment of the barcodes during 2P amplification is critical to assign the reads to the individual sample in the demultiplex process.

4.2. The Effect of PCR Library Protocols (2P, 3P, and 3P+Cleanup) on the Quantity and Quality of Sequence Reads

We investigated the sequencing results of seven soil samples generated using the 2P, 3P, and 3P+cleanup protocols (Table 1). To understand the effect of the magnetic bead clean-up between PCR steps on sequencing results, we first compared the 3P+cleanup protocol to the original 3P protocol. Compared to the 3P protocol, 3P+cleanup includes an extra step to clean up the intermediate PCR product (3P_2nd) before using it as the DNA template to generate the final PCR product (3P_3rd) (Table 2). The 3P+cleanup protocol yielded a higher read number (Wilcoxon test, FDR < 0.001) with better read quality (Wilcoxon test, FDR < 0.01) (Figure 5A,B) compared to 3P, suggesting that the clean-up step improved read quantity and quality. After confirming the beneficial effect of the beads clean-up step, we then compared the 2P protocol, which includes the beads clean-up step but implements a simpler library preparation procedure than 3P (Table 1 and Table 2). According to the sequencing reports, 2P received a significantly (Wilcoxon test, FDR < 0.001) higher number of reads compared to 3P (Figure 5A) per sample. The read quality is similar between 2P and 3P (Figure 5B). The 3P+clean protocol yielded a significantly higher read quality compared to 2P and 3P. However, because 2P protocol significantly increased read quantity while retained read quality over 3P, 2P is considered an improved protocol overall.

Figure 5. Sequencing result comparisons. All sequencing was conducted at the sequencing core at Duke University Center for Genomic and Computational Biology (datasets 2 and 3). (A,B) Individual samples prepared with the three-step PCR (3P), three-step PCR with bead clean-up (3P+cleanup), and two-step PCR (2P) protocols (Supplementary Table S4). All samples were prepared with the 515F-806R primer set targeting bacteria and archaea (dataset 2). (A) Number of reads received and (B) average quality of reads (Phred score) for 3P, 3P+cleanup, and 2P per sample. (C,D) Independent Illumina MiSeq runs of 2P vs. 3P protocols (dataset 3, Supplementary Table S5). (C) Percentage of undetermined reads (i.e., reads that could not be assigned to a sample due to unrecognizable barcode) and (D) average quality of reads in 3P vs. 2P across independent Miseq runs. Significance label for p-value and FDR: ns = not significant (>0.05), ** <0.01, *** <0.001.

4.3. Across Run Comparison for the Proportion of Undetermined Barcoded Sequences Generated with 2P vs. 3P Approaches

To evaluate the quality of sequences generated from 2P and 3P across independent sequencing runs, the critical components of MiSeq sequencing reports were compared, including the percentage of undetermined reads (i.e., reads unable to be assigned to specific sample) and the average read quality (Phred score) (Supplementary Table S5). Independent Illumina MiSeq runs were compared between 2P (5 runs) and 3P (5 runs). The percentage of undetermined reads of 2P (8.19% ± 1.22%) were lower compared to those of 3P (22.8% ± 25.5%) (Wilcoxon test, p = 0.09) (Figure 5C). The 2P protocol yielded a more consistent percentage of undetermined reads with significantly less variation between runs (Levene test, p = 0.05) compared to 3P. The average read quality is similar in 2P (32.5 ± 2.04) compared to 3P (31.8 ± 2.50) (Welch two-sample t-test, p = 0.61) (Figure 5D) and showed similar degrees of variation across samples (Levene test, p = 0.77).

The higher number of sequence reads generated per sample (Figure 5A) combined with the lower percentage of undetermined reads per run (Figure 5C) indicates that the barcodes were properly attached to amplicons generated with 2P protocol. This major improvement in 2P consistently minimizes the chance to discard valuable reads simply due to their uncertain originality. The high read quality score in both 2P and 3P reflects a low error rate. Taken together, the 2P protocol allows for the retention of more reads for downstream analysis and assessment (Figure 2). Compared to 3P, the 2P protocol slightly enhances the average read quality across runs (Figure 5D). The quality of reads show variation among samples for both 2P and 3P. We suspect that the read quality issue might be sample-specific but requires further examination to discern the specific cause(s).

4.4. Evaluation of 2P and 3P Performance for Libraries Targeting Different Taxonomic Groups

To test whether the positive impact of 2P over 3P is consistent regardless of the taxonomic groups each library targets, comparisons of libraries targeting bacteria and fungi with 3P and 2P protocols were made. In addition to evaluating the read number, we further compared the ecological inferences, including alpha-diversity (i.e., Operational Taxonomic Unit (OTU) number) and the taxonomic composition recovered by each protocol. The bioinformatic pipeline was described in Supplementary Method S1. Compared to 3P, 2P-prepared samples consistently resulted in higher read number and higher alpha-diversity based on the same DNA extraction for bacteria (Welch two-sample t-test, p < 0.01) (Figure 6). For fungi, while the read number and alpha-diversity of libraries prepared by 2P were higher than 3P too, the magnitude of increases were not statistically significant (Figure 6). In addition to quality assessments, the number of chimeras removed from samples sequenced at MSU were compared between the two amplification methods. For ITS samples, the number of chimeras was the same regardless of amplification method (41). The number of chimeras removed from 16S samples was higher in samples amplified with the 2P protocol compared to the 3P protocol (1072 vs. 915). However, this was consistent with a greater number of OTUs and reads produced by the 2P protocol.

Figure 6. Comparison of sequence results obtained for bacteria and fungi (dataset 4, Supplementary Method S1, Table S6) using the three-step (3P) and two-step (2P) protocols. All library preparation and sequencing were conducted at the RTSF Genomics Core at Michigan State University. (A–C) Sequence results of bacteria. (A) Read number and alpha diversity (observed number of species) detected with the 2P and 3P protocols (B,C) Order level taxonomic composition of the 3P- and 2P-prepared libraries. (D–F) Sequence results of fungi. (D) Read number and alpha diversity (observed number of species) detected with the 2P and 3P protocols. (E,F) Genus level taxonomic composition of the 3P- and 2P-prepared libraries. Welch two-sample t-test was conducted. ** p value < 0.01, ns = not significant.

4.5. 2P and 3P Comparison between Sequencing Facilities

To confirm that the 2P protocol improvement is consistent regardless of the sequencing facility, sequencing results independently generated by the sequencing facilities at Duke University and Michigan State University were compared. We showed that samples sequenced with 2P yielded a higher read number in both institutions, ranging from 4.74 (bacteria_Duke)-fold to 1.19-fold compared to 3P (fungi_MSU) (Figure 7). Samples prepared with 2P had slightly lower sequencing quality (all 0.99-fold) compared to 3P in both sequencing facilities (Figure 7).

Figure 7. Comparison of the 2P and 3P sequencing results generated by independent sequencing facilities (datasets 2 and 4). (A) The ratios of two-step PCR (2P)-generated read numbers to those of three-step PCR (3P) per sample. (B) The ratios of 2P-generated average read quality to those of 3P per sample. bacteria_Duke = libraries targeting bacterial/archaeal communities and sequenced at Duke University, bacteria_MSU = libraries targeting bacterial/archaeal communities and sequenced at Michigan State University, and fungi_MSU = libraries targeting fungal community and sequenced at Michigan State University.

4.6. Touchdown Technique in Improving Multi-Step PCR for Next-Generation Amplicon Sequencing

The touchdown PCR technique has been widely utilized in molecular applications to resolve incorrect primer binding [19,20]. In almost all PCR cycler programs, one annealing temperature is set to optimize primer binding to the template. The higher the annealing temperature, the higher the specificity of primer binding. However, a high annealing temperature and a high specificity come with the cost of a low binding rate, which could result in insufficient PCR amplification. Finding the optimized temperature by taking into account annealing specificity and efficiency often requires trial and error and is time-consuming for individual primer sets or samples. The concept of touchdown PCR takes advantage of exponential DNA amplification. By decreasing the annealing temperature, high temperatures enable the high specificity of PCR to be met first, thus producing large copies of correct DNA amplicons by serving as templates for successive cycles. In the presented approach, a wide range of annealing temperatures are used in the programs to exploit a single PCR thermocycler program (2P_1st and 2P_2nd), allowing individual programs to be applied to diverse primer sets and samples with different conditions. The MiSeq next-generation sequencing platform typically requires long oligonucleotides that include primers, adaptors, and barcodes for amplicon sequencing. Such long oligonucleotides complicate the annealing or ligating steps. Therefore, the touchdown technique for amplicon sequencing that we provide here is a promising modification to existing microbiome protocols. While the current 2P protocol is optimized for Illumina MiSeq sequencing platform, the protocol could be adapted to other sequencing platforms (Oxford Nanopore, Pacbio) with minor adjustments.

5. Conclusions

The improved 2P protocol described herein was demonstrated to be suitable to study soil and plant microbiomes in agricultural and forest ecosystems. The simplified PCR protocol yields a superior read recovery rate; thus, we recommend the switch from 3P to 2P protocols. Currently, this protocol has only been tested on microbes (i.e., bacteria and fungi), yet the flexible primer, adaptor, and barcode attachment system are transferable to other target taxonomic groups or markers to benefit enterprises and future applications. For instance, this method is amenable to the examination of microfauna (e.g., nematodes in soil), plant identification in mixed crop products, viruses, and general environmental metabarcoding.

Supplementary Materials

The following are available online at https://www.mdpi.com/article/10.3390/agronomy11071274/s1, Table S1: Sequences of frame-shift primers, Table S2: PCR_F and barcode sequences, Table S3: Read1_seq, Read2_seq, Barcode_seq sequences, Table S4: Comparison of 2P, 3P, and 3P+cleanup based on the same DNA extraction, Table S5: Sequencing results across MiSeq runs for 2-step (2P) and 3-step PCR (3P), Table S6: Comparison of sequencing and microbial community inferences targeting bacteria and fungi, Method S1: Sample and library preparation at Michigan State University.

Author Contributions

K.-H.C. modified the protocol. K.-H.C. and R.L. tested the protocol and analyzed the results. G.B. and H.-L.L. supervised the work. K.-H.C., R.L., G.B. and H.-L.L. all contributed to writing and editing the manuscript. K.-H.C., R.L., G.B. and H.-L.L. All authors have read and agreed to the published version of the manuscript.

Funding

HLL was supported by an award (USDA-NIFA 2019-67013-29107) from the US Department of Agriculture, GB and RL were supported by the United States National Science Foundation award (DEB 1737898), and KHC was supported by the Taiwanese Ministry of Science and Technology (MOST #109-2621-B-001-006-MY3 grant). Support for this research was also provided by the NSF Long-term Ecological Research Program (DEB 1832042) at the Kellogg Biological Station and by Michigan State University AgBioResearch.

Data Availability Statement

Sequences are available in the Sequence Read Archive of NCBI (Bioproject: PRJNA736330) (https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA736330, accessed on 9 June 2021).

Acknowledgments

We thank Yuxi Guo, Jianyu Li, Bodh Raj Paudel, Chih-Ming Hsu, and Valerie Mendez for sharing their sequencing reports and Lukas Beule for the valuable discussion regarding the protocol modification. We thank Nicolas Devos for the discussion about MiSeq sequencing and Sean Sultaire for collecting the chipmunk scat used in this study.

Conflicts of Interest

The authors declare no conflict of interest.

References

Schnell, I.B.; Bohmann, K.; Gilbert, M. Tag jumps illuminated—reducing sequence-to-sample misidentifications in metabarcoding studies. Mol. Ecol. Resour. 2015, 15, 1289–1303. [Google Scholar] [CrossRef] [PubMed]
Fadrosh, D.W.; Ma, B.; Gajer, P.; Sengamalay, N.; Ott, S.; Brotman, R.M.; Ravel, J. An improved dual-indexing approach for multiplexed 16S rRNA gene sequencing on the Illumina MiSeq platform. Microbiome 2014, 2, 6. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Smith, D.P.; Peay, K.G. Sequence Depth, Not PCR Replication, Improves Ecological Inference from Next Generation DNA Sequencing. PLoS ONE 2014, 9, e90234. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Truong, C.; Gabbarini, L.A.; Corrales, A.; Mujic, A.B.; Escobar, J.M.; Moretto, A.; Smith, M.E. Ectomycorrhizal fungi and soil enzymes exhibit contrasting patterns along elevation gradients in southern Patagonia. New Phytol. 2019, 222, 1936–1950. [Google Scholar] [CrossRef] [PubMed]
Chen, K.-H.; Liao, H.-L.; Arnold, A.E.; Bonito, G.; Lutzoni, F. RNA-based analyses reveal fungal communities structured by a senescence gradient in the moss Dicranum scoparium and the presence of putative multi-trophic fungi. New Phytol. 2018, 218, 1597–1611. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Lundberg, D.S.; Yourstone, S.; Mieczkowski, P.; Jones, C.D.; Dangl, J.L. Practical innovations for high-throughput amplicon sequencing. Nat. Methods 2013, 10, 999–1002. [Google Scholar] [CrossRef] [PubMed]
Benucci, G.M.N.; Burnard, D.; Shepherd, L.D.; Bonito, G.; Munkacsi, A.B. Evidence for Co-evolutionary History of Early Diverging Lycopodiaceae Plants With Fungi. Front. Microbiol. 2020, 10, 2944. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Beule, L.; Chen, K.-H.; Hsu, C.-M.; Mackowiak, C.; Dubeux, J.C.B.; Blount, A.; Liao, H.-L.; Dubeux, J.C.B., Jr. Soil bacterial and fungal communities of six bahiagrass cultivars. PeerJ 2019, 7, e7014. [Google Scholar] [CrossRef] [PubMed]
Longley, R.; Noel, Z.A.; Benucci, G.M.N.; Chilvers, M.I.; Trail, F.; Bonito, G. Crop Management Impacts the Soybean (Glycine max) Microbiome. Front. Microbiol. 2020, 11, 1116. [Google Scholar] [CrossRef] [PubMed]
Wang, X.; Hsu, C.; Dubeux, J.C.B., Jr.; Mackowiak, C.; Blount, A.; Han, X.; Liao, H. Effects of rhizoma peanut cultivars (Arachis glabrata Benth.) on the soil bacterial diversity and predicted function in nitrogen fixation. Ecol. Evol. 2019, 9, 12676–12687. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Liao, H.-L.; Chen, Y.; Bruns, T.D.; Peay, K.G.; Taylor, J.W.; Branco, S.; Talbot, J.M.; Vilgalys, R. Metatranscriptomic analysis of 567 ectomycorrhizal roots reveal genes associated with Piloderma-Pinus symbiosis: Improved methodologies for assessing gene 568 expression in situ. Environ. Microbiol. 2014, 16, 3730–3742. [Google Scholar] [CrossRef] [PubMed]
R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2018; Available online: https://www.r-project.org (accessed on 1 November 2018).
Schrader, C.; Schielke, A.; Ellerbroek, L.; Johne, R. PCR inhibitors—occurrence, properties and removal. J. Appl. Microbiol. 2012, 113, 1014–1026. [Google Scholar] [CrossRef] [PubMed]
Stortchevoi, A.; Kamelamela, N.; Levine, S.S. SPRI Beads-based Size Selection in the Range of 2–10kb. J. Biomol. Tech. JBT 2020, 31, 7–10. [Google Scholar] [CrossRef] [PubMed]
Lucena-Aguilar, G.; Sánchez-López, A.M.; Barberá;n-Aceituno, C.; Carrillo-Ávila, J.A.; López-Guerrero, J.A.; Aguilar-Quesada, R. DNA source selection for downstream applications based on DNA quality indicators analysis. Biopreserv. Biobank. 2016, 14, 264–270. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Taylor, D.L.; Walters, W.A.; Lennon, N.J.; Bochicchio, J.; Krohn, A.; Caporaso, J.G.; Pennanen, T. Accurate Estimation of Fungal Diversity and Abundance through Improved Lineage-Specific Primers Optimized for Illumina Amplicon Sequencing. Appl. Environ. Microbiol. 2016, 82, 7217–7226. [Google Scholar] [CrossRef] [Green Version]
White, T.J.; Bruns, T.; Lee, S.; Taylor, J. Amplification and direct sequencing of fungal ribosomal RNA genes for phylogenetics. In PCR Protocols: A Guide to Methods and Applications; Academic Press: Cambridge, MA, USA, 1990; pp. 315–322. [Google Scholar]
Takahashi, S.; Tomita, J.; Nishioka, K.; Hisada, T.; Nishijima, M. Development of a Prokaryotic Universal Primer for Simultaneous Analysis of Bacteria and Archaea Using Next-Generation Sequencing. PLoS ONE 2014, 9, e105592. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Don, R.H.; Cox, P.T.; Wainwright, B.J.; Baker, K.; Mattick, J.S. ‘Touchdown’ PCR to circumvent spurious priming during gene 561 amplification. Nucleic Acids Res. 1991, 19, 4008. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Korbie, D.J.; Mattick, J.S. Touchdown PCR for increased specificity and sensitivity in PCR amplification. Nat. Protoc. 2008, 3, 1452. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Examples of sample sources tested in this study, including (A) roots and soils from a cotton field, (B,C) roots from pine forests, (D) Leaves of grass, and (E,F) soil cores collected from a grass field.

Figure 2. Schematic diagram of the workflow of two-step PCR protocol. The primers (forward: FGRSP_F1-F6, reverse: RGRSP_F1-F6) added in Step 2 include the “gene region-specific primer (GRSP)” (colored in green; e.g., ITS1F and ITS4); the “linker” region (orange), which contains two base pairs; the “frame shift” regions (pink), which are one to six randomized nucleotides (F1–F6); and the sequencing primers (blue) specific to the sequencing platform selected (e.g., Illumina). In Step 3, the forward primer (PCR_F) added includes sequencing primers and Illumina primers (purple). The reverse primer (PCR_R_bc_(X), X = barcode ID) has an additional 10 bp barcode region (yellow) that is used for the recognition of read originality. PCR product clean-up was carried out after each PCR reaction (Steps 3 and 5). The oligonucleotides and sequences of Read1_seq and Read2_seq are provided to the sequencing facility upon sample submission. Synthetic oligonucleotide sequences are provided in Supplementary Tables S1–S3. Thermocycler programs are illustrated in Figure 3.

Figure 3. Thermocycler programs for two-step PCR (2P) workflow. * Decrease by 0.3 degree Celsius every cycle; ** We recommend using 15 cycles. However, if PCR yielded low DNA concentrations, the number of cycles can be increased up to 20.

Figure 4. PCR product lengths detected by Fragment Analyzer ™ measurements. PCR libraries were prepared with two-step (2P) and three-step (3P) protocols from the same DNA extractions (one soil and one root). PCR was performed with a fungus-specific primer set ITS1F and ITS4 (dataset 1). PCR products resulting from each of the three PCR steps of protocol 3P are referred to as 3P_1st, 3P_2nd, and 3P_3rd (Table 2). Similarly, the PCR steps of protocol 2P are referred to as 2P_1st and 2P_2nd (Table 2 and Figure 2). (A) Simulated gel image with DNA fragments shown as black bands. The Y-axis corresponds to the size of the fragment. The DNA fragments for both the soil and root samples demonstrated an increase in PCR product size from 2P_1st to 2P_2nd. Bands at 35 and 1500 bp are size standards; bands at ca. 66 bp are primer dimers. (B) An overlay of the DNA size spectrums generated from each PCR step using a soil and a root sample. The X-axis indicates the size of the DNA fragment measured. The Y-axis corresponds to the intensity of the Relative Fluorescence Unit (RFU) viewed as a proxy for fragment abundance at the given size. The root sample yielded PCR products with peaks at 845 bp (2P_1st), 900 bp (2P_2nd), and 861 bp (3P_3rd product). The PCR product of a soil sample yielded PCR product size peaks such as 849 bp (2P_1st), 911 bp (2P_2nd), and 875 bp (3P_3rd). The DNA standards of 35 and 1500 bp are shown. 2P = two-step PCR protocol; 3P = three-step PCR protocol.

Figure 5. Sequencing result comparisons. All sequencing was conducted at the sequencing core at Duke University Center for Genomic and Computational Biology (datasets 2 and 3). (A,B) Individual samples prepared with the three-step PCR (3P), three-step PCR with bead clean-up (3P+cleanup), and two-step PCR (2P) protocols (Supplementary Table S4). All samples were prepared with the 515F-806R primer set targeting bacteria and archaea (dataset 2). (A) Number of reads received and (B) average quality of reads (Phred score) for 3P, 3P+cleanup, and 2P per sample. (C,D) Independent Illumina MiSeq runs of 2P vs. 3P protocols (dataset 3, Supplementary Table S5). (C) Percentage of undetermined reads (i.e., reads that could not be assigned to a sample due to unrecognizable barcode) and (D) average quality of reads in 3P vs. 2P across independent Miseq runs. Significance label for p-value and FDR: ns = not significant (>0.05), ** <0.01, *** <0.001.

Figure 6. Comparison of sequence results obtained for bacteria and fungi (dataset 4, Supplementary Method S1, Table S6) using the three-step (3P) and two-step (2P) protocols. All library preparation and sequencing were conducted at the RTSF Genomics Core at Michigan State University. (A–C) Sequence results of bacteria. (A) Read number and alpha diversity (observed number of species) detected with the 2P and 3P protocols (B,C) Order level taxonomic composition of the 3P- and 2P-prepared libraries. (D–F) Sequence results of fungi. (D) Read number and alpha diversity (observed number of species) detected with the 2P and 3P protocols. (E,F) Genus level taxonomic composition of the 3P- and 2P-prepared libraries. Welch two-sample t-test was conducted. ** p value < 0.01, ns = not significant.

Figure 7. Comparison of the 2P and 3P sequencing results generated by independent sequencing facilities (datasets 2 and 4). (A) The ratios of two-step PCR (2P)-generated read numbers to those of three-step PCR (3P) per sample. (B) The ratios of 2P-generated average read quality to those of 3P per sample. bacteria_Duke = libraries targeting bacterial/archaeal communities and sequenced at Duke University, bacteria_MSU = libraries targeting bacterial/archaeal communities and sequenced at Michigan State University, and fungi_MSU = libraries targeting fungal community and sequenced at Michigan State University.

Table 1. Cross comparison of the steps applied for two-step PCR (2P), three-step PCR (3P), and three-step PCR with bead clean-up (3P+cleanup) protocols. Forward Gene Region-Specific Primer (FGRSP, e.g., ITS1F, 341F, and 515F) and Reverse Gene Region-Specific Primer (RGRSP, e.g., ITS2, ITS4, and 806R) can be replaced by forward and reverse primers of interest, respectively. x = barcode ID. Each of the 2P steps (Step 1–7) were described in detail in Figure 2.

Protocol	2P (This Study)	3P (Chen et al., 2018)	3P+Cleanup
Task Summary (Description)	Step No.	Step Included or Not	Step Included or Not
DNA extraction	Step 1	Yes	Yes
PCR amplification (Amplify with FGRSP/RGRSP primers)	NA	Yes	Yes
PCR amplification (Amplify with FGRSP_F1-F6/RGRSP_F1-F6)	Step 2	Yes	Yes
Beads clean-up (1st PCR product clean-up)	Step 3	NA	Yes
PCR amplification (Amplify with PCR_F, PCR_R_bc (x))	Step 4	Yes	Yes
Beads clean-up (2nd PCR product clean-up)	Step 5	Yes	Yes
PCR product evaluation and multiplex	Step 6	Yes	Yes
Sequencing	Step 7	Yes	Yes

Note: Forward Gene Region-Specific Primer (FGRSP, e.g., ITS1F, 341F, and 515F) and Reverse Gene Region-Specific Primer (RGRSP, e.g., ITS2, ITS4, and 806R) can be replaced by forward and reverse primers of interest, respectively. x = barcode ID. Each of the 2P steps (Steps 1–7) is described in detail in Figure 2.

Table 2. PCR program differences between two-step PCR (2P) and three-step PCR (3P) protocols, including cycle number, annealing approach, and template concentration.

	Two-Step PCR (2P) (This Study)				Three-Step PCR (3P) (Chen et al., 2018)
Primer Sets	Cycle Number	Product Name	Annealing T (°C)	DNA Input (μL)	Cycle Number	Product Name	Annealing T. (°C)	DNA Input (μL)
FGRSP, RGRSP	N/A	N/A	N/A	N/A	10	3P_1st	Constant	0.5
FGRSP_F1-F6, RGRSP_F1-F6	15	2P_1st	Touchdown	1	10	3P_2nd	Constant	2.5
PCR_F, PCR_R_bc (x)	15	2P_2nd	Touchdown	1.6	10	3P_3rd	Constant	10

Note: Forward Gene Region-Specific Primer (FGRSP, e.g., ITS1F, 341F, and 515F) and Reverse Gene Region-Specific Primer (RGRSP, e.g., ITS2, ITS4, and 806R) can be replaced by forward and reverse primers of interest, respectively. x = barcode ID.

Table 3. PCR mix recipes for two-step PCR protocol.

2P_1st			2P_2nd
Reagent	Volume (μL)	Volume (μL)	Reagent	Volume (μL)	Volume (μL)
10× PCR buffer	1.25		10× PCR buffer	2.5
MgCl₂ (50 mM)	0.375		MgCl₂ (50 mM)	0.75
FGRSP_F1-F6 (10 μM)	0.25		PCR_F primer (10 μM)	0.5
RGRSP_F1-F6 (10 μM)	0.25		dNTP 10 mM	0.5
dNTP 10 mM	0.25		Taq (5 U/μL)	0.08
Taq (5 U/μL)	0.05		Water	18.57
Water	9.075				Master mix: 22.9
		Master mix: 11.5	Product from C_2P_1st	1.6	Add individually
DNA (1:50×) *	1	Add individually	Barcode_primer	0.5	Add individually
		Total = 12.5			Total = 25

Note: * Adjusted to appropriate concentration range 1–20 ng/μL. Sequences of the Forward/Reverse Gene Region-Specific Primers F1–F6 (FGRSP/RGRSP_F1–F6), PCR_F primer sequence are provided in Supplementary Table S1. The barcode sequences are provided in Supplementary Table S2.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

A Two-Step PCR Protocol Enabling Flexible Primer Choice and High Sequencing Yield for Illumina MiSeq Meta-Barcoding

Abstract

1. Introduction

2. Materials and Methods

Overview of Three-Step and Two-Step PCR Library Construction

3. Detailed Workflow of Two-Step PCR (2P) Protocol

3.1. Step 1: DNA Preparation

3.1.1. Sample Collection and Processing

3.1.2. DNA Extraction

3.1.3. Methods (Optional) to Prevent PCR Inhibitors in Extracted DNA Affecting the Following PCR Steps

3.2. Performing the First PCR Using Thermocycler Program “2P_1st”

3.2.1. PCR Reagent Preparation

3.2.2. PCR Amplification

3.3. First PCR Product Clean-Up

3.3.1. AMPure Bead Preparation

3.3.2. Mix AMPure Beads and PCR Products

3.3.3. Separate Solution versus AMPure Beads

3.3.4. Ethanol Wash

3.3.5. Beads and DNA Resuspension

3.3.6. AMpure Bead Separation and Removal

3.4. Performing the Second PCR Using Thermocycler Program “2P_2nd”

3.4.1. PCR Reagent Preparation

3.4.2. PCR Amplification

3.5. Second PCR Product Clean-Up

3.5.1. AMPure Beads Preparation

3.5.2. Mix AMPure Beads and PCR Products

3.5.3. Separate Solution versus AMPure Beads

3.5.4. Ethanol Wash

3.5.5. Beads and DNA Resuspension

3.5.6. AMpure Bead Separation and Removal

3.6. PCR Product Evaluation and Multiplex

3.6.1. Gel Preparation and Electrophoresis

3.6.2. Gel Examination

3.6.3. DNA Quantity and Purity Assessment

3.6.4. Sample Multiplexing

3.7. Submit Samples for Sequencing

3.7.1. Sample Submission to the Sequencing Facility

3.7.2. Submission Condition Inquiry

3.8. Hardware Requirements

3.9. Synthetic Oligonucleotides

4. Results and Discussion

4.1. Evaluation of PCR Efficiency by Assessing the Amplicon Length

4.2. The Effect of PCR Library Protocols (2P, 3P, and 3P+Cleanup) on the Quantity and Quality of Sequence Reads

4.3. Across Run Comparison for the Proportion of Undetermined Barcoded Sequences Generated with 2P vs. 3P Approaches

4.4. Evaluation of 2P and 3P Performance for Libraries Targeting Different Taxonomic Groups

4.5. 2P and 3P Comparison between Sequencing Facilities

4.6. Touchdown Technique in Improving Multi-Step PCR for Next-Generation Amplicon Sequencing

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics