1. Introduction
The advent of Next-Generation Sequencing (NGS) technology has enabled rapid, cost-effective microbiome profiling that can be used to study the impact of microbial community structure on health and disease. Sequence-based profiling of microbiomes, whether using 16S rRNA gene amplicons or shotgun methodology, depends on a lysis and DNA isolation method that is stringent enough to lyse all cells while not severely damaging the DNA. It is well established that lysis methods can be a major determinant of microbiome profile bias in studies performed with stool [
1,
2,
3,
4,
5,
6,
7] and oral samples [
8,
9,
10]. In order to obtain an unbiased representation of heterogeneous bacterial communities within a sample, the lysis method employed must be robust enough to lyse all microbes while at the same time preserving DNA quality. Gram-positive Firmicutes species tend to contain thicker cell walls, which makes them more difficult to lyse, and it is known that Firmicutes are under-represented by the HMP protocol. Gentle lysis methods will tend to under-represent Gram-positive bacteria, while more stringent bead-beating methods are difficult to automate and can introduce variability because they can rapidly damage the DNA of easily lysed microbes. The intensity of bead beating can vary depending on the position of a sample in the bead beater instrument, or because of minor variations in tube placement, as well as variations in bead beater duration and intensity. Gram-positive bacteria tend to have thicker cell walls with several layers of peptidoglycan that can vary in chemical structure [
11], biasing enzymatic/lysozyme lysis methods. The major structural differences between bacteria provide significant challenges to the development of a DNA extraction method with uniform lysis properties across complex bacterial communities.
We developed and tested a non-bead beating, non-enzymatic, novel ‘Rapid’ microbiome DNA extraction procedure suitable for 16S rRNA gene-based microbiome profiling applications. The ‘Rapid’ protocol was designed for uniform lysis of diverse populations of microbial cells, including difficult-to-lyse bacteria, by applying a unique, single alkaline/heat/lysis buffer combination to sample volumes of a few milligrams, which simplifies sample handling while rapidly supplying a sufficient high-quality DNA yield for both short and long amplicon rRNA PCR applications.
Alkaline lysis methods disrupt the bacterial cell by denaturing and solubilizing membrane components. Since 1979 [
12], alkaline lysis using sodium hydroxide (NaOH) has been widely used for isolation of plasmid DNA. In 2014, a potassium hydroxide (KOH) based DNA extraction method was shown to be effective in the target lysis of the Gram-positive genera
Bacillus,
Streptomyces,
Micromonospora,
Nonomuraea,
Microbispora, and
Staphylococcus [
13]. Unfortunately, KOH-based methods result in the rapid precipitation of commonly used lysis buffers, which is perhaps why KOH methods were not more widely used. Application of heat to precipitation-resistant lysis buffers in a basic KOH solution result in a simultaneous attack on the cell wall and membrane that may be more effective than independent treatments. Although alkaline protocols were previously demonstrated to effectively break down targeted Gram-positive complex cell walls, patent US 10,774,322 B2 [
14] for the ‘Rapid’ technique was granted for this method because there was previously no report exploring the potential benefit of an alkaline KOH DNA extraction method combined with heat and lysis buffer mixtures in 16S rRNA gene amplicon-based microbiome studies targeting the entire bacterial community.
In the present study, we evaluated a novel alkaline-based DNA extraction method utilizing a mock community to test the true representation of bacterial communities against various other commercially available lysis protocols. We further evaluated alkaline-based DNA extraction methods using stool samples from 20 human subjects to demonstrate the effectiveness of this protocol for 16S rRNA sequencing-based microbiome studies. Bacterial 16S rRNA gene sequence data from both mock communities and human fecal microbiomes were generated as V1V3, V4 amplicons for MiSeq sequencing (Illumina, San Diego, CA, USA), and full-length 16S rRNA V1V9 amplicons for Sequel sequencing (Pacific Biosciences, Menlo Park, CA, USA) for the in-depth microbiome analyses and comparisons.
2. Methods
2.1. Commercial Mock Community—ZymoBIOMICS Microbial Community Standard
The ZymoBIOMICS Microbial Community Standard containing three Gram-negative bacteria with relatively fragile cell walls, five Gram-positive bacteria with thicker cell walls, and two fungal species were tested. Theoretical composition based on genomic DNA consist of 12% Listeria monocytogenes, 12% Pseudomonas aeruginosa, 12% Bacillus subtilis, 12% Escherichia coli, 12% Salmonella enterica, 12% Lactobacillus fermentum, 12% Enterococcus faecalis, 12% Staphylococcus aureus, 2% Saccharomyces cerevisiae, and 2% Cryptococcus neoformans. Yeast genes were not included as part of the study and were absent from the microbiome profiles presented as part of this study.
2.2. Mock Community
A custom mock community consisting of 10 bacterial species was assembled (
Supplementary Table S1). Bacterial species were selected to include members of the phyla Bacteroidetes, Firmicutes, Proteobacteria, and Actinobacteria, all of which are commonly found in the human gut. Primer specificity was another consideration. Gut bacterial species with 0 to 3 bases mismatched to the 27F forward primer was included in the mock community. Based on these criteria,
Lactobacillus paracasei (isolate from our lab),
Enterococcus faecalis ATCC
® 47077
TM OG1RF,
Escherichia coli ATCC
® 700926
TMMG1655,
Bifidobacterium dentium ATCC
® 27678
TM,
Bacteroides vulgatus ATCC
® 8482
TM,
Prevotella oralis ATCC
® 33269
TM,
Bacteroides caccae ATCC
® 43185
TM,
Prevotella copri DSM18205,
Ruminococcus lactaris ATCC
® 29176
TM, and
Faecalibacterium prausnitzii (isolate from our lab) were selected. The commercially available mock community tested using the V4 amplicon was obtained from Zymobiomics, containing
Pseudomonas aeruginosa, Escherichia coli,
Salmonella enterica,
Listeria monocytogenes,
Bacillus subtilis,
Lactobacillus fermentum,
Staphylococcus aureus, and
Enterococcus faecalis. The mock also contained DNA from the eukaryotes
Saccharomyces cerevisiae and
Cryptococcus neoformans. To make the mock community, bacterial cells were freshly cultured and counted using a Neubauer Chamber hemocytometer and diluted to contain equal numbers of cells (10
7 cells uL
−1), and then normalized for 16S rRNA gene copy numbers (4~7) in order to include an equal number of 16S rRNA gene copies for each taxon. Normalized bacterial cell pools were centrifuged (5000×
g), and pellets were used for 6 different DNA extraction methods.
2.3. Human Subject Fecal Sample Collection
Fecal samples from 20 subjects were included in this study. Subjects were enrolled at the Seoul National University, under an IRB-approved protocol. Fecal samples were collected from 10 hospitalized individuals undergoing treatment for lung cancer and 10 control subjects. Fecal samples were stored at −80 °C until processing.
2.4. DNA Extraction
Six DNA extraction methods that employ a variety of lysis steps were evaluated. Methods included the novel Rapid protocol (chemical only), a new bead pasting protocol (mechanical only), and four commercially available protocols that included combinations of enzymatic, chemical, and mechanical methods, one of which was used as a standard protocol in the HMP (Qiagen PowerSoil kit). Two of the four commercial kits included enzyme-based lysis (Epicentre MasterPure and QiaAmp Stool), and two included mechanical bead beading (Qiagen PowerSoil kit and Zymo DNA/RNA Mini). Samples were purified in duplicate for each DNA extraction method. All six DNA extraction protocols were used to evaluate the custom mock community, which contains organisms with widely varying resistance to lysis. A Zymobiomics commercial mock community was processed using the Rapid protocol, and results were compared to the V4 amplicon results published by Zymobiomics. Human fecal samples were processed with the Rapid protocol and HMP protocols to determine whether the increased Firmicutes lysis seen in the mock translates to increases in Firmicutes representation in complex microbiome samples. Since the yield of extracted DNA varied according to the cell input amounts according to sample type and manufacturer’s instructions, and recommended input varied over 100-fold across mock microbiome and fecal material, individual DNA extractions were considered successful if the PCR resulted in sufficient amplicon for sequencing.
2.5. Novel ‘Rapid’ KOH Alkaline Based Protocol (K)
DNA was extracted using the DNA Purification and 16S rRNA Amplification Kit (Shoreline Biome, Farmington, CT, USA) as per the manufacturer’s instructions. Briefly, 1–3 mg of fecal material or mock microbiome was picked up on a 1 µL calibrated inoculating loop and dispersed into 50 µL lysis buffer by briefly twisting the loop, and 50 µL KOH solution was added. Samples were heated to 95 °C for 5 min and cooled until precipitate formed. The precipitate was pelleted, and the supernatant containing lysed DNA was added to 50 µL of a purification buffer. Samples were incubated at 50 °C to bind DNA to capture beads. Capture beads were washed 2× with 70% ethanol, and DNA was eluted in 40 µL TE. An amount of 160 µL TE was added to the diluted DNA for a final volume of 200 µL DNA. Diluted DNA (10 µL) was used in PCR for all amplicons.
2.6. New Bead Pasting Protocol (B)
Samples were processed according to the manufacturer’s instructions using the Bead Beater Lysis Kit (Shoreline Biome). Cell pellets or 1–3 mg fecal samples were suspended in a 50 µL bead beater buffer, and a 15 µL aliquot was added to the tube containing the bead mixture. The sample was subjected to bead beating for 60 s, after which the sample was diluted in 100 µL TE. The sample was vortexed briefly, and the supernatant containing the DNA was transferred to a purification buffer. After a 5 min incubation, the beads containing DNA were captured by a magnet and washed 2× with 70% ethanol. DNA was eluted from capture beads in 40 µL TE. The sample was further diluted 1:5 in TE prior to use in PCR.
2.7. MasterPure Complete DNA and RNA Purification Kit (E)
DNA was extracted using the MasterPure Complete DNA and RNA Purification Kit (Epicentre) as per the manufacturer’s instructions. The provided 50 µg/µL proteinase K and tissue and cell lysis solution was mixed with bacterial cells and incubated at 65 °C for 15 min. Cell debris was pelleted by centrifugation, followed by DNA precipitation using the provided MPC Protein Precipitation Reagent. DNA was pelleted by centrifugation and rinsed with isopropanol and 70% ethanol. DNA was resuspended in TE buffer.
2.8. Qiagen PowerSoil Kit (HMP)
This protocol has been used to process stool samples for the HMP [
15] and yielded strain-level microbiome taxonomic resolution for a few genera [
16]. As per the HMP recommended protocol, samples were pre-treated at 65 °C for 10 min and then at 95 °C for 10 min. After pre-treatment, samples were processed as per the manufacturer’s instructions. Briefly, the pre-heated fecal suspension was transferred to PowerBead Tubes and mixed with C1 lysis solution. Bacterial cells were subjected to mechanical bead beating using a MO BIO Vortex Adapter tube holder for 10 min, after which cell debris was pelleted by centrifugation. Proteins were precipitated from the supernatant by mixing and incubating with buffer C2 at 4 °C and pelleted by centrifugation. Supernatants were next incubated with buffer C3 at 4 °C to remove inhibitors. Column binding buffer C4 was added, and supernatants were applied to spin columns and centrifuged to bind DNA. The columns were washed by centrifugation with buffer C5 and eluted by centrifugation in buffer C6.
2.9. QIAamp DNA Stool Kit (Q)
DNA was extracted using the QIAamp DNA Stool Kit as per the manufacturer’s instructions. The provided lysis butter ASL was mixed with samples, vortexed to suspend cells, and incubated at 70 °C for 5 min. Debris was pelleted by centrifugation. InhibitEX tablets provided with the kit were incubated with samples, and debris was pelleted by centrifugation. Proteinase K was added to the supernatant, and after 10 min incubation at 70 °C, samples were centrifuged to pellet debris. The lysate was applied to spin columns, and the columns were washed by centrifugation with AW1 and then by AW2. Following an additional spin to remove residual buffer, DNA was eluted by centrifugation in buffer AE (elution buffer) provided in the kit.
2.10. ZymoBIOMICSTM DNA/RNA Mini Kit (Z)
The standard procedure recommended by ZymoBIOMICS was followed. Briefly, bacterial cells were lysed by bead beating using a DNA/RNA Shield Lysis Tube, followed by centrifugation to pellet debris. The supernatant was mixed with lysis buffer, transferred to a spin column, and centrifuged. Bound DNA was washed by centrifugation once with DNA/RNA Prep Buffer and twice with DNA/RNA Wash Buffer. DNA was eluted by centrifugation with DNase/RNase-free water provided in the kit. A Zymo-Spin III-HRC Filter was prepared by centrifugation with HRC Prep Solution, the eluted DNA was transferred to the filter and recovered by centrifugation. Eluted DNA was used in PCR.
2.11. Library Construction and 16S rRNA Amplicon-Based Sequencing
Three different amplicons from different regions of the 16S rRNA gene were prepared from bacterial DNA purified from custom mock and human fecal samples. Amplicons ranged in size from the 292 base V4 region, the 526 base V1V3 region, and the 1506 base V1V9 amplicon that included all variable regions of the 16S rRNA gene (
Figure S1). PCR was performed as per the manufacturer’s instructions (Shoreline Biome). Briefly, 10 μL of extracted DNA (2 ng/μL) from each sample was added to the 96-well plate containing dried PCR primers with barcodes. An amount of 10 µL 2× PCR mix was added to each well. PCR conditions were unique to each amplicon, as described in
Table S2. Primer sequences for V1V3, V4, and V1V9 amplicons that were used for PCR in custom mock experiments are detailed in
Table S1. Amplicons V1V3 and V4 were sequenced on the MiSeq (Illumina), and V1V9 was sequenced on the Sequel (Pacific Bioscience) instrument.
2.12. Sequence Pre-Processing and Classification
V1V3 and V4 sequence reads generated using the Illumina platform were processed by removing the sequences with low quality (average qual < 25) and ambiguous codons (N’s). Chimeric amplicons were removed using UChime software (Ver 6.0.307). Sequence reads were analyzed using the mothur phylotype pipeline SOP [
17] by direct classification. V1V9 sequencing reads were generated using the PacBio platform. Circular consensus (ccs) reads were created from the raw reads using PacBio software (Ver 3.4.0) with standard cutoffs of 3 passes and a minimum of 90% accuracy. The mean number of passes was much higher than 3 for most reads because the amplicons were less than ~1500 bases, so the resulting base calling accuracy was generally over 99.9%. PacBio sequences were demultiplexed and classified using SBanalyzer (Ver 2.2-3), a GUI-based software package. A custom algorithm was used to identify sample-specific DNA barcodes on each read, trim barcodes from each read, and sort the reads into individual files based on the barcodes. A custom reference database called ‘Athena’ was built for classification, which contains contiguous 16S–23S gene regions from organisms sequenced and assembled such that individual 16S genes could be positioned in the assembly. The mapping algorithm was based on BLASTplus v2.8.0 (
ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/ (accessed on 13 December 2018)). The mock community was analyzed at the species level, and human stool samples were analyzed at the genus level.
2.13. Statistical Analyses
Differences between lysis methods HMP and K applied to human stool samples were tested and evaluated using the paired t-test for alpha-diversity and Analysis of Molecular Variance (AMOVA) for beta-diversity analyses. Differentially abundant taxa depicted in heatmaps and significant driver taxa in non-metric multidimensional scaling (NMDS) plots were selected using the Wilcoxon Signed Rank test based on their relative abundances. The Benjamini–Hochberg false discovery rate method was used for multiple comparison adjustments.
2.14. Data Sharing
Sequences were submitted to the Sequence Reads Archive (PRJNA531279).
4. Discussion
In this study, we benchmark a novel ‘rapid’ alkaline microbiome DNA extraction procedure ‘K’ that does not contain bead beating or enzymatic methods to assess suitability for 16S rRNA gene or other amplicon-based microbiome survey applications.
Bead beating is a time-consuming process that is not easily automated, can vary in effectiveness depending on how it is implemented, and has the potential to degrade the DNA of the easily lysed organisms while missing DNA from more difficult Gram-positive bacteria. Digestion with proteases and/or lysozyme adds time, complexity, and variability to DNA preparation methods because bacteria can vary widely in susceptibility to enzymatic digestion. Effective lysis methods that avoid bead beating and enzymatic steps have the potential to eliminate costs associated with bead beater hardware, reagents and disposables, and enzyme costs and reduce lysis variability, decrease time to result, and enable high throughput handling and/or automation.
A comparison of multiple commercial methods and a commonly used ‘HMP’ method to the novel ‘K’ method clearly demonstrated that bead beating and enzymatic digestion are not sufficient for comprehensive bacterial lysis. In fact, the ‘K’ method improved the recovery of Gram-positive Firmicutes as compared to multiple commercial methods for both a custom mock community as well as human fecal samples. Although mock communities can be useful for assessing lysis effectiveness, it can be difficult to determine whether any given lysis method is universally effective using a mock, because not all microbes can be present in a mock microbiome, and the susceptibility of different strains to lysis conditions can vary. For example, method ‘K’ yielded results similar to both the expected profile and the manufacturer’s optimized protocol for a commercially available mock community based on the proportions of microbes measured using the V4 16S rRNA gene amplicon and Illumina sequencing. However, it was apparent from similar tests with V1V3 and V1V9 amplicon that the choice of PCR primers and amplicon, lysis methods, and bacteria selected for the mock community can be sources of variability in the microbial DNA profile. To isolate and study the effects of lysis methods, the relative performance of multiple lysis and DNA purification methods was assessed by sequencing the purified DNA from a custom mock community and fecal samples using multiple PCR amplicons and sequencing technologies. A custom mock microbiome containing selected Gram-positive Firmicutes strains that were known to be difficult to lyse demonstrated that the ‘K’ method provided an improved representation of Firmicutes. Additional comparisons of the ‘K’ and ‘HMP’ methods on twenty human fecal samples demonstrated that the microbial profiles of specific bacteria depended on the lysis method and that the ‘K’ method yielded improved representation of multiple Firmicutes, which is in agreement with the custom mock community results. Representation of certain Actinobacteria species was also improved by the ‘K’ method, demonstrating that not all difficult-to-lyse bacteria are Firmicutes.
Furthermore, 16S rRNA amplicon-based sequencing profiling of microbial populations remains popular because it is cost-effective, and taxonomic identification of sequencing reads can be facilitated by multiple well-understood data analysis tools. These tools enable rapid mapping of sequenced data against large datasets such as the RDP [
18], SILVA [
19], GreenGenes [
20], EasyTaxon [
21], and Athena databases used in the present study. It should be possible to extend this methodology to other amplicon-based targets such as fungi and other non-bacterial organisms in the community. Thus, a rapid, comprehensive lysis solution for amplicon-based analysis is a useful addition to reduce costs, variability, and time to result in microbiome profiling.
Lysis and DNA purification is arguably one of the most important steps in sequencing-based microbiome profiling because bacteria need to be lysed to release the DNA for sequencing. Unfortunately, lysis is also one of the most variable steps in microbiome profiling, because minor variations in standard methodologies can result in under-treatment (un-lysed organisms) or over-treatment (DNA damage). Contributions to the variability of lysis include differential resistance of cell wall and cell membrane structures to lysis methods such as protease and lysozyme treatment, as well as variability inherent in physical processes like bead beating. In order to determine which protocols provide the best DNA extraction outcomes, where the DNA recovered from bacterial lysis closely represents the original bacterial community, several different DNA extraction methods that employ lysozyme [
22], proteinase K [
23], physical disruption methods such as bead beating [
15] and temperature change [
24], chemicals [
25], or combinations of each [
26] were tested. Lysozymes can be used to target bacterial cell walls, but different lysozymes have different effectiveness against the variety of bacterial cell wall chemistries and layers of phospholipid, peptidoglycan, and protein that bacteria employ. Physical disruption can promote lysis of Gram-positive bacteria with tougher cell walls, but bead beating suffers from sample-to-sample variability, is not easily scalable for high throughput sample processing or automation, damages DNA [
27], and negatively affects recovery from low-input samples. It can be difficult to standardize results across different microbiome investigations because each sample type requires independently optimized DNA extraction methods based on the sample type for optimal yield [
1,
28]. In this report, we assess different methods, including an alternative alkaline DNA extraction method for 16S rRNA amplicon sequencing strategies that were designed to eliminate variable processes such as bead beating and enzyme treatment, increase lysis efficiency for difficult bacteria, and enable high throughput applications by reducing the number of steps, decreasing handling time, and improving efficiency.
In summary, we benchmark a novel K DNA extraction protocol that avoids bead beating and enzymatic treatments, while at the same time demonstrating improved performance compared to commonly used DNA lysis and purification methods for the accurate representation of mock communities and human fecal gut microbiome samples. We conclude that the novel ‘K’ DNA extraction protocol offers a reliable alternative for preparing fecal specimens for 16S rRNA gene amplicon sequencing that maintains the representation of microbial populations in a sample.