Deciphering Gut Microbiome in Colorectal Cancer via Robust Learning Methods
Abstract
:1. Introduction
2. Materials and Methods
2.1. Data Source and Processing
2.2. Statistical Methods
3. Results
3.1. Gut Microbial Diversity in Colorectal Cancer
3.1.1. Differences in Alpha Diversity Between the CRC and Healthy Groups
3.1.2. Differences in Beta Diversity Between the CRC and Healthy Groups
3.2. Differentially Abundant Taxa in Colorectal Cancer
3.2.1. Exploratory Analysis
3.2.2. Differential Abundance Analysis
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
Study | Cases 1 | Controls 1 | Region | Sequencer | 16S Region |
---|---|---|---|---|---|
Baxter (2016) [22] | 120 | 172 | USA | Miseq | V4 |
Chen (2012) [23] | 21 | 22 | China | 454 | V1–V3 |
Wang (2012) [24] | 44 | 54 | China | 454 | V3 |
Zeller (2014) [26] | 41 | 75 | France | Miseq | V4 |
Zackular (2014) [25] | 30 | 30 | USA | Miseq | V4 |
Method | Model Formula | Key Features |
---|---|---|
LEfSe | Compare taxon abundance between healthy and diseased individuals | - Linear discriminant analysis for effect size estimation - Non-parametric tests (Kruskal–Wallis/Wilcoxon rank sum) |
ANCOM-BC | - : Count of taxon j in individual i from group k (: healthy; : diseased) - : Effect of sampling fraction for individual i from group k - : Effect of true count of taxon j for an individual from group k | - Corrects bias introduced by differences in sampling fractions - Provides statistically valid test - Provides confidence intervals for effect size of each taxon - Controls false discovery rate and maintains adequate power |
sccomp | - : Count of taxon j in individual i - : Baseline count (healthy) - : Effect of disease on count of taxon j | - follows a sum-constrained beta-binomial distribution - Estimates by Hamiltonian Monte Carlo via Bayesian inference - Captures mean—variability relationship of microbial abundances - Developed for single-cell and adapted for microbiome data |
Taxa | Estimate 1 | LowerCI | UpperCI | p-Value 2 | q-Value 3 |
---|---|---|---|---|---|
Porphyromonas | −0.346 | −0.353 | −0.313 | <0.001 | <0.001 |
Lactococcus | −0.233 | −0.281 | −0.132 | 0.001 | 0.006 |
Anaerostipes | 2.633 | 2.520 | 2.659 | <0.001 | <0.001 |
Clostridium_XlVb | −0.725 | −1.145 | −0.512 | <0.001 | <0.001 |
Peptostreptococcus | −0.218 | −0.280 | −0.091 | <0.001 | <0.001 |
Acetanaerobacterium | −0.367 | −0.366 | −0.366 | <0.001 | <0.001 |
Pseudoflavonifractor | −0.273 | −0.299 | −0.214 | 0.001 | 0.003 |
Fusobacterium | −0.229 | −0.286 | −0.101 | <0.001 | <0.001 |
Desulfovibrio | −0.242 | −0.284 | −0.159 | <0.001 | <0.001 |
Taxa | Estimate 1 | LowerCI | UpperCI | p-Value 2 | q-Value 3 |
---|---|---|---|---|---|
Porphyromonas | 1.449 | 1.095 | 1.803 | <0.001 | <0.001 |
Gemella | 0.326 | 0.100 | 0.553 | 0.004 | 0.024 |
Anaerostipes | −0.581 | −0.818 | −0.343 | <0.001 | <0.001 |
Clostridium_XlVb | 1.031 | 0.737 | 1.324 | <0.001 | <0.001 |
Peptostreptococcus | 0.487 | 0.248 | 0.726 | <0.001 | <0.001 |
Acetanaerobacterium | 0.961 | 0.703 | 1.219 | <0.001 | <0.001 |
Pseudoflavonifractor | 0.338 | 0.103 | 0.572 | 0.005 | 0.024 |
Ruminococcus | −0.577 | −1.002 | −0.152 | 0.008 | 0.036 |
Holdemania | 0.319 | 0.121 | 0.516 | 0.002 | 0.010 |
Fusobacterium | 1.063 | 0.567 | 1.560 | <0.001 | <0.001 |
Desulfovibrio | 0.929 | 0.520 | 1.337 | <0.001 | <0.001 |
Acinetobacter | 0.549 | 0.223 | 0.874 | 0.001 | 0.007 |
Taxa | Estimate 1 | LowerCI 2 | UpperCI 2 | p-Value 3 | q-Value 4 |
---|---|---|---|---|---|
Fusobacterium | 0.772 | 0.279 | 1.272 | 0.014 | 0.006 |
Olsenella | 1.253 | 0.547 | 1.901 | 0.002 | 0.003 |
Peptostreptococcus | 1.371 | 0.698 | 2.044 | 0.001 | 0.002 |
References
- Sung, H.; Ferlay, J.; Siegel, R.L.; Laversanne, M.; Soerjomataram, I.; Jemal, A.; Bray, F. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 2021, 71, 209–249. [Google Scholar] [CrossRef] [PubMed]
- Jemal, A.; Bray, F.; Center, M.M.; Ferlay, J.; Ward, E.; Forman, D. Global cancer statistics. CA Cancer J. Clin. 2011, 61, 69–90. [Google Scholar] [CrossRef]
- Nakatsu, G.; Li, X.; Zhou, H.; Sheng, J.; Wong, S.H.; Wu, W.K.K.; Ng, S.C.; Tsoi, H.; Dong, Y.; Zhang, N.; et al. Gut mucosal microbiome across stages of colorectal carcinogenesis. Nat. Commun. 2015, 6, 8727. [Google Scholar] [CrossRef] [PubMed]
- Yachida, S.; Mizutani, S.; Shiroma, H.; Shiba, S.; Nakajima, T.; Sakamoto, T.; Watanabe, H.; Masuda, K.; Nishimoto, Y.; Kubo, M.; et al. Metagenomic and metabolomic analyses reveal distinct stage-specific phenotypes of the gut microbiota in colorectal cancer. Nat. Med. 2019, 25, 968–976. [Google Scholar] [CrossRef] [PubMed]
- Yang, L.; Li, A.; Wang, Y.; Zhang, Y. Intratumoral microbiota: Roles in cancer initiation, development and therapeutic efficacy. Signal Transduct. Target. Ther. 2023, 8, 35. [Google Scholar] [CrossRef] [PubMed]
- Beghini, F.; McIver, L.J.; Blanco-Míguez, A.; Dubois, L.; Asnicar, F.; Maharjan, S.; Mailyan, A.; Manghi, P.; Scholz, M.; Thomas, A.M.; et al. Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3. eLife 2021, 10, e65088. [Google Scholar] [CrossRef] [PubMed]
- Feng, Q.; Liang, S.; Jia, H.; Stadlmayr, A.; Tang, L.; Lan, Z.; Zhang, D.; Xia, H.; Xu, X.; Jie, Z.; et al. Gut microbiome development along the colorectal adenoma–carcinoma sequence. Nat. Commun. 2015, 6, 6528. [Google Scholar] [CrossRef] [PubMed]
- Xiao, L.; Zhang, F.; Zhao, F. Large-scale microbiome data integration enables robust biomarker identification. Nat. Comput. Sci. 2022, 2, 307–316. [Google Scholar] [CrossRef] [PubMed]
- Ling, W.; Lu, J.; Zhao, N.; Lulla, A.; Plantinga, A.M.; Fu, W.; Zhang, A.; Liu, H.; Song, H.; Li, Z.; et al. Batch effects removal for microbiome data via conditional quantile regression. Nat. Commun. 2022, 13, 5418. [Google Scholar] [CrossRef] [PubMed]
- Yang, L.; Chen, J. A comprehensive evaluation of microbial differential abundance analysis methods: Current status and potential solutions. Microbiome 2022, 10, 130. [Google Scholar] [CrossRef]
- Segata, N.; Izard, J.; Waldron, L.; Gevers, D.; Miropolsky, L.; Garrett, W.S.; Huttenhower, C. Metagenomic biomarker discovery and explanation. Genome Biol 2011, 12, R60. [Google Scholar] [CrossRef] [PubMed]
- Lin, H.; Peddada, S.D. Analysis of compositions of microbiomes with bias correction. Nat. Commun. 2020, 11, 3514. [Google Scholar] [CrossRef] [PubMed]
- Mangiola, S.; Roth-Schulze, A.J.; Trussart, M.; Zozaya-Valdés, E.; Ma, M.; Gao, Z.; Papenfuss, A.T. sccomp: Robust differential composition and variability analysis for single-cell data. Proc. Natl. Acad. Sci. USA 2023, 120, e2203828120. [Google Scholar] [CrossRef] [PubMed]
- Duvallet, C.; Gibbons, S.M.; Gurry, T.; Irizarry, R.A.; Alm, E.J. Meta-analysis of gut microbiome studies identifies disease-specific and shared responses. Nat. Commun. 2017, 8, 1784. [Google Scholar] [CrossRef]
- Duvallet, C.; Gibbons, S.; Gurry, T.; Irizarry, R.; Alm, E. MicrobiomeHD: The human gut microbiome in health and disease. Zenodo 2017. [Google Scholar] [CrossRef]
- Gurry, T.; Alm Lab. Alm Lab’s in-House 16S Hrocessing Pipeline. Available online: https://github.com/thomasgurry/amplicon_sequencing_pipeline (accessed on 10 February 2025).
- Schloss, P.D.; Westcott, S.L.; Ryabin, T.; Hall, J.R.; Hartmann, M.; Hollister, E.B.; Weber, C.F. Introducing mothur: Open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl. Environ. Microbiol. 2009, 75, 7537–7541. [Google Scholar] [CrossRef] [PubMed]
- Caporaso, J.G.; Kuczynski, J.; Stombaugh, J.; Bittinger, K.; Bushman, F.D.; Costello, E.K.; Knight, R. QIIME allows analysis of high-throughput community sequencing data. Nat. Methods 2010, 7, 335–336. [Google Scholar] [CrossRef]
- Quast, C.; Pruesse, E.; Yilmaz, P.; Gerken, J.; Schweer, T.; Yarza, P.; Peplies, J.; Glöckner, F.O. The SILVA ribosomal RNA gene database project: Improved data processing and web-based tools. Nucleic Acids Res. 2013, 41, D590–D596. [Google Scholar] [CrossRef]
- Cole, J.R.; Wang, Q.; Fish, J.A.; Chai, B.; McGarrell, D.M.; Sun, Y.; Tiedje, J.M. Ribosomal Database Project: Data and tools for high throughput rRNA analysis. Nucleic Acids Res. 2014, 42, D633–D642. [Google Scholar] [CrossRef]
- Altschul, S.F.; Gish, W.; Miller, W.; Myers, E.W.; Lipman, D.J. Basic local alignment search tool. J. Mol. Biol. 1990, 215, 403–410. [Google Scholar] [CrossRef]
- Baxter, N.T.; Ruffin, M.T., IV; Rogers, M.A.M.; Schloss, P.D. Microbiota-based model improves the sensitivity of fecal immunochemical test for detecting colonic lesions. Genome Med. 2016, 8, 37. [Google Scholar] [CrossRef] [PubMed]
- Chen, W.; Liu, F.; Ling, Z.; Tong, X.; Xiang, C. Human Intestinal Lumen and Mucosa-Associated Microbiota in Patients with Colorectal Cancer. PLoS ONE 2012, 7, e39743. [Google Scholar] [CrossRef] [PubMed]
- Wang, T.; Cai, G.; Qiu, Y.; Fei, N.; Zhang, M.; Pang, X.; Jia, W.; Cai, S.; Zhao, L. Structural segregation of gut microbiota between colorectal cancer patients and healthy volunteers. ISME J. 2012, 6, 320–329. [Google Scholar] [CrossRef] [PubMed]
- Zackular, J.P.; Rogers, M.A.M.; Ruffin, M.T., IV; Schloss, P.D. The Human Gut Microbiome as a Screening Tool for Colorectal Cancer. Cancer Prev. Res. 2014, 7, 1112–1121. [Google Scholar] [CrossRef] [PubMed]
- Zeller, G.; Tap, J.; Voigt, A.Y.; Sunagawa, S.; Kultima, J.R.; Costea, P.I.; Amiot, A.; Böhm, J.; Brunetti, F.; Habermann, N.; et al. Potential of fecal microbiota for early-stage detection of colorectal cancer. Mol. Syst. Biol. 2014, 10, 766. [Google Scholar] [CrossRef] [PubMed]
- Leek, J.T.; Scharpf, R.B.; Bravo, H.C.; Simcha, D.; Langmead, B.; Johnson, W.E.; Geman, D.; Baggerly, K.; Irizarry, R.A. Tackling the widespread and critical impact of batch effects in high-throughput data. Nat. Rev. Genet. 2010, 11, 733–739. [Google Scholar] [CrossRef]
- Fu, C.; Lu, J.; Zhao, N.; Ling, W. MOSAIC: A Pipeline for MicrobiOme Studies Analytical Integration and Correction. bioRxiv 2024. [Google Scholar] [CrossRef]
- Aitchison, J. The Statistical Analysis of Compositional Data. J. R. Stat. Soc. B 1982, 44, 139–177. [Google Scholar] [CrossRef]
- Martín-Fernández, J.-A.; Hron, K.; Templ, M.; Filzmoser, P.; Palarea-Albaladejo, J. Bayesian-Multiplicative Treatment of Count Zeros in Compositional Data Sets. Math. Geosci. 2015, 15, 2. [Google Scholar] [CrossRef]
- Gower, J.C. Some distance properties of latent root and vector methods used in multivariate analysis. Biometrika 1966, 53, 325–338. [Google Scholar] [CrossRef]
- Anderson, M.J. A new method for non-parametric multivariate analysis of variance. Austral Ecol. 2001, 26, 32–46. [Google Scholar] [CrossRef]
- Zhao, N.; Chen, J.; Carroll, I.M.; Ringel-Kulka, T.; Epstein, M.P.; Zhou, H.; Wu, M.C. Testing in microbiome-profiling studies with MiRKAT, the microbiome regression-based kernel association test. Am. J. Hum. Genet. 2015, 96, 797–807. [Google Scholar] [CrossRef]
- Carpenter, B.; Gelman, A.; Hoffman, M.D.; Lee, D.; Goodrich, B.; Betancourt, M.; Brubaker, M.; Guo, J.; Li, P.; Riddell, A. Stan: A probabilistic programming language. J. Stat. Softw. 2017, 76, 1–32. [Google Scholar] [CrossRef] [PubMed]
- Stephens, M. False discovery rates: A new deal. Biostatistics 2017, 18, 275–294. [Google Scholar] [CrossRef] [PubMed]
- Gu, Z.; Eils, R.; Schlesner, M. Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinformatics 2016, 32, 2847–2849. [Google Scholar] [CrossRef] [PubMed]
- Gu, Z. Complex heatmap visualization. iMeta 2022, 1, e43. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Mangiola, S.; Papenfuss, A.T. tidyHeatmap: An R package for modular heatmap production based on tidy principles. J. Open Source Softw. 2020, 5, 2472. [Google Scholar] [CrossRef]
- Silva, M.; Brunner, V.; Tschurtschenthaler, M. Microbiota and colorectal cancer: From gut to bedside. Front. Pharmacol. 2021, 12, 760280. [Google Scholar] [CrossRef]
- Ahn, J.; Sinha, R.; Pei, Z.; Dominianni, C.; Wu, J.; Shi, J.; Goedert, J.J.; Hayes, R.B.; Yang, L. Human gut microbiome and risk for colorectal cancer. J. Natl. Cancer Inst. 2013, 105, 1907–1911. [Google Scholar] [CrossRef] [PubMed]
- Ai, D.; Pan, H.; Li, X.; Gao, Y.; Liu, G.; Xia, L.C. Identifying gut microbiota associated with colorectal cancer using a zero-inflated lognormal model. Front. Microbiol. 2019, 10, 826. [Google Scholar] [CrossRef] [PubMed]
- Thomas, A.M.; Manghi, P.; Asnicar, F.; Pasolli, E.; Armanini, F.; Zolfo, M.; Beghini, F.; Manara, S.; Karcher, N.; Pozzi, C.; et al. Metagenomic analysis of colorectal cancer datasets identifies cross-cohort microbial diagnostic signatures and a link with choline degradation. Nat. Med. 2019, 25, 667–678. [Google Scholar] [CrossRef] [PubMed]
- Loftus, M.; Hassouneh, S.A.-D.; Yooseph, S. Bacterial community structure alterations within the colorectal cancer gut microbiome. BMC Microbiol. 2021, 21, 98. [Google Scholar] [CrossRef] [PubMed]
- Liu, J.; Huang, X.; Chen, C.; Wang, Z.; Huang, Z.; Qin, M.; He, F.; Tang, B.; Long, C.; Hu, H.; et al. Identification of colorectal cancer progression-associated intestinal microbiome and predictive signature construction. J. Transl. Med. 2023, 21, 373. [Google Scholar] [CrossRef] [PubMed]
- Zhang, H.; Jin, K.; Xiong, K.; Jing, W.; Pang, Z.; Feng, M.; Cheng, X. Disease-associated gut microbiome and critical metabolomic alterations in patients with colorectal cancer. Cancer Med. 2023, 12, 12889–12902. [Google Scholar] [CrossRef]
- Gethings-Behncke, C.; Coleman, H.G.; Jordao, H.W.T.; Longley, D.B.; Crawford, N.; Murray, L.J.; Kunzmann, A.T. Fusobacterium nucleatum in the Colorectum and Its Association with Cancer Risk and Survival: A Systematic Review and Meta-analysis. Cancer Epidemiol. Biomark. Prev. 2020, 29, 539–548. [Google Scholar] [CrossRef] [PubMed]
- Kostic, A.D.; Chun, E.; Robertson, L.; Glickman, J.N.; Gallini, C.A.; Michaud, M.; Clancy, T.E.; Chung, D.C.; Lochhead, P.; Hold, G.L.; et al. Fusobacterium nucleatum potentiates intestinal tumorigenesis and modulates the tumor immune microenvironment. Cell Host Microbe 2013, 14, 207–215. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Rubinstein, M.R.; Baik, J.E.; Lagana, S.M.; Han, R.P.; Raab, W.J.; Sahoo, D.; Dalerba, P.; Wang, T.C.; Han, Y.W. Fusobacterium nucleatum promotes colorectal cancer by inducing Wnt/β-catenin modulator Annexin A1. EMBO Rep. 2019, 20, e47638. [Google Scholar] [CrossRef]
- Long, X.; Wong, C.C.; Tong, L.; Chu, E.S.H.; Szeto, C.H.; Go, M.Y.Y.; Coker, O.O.; Chan, A.W.H.; Chan, F.K.L.; Sung, J.J.Y.; et al. Peptostreptococcus anaerobius promotes colorectal carcinogenesis and modulates tumour immunity. Nat. Microbiol. 2019, 4, 2319–2330. [Google Scholar] [CrossRef] [PubMed]
- Cheng, Y.; Ling, Z.; Li, L. The Intestinal Microbiota and Colorectal Cancer. Front. Immunol. 2020, 11, 615056. [Google Scholar] [CrossRef]
- Fang, C.Y.; Chen, J.S.; Hsu, B.M.; Hussain, B.; Rathod, J.; Lee, K.H. Colorectal Cancer Stage-Specific Fecal Bacterial Community Fingerprinting of the Taiwanese Population and Underpinning of Potential Taxonomic Biomarkers. Microorganisms 2021, 9, 1548. [Google Scholar] [CrossRef] [PubMed]
- Ohland, C.L.; Macnaughton, W.K. Probiotic bacteria and intestinal epithelial barrier function. Am. J. Physiol. Gastrointest. Liver Physiol. 2010, 298, G807–G819. [Google Scholar] [CrossRef] [PubMed]
- Okumura, S.; Konishi, Y.; Narukawa, M.; Sugiura, Y.; Yoshimoto, S.; Arai, Y.; Sato, S.; Yoshida, Y.; Tsuji, S.; Uemura, K.; et al. Gut bacteria identified in colorectal cancer patients promote tumourigenesis via butyrate secretion. Nat. Commun. 2021, 12, 5674. [Google Scholar] [CrossRef]
- Silva, Y.P.; Bernardi, A.; Frozza, R.L. The Role of Short-Chain Fatty Acids From Gut Microbiota in Gut-Brain Communication. Front. Endocrinol. 2020, 11, 25. [Google Scholar] [CrossRef] [PubMed]
- Blaak, E.E.; Canfora, E.E.; Theis, S.; Frost, G.; Groen, A.K.; Mithieux, G.; Nauta, A.; Scott, K.; Stahl, B.; van Harsselaar, J.; et al. Short chain fatty acids in human gut and metabolic health. Benef. Microbes 2020, 11, 411–455. [Google Scholar] [CrossRef] [PubMed]
- Liu, G.; Tang, J.; Zhou, J.; Dong, M. Short-chain fatty acids play a positive role in colorectal cancer. Discov. Oncol. 2024, 15, 425. [Google Scholar] [CrossRef] [PubMed]
- Cong, J.; Liu, P.; Han, Z.; Ying, W.; Li, C.; Yang, Y.; Wang, S.; Yang, J.; Cao, F.; Shen, J.; et al. Bile acids modified by the intestinal microbiota promote colorectal cancer growth by suppressing CD8+ T cell effector functions. Immunity 2024, 57, 876–889.e11. [Google Scholar] [CrossRef] [PubMed]
- Rebersek, M. Gut microbiome and its role in colorectal cancer. BMC Cancer 2021, 21, 1325. [Google Scholar] [CrossRef] [PubMed]
- Saus, E.; Iraola-Guzmán, S.; Willis, J.R.; Brunet-Vega, A.; Gabaldón, T. Microbiome and colorectal cancer: Roles in carcinogenesis and clinical potential. Mol. Asp. Med. 2019, 69, 93–106. [Google Scholar] [CrossRef] [PubMed]
- Gallo, G.; Vescio, G.; De Paola, G.; Sammarco, G. Therapeutic Targets and Tumor Microenvironment in Colorectal Cancer. J. Clin. Med. 2021, 10, 2295. [Google Scholar] [CrossRef] [PubMed]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Han, H.; Li, Y.; Qi, Y.; Mangiola, S.; Ling, W. Deciphering Gut Microbiome in Colorectal Cancer via Robust Learning Methods. Genes 2025, 16, 452. https://doi.org/10.3390/genes16040452
Han H, Li Y, Qi Y, Mangiola S, Ling W. Deciphering Gut Microbiome in Colorectal Cancer via Robust Learning Methods. Genes. 2025; 16(4):452. https://doi.org/10.3390/genes16040452
Chicago/Turabian StyleHan, Huiye, Ying Li, Youran Qi, Stefano Mangiola, and Wodan Ling. 2025. "Deciphering Gut Microbiome in Colorectal Cancer via Robust Learning Methods" Genes 16, no. 4: 452. https://doi.org/10.3390/genes16040452
APA StyleHan, H., Li, Y., Qi, Y., Mangiola, S., & Ling, W. (2025). Deciphering Gut Microbiome in Colorectal Cancer via Robust Learning Methods. Genes, 16(4), 452. https://doi.org/10.3390/genes16040452