Multi-Task Learning for Compositional Data via Sparse Network Lasso
Abstract
:1. Introduction
2. Multi-Task Learning Based on a Network Lasso
3. Regression Modeling for Compositional Data
4. Proposed Method
4.1. Model
4.2. Estimation Algorithm
Algorithm 1 Estimation algorithm for (12) via ADMM |
|
Algorithm 2 Estimation algorithm for constrained Weber problem (11) via ADMM |
|
5. Simulation Studies
6. Application to Gut Microbiome Data
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
ADMM | alternating direction method of multipliers |
Appendix A. Derivations of Update Formulas in ADMM
Appendix A.1. Update of w
Appendix A.2. Update of a
Appendix A.3. Update of b
Appendix A.4. Update of Q
Appendix B. Update Algorithm for Constrained Weber Problem via ADMM
Appendix B.1. Update of w i *
Appendix B.2. Update of e
Appendix B.3. Update of u and v
References
- Argyriou, A.; Evgeniou, T.; Pontil, M. Convex multi-task feature learning. Mach. Learn. 2008, 73, 243–272. [Google Scholar] [CrossRef] [Green Version]
- Abdulnabi, A.H.; Wang, G.; Lu, J.; Jia, K. Multi-Task CNN Model for Attribute Prediction. IEEE Trans. Multimed. 2015, 17, 1949–1959. [Google Scholar] [CrossRef] [Green Version]
- Luong, M.T.; Le, Q.V.; Sutskever, I.; Vinyals, O.; Kaiser, L. Multi-task Sequence to Sequence Learning. In Proceedings of the 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, 2–4 May 2016. [Google Scholar]
- Lengerich, B.J.; Aragam, B.; Xing, E.P. Personalized regression enables sample-specific pan-cancer analysis. Bioinformatics 2018, 34, i178–i186. [Google Scholar] [CrossRef] [Green Version]
- Cowie, M.R.; Mosterd, A.; Wood, D.A.; Deckers, J.W.; Poole-Wilson, P.A.; Sutton, G.C.; Grobbee, D.E. The epidemiology of heart failure. Eur. Heart J. 1997, 18, 208–225. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Xu, J.; Zhou, J.; Tan, P.N. FORMULA: FactORized MUlti-task LeArning for task discovery in personalized medical models. In Proceedings of the 2015 SIAM International Conference on Data Mining (SDM), Vancouver, BC, Canada, 30 April–2 May 2015; pp. 496–504. [Google Scholar]
- Yamada, M.; Koh, T.; Iwata, T.; Shawe-Taylor, J.; Kaski, S. Localized Lasso for High-Dimensional Regression. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, Ft. Lauderdale, FL, USA, 20–22 April 2017; pp. 325–333. [Google Scholar]
- Hallac, D.; Leskovec, J.; Boyd, S. Network Lasso: Clustering and Optimization in Large Graphs. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, Australia, 10–13 August 2015; pp. 387–396. [Google Scholar]
- Wu, G.D.; Chen, J.; Hoffmann, C.; Bittinger, K.; Chen, Y.Y.; Keilbaugh, S.A.; Bewtra, M.; Knights, D.; Walters, W.A.; Knight, R.; et al. Linking Long-Term Dietary Patterns with Gut Microbial Enterotypes. Science 2011, 334, 105–108. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Dillon, S.M.; Frank, D.N.; Wilson, C.C. The gut microbiome and HIV-1 pathogenesis: A two-way street. AIDS 2016, 30, 2737–2751. [Google Scholar] [CrossRef] [Green Version]
- Arumugam, M.; Raes, J.; Pelletier, E.; Le Paslier, D.; Yamada, T.; Mende, D.R.; Fernandes, G.R.; Tap, J.; Bruls, T.; Batto, J.M.; et al. Enterotypes of the human gut microbiome. Nature 2011, 473, 174–180. [Google Scholar] [CrossRef] [Green Version]
- Aitchison, J.; Bacon-Shone, J. Log contrast models for experiments with mixtures. Biometrika 1984, 71, 323–330. [Google Scholar] [CrossRef]
- Lin, W.; Shi, P.; Feng, R.; Li, H. Variable selection in regression with compositional covariates. Biometrika 2014, 101, 785–797. [Google Scholar] [CrossRef]
- Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. 1996, 58, 267–288. [Google Scholar] [CrossRef]
- Boyd, S.; Parikh, N.; Chu, E. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers; Now Publishers Inc.: Delft, The Netherlands, 2011. [Google Scholar]
- Kong, D.; Fujimaki, R.; Liu, J.; Nie, F.; Ding, C. Exclusive Feature Learning on Arbitrary Structures via ℓ1,2-norm. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 8–13 December 2014; pp. 1655–1663. [Google Scholar]
- Aitchison, J. The statistical analysis of compositional data. J. R. Stat. Soc. 1982, 44, 139–160. [Google Scholar] [CrossRef]
- Shi, P.; Zhang, A.; Li, H. Regression analysis for microbiome compositional data. Ann. Appl. Stat. 2016, 10, 1019–1040. [Google Scholar] [CrossRef]
- Wang, T.; Zhao, H. Structured subcomposition selection in regression and its application to microbiome data analysis. Ann. Appl. Stat. 2017, 11, 771–791. [Google Scholar] [CrossRef]
- Bien, J.; Yan, X.; Simpson, L.; Müller, C.L. Tree-aggregated predictive modeling of microbiome data. Sci. Rep. 2021, 11, 14505. [Google Scholar] [CrossRef]
- Combettes, P.L.; Müller, C.L. Regression Models for Compositional Data: General Log-Contrast Formulations, Proximal Optimization, and Microbiome Data Applications. Stat. Biosci. 2021, 13, 217–242. [Google Scholar] [CrossRef]
- Friedman, J.; Hastie, T.; Tibshirani, R. A note on the group lasso and a sparse group lasso. arXiv 2010, arXiv:1001.0736. [Google Scholar]
- Haro, C.; Rangel-Zúñiga, O.A.; Alcala-Diaz, J.F.; Gómez-Delgado, F.; Pérez-Martínez, P.; Delgado-Lista, J.; Quintana-Navarro, G.M.; Landa, B.B.; Navas-Cortés, J.A.; Tena-Sempere, M.; et al. Intestinal microbiota is influenced by gender and body mass index. PloS ONE 2016, 11, e0154090. [Google Scholar] [CrossRef] [Green Version]
- Saraswati, S.; Sitaraman, R. Aging and the human gut microbiota–from correlation to causality. Front. Microbiol. 2015, 5, 764. [Google Scholar] [CrossRef] [Green Version]
- McMurdie, P.J.; Holmes, S. phyloseq: An R package for reproducible interactive analysis and graphics of microbiome census data. PloS ONE 2013, 8, e61217. [Google Scholar] [CrossRef] [Green Version]
- Gower, J.C. A General Coefficient of Similarity and Some of Its Properties. Biometrics 1971, 27, 857–871. [Google Scholar] [CrossRef]
- Greenacre, M. Compositional Data Analysis in Practice; CRC Press: Boca Raton, FL, USA, 2018. [Google Scholar]
- Cuevas-Sierra, A.; Riezu-Boj, J.I.; Guruceaga, E.; Milagro, F.I.; Martínez, J.A. Sex-Specific Associations between Gut Prevotellaceae and Host Genetics on Adiposity. Microorganisms 2020, 8, 938. [Google Scholar] [CrossRef] [PubMed]
- Zeng, Q.; Li, D.; He, Y.; Li, Y.; Yang, Z.; Zhao, X.; Liu, Y.; Wang, Y.; Sun, J.; Feng, X.; et al. Discrepant gut microbiota markers for the classification of obesity-related metabolic abnormalities. Sci. Rep. 2019, 9, 13424. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Chaudhury, K.N.; Ramakrishnan, K.R. A new ADMM algorithm for the Euclidean Median and its application to robust patch regression. In Proceedings of the 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), South Brisbane, Australia, 19–24 April 2015. [Google Scholar]
- Parikh, N.; Boyd, S. Proximal Algorithms. Found. Trends Optim. 2014, 1, 127–239. [Google Scholar] [CrossRef]
Method | |||
---|---|---|---|
CL | |||
Proposed | |||
SNL | |||
Proposed | |||
SNL | |||
Proposed | |||
SNL | |||
Proposed | |||
SNL | |||
Proposed | |||
SNL |
Method | |||
---|---|---|---|
CL | |||
Proposed | |||
SNL | |||
Proposed | |||
SNL | |||
Proposed | |||
SNL | |||
Proposed | |||
SNL | |||
Proposed | |||
SNL |
Method | |||
---|---|---|---|
CL | |||
Proposed | |||
SNL | |||
Proposed | |||
SNL | |||
Proposed | |||
SNL | |||
Proposed | |||
SNL | |||
Proposed | |||
SNL |
Value | Proposed (i) | Proposed (ii) | CL |
---|---|---|---|
MSE (SD) |
Value | Proposed (i) | CL |
---|---|---|
LOOCV |
Variable | Kingdom | Phylum | Class | Order | Family | Genus | Species |
---|---|---|---|---|---|---|---|
OTU1 | Bacteria | ||||||
OTU2 | Bacteria | ||||||
OTU3 | Bacteria | Bacteroidetes | |||||
OTU4 | Bacteria | Bacteroidetes | Bacteroidetes | Bacteroidales | |||
OTU5 | Bacteria | Bacteroidetes | Bacteroidetes | Bacteroidales | Bacteroidaceae | Bacteroides | |
OTU6 | Bacteria | Bacteroidetes | Bacteroidetes | Bacteroidales | Bacteroidaceae | Bacteroides | |
OTU7 | Bacteria | Bacteroidetes | Bacteroidetes | Bacteroidales | Bacteroidaceae | Bacteroides | |
OTU8 | Bacteria | Bacteroidetes | Bacteroidetes | Bacteroidales | Bacteroidaceae | Bacteroides | |
OTU9 | Bacteria | Bacteroidetes | Bacteroidetes | Bacteroidales | Bacteroidaceae | Bacteroides | |
OTU10 | Bacteria | Bacteroidetes | Bacteroidetes | Bacteroidales | Bacteroidaceae | Bacteroides | |
OTU11 | Bacteria | Bacteroidetes | Bacteroidetes | Bacteroidales | Porphyromonadaceae | Parabacteroides | |
OTU12 | Bacteria | Bacteroidetes | Bacteroidetes | Bacteroidales | Prevotellaceae | ||
OTU13 | Bacteria | Firmicutes | Clostridia | ||||
OTU14 | Bacteria | Firmicutes | Clostridia | Clostridiales | |||
OTU15 | Bacteria | Firmicutes | Clostridia | Clostridiales | |||
OTU16 | Bacteria | Firmicutes | Clostridia | Clostridiales | |||
OTU17 | Bacteria | Firmicutes | Clostridia | Clostridiales | |||
OTU18 | Bacteria | Firmicutes | Clostridia | Clostridiales | |||
OTU19 | Bacteria | Firmicutes | Clostridia | Clostridiales | Lachnospiraceae | ||
OTU20 | Bacteria | Firmicutes | Clostridia | Clostridiales | Lachnospiraceae | ||
OTU21 | Bacteria | Firmicutes | Clostridia | Clostridiales | Lachnospiraceae | ||
OTU22 | Bacteria | Firmicutes | Clostridia | Clostridiales | Lachnospiraceae | ||
OTU23 | Bacteria | Firmicutes | Clostridia | Clostridiales | Lachnospiraceae | ||
OTU24 | Bacteria | Firmicutes | Clostridia | Clostridiales | Lachnospiraceae | ||
OTU25 | Bacteria | Firmicutes | Clostridia | Clostridiales | Lachnospiraceae | ||
OTU26 | Bacteria | Firmicutes | Clostridia | Clostridiales | Lachnospiraceae | ||
OTU27 | Bacteria | Firmicutes | Clostridia | Clostridiales | Lachnospiraceae | ||
OTU28 | Bacteria | Firmicutes | Clostridia | Clostridiales | Lachnospiraceae | ||
OTU29 | Bacteria | Firmicutes | Clostridia | Clostridiales | Lachnospiraceae | Roseburia | |
OTU30 | Bacteria | Firmicutes | Clostridia | Clostridiales | Ruminococcaceae | ||
OTU31 | Bacteria | Firmicutes | Clostridia | Clostridiales | Ruminococcaceae | ||
OTU32 | Bacteria | Firmicutes | Erysipelotrichia | Erysipelotrichales | Erysipelotrichaceae | Catenibacterium | |
OTU33 | Bacteria | Firmicutes | Erysipelotrichia | Erysipelotrichales | Erysipelotrichaceae | Erysipelotrichaceae. Incertae.Sedis | |
OTU34 | Bacteria | Proteobacteria | |||||
OTU35 | Bacteria | Proteobacteria | Gammaproteobacteria | Enterobacteriales | Enterobacteriaceae |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Okazaki, A.; Kawano, S. Multi-Task Learning for Compositional Data via Sparse Network Lasso. Entropy 2022, 24, 1839. https://doi.org/10.3390/e24121839
Okazaki A, Kawano S. Multi-Task Learning for Compositional Data via Sparse Network Lasso. Entropy. 2022; 24(12):1839. https://doi.org/10.3390/e24121839
Chicago/Turabian StyleOkazaki, Akira, and Shuichi Kawano. 2022. "Multi-Task Learning for Compositional Data via Sparse Network Lasso" Entropy 24, no. 12: 1839. https://doi.org/10.3390/e24121839
APA StyleOkazaki, A., & Kawano, S. (2022). Multi-Task Learning for Compositional Data via Sparse Network Lasso. Entropy, 24(12), 1839. https://doi.org/10.3390/e24121839