GraphPPL.jl: A Probabilistic Programming Language for Graphical Models
Abstract
:1. Introduction
- Section 2 reviews factor graphs, variational Bayesian inference, and CBFE minimization. Additionally, we discuss why CBFE minimization on a factor graph is considered a general, customizable inference method.
- Section 3 discusses related works.
- Section 4.2 delves into the core design philosophy of GraphPPL.jl and explains the rationale for implementing the CBFE minimization plugin as the default in GraphPPL.jl.
- Section 4.3 introduces the @model macro and syntax, which serves as the primary entry point for creating any GraphPPL.jl model.
- Section 4.4 showcases how models defined with the @model macro can be reused in larger models, thereby adding modularity to the language.
- Section 4.7 presents the @constraints macro, which specifies factorization and functional-form constraints on the variational posterior for the inference engine. We define a clear and intuitive constraint language. Both @model and @constraints exemplify the integration of models specified within GraphPPL.jl with a particular inference backend.
- Section 5 demonstrates the integration of GraphPPL.jl in the RxInfer.jl inference ecosystem with an inference example.
2. Background
2.1. Bayesian Inference
2.2. Factor Graphs
2.3. Variational Inference
2.4. Constrained Bethe Free Energy
3. Related Works
4. The GraphPPL.jl Engine
4.1. Representing Graphical Models
4.2. Language Philosophy
- GraphPPL.jl models should be as readable as Julia programs. Drawing inspiration from PyTorch [32] and their approach of treating ’deep learning models as Python programs’, we want GraphPPL.jl models to have the same readability and feel as Julia programs. As a result, any GraphPPL.jl model should be usable as a component within a larger GraphPPL.jl model.
- A materialized GraphPPL.jl model should be compatible with various inference backends; for example, CBFE minimization. The model should be extendable to inject all required information for any inference backend to perform Bayesian inference.
4.3. The Model Specification Language
- Code Block 1. GraphPPL.jl code for the coin toss model defined in Equation (11).
- Code Block 2. The GCV model from Equation (12) implemented with compound statements.
- Code Block 3. The GCV model from Equation (12) implemented with deterministic statements.
4.4. Modular Definition and Usage of Models
- Code Block 4. Model chaining a gcv submodel with a Gaussian likelihood.
- Code Block 5. The hierarchical Gaussian filter defined in GraphPPL.jl using the gcv and gcv_lm submodels.
4.5. Example: Bayesian Neural Network
- Code Block 6. Dot product with nonlinearity applied as the basic building block of a neural network.
- Code Block 7. Artificial neuron.
- Code Block 8. Fully connected neural network layer defined as a composition of neurons.
- Code Block 9. Example of a multilayer perceptron.
4.6. Extensibility
4.7. Constraint Specification
- Code Block 10. Example constraint specification for an HGF model.
5. Inference Example with the ReactiveMP.jl Backend
- Code Block 11. Simple hierarchical state-space model in GraphPPL.jl.
- Code Block 12. Inference constraints for the hierarchical state-space model.
- Code Block 13. Running inference in the hierarchical_ssm model with generated data.
6. Discussion
- It should be possible to use any GraphPPL.jl model as a submodel in any subsequent model.
- A materialized GraphPPL.jl model should contain all information necessary to perform Bayesian inference. The model should be extendable by backend developers to include additional information.
- A GraphPPL.jl model should look, as much as possible, like the mathematical representation of the generative model, exposing as few implementation details as possible.
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A. Additional Examples
Appendix A.1. Recursive Generative Model
- Code Block A1. Recursively defined generative model. Depth is determined by the input size.
Appendix A.2. Variational Autoencoder
- Code Block A2. Neural network used as both encoder and decoder in the variational autoencoder example.
- Code Block A3. A Variational autoencoder, with the decoder implemented as a mirrored version of the encoder.
References
- Phan, D.; Pradhan, N.; Jankowiak, M. Composable Effects for Flexible and Accelerated Probabilistic Programming in NumPyro. arXiv 2019, arXiv:1912.11554. [Google Scholar]
- Ge, H.; Xu, K.; Ghahramani, Z. Turing: A Language for Flexible Probabilistic Inference. In Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, PMLR, Playa Blanca, Spain, 9–11 April 2018; pp. 1682–1690, ISSN 2640-3498. [Google Scholar]
- Semenova, E.; Williams, D.P.; Afzal, A.M.; Lazic, S.E. A Bayesian neural network for toxicity prediction. Comput. Toxicol. 2020, 16, 100133. [Google Scholar]
- Da Silva, S.L.E.; Karsou, A.; Moreira, R.M.; Cetale, M. Bayesian weighted time-lapse full-waveform inversion using a receiver-extension strategy. IEEE Trans. Geosci. Remote. Sens. 2024, 62, 5921522. [Google Scholar]
- Griffiths, T.L.; Chater, N.; Kemp, C.; Perfors, A.; Tenenbaum, J.B. Probabilistic models of cognition: Exploring representations and inductive biases. Trends Cogn. Sci. 2010, 14, 357–364. [Google Scholar]
- Spiegelhalter, D.; Thomas, A.; Best, N.; Gilks, W. BUGS 0.5: Bayesian inference using Gibbs sampling manual (version ii). In MRC Biostatistics Unit, Institute of Public Health; Citeseer: Cambridge, UK, 1996; pp. 1–59. [Google Scholar]
- Gelman, A.; Lee, D.; Guo, J. Stan: A Probabilistic Programming Language for Bayesian Inference and Optimization. J. Educ. Behav. Stat. 2015, 40, 530–543. [Google Scholar] [CrossRef]
- Cox, M.; van de Laar, T.; de Vries, B. ForneyLab. jl: Fast and flexible automated inference through message passing in Julia. In Proceedings of the International Conference on Probabilistic Programming, Cambridge, MA, USA, 5–6 October 2018. [Google Scholar]
- Luttinen, J. BayesPy: Variational Bayesian inference in Python. J. Mach. Learn. Res. 2016, 17, 1419–1424. [Google Scholar]
- Bagaev, D.; Podusenko, A.; Vries, B.d. RxInfer: A Julia package for reactive real-time Bayesian inference. J. Open Source Softw. 2023, 8, 5161. [Google Scholar] [CrossRef]
- Duane, S.; Kennedy, A.D.; Pendleton, B.J.; Roweth, D. Hybrid Monte Carlo. Phys. Lett. B 1987, 195, 216–222. [Google Scholar] [CrossRef]
- Hoffman, M.D.; Gelman, A. The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo. arXiv 2011. [CrossRef]
- Bagaev, D.; de Vries, B. Reactive Message Passing for Scalable Bayesian Inference. arXiv 2021. [CrossRef]
- Șenöz, İ. Message Passing Algorithms for Hierarchical Dynamical Models. Ph.D. Thesis, Eindhoven University of Technology, Eindhoven, The Netherlands, 2022. ISBN 9789038655321. [Google Scholar]
- Loeliger, H.A.; Dauwels, J.; Hu, J.; Korl, S.; Ping, L.; Kschischang, F.R. The Factor Graph Approach to Model-Based Signal Processing. Proc. IEEE 2007, 95, 1295–1322. [Google Scholar] [CrossRef]
- Kullback, S.; Leibler, R.A. On Information and Sufficiency. Ann. Math. Stat. 1951, 22, 79–86. [Google Scholar] [CrossRef]
- Blei, D.M.; Kucukelbir, A.; McAuliffe, J.D. Variational Inference: A Review for Statisticians. J. Am. Stat. Assoc. 2017, 112, 859–877. [Google Scholar] [CrossRef]
- Heskes, T. Convexity Arguments for Efficient Minimization of the Bethe and Kikuchi Free Energies. J. Artif. Intell. Res. 2006, 26, 153–190. [Google Scholar] [CrossRef]
- Chertkov, M.; Chernyak, V.Y. Loop calculus in statistical physics and information science. Phys. Rev. E 2006, 73, 065102. [Google Scholar] [CrossRef]
- Pearl, J. Reverend Bayes on inference engines: A distributed hierarchical approach. In Proceedings of the AAAI National Conference on AI, Pittsburgh, PA, USA, 18–20 August 1982; pp. 133–136. [Google Scholar]
- Yedidia, J.S.; Freeman, W.; Weiss, Y. Constructing free-energy approximations and generalized belief propagation algorithms. IEEE Trans. Inf. Theory 2005, 51, 2282–2312. [Google Scholar] [CrossRef]
- Lunn, D.J.; Thomas, A.; Best, N.; Spiegelhalter, D. WinBUGS-A Bayesian modelling framework: Concepts, structure, and extensibility. Stat. Comput. 2000, 10, 325–337. [Google Scholar] [CrossRef]
- Carpenter, B.; Gelman, A.; Hoffman, M.D.; Lee, D.; Goodrich, B.; Betancourt, M.; Brubaker, M.; Guo, J.; Li, P.; Riddell, A. Stan: A Probabilistic Programming Language. J. Stat. Softw. 2017, 76, 1. [Google Scholar] [CrossRef]
- Goodman, N.; Mansinghka, V.; Roy, D.M.; Bonawitz, K.; Tenenbaum, J.B. Church: A language for generative models. arXiv 2014. [CrossRef]
- Pfeffer, A. IBAL: A Probabilistic Rational Programming Language. In IJCAI; Citeseer: Cambridge, UK, 2001. [Google Scholar]
- Murray, L.M. Bayesian State-Space Modelling on High-Performance Hardware Using LibBi. arXiv 2013. [CrossRef]
- Bingham, E.; Chen, J.P.; Jankowiak, M.; Obermeyer, F.; Pradhan, N.; Karaletsos, T.; Singh, R.; Szerlip, P.; Horsfall, P.; Goodman, N.D. Pyro: Deep Universal Probabilistic Programming. arXiv 2018. [CrossRef]
- Mansinghka, V.; Selsam, D.; Perov, Y. Venture: A higher-order probabilistic programming platform with programmable inference. arXiv 2014. [CrossRef]
- Dillon, J.V.; Langmore, I.; Tran, D.; Brevdo, E.; Vasudevan, S.; Moore, D.; Patton, B.; Alemi, A.; Hoffman, M.; Saurous, R.A. TensorFlow Distributions. arXiv 2017. [CrossRef]
- Krapu, C.; Borsuk, M. Probabilistic programming: A review for environmental modellers. Environ. Model. Softw. 2019, 114, 40–48. [Google Scholar] [CrossRef]
- Bezanson, J.; Edelman, A.; Karpinski, S.; Shah, V.B. Julia: A Fresh Approach to Numerical Computing. arXiv 2015. [CrossRef]
- Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library. arXiv 2019. [CrossRef]
- Şenöz, İ.; de Vries, B. Online Variational Message Passing in the Hierarchical Gaussian Filter. In Proceedings of the 2018 IEEE 28th International Workshop on Machine Learning for Signal Processing (MLSP), Aalborg, Denmark, 17–20 September 2018; pp. 1–6, ISSN 1551-2541. [Google Scholar] [CrossRef]
- Mathys, C.D. Hierarchical Gaussian filtering: Construction and variational inversion of a generic Bayesian model of individual learning under uncertainty. Ph.D. Thesis, ETH Zurich, Zürich, Switzerland, 2012. [Google Scholar] [CrossRef]
- Mathys, C.D.; Lomakina, E.I.; Daunizeau, J.; Iglesias, S.; Brodersen, K.H.; Friston, K.J.; Stephan, K.E. Uncertainty in perception and the Hierarchical Gaussian Filter. Front. Hum. Neurosci. 2014, 8, 825. [Google Scholar] [CrossRef]
- Martin, R.C. Agile Software Development: Principles, Patterns, and Practices; Pearson Education: Upper Saddle River, NJ, USA, 2003. [Google Scholar]
- Bagaev, D.; van Erp, B.; Podusenko, A.; de Vries, B. ReactiveMP.jl: A Julia package for reactive variational Bayesian inference. Softw. Impacts 2022, 12, 100299. [Google Scholar] [CrossRef]
- Dauwels, J. On Variational Message Passing on Factor Graphs. In Proceedings of the IEEE International Symposium on Information Theory, Nice, France, 24–29 June 2007; pp. 2546–2550. [Google Scholar] [CrossRef]
- Winn, J.; Bishop, C. Variational Message Passing. J. Mach. Learn. Res. 2005, 6, 661–694. [Google Scholar]
- Knill, D.C.; Pouget, A. The Bayesian brain: The role of uncertainty in neural coding and computation. Trends Neurosci. 2004, 27, 712–719. [Google Scholar] [CrossRef]
- Friston, K. The free-energy principle: A unified brain theory? Nat. Rev. Neurosci. 2010, 11, 127–138. [Google Scholar] [CrossRef]
- Kirchhoff, M.; Parr, T.; Palacios, E.; Friston, K.; Kiverstein, J. The Markov blankets of life: Autonomy, active inference and the free energy principle. J. R. Soc. Interface 2018, 15, 20170792. [Google Scholar] [CrossRef]
- Annila, A.; Kuismanen, E. Natural hierarchy emerges from energy dispersal. Biosystems 2009, 95, 227–233. [Google Scholar] [CrossRef]
- Dritsas, I. Stochastic Optimization: Seeing the Optimal for the Uncertain; BoD—Books on Demand: Norderstedt, Germany, 2011. [Google Scholar]
- Koenderink, J.J. The structure of images. Biol. Cybern. 1984, 50, 363–370. [Google Scholar] [CrossRef]
- Kingma, D.P.; Welling, M. Auto-Encoding Variational Bayes. arXiv 2022. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Nuijten, W.W.L.; Bagaev, D.; de Vries, B. GraphPPL.jl: A Probabilistic Programming Language for Graphical Models. Entropy 2024, 26, 890. https://doi.org/10.3390/e26110890
Nuijten WWL, Bagaev D, de Vries B. GraphPPL.jl: A Probabilistic Programming Language for Graphical Models. Entropy. 2024; 26(11):890. https://doi.org/10.3390/e26110890
Chicago/Turabian StyleNuijten, Wouter W. L., Dmitry Bagaev, and Bert de Vries. 2024. "GraphPPL.jl: A Probabilistic Programming Language for Graphical Models" Entropy 26, no. 11: 890. https://doi.org/10.3390/e26110890
APA StyleNuijten, W. W. L., Bagaev, D., & de Vries, B. (2024). GraphPPL.jl: A Probabilistic Programming Language for Graphical Models. Entropy, 26(11), 890. https://doi.org/10.3390/e26110890