A Hierarchical Bayesian Model for Inferring and Decision Making in Multi-Dimensional Volatile Binary Environments
Abstract
:1. Introduction
2. Hierarchical Bayesian Perceptual Model
2.1. Beyond Independency
2.2. Perceiving Tendency and Volatility
3. Perceptual Inference Approximated by Variational Approximation
4. Decision Making in Volatile Multi-Armed Bandits
5. Simulation Results
5.1. Dynamics of Bayesian Decision Making
- (1)
- Generating synthetic data. According to the expected states of the arms (Figure 3), we randomly generated a sequence of multivariate binary inputsThen the series of ideal actions is computed by Equation (38). The random reward sequence is generated from uniform distribution based on the reward set .
- (2)
- Initializing sufficient statistics of all random parameters. To allow our model to work well for sensory inputs, we choose particular initial sufficient statistics of the random parameter vector , and determined the prior distribution of . The configuration for the parameters of the Bayesian agent (Figure 4) is shown in Table 2.
- (3)
5.2. Bayesian Model Selection
- (1)
- Generating synthetic dataset . According to Figure 3, we randomly generated 100 sequences of multivariate binary inputs (). Then the series of ideal actions are computed according to Equation (38). Random reward sequences are generated from uniform distribution based on the reward set . Here we used the notation to denote the set of sensory and action sequences
- (2)
- Initializing sufficient statistics of all random parameters in our Bayesian agent . We choose particular initial sufficient statistics of a parameter vector to allow the Bayesian agent to work well on all sequences of sensory inputs. Then we determined the prior distribution of . All configurations for parameters of the agent based on GHBF (Figure 4) are shown in Table 2.
- (3)
- Initializing sufficient statistics of all random parameters in the RW-agent . We determined a particular initial value of a parameter vector (Table A2) for the agent . All configurations for parameters of the agent based on Rescorla–Wagner model were shown in Table A2. The response model of the RW model uses the same parameter configuration as in the Bayesian agent in step 2.
- (4)
- Maximizing negative free energy. On each sequence of sensory inputs, we performed an optimization method to obtain the optimal prior parameters of the parameter for the agent and the optimal prior parameters of the parameter for the agent according to Equation (A21) respectively. In this paper, we implemented the quasi-Newton Broyden–Fletcher–Goldfarb–Shanno method based on a line search framework [49] to obtain negative free energy maximization (Equations (A19)–(A21)).
- (5)
6. Discussion
6.1. Contributions of This Work
6.2. Related Works
6.3. Strengths and Limitations
6.4. Future Work
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
GHBF | General Hierarchical Brownian Filter |
SI | Sampling interval |
RL | Reinforcement Learning |
RW | Rescorla–Wagner |
BF | Bayesian Factor |
BIC | Bayesian Information Criterion |
Appendix A. Bayesian Agent
Appendix B. Variational Bayesian Inference
Appendix C. Probabilistic Representation of Parameters
Appendix D. Variational Bayesian Learning
Appendix E. Evaluating Negative Free Energy
Appendix F. Bayesian Model Selection
Bayesian Factor | Interpretations |
---|---|
Decisive evidence for | |
Strong evidence for | |
Moderate evidence for | |
Weak evidence for | |
Weak evidence for | |
Moderate evidence for | |
Strong evidence for | |
Decisive evidence for |
Appendix G. Rescorla–Wagner Model
Name | Description | Initial Value | Fixed or Free |
---|---|---|---|
Parameters of Rescorla–Wagner model | |||
Dimension of | 2 | constant | |
Dimension of | 2 | constant | |
Prior initial state | Fixed | ||
Mean of | |||
Covariance of | |||
Learning rate | Free | ||
Mean of | 0 | ||
Covariance of |
References
- Cisek, P.; Puskas, G.A.; El-Murr, S. Decisions in changing conditions: The urgency-gating model. J. Neurosci. 2009, 29, 11560–11571. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Weiss, A.; Chambon, V.; Lee, J.K.; Drugowitsch, J.; Wyart, V. Interacting with volatile environments stabilizes hidden-state inference and its brain signatures. Nat. Commun. 2021, 12, 2228. [Google Scholar] [CrossRef] [PubMed]
- Vargas, D.V.; Lauwereyns, J. Setting the space for deliberation in decision-making. Cogn. Neurodyn. 2021, 15, 743–755. [Google Scholar] [CrossRef] [PubMed]
- Knill, D.C.; Richards, W. Perception as Bayesian Inference; Cambridge University Press: Cambridge, UK, 1996. [Google Scholar]
- Ernst, M.O.; Banks, M.S. Humans integrate visual and haptic information in a statistically optimal fashion. Nature 2002, 415, 429–433. [Google Scholar] [CrossRef]
- Weilnhammer, V.A.; Stuke, H.; Sterzer, P.; Schmack, K. The neural correlates of hierarchical predictions for perceptual decisions. J. Neurosci. 2018, 38, 5008–5021. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Zhang, W.; Wu, S.; Doiron, B.; Lee, T.S. A Normative Theory for Causal Inference and Bayes Factor Computation in Neural Circuits. In Advances in Neural Information Processing Systems; Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché Buc, F., Fox, E., Garnett, R., Eds.; Curran Associates, Inc.: New York, NY, USA, 2019; Volume 32. [Google Scholar]
- Friston, K.; FitzGerald, T.; Rigoli, F.; Schwartenbeck, P.; Pezzulo, G. Active inference and learning. Neurosci. Biobehav. Rev. 2016, 68, 862–879. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Shikauchi, Y.; Miyakoshi, M.; Makeig, S.; Iversen, J.R. Bayesian models of human navigation behaviour in an augmented reality audiomaze. Eur. J. Neurosci. 2021, 54, 8308–8317. [Google Scholar] [CrossRef]
- Zhang, J.; Gu, Y.; Chen, A.; Yu, Y. Unveiling Dynamic System Strategies for Multisensory Processing: From Neuronal Fixed-Criterion Integration to Population Bayesian Inference. Research 2022, 2022, 9787040. [Google Scholar] [CrossRef]
- Zhou, L.; Gu, Y. Cortical Mechanisms of Multisensory Linear Self-motion Perception. Neurosci. Bull. 2022, 1–13. [Google Scholar] [CrossRef]
- Chikkerur, S.; Serre, T.; Tan, C.; Poggio, T. Attention as a Bayesian inference process. In Human Vision and Electronic Imaging XVI; Rogowitz, B.E., Pappas, T.N., Eds.; Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series; SPIE Press: California, USA, 2011; Volume 7865, p. 786511. [Google Scholar]
- Vossel, S.; Mathys, C.; Stephan, K.E.; Friston, K.J. Cortical coupling reflects Bayesian belief updating in the deployment of spatial attention. J. Neurosci. 2015, 35, 11532–11542. [Google Scholar] [CrossRef]
- Lawson, R.P.; Mathys, C.; Rees, G. Adults with autism overestimate the volatility of the sensory environment. Nat. Neurosci. 2017, 20, 1293–1299. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Friston, K. The free-energy principle: A unified brain theory? Nat. Rev. Neurosci. 2010, 11, 127–138. [Google Scholar] [CrossRef] [PubMed]
- Friston, K. A theory of cortical responses. Philos. Trans. R. Soc. B Biol. Sci. 2005, 360, 815–836. [Google Scholar] [CrossRef] [PubMed]
- Stefanics, G.; Heinzle, J.; Horváth, A.A.; Stephan, K.E. Visual mismatch and predictive coding: A computational single-trial ERP study. J. Neurosci. 2018, 38, 4020–4030. [Google Scholar] [CrossRef] [Green Version]
- Wang, B.A.; Schlaffke, L.; Pleger, B. Modulations of insular projections by prior belief mediate the precision of prediction error during tactile learning. J. Neurosci. 2020, 40, 3827–3837. [Google Scholar] [CrossRef]
- Sun, Y.; Gomez, F.; Schmidhuber, J. Planning to Be Surprised: Optimal Bayesian Exploration in Dynamic Environments. In Artificial General Intelligence; Schmidhuber, J., Thórisson, K.R., Looks, M., Eds.; Springer: Berlin/Heidelberg, Germany, 2011; pp. 41–51. [Google Scholar]
- Daunizeau, J.; den Ouden, H.E.M.; Pessiglione, M.; Kiebel, S.J.; Stephan, K.E.; Friston, K.J. Observing the Observer (I): Meta-Bayesian Models of Learning and Decision-Making. PLoS ONE 2010, 5, e15554. [Google Scholar] [CrossRef]
- Daunizeau, J.; Den Ouden, H.E.; Pessiglione, M.; Kiebel, S.J.; Friston, K.J.; Stephan, K.E. Observing the observer (II): Deciding when to decide. PLoS ONE 2010, 5, e15555. [Google Scholar] [CrossRef]
- Beal, M.J. Variational Algorithms for Approximate Bayesian Inference. Ph.D. Thesis, University College London (UCL), London, UK, 2003. [Google Scholar]
- Mathys, C.D.; Daunizeau, J.; Friston, K.J.; Stephan, K.E. A Bayesian Foundation for Individual Learning Under Uncertainty. Front. Hum. Neurosci. 2011, 5, 39. [Google Scholar] [CrossRef] [Green Version]
- Vossel, S.; Mathys, C.; Daunizeau, J.; Bauer, M.; Driver, J.; Friston, K.; Stephan, K. Spatial Attention, Precision, and Bayesian Inference: A Study of Saccadic Response Speed. Cereb. Cortex 2013, 24, 1436–1450. [Google Scholar] [CrossRef] [Green Version]
- Diaconescu, A.O.; Mathys, C.; Weber, L.A.; Kasper, L.; Mauer, J.; Stephan, K.E. Hierarchical prediction errors in midbrain and septum during social learning. Soc. Cogn. Affect. Neurosci. 2017, 12, 618–634. [Google Scholar] [CrossRef]
- Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction; MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
- Si, B.; Herrmann, J.M.; Pawelzik, K. Gain-based Exploration: From Multi-armed Bandits to Partially Observable Environments. In Proceedings of the International Conference on Natural Computation, Haikou, China, 24–27 August 2007; pp. 177–182. [Google Scholar]
- Atan, O.; Tekin, C.; van der Schaar, M. Global bandits. IEEE Trans. Neural Netw. Learn. Syst. 2018, 29, 5798–5811. [Google Scholar] [CrossRef] [Green Version]
- Xu, X.; Xie, H.; Lui, J.C.S. Generalized Contextual Bandits with Latent Features: Algorithms and Applications. IEEE Trans. Neural Netw. Learn. Syst. 2021, 1–13. [Google Scholar] [CrossRef] [PubMed]
- Behrens, T.E.J.; Woolrich, M.W.; Walton, M.E.; Rushworth, M.F.S. Learning the value of information in an uncertain world. Nat. Neurosci. 2007, 10, 1214–1221. [Google Scholar] [CrossRef] [PubMed]
- Walton, M.E.; Behrens, T.E.; Buckley, M.J.; Rudebeck, P.H.; Rushworth, M.F. Separable learning systems in the macaque brain and the role of orbitofrontal cortex in contingent learning. Neuron 2010, 65, 927–939. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Costa, V.D.; Mitz, A.R.; Averbeck, B.B. Subcortical substrates of explore-exploit decisions in primates. Neuron 2019, 103, 533–545. [Google Scholar] [CrossRef] [PubMed]
- Hampton, A.N.; Bossaerts, P.; O’Doherty, J.P. Neural correlates of mentalizing-related computations during strategic interactions in humans. Proc. Natl. Acad. Sci. USA 2008, 105, 6741–6746. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Heuer, L.; Orland, A. Cooperation in the Prisoner’s Dilemma: An experimental comparison between pure and mixed strategies. R. Soc. Open Sci. 2019, 6, 182142. [Google Scholar] [CrossRef] [Green Version]
- Hill, C.A.; Suzuki, S.; Polania, R.; Moisa, M.; O’doherty, J.P.; Ruff, C.C. A causal account of the brain network computations underlying strategic social behavior. Nat. Neurosci. 2017, 20, 1142–1149. [Google Scholar] [CrossRef] [Green Version]
- Bolis, D.; Balsters, J.; Wenderoth, N.; Becchio, C.; Schilbach, L. Beyond autism: Introducing the dialectical misattunement hypothesis and a Bayesian account of intersubjectivity. Psychopathology 2017, 50, 355–372. [Google Scholar] [CrossRef]
- Konishi, T.; Kubo, T.; Watanabe, K.; Ikeda, K. Variational Bayesian Inference Algorithms for Infinite Relational Model of Network Data. IEEE Trans. Neural Netw. Learn. Syst. 2015, 26, 2176–2181. [Google Scholar] [CrossRef]
- Chien, J.T.; Ku, Y.C. Bayesian Recurrent Neural Network for Language Modeling. IEEE Trans. Neural Netw. Learn. Syst. 2016, 27, 361–374. [Google Scholar] [CrossRef] [PubMed]
- Qi, Y.; Liu, B.; Wang, Y.; Pan, G. Dynamic ensemble modeling approach to nonstationary neural decoding in Brain-computer interfaces. In Proceedings of the Advances in Neural Information Processing Systems 32 (Nips 2019), Vancouver, BC, Canada, 8–14 December 2019. [Google Scholar]
- Li, H.; Barnaghi, P.; Enshaeifar, S.; Ganz, F. Continual Learning Using Bayesian Neural Networks. IEEE Trans. Neural Netw. Learn. Syst. 2021, 32, 4243–4252. [Google Scholar] [CrossRef] [PubMed]
- Wang, H.; Yeung, D.Y. Towards Bayesian deep learning: A framework and some existing methods. IEEE Trans. Knowl. Data Eng. 2016, 28, 3395–3408. [Google Scholar] [CrossRef] [Green Version]
- Du, C.; Zhu, J.; Zhang, B. Learning Deep Generative Models With Doubly Stochastic Gradient MCMC. IEEE Trans. Neural Netw. Learn. Syst. 2018, 29, 3084–3096. [Google Scholar] [CrossRef] [PubMed]
- Mirza, M.B.; Adams, R.A.; Mathys, C.; Friston, K.J. Human visual exploration reduces uncertainty about the sensed world. PLoS ONE 2018, 13, e0190429. [Google Scholar] [CrossRef] [PubMed]
- Adolphs, R. Cognitive neuroscience of human social behaviour. Nat. Rev. Neurosci. 2003, 4, 165–178. [Google Scholar] [CrossRef] [PubMed]
- Pezzulo, G.; Friston, K.J. The value of uncertainty: An active inference perspective. Behav. Brain Sci. 2019, 42, e47. [Google Scholar] [CrossRef] [PubMed]
- Zhu, C.; Zhou, K.; Han, Z.; Tang, Y.; Tang, F.; Si, B. General hierarchical Brownian filter in multi-dimensional volatile environments. 2022; submitted. [Google Scholar]
- Mathys, C.D.; Lomakina, E.I.; Daunizeau, J.; Iglesias, S.; Brodersen, K.H.; Friston, K.J.; Stephan, K.E. Uncertainty in perception and the Hierarchical Gaussian Filter. Front. Hum. Neurosci. 2014, 8, 825. [Google Scholar] [CrossRef] [Green Version]
- Al-Nowaihi, A.; Dhami, S. Probability Weighting Functions; University of Leicester: Leicester, UK, 2010. [Google Scholar]
- Nocedal, J.; Wright S., J. Numerical Optimization; Spinger: New York, NY, USA, 2006. [Google Scholar]
- Ando, T. Bayesian Model Selection and Statistical Modeling; CRC Press: Cleveland, OH, USA, 2010. [Google Scholar]
- Zhang, L.; Gläscher, J. A brain network supporting social influences in human decision-making. Sci. Adv. 2020, 6, eabb4159. [Google Scholar] [CrossRef]
- Berger, J.O. Statistical Decision Theory and Bayesian Analysis; Springer Inc.: New York, NY, USA, 2013. [Google Scholar]
- Zeng, T.; Si, B. A brain-inspired compact cognitive mapping system. Cogn. Neurodyn. 2021, 15, 91–101. [Google Scholar] [CrossRef]
- Chen, S.; Tang, J.; Zhu, L.; Kong, W. A multi-stage dynamical fusion network for multimodal emotion recognition. Cogn. Neurodyn. 2022, 1–10. [Google Scholar] [CrossRef]
- Walkenbach, J.; Haddad, N.F. The Rescorla-Wagner theory of conditioning: A review of the literature. Psychol. Rec. 1980, 30, 497–509. [Google Scholar] [CrossRef]
- Zhang, L.; Lengersdorff, L.; Mikus, N.; Gläscher, J.; Lamm, C. Using reinforcement learning models in social neuroscience: Frameworks, pitfalls and suggestions of best practices. Soc. Cogn. Affect. Neurosci. 2020, 15, 695–707. [Google Scholar] [CrossRef] [PubMed]
- Zheng, N.; Ding, J.; Chai, T. DMGAN: Adversarial Learning-Based Decision Making for Human-Level Plant-Wide Operation of Process Industries Under Uncertainties. IEEE Trans. Neural Netw. Learn. Syst. 2021, 32, 985–998. [Google Scholar] [CrossRef] [PubMed]
- Chen, X.; Yang, T. A neural network model of basal ganglia’s decision-making circuitry. Cogn. Neurodyn. 2021, 15, 17–26. [Google Scholar] [CrossRef] [PubMed]
- Mao, D. Neural Correlates of Spatial Navigation in Primate Hippocampus. Neurosci. Bull. 2022, 1–13. [Google Scholar] [CrossRef] [PubMed]
- Zheng, L.; Liu, W.; Long, Y.; Zhai, Y.; Zhao, H.; Bai, X.; Zhou, S.; Li, K.; Zhang, H.; Liu, L.; et al. Affiliative bonding between teachers and students through interpersonal synchronisation in brain activity. Soc. Cogn. Affect. Neurosci. 2020, 15, 97–109. [Google Scholar] [CrossRef] [Green Version]
- Wang, Y.; Yang, X.; Tang, Z.; Xiao, S.; Hewig, J. Hierarchical neural prediction of interpersonal trust. Neurosci. Bull. 2021, 37, 511–522. [Google Scholar] [CrossRef]
- Wang, W.; Fu, C.; Kong, X.; Osinsky, R.; Hewig, J.; Wang, Y. Neuro-behavioral dynamic prediction of interpersonal cooperation and aggression. Neurosci. Bull. 2022, 38, 275–289. [Google Scholar] [CrossRef]
- Dong, W.; Chen, H.; Sit, T.; Han, Y.; Song, F.; Vyssotski, A.L.; Gross, C.T.; Si, B.; Zhan, Y. Characterization of exploratory patterns and hippocampal–prefrontal network oscillations during the emergence of free exploration. Sci. Bull. 2021, 66, 2238–2250. [Google Scholar] [CrossRef]
- Friston, K.; FitzGerald, T.; Rigoli, F.; Schwartenbeck, P.; Pezzulo, G. Active Inference: A Process Theory. Neural Comput. 2017, 29, 1–49. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Friston, K.; Mattout, J.; Trujillo-Barreto, N.; Ashburner, J.; Penny, W. Variational free energy and the Laplace approximation. NeuroImage 2007, 34, 220–234. [Google Scholar] [CrossRef] [PubMed]
- Harold Jeffreys, S. Theory of Probability; Clarendon Press: Oxford, UK, 1961. [Google Scholar]
- Schwarz, G. Estimating the dimension of a model. Ann. Stat. 1978, 6, 461–464. [Google Scholar] [CrossRef]
a | |||
---|---|---|---|
0 | 1 | ||
(0,0) | 0 | ||
(1,1) | 0 | ||
(1,0) | 0 | ||
(0,1) | 0 |
Name | Description | Initial Value | Fixed or Free |
---|---|---|---|
Parameters of our Bayesian perceptual model | |||
Dimension of sensory input | 2 | constant | |
Dimension of | 2 | constant | |
Dimension of | 3 | constant | |
Sampling interval | 1 | constant | |
Upper bound on | constant | ||
Volatility of | Free | ||
Mean of | |||
Covariance of | |||
Upper bound on | constant | ||
Coupling strength | Free | ||
Mean of | |||
Covariance of | |||
Coupling bias | Fixed | ||
Mean of | |||
Covariance of | |||
Prior mean of | Free | ||
Mean of | |||
Covariance of | |||
Prior covariance of | Free | ||
Mean of | |||
Covariance of | |||
Prior mean of | Free | ||
Mean of | |||
Covariance of | |||
Prior covariance of | Free | ||
Mean of | |||
Covariance of | |||
Coefficient | Fixed | ||
Mean of | |||
Covariance of | |||
Parameters of our response model | |||
Dimension of a | 1 | Fixed | |
Coefficient | Fixed | ||
Mean of | |||
Covariance of | 0 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhu, C.; Zhou, K.; Tang, F.; Tang, Y.; Li, X.; Si, B. A Hierarchical Bayesian Model for Inferring and Decision Making in Multi-Dimensional Volatile Binary Environments. Mathematics 2022, 10, 4775. https://doi.org/10.3390/math10244775
Zhu C, Zhou K, Tang F, Tang Y, Li X, Si B. A Hierarchical Bayesian Model for Inferring and Decision Making in Multi-Dimensional Volatile Binary Environments. Mathematics. 2022; 10(24):4775. https://doi.org/10.3390/math10244775
Chicago/Turabian StyleZhu, Changbo, Ke Zhou, Fengzhen Tang, Yandong Tang, Xiaoli Li, and Bailu Si. 2022. "A Hierarchical Bayesian Model for Inferring and Decision Making in Multi-Dimensional Volatile Binary Environments" Mathematics 10, no. 24: 4775. https://doi.org/10.3390/math10244775
APA StyleZhu, C., Zhou, K., Tang, F., Tang, Y., Li, X., & Si, B. (2022). A Hierarchical Bayesian Model for Inferring and Decision Making in Multi-Dimensional Volatile Binary Environments. Mathematics, 10(24), 4775. https://doi.org/10.3390/math10244775