Effect of Private Deliberation: Deception of Large Language Models in Game Play
Abstract
:1. Introduction
- We formalize LLM-agent-based games using the partially observable stochastic game (POSG) framework.
- We validate the elements of partially observed stochastic games (POSG) for finding optimal solutions. We also identify weaknesses in the underlying LLM when sampling from probability distributions and making conclusions based on samples from identified probability distributions. Those weaknesses reveal an inability to perform basic Bayesian reasoning, which is crucial in POSG.
- We introduce the concept of a private LLM agent, implemented using in-context learning (ICL) and chain-of-yhought (CoT), which is equipped to deliberate on future and past actions privately. We compare the private agent with a baseline and examine its deception strategy.
- We conduct an extensive performance evaluation of the private agent within various normal-form games with different inherent characteristics, to examine behavior coverage through games featuring different equilibrium types.
- We perform a sensitivity analysis of LLM agents within an experiment design that varied the input parameters of the normal-form games such that the reward matrix shifted from competitive to cooperative. Additionally, as part of the sensitivity analysis design of the experiments, we examined the impact of different underlying LLMs, agent types, and the number of game steps.
2. Related Work
2.1. Generative Agents
2.2. Decision Making Using LLMs
2.3. Modeling Social Dynamics
3. Problem Setting
3.1. Agent Types in POSG
- Listing 1. Example output of an private agent to environment and its own context-window memory.
- N represents the finite set of all agents. We experimented on two-player games, i.e., . If represents an agent i, his opponent is denoted as .
- S represents the finite, countable, non-empty set of all states. The state is represented as the accumulation of dialogue text between two agents, including public and private thoughts (if they exist), actions, and rewards.
- represents the initial distribution of beliefs agent has over the state of the other player , denoted by , where . Each agent receives a unique initial prompt contained in its initial state. The initial belief distribution of the LLM agents is biased towards fairness and cooperation, with a >60% cooperation rate [33,34].
- represents the final countable non-empty action space of agent i. The action represents the text the agent produces. This has two parts for a public agent: (1) communicating with the other agent; (2) making a decision on which move to make from the available set of actions; and three parts for a private agent: (1) developing a communication strategy and decision strategy in private thoughts; (2) communicating with the other agents; (3) making a decision on which move to make in public thoughts.
- represents an observation agent i receives in state , and the joint observations of all agents is denoted as . The public agent has incomplete observation, due to unavailable private thoughts, while a private agent’s observation is complete only if it is the only private agent in the game. However, it may be unaware of that, fact due to the agents’ beliefs.
- Z: represents the probability of generating observation depending on the player’s i current state and action, and opposite player’s current state and action denoted as . The observations are generated from the environment with which the agents interact. This prevents agents from influencing others’ observations.
- T: represents the state transition probability of moving from the current state s to a new state on joint action . State transition represents the concatenation of states and rewards achieved in each round. State transitions are derived from the environment in which agents interact. In this problem setting, transitions are deterministic, as we use deterministic games.
- R: represents the immediate reward for an agent N given a joint state and an action profile denoted as . The language model environment assigns a reward in each round. LLM agents communicate, thus generating dialogue text and, in the end, providing their choices. After all agents have made their choices, the environment assigns a reward to each agent.
- Listing 2. Initial prompt to the PD game.
3.1.1. Language Generation through In-Context Learning
3.1.2. Chain of Thought Prompting
3.1.3. Action Selection Strategy in Agent Types
3.2. Game Setting
- Listing 3. Initial prompt to the SH game.
4. Experiments
4.1. Experiment Setup
- Listing 4. An example of private agent’s thoughts.
- Listing 5. An example of public agent’s thoughts.
4.2. Achieving Equilibrium
4.3. Results
4.4. Parameterized Game
4.5. Sensitivity Analysis
4.6. Limitations and Constraints
5. Discussion
6. Conclusions
7. Limitations
8. Ethics Statement
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
LLM | Large Language Model |
POSG | Partially Observable Stochastic Game |
OOP | Objective Oriented Programming |
SLU | Spoken Language Understanding |
CoT | Chain of Thought |
ICL | In-Context Learning |
PD | Prisoner’s Dilemma |
SH | Stag Hunt |
GPT | Generative Pre-trained Transformer |
AI | Artificial Intelligence |
References
- Brown, T.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.D.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 2020, 33, 1877–1901. [Google Scholar]
- Hoglund, S.; Khedri, J. Comparison Between RLHF and RLAIF in Fine-Tuning a Large Language Model. Available online: https://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-331926 (accessed on 1 May 2024).
- Wei, J.; Wang, X.; Schuurmans, D.; Bosma, M.; Xia, F.; Chi, E.; Le, Q.V.; Zhou, D. Chain-of-thought prompting elicits reasoning in large language models. Adv. Neural Inf. Process. Syst. 2022, 35, 24824–24837. [Google Scholar]
- Creswell, A.; Shanahan, M.; Higgins, I. Selection-inference: Exploiting large language models for interpretable logical reasoning. arXiv 2022, arXiv:2205.09712. [Google Scholar]
- Meta Fundamental AI Research Diplomacy Team (FAIR); Bakhtin, A.; Brown, N.; Dinan, E.; Farina, G.; Flaherty, C.; Fried, D.; Goff, A.; Gray, J.; Hu, H.; et al. Human-level play in the game of diplomacy by combining language models with strategic reasoning. Science 2022, 378, 1067–1074. [Google Scholar] [PubMed]
- OpenAI. Gpt-4 technical report. arXiv 2023, arXiv:2303.08774. [Google Scholar]
- Park, J.S.; O’Brien, J.C.; Cai, C.J.; Morris, M.R.; Liang, P.; Bernstein, M.S. Generative agents: Interactive simulacra of human behavior. arXiv 2023, arXiv:2304.03442. [Google Scholar]
- Wei, J.; Tay, Y.; Bommasani, R.; Raffel, C.; Zoph, B.; Borgeaud, S.; Yogatama, D.; Bosma, M.; Zhou, D.; Metzler, D.; et al. Emergent abilities of large language models. arXiv 2022, arXiv:2206.07682. [Google Scholar]
- Andreas, J. Language models as agent models. arXiv 2022, arXiv:2212.01681. [Google Scholar]
- Li, G.; Hammoud, H.A.A.K.; Itani, H.; Khizbullin, D.; Ghanem, B. Camel: Communicative agents for “mind” exploration of large scale language model society. arXiv 2023, arXiv:2303.17760. [Google Scholar]
- Silver, D.; Schrittwieser, J.; Simonyan, K.; Antonoglou, I.; Huang, A.; Guez, A.; Hubert, T.; Baker, L.; Lai, M.; Bolton, A.; et al. Mastering the game of go without human knowledge. Nature 2017, 550, 354–359. [Google Scholar] [CrossRef]
- Poje, K.; Brcic, M.; Kovač, M.; Krleža, D. Challenges in collective intelligence: A survey. In Proceedings of the 2023 46th MIPRO ICT and Electronics Convention (MIPRO), Opatija, Croatia, 22–26 May 2023; pp. 1033–1038. [Google Scholar]
- Başar, T.; Olsder, G.J. Dynamic Noncooperative Game Theory; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 1998. [Google Scholar]
- Isufi, S.; Poje, K.; Vukobratovic, I.; Brcic, M. Prismal view of ethics. Philosophies 2022, 7, 134. [Google Scholar] [CrossRef]
- Shoham, Y.; Leyton-Brown, K. Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations; Cambridge University Press: Cambridge, UK, 2008. [Google Scholar]
- Chawla, K.; Ramirez, J.; Clever, R.; Lucas, G.; May, J.; Gratch, J. Casino: A corpus of campsite negotiation dialogues for automatic negotiation systems. arXiv 2021, arXiv:2103.15721. [Google Scholar]
- Webb, T.; Holyoak, K.J.; Lu, H. Emergent analogical reasoning in large language models. Nat. Hum. Behav. 2023, 7, 1526–1541. [Google Scholar] [CrossRef] [PubMed]
- Dong, Q.; Li, L.; Dai, D.; Zheng, C.; Wu, Z.; Chang, B.; Sun, X.; Xu, J.; Sui, Z. A survey for in-context learning. arXiv 2022, arXiv:2301.00234. [Google Scholar]
- Fu, Y.; Peng, H.; Khot, T.; Lapata, M. Improving language model negotiation with self-play and in-context learning from ai feedback. arXiv 2023, arXiv:2305.10142. [Google Scholar]
- Zhao, W.X.; Zhou, K.; Li, J.; Tang, T.; Wang, X.; Hou, Y.; Min, Y.; Zhang, B.; Zhang, J.; Dong, Z.; et al. A survey of large language models. arXiv 2023, arXiv:2303.18223. [Google Scholar]
- Qin, Y.; Liang, S.; Ye, Y.; Zhu, K.; Yan, L.; Lu, Y.; Lin, Y.; Cong, X.; Tang, X.; Qian, B.; et al. Toolllm: Facilitating large language models to master 16,000+ real-world apis. arXiv 2023, arXiv:2307.16789. [Google Scholar]
- Shinn, N.; Cassano, F.; Gopinath, A.; Narasimhan, K.R.; Yao, S. Reflexion: Language agents with verbal reinforcement learning. In Proceedings of the Thirty-Seventh Conference on Neural Information Processing Systems, New Orleans, LA, USA, 10–16 December 2023; Volume 36. [Google Scholar]
- Huang, W.; Xia, F.; Xiao, T.; Chan, H.; Liang, J.; Florence, P.; Zeng, A.; Tompson, J.; Mordatch, I.; Chebotar, Y.; et al. Inner monologue: Embodied reasoning through planning with language models. arXiv 2022, arXiv:2207.05608. [Google Scholar]
- Diji, Y.; Kezhen, C.; Jinmeng, R.; Xiaoyuan, G.; Yawen, Z.; Jie, Y.; Yi, Z. Tackling vision language tasks through learning inner monologues. Proc. AAAI Conf. Artif. Intell. 2024, 38, 19350–19358. [Google Scholar]
- Junkai, Z.; Liang, P.; Huawei, S.; Xueqi, C. Think Before You Speak: Cultivating Communication Skills of Large Language Models via Inner Monologue. arXiv 2023, arXiv:2311.07445. [Google Scholar]
- Bommasani, R.; Hudson, D.A.; Adeli, E.; Altman, R.; Arora, S.; von Arx, S.; Bernstein, M.S.; Bohg, J.; Bosselut, A.; Brunskill, E.; et al. On the opportunities and risks of foundation models. arXiv 2021, arXiv:2108.07258. [Google Scholar]
- Kurvinen, E.; Koskinen, I.; Battarbee, K. Prototyping social interaction. Des. Issues 2008, 24, 46–57. [Google Scholar] [CrossRef]
- Schön, D.A. The Reflective Practitioner: How Professionals Think in Action; Routledge: London, UK, 2017. [Google Scholar]
- Gordon, M.L.; Zhou, K.; Patel, K.; Hashimoto, T.; Bernstein, M.S. The disagreement deconvolution: Bringing machine learning performance metrics in line with reality. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, New York, NY, USA, 8–13 May 2021; pp. 1–14. [Google Scholar]
- Gordon, M.L.; Lam, M.S.; Park, J.S.; Patel, K.; Hancock, J.; Hashimoto, T.; Bernstein, M.S. Jury learning: Integrating dissenting voices into machine learning models. In Proceedings of the CHI Conference on Human Factors in Computing Systems, New Orleans, LA, USA, 29 April 2022; pp. 1–19. [Google Scholar]
- Lee, M.; Srivastava, M.; Hardy, A.; Thickstun, J.; Durmus, E.; Paranjape, A.; Gerard-Ursin, I.; Li, X.L.; Ladhak, F.; Rong, F.; et al. Evaluating human-language model interaction. arXiv 2022, arXiv:2212.09746. [Google Scholar]
- Albrecht, S.V.; Christianos, F.; Schäfer, L. Multi-Agent Reinforcement Learning: Foundations and Modern Approaches; The MIT Press: Cambridge, MA, USA, 2024. [Google Scholar]
- Brookins, P.; DeBacker, J.M. Playing Games with GPT: What Can We Learn about a Large Language Model from Canonical Strategic Games? 2023. Available online: https://ssrn.com/abstract=4493398 (accessed on 1 May 2024).
- Guo, F. Gpt in game theory experiments. arXiv 2023, arXiv:2305.05516. [Google Scholar]
- Zhou, Z.; Liu, G.; Tang, Y. Multi-agent reinforcement learning: Methods, applications, visionary prospects, and challenges. arXiv 2023, arXiv:2305.10091. [Google Scholar]
- Zhang, K.; Yang, Z.; Başar, T. Multi-agent reinforcement learning: A selective overview of theories and algorithms. Handb. Reinf. Learn. Control. 2021, 325, 321–384. [Google Scholar]
- Chen, Z.; Zhou, D.; Gu, Q. Almost optimal algorithms for two-player zero-sum linear mixture markov games. In Proceedings of the International Conference on Algorithmic Learning Theory, Paris, France, 29 March–1 April 2022; pp. 227–261. [Google Scholar]
- Ji, Z.; Lee, N.; Frieske, R.; Yu, T.; Su, D.; Xu, Y.; Ishii, E.; Bang, Y.J.; Madotto, A.; Fung, P. Survey of hallucination in natural language generation. ACM Comput. Surv. 2023, 55, 1–38. [Google Scholar] [CrossRef]
- Zhu, Z.; Cheng, X.; Li, Y.; Li, H.; Zou, Y. Aligner2: Enhancing joint multiple intent detection and slot filling via adjustive and forced cross-task alignment. Proc. AAAI Conf. Artif. Intell. 2024, 38, 19777–19785. [Google Scholar] [CrossRef]
- Liu, B.; Lane, I. Attention-based recurrent neural network models for joint intent detection and slot filling. arXiv 2016, arXiv:1609.01454. [Google Scholar]
- Aggarwal, M.; Hanmandlu, M. On modeling ambiguity through entropy. Int. Trans. Oper. Res. 2023, 30, 1407–1426. [Google Scholar] [CrossRef]
- Jiang, H. A latent space theory for emergent abilities in large language models. arXiv 2023, arXiv:2304.09960. [Google Scholar]
- Liu, Q. Does gpt-4 play dice? Chinaxiv 2023. [Google Scholar] [CrossRef]
- Bravetti, A.; Padilla, P. An optimal strategy to solve the prisoner’s dilemma. Sci. Rep. 2018, 8, 1948. [Google Scholar] [CrossRef]
- Tulli, S.; Correia, F.; Mascarenhas, S.; Gomes, S.; Melo, F.S.; Paiva, A. Effects of agents’ transparency on teamwork. In International Workshop on Explainable, Transparent Autonomous Agents and Multi-Agent Systems; Springer: Berlin/Heidelberg, Germany, 2019; pp. 22–37. [Google Scholar]
- Harrison Chase. Langchain. Available online: https://github.com/langchain-ai/langchain (accessed on 5 April 2024).
- Fudenberg, D.; Levine, D.K. The Theory of Learning in Games; MIT Press: Cambridge, MA, USA, 1998; Volume 2. [Google Scholar]
- Neyman, A. Correlated equilibrium and potential games. Int. J. Game Theory 1997, 26, 223–227. [Google Scholar] [CrossRef]
- Daskalakis, C.; Goldberg, P.W.; Papadimitriou, C.H. The complexity of computing a nash equilibrium. Commun. ACM 2009, 52, 89–97. [Google Scholar] [CrossRef]
- Iancu, D.A.; Trichakis, N. Pareto efficiency in robust optimization. Manag. Sci. 2014, 60, 130–147. [Google Scholar] [CrossRef]
- van der Rijt, J.-W. The quest for a rational explanation: An overview of the development of focal point theory. In Focal Points in Negotiation; Springer International Publishing: New York, NY, USA, 2019; pp. 15–44. [Google Scholar]
- Thawani, A.; Pujara, J.; Ilievski, F. Numeracy enhances the literacy of language models. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Virtual Event, 7–11 November 2021; pp. 6960–6967. [Google Scholar]
- Spithourakis, G.P.; Riedel, S. Numeracy for language models: Evaluating and improving their ability to predict numbers. arXiv 2018, arXiv:1805.08154. [Google Scholar]
- Došilović, F.K.; Brcic, M.; Hlupić, N. Explainable artificial intelligence: A survey. In Proceedings of the 2018 41st International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), Opatija, Croatia, 21–25 May 2018; pp. 0210–0215. [Google Scholar]
- Brcic, M.; Yampolskiy R, V. Impossibility Results in AI: A survey. ACM Comput. Surv. 2023, 56, 1–24. [Google Scholar] [CrossRef]
- Longo, L.; Brcic, M.; Cabitza, F.; Choi, J.; Confalonieri, R.; Del Ser, J.; Guidotti, R.; Hayashi, Y.; Herrera, F.; Holzinger, A.; et al. Explainable artificial intelligence (XAI) 2.0: A manifesto of open challenges and interdisciplinary research directions. Inf. Fusion 2024, 2024, 102301. [Google Scholar] [CrossRef]
- Enßlin, T.; Kainz, V.; Bœhm, C. A Reputation Game Simulation: Emergent Social Phenomena from Information Theory. Ann. Der Phys. 2022, 534, 2100277. [Google Scholar] [CrossRef]
- Carlo, K.; Kevin, B.K.; Bruce, I.M. Information-theoretic models of deception: Modelling cooperation and diffusion in populations exposed to “fake news”. PLoS ONE 2018, 13, e0207383. [Google Scholar]
- Dai, Z.; Yang, Z.; Yang, Y.; Carbonell, J.; Le, Q.V.; Salakhutdinov, R. Transformer-xl: Attentive language models beyond a fixed-length context. arXiv 2019, arXiv:1901.02860. [Google Scholar]
- Azamfirei, R.; Kudchadkar, S.R.; Fackler, J. Large language models and the perils of their hallucinations. Crit. Care 2023, 27, 120. [Google Scholar] [CrossRef] [PubMed]
- Peng, B.; Quesnelle, J.; Fan, H.; Shippole, E. Yarn: Efficient context window extension of large language models. arXiv 2023, arXiv:2309.00071. [Google Scholar]
- Li, R.; Xu, J.; Cao, Z.; Zheng, H.T.; Kim, H.G. Extending Context Window in Large Language Models with Segmented Base Adjustment for Rotary Position Embeddings. Appl. Sci. 2024, 14, 3076. [Google Scholar] [CrossRef]
Term | Explanation |
---|---|
Prisoner’s Dilemma | In the prisoner’s dilemma, two suspects are arrested, and each has to decide whether to cooperate with or betray their accomplice. The optimal outcome for both is to cooperate, but the risk is that if one cooperates and the other betrays, the betrayer goes free while the cooperator faces a harsh penalty. This game illustrates a situation where rational individuals may not cooperate even when it is in their best interest, leading to a sub-optimal outcome. |
Stag Hunt | The stag hunt game involves two hunters who can choose to hunt either a stag (high reward) or a hare (low reward). To successfully hunt a stag, both hunters must cooperate. However, if one chooses to hunt a hare while the other hunts a stag, the stag hunter gets nothing. It exemplifies a scenario where cooperation can lead to a better outcome, but there is a risk of one player defecting for a smaller, more certain reward. |
Chicken game | In the chicken game, two players drive toward each other, and they must decide whether to swerve (cooperate) or continue driving straight (defect). If both players swerve, they are both safe, but if both continue straight, they crash (a disastrous outcome). This game highlights the tension between personal incentives (not swerving) and the mutual interest in avoiding a collision (swerving). |
Head-tail game | The head-tail game involves two players simultaneously choosing between showing either the head or tail on a coin. If both players choose the same side (both heads or both tails), one player wins. If they choose differently, the other player wins. This game illustrates a simple coordination problem, where players have to predict and match each other’s choices to win. |
The battle of sexes | In the battle of the sexes game, a couple has to decide where to go for an evening out, with one preferring a football game and the other preferring the opera. Each player ranks the options: the highest payoff is when both go to their preferred event, but they prefer being together over going alone. It demonstrates the challenge of coordinating when preferences differ and highlights the potential for multiple equilibria. |
Equilibrium | ||||
---|---|---|---|---|
Game | Correlated | Nash | Pareto | Focal Point |
Prisoner’s Dilemma | ✓ | |||
Stag Hunt | ✓ | |||
Chicken game | ✓ | |||
Head-tail game | ✓ | |||
The battle of sexes | ✓ |
Player B (Cooperates) | Player B (Defects) | |
---|---|---|
Player A (cooperates) | 3, 3 | 0, x |
Player A (defects) | x, 0 | 1, 1 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Poje, K.; Brcic, M.; Kovac, M.; Babac, M.B. Effect of Private Deliberation: Deception of Large Language Models in Game Play. Entropy 2024, 26, 524. https://doi.org/10.3390/e26060524
Poje K, Brcic M, Kovac M, Babac MB. Effect of Private Deliberation: Deception of Large Language Models in Game Play. Entropy. 2024; 26(6):524. https://doi.org/10.3390/e26060524
Chicago/Turabian StylePoje, Kristijan, Mario Brcic, Mihael Kovac, and Marina Bagic Babac. 2024. "Effect of Private Deliberation: Deception of Large Language Models in Game Play" Entropy 26, no. 6: 524. https://doi.org/10.3390/e26060524
APA StylePoje, K., Brcic, M., Kovac, M., & Babac, M. B. (2024). Effect of Private Deliberation: Deception of Large Language Models in Game Play. Entropy, 26(6), 524. https://doi.org/10.3390/e26060524