2.1. Basic Principles
In the following, we summarize the most important aspects of the Reputation Game’s mechanisms. For the details, we refer to [
8].
The game follows the interactions of n agents that exchange information of only one kind, namely, the honesty of other agents. In each round each agent once takes on the role of the speaker(denoted by agent a) and therefore chooses a communication partner b and a topic agent c to transfer a message about to b and then receives a reply from b about c. This means that a and b tell each other how honest they believe c’s statements to be in general, which can either be the agents’ actual belief in form of an honest statement or a lie. The model is built around this type of information called the reputation. The agents use this image of others to decide about the trustworthiness of received messages to properly implement those into their beliefs while striving for maximizing their own reputation in the eyes of their fellows. A higher reputation of an agent also means a stronger influence of his messages on the group. So far, various strategies for achieving this goal are implemented in the RGS, varying in the way lies are used to deceptively influence other agents’ beliefs.
The agents’ beliefs are represented as the set of two parameters that denote the number of honest and dishonest statements, respectively, that a believes b to have told during the game. Thus, following Bayesian logic, they describe a probability distribution over the possible lie frequencies of b in the eyes of a by . The mean of a belief , calculated from this probability distribution, refers to the reputation of b for agent a and is a measure of the frequency for a’s belief for b to tell the truth.
Besides their belief on the behavior of others, the agent’s mental states also include the first-order Theory of Mind values of the form , denoting what a thinks b thinks about c. Proper information on the beliefs and plans of others, on the one hand, protects the agents against being manipulated, and, on the other hand, makes it easier the other way round to successfully deceive their counterparts.
After receiving a message, the new belief of an agent ideally would have the form of a superposition of probability distributions for the old belief (that is to be kept in case of being lied to) and one for the new belief, in case the message was honest. The weighting of these two beliefs is according to the assessment of the receiver that the message was a lie or truth, respectively. For the sake of keeping the form of two parameters and , the superposition is then compressed back into the form of Beta distribution by minimizing the appropriate information distance between those, namely, the Kullback–Leibler divergence.
Whether an agent a lies or tells the truth in a specific situation is not decided by himself but in a random process characterized by the intrinsic, agent-specific lie frequency .
If an agent is chosen to lie in the first place, his lies are constructed around first-order ToM values (what a thinks that b thinks about c, i.e., lying positive means transmitting a message that is of the form , and for a negative , with an for that a expects to make the statement credible for b). This means the agents present the topic slightly better or worse than they believe the receiver to currently think about it. Transmitting the exact values of the first-order ToM is what is referred to as a white lie. After sending a message, the agents also update their belief on their own honesty according to whether they have just lied or not. This value is also influenced by the information the agents receive by others about themselves.
The agents use several indicators to detect lies in order to protect themselves from deception and adopting false information in their belief state. These include blushing (when agents tell a lie, they run the risk of exposing themselves by a subsequent blushing with frequency ) or unusual surprise. The latter denotes the most crucial part of lie detection and means of evaluating the trustworthiness of a message according to the surprise it generates in the agent’s mind when hearing it, i.e., the amount of information gained if the agent updated its initial belief to the information contained in the message. This surprise is calculated by the Kullback–Leibler divergence between the agents’ current belief on the topic and the message. The value also is weighted by a factor , an agent specific value that denotes the mean of message surprises of the last ten conversations and therefore represents the current “social atmosphere”.
Another indicator for honesty is the so-called confession. It means that, if a statement of an agent about himself is more negative than the receiver’s current belief, the message is seen to be ultimately honest.
Besides the pure reputation-driven dynamics, the RGS contains a second principle that decides whether an agent should be lied positively or negatively about: whether they are regarded as a friend or an enemy. If an agent receives a message about himself wherein the mean reputation is above the perceived average, the sender is added to the receiver’s list of friends, or to the list of enemies if the opposite was the case. If an agent decides to communicate false information, they will lie positively about friends, causing the receiver to put more trust in the expected positive subsequent messages of the friend about a. Following the same reasoning, a aims at minimizing the reputation of his enemies.
2.2. Implementation of a Second-Order Theory of Mind
We will now replace the lie direction rule for a new type of agent (called ToM-agent in the following) with the following considerations. The standard mental model of agents in the RGS includes a ToM up to the first order represented by
describing
a’s belief about
b’s zero order belief
on
c. The ToM-agent’s knowledge not solely includes those first-order ToM values but also the very fact that this is the highest order that other agents conceive and an understanding of the mental processes leading to those beliefs. At least we demand the ToM-agents to be able to “imagine” how a hypothetical message affects a hypothetical mind. The rational consequence is to extend the ToM by an extra order, as this allows the ToM-agent to build up a complete picture of his fellows’ mental states and, therefore, examine how they are influenced by possible messages and choose the option promising the best result in terms of its expected reputation. The ToM strategy therefore consists of using the extended picture of the surrounding to simulate the impact of different behavioral options on it. Following rational theory, the best response is again to undermine the strategy by increasing one’s ToM by one more level and simulate the ToM-agents simulations, respectively, for the next response, resulting in an infinite number of iterations [
10]. This is neither reasonable to model nor a process to be expected in real life. Sociology finds that, in practice, people do not apply deductive rational reasoning to the highest level but rather act according to a so-called
bounded rationality that basically emerges from limitations in human brain capacities and from the resulting belief that others also react according to a bounded rationality [
11]. Moreover, in the RGS, the uncertainty about the information of the ToM becomes more uncertain with each order of ToM, making the mental simulations more inaccurate and mostly useless at some point of depth. The proper strategic solution is, therefore, to identify the level of others’ reasoning and undermine their strategy by thinking exactly one order deeper than their counterparts, which is exactly what ToM-agents in the RGS do, where they know the others’ depth of reasoning to be one.
We also see this approach legitimized by its practical relevance, as sociological studies show that humans in general are able to form and apply second-order ToM beliefs, while higher orders are observed less frequently [
1,
2,
3,
10].
2.4. Finding the Optimal Message
If an agent has chosen communication partner
b and topic
c, he finds himself with different options for the messages that he is about to be sent, which are a white, a positive, and a negative lie or
a’s honest opinion. In the first step, the ToM-agent constructs all these possible messages on the basis of his first-order ToM. Then five virtual dummy minds for
b are created, of which four are updated under one of the possible messages each, whereas the last is confronted with an arbitrary lie with a subsequent blushing. We denote these options in the following by w, +, −, t, and blush, respectively.
a needs to consider the possibility of being exposed as a liar by blushing, as this diminishes the expected gain in reputation by telling a lie. Blushing happens with a fixed frequency
for all agents, which is known to all agents. This procedure is illustrated in
Figure 1. The quality of the message is then measured by the expected reputation of
a after the conversation. For the honest statement, this is equal to the reputation of
a in the eyes of the simulated dummy after the update under the honest information. For the lies, the expected value is calculated as the sum of the reputation after the update under the successful lie, multiplied with the frequency of not blushing, and the update after the lie followed by blushing multiplied with the frequency for that to happen.
The message promising the highest possible expected value is then actually sent to the receiver. As only one positive and one negative lie are simulated, the chosen message is in general close to but not necessarily the ultimate best option.
When talking about the receiver b, b’s mind update on a is not influenced by the content of the message but only by the amount of trust that he places in it to be true. Therefore, in this scenario, the white lie option or the truth happens to be the most promising (as currently the ToM-agent does not aim for friendship and its benefits). When talking about oneself, it is not possible to give a generalized answer about what lie, if not the truth, is the optimal message as it arises from a more complex interplay of parameters. Besides the intuitive result of lying positively being the most promising option in some cases, sometimes lying in the downward direction might boost the reputation even more. This is due to the RGS’s concept of confession, meaning that if a receiver agent b receives a message from an agent a that presents himself worse than b currently thinks of a, b considers the statement to be ultimately true, as the only purpose of sending a potentially damaging message must lie in the intention of making a true statement. Therefore, if a manages to put a negative lie on his counterpart that lies only slightly below his opinion, the negative impact of the messages content becomes overcompensated by the one certainly true statement that b adds to his belief on a.
In the case of talking about a third agent
c that is not the receiver or the ToM-agent
a himself,
a not only influences
b’s opinion on
a but also on
c and therefore also takes into account the possible subsequent messages from
c to
b about
a. Depending on what
a expects
c to tell
b about himself, it might be reasonable to accept a different effect on his reputation in the first communication with
b. If
a’s statement on
c leads to a different reputation of
b about
c, then
c’s positive or negative messages about
a will be evaluated differently by
b and the expected value for the message options changes. In order to investigate the effect of the conversation from
c to
b about the ToM-agent, he now imagines the four possible messages that
c could tell
b on the basis of its knowledge about
c’s mental state, as shown in
Figure 2. Those are, again, the honest statement, positive and negative lie, and a lie followed by unintended blushing. The honest statement
a expects
c to make is given by the first-order ToM
, which represents what
a thinks
c to belief about himself. As
c will construct his lies to
b around his first-order values
,
a will construct his expectation for them around his second-order values
.
For making predictions about c’s behavior, a assumes friendships to be symmetrical, meaning, if a sees c as a friend, he automatically assumes that c also considers a as a friend, applying the same logic if c is an enemy of a. This makes a expect c to tell b a positive lie about himself in the first case and a negative in the latter. If c is neither friend nor enemy in the eyes of a, then he expects positive and negative lies with probability of each.
The ToM-agent calculates the expected reputations for each combination of messages from himself about c, followed by a possible message from c to b. Then he evaluates the expected value for each of his options by considering the messages by c and the blushing to happen. The honest message that a expects c to make about a is equal to the first-order parameter . However, when b receives the message, he will extract the new information by comparing the content to his first-order . As we construct the ToM-agent so that he assumes that this value in the form is equal to , the content of the true messages in the subsequent messages in the simulation only influences the belief of b in the trustworthiness of c but not the belief in a, which is of actual interest. Therefore, the analysis of the strategy shows that it is advantageous for the ToM-agent to lie positively about agents that he considers to consider him a friend and negative about agents that he thinks see him as an enemy. Due to the assumption of symmetry, this is equivalent to the current rule of lying positively/negatively about agents the sender himself considers friend/enemy. Only if the first-order value for friendship and the second-order ToM values of were tracked and updated as independent parameters, the rule would become more differentiated. Anyhow, the current structure of the RGS does not provide the mechanism to do so. Moreover, the uncertainty of ToM values increases with the order, which was also a reason to introduce the bounded rationality in the first place.
2.5. ToM-Agent Update Rules
We also introduced additional information update rules that are unique to ToM-agents compared to the agents of the previous version. First, after sending a simulated message to b, the ToM-agent replaces his representation for b’s mind with the mind of the dummy that was updated under the chosen message.
We also assume the ToM-agents to be deaf towards messages about themselves. As their lie decision does not depend on random processes but on prudent considerations, we find it reasonable to assume the ToM-agents to be aware of their exact lie and truth count and therefore should not be influenceable in their self reputation. Anyhow, those messages are still used to update the ToM-agents’ view on the honesty of the sender.
An agent that understands the general structure of a mind in the RGS so well that he is able to perform the described simulations should also be aware that other agents like himself perform an update on their self reputation after telling a message. Therefore, after receiving a message from
a, the ToM-agent
b will update his first-order belief
. In a two-agent game, the ToM-agent is therefore capable of keeping track of its counterpart’s exact mental state at all times, under a known start parameter. Moreover, a lie can be spotted easily if the content of the message does not precisely match the first ToM-agent’s first-order ToM. In this constellation, ToM-agents therefore always manage to reach and keep the top reputation after a small number of rounds by effectively outplaying their counterpart. In larger groups in general, the first-order ToM representation is not equal to the real values, which is why we stick to smart lie detection (as described in Enßlin et al. [
8]) for more than two agents.