Bayesian Social Learning with Local Interactions

Guarino, Antonio; Ianni, Antonella

doi:10.3390/g1040438

Open AccessArticle

Bayesian Social Learning with Local Interactions

by

Antonio Guarino

¹ and

Antonella Ianni

^2,*

¹

Department of Economics ELSE, University College London, London WC1E 6BT, UK

²

Economic Division, School of Social Sciences, University of Southampton, Southampton SO17 1BJ, UK

^*

Author to whom correspondence should be addressed.

Games 2010, 1(4), 438-458; https://doi.org/10.3390/g1040438

Submission received: 8 September 2010 / Accepted: 2 October 2010 / Published: 20 October 2010

(This article belongs to the Special Issue Social Networks and Network Formation)

Download Versions Notes

Abstract

:

We study social learning in a large population of agents who only observe the actions taken by their neighbours. Agents have to choose one, out of two, reversible actions, each optimal in one, out of two, unknown states of the world. Each agent chooses rationally, on the basis of private information and of the observation of his neighbours’ actions. Agents can repeatedly update their choices at revision opportunities that they receive in a random sequential order. We show that if agents receive equally informative signals and observe both neighbours, then actions converge exponentially fast to a configuration where some agents are permanently wrong. In contrast, if agents are unequally informed (in that some agents receive a perfectly informative signal and others are uninformed) and observe one neighbour only, then everyone will eventually choose the correct action. Convergence, however, obtains very slowly, at rate

\sqrt{t}

.

Keywords:

social learning; Bayesian learning; local informational externalities; path dependence; consensus; clustering; convergence rates

1. Introduction

In many economic and social situations, we make our decisions after observing the choices of others. We can learn from such choices, since they can reveal information that others hold. Observing others can help us to form a more precise evaluation and make a better choice. Of course, observing others can also lead to conformism, in so far as we decide to just follow the crowd.

The process of learning from others has first been studied by Bikhchandani et al. [1], and Banerjee [2]. These papers focused on a simple case in which agents, endowed with some private information, act sequentially and make one irreversible choice, after observing the entire history of actions taken by their predecessors. This set up was, of course, very convenient for the analysis, but, at the same time, quite restrictive.

One of the features that restrict the applicability considerably is that every agent can observe the whole set of choices made by others. In most cases, we do not observe everyone’s decisions but only those made by agents that we know, like our friends or neighbours. We know, for example, which restaurant our friends go to, which bank they use, which car they bought. We may try to infer information from their actions. But this inference is, of course, very complicated. While we observe their actions, and can take these into account to make our decision, they themselves could have gone through a similar process when it was their turn to make a choice. They could have observed our previous decisions and those of their friends and neighbours, that maybe we do not observe. And their neighbours, in turn, could have observed the decisions of others, and so on. Clearly, when we take all this into account, we realize that the process of social learning is quite an intricate and complex phenomenon.

The purpose of our paper is to shed some light on social learning when agents can only observe the actions taken by their neighbours. The seminal papers by Bikhchandani et al. [1] and Banerjee [2] illustrated a striking phenomenon: in their set up, eventually agents decide to disregard their private information and just conform to the prevailing action chosen by their predecessors. Conformism prevails, and it may well be that the entire population settles on the wrong action. Beliefs may never converge to the truth, despite the population is only formed of rational agents. In situations in which agents can only observe their neighbours’ decisions, should we still expect uniformity of behavior? Can we still expect agents to neglect their private information? Will the population as a whole eventually learn the right decision? And in the positive case, will this convergence process be slow or fast?

When the agents’ observation is limited to their neighbours’ choices, the amount of information that they receive is limited and the possibility of learning seems reduced. On the other hand, the way in which this information is disseminated may be more efficient, since agents may rely more on their private information, and feed this into the social learning process by their choice of actions. Therefore, the social learning process under local interactions can differ in many ways from the one studied in the canonical models.

To address these issues, we build a simple model that departs from the canonical one in various ways. First, as we said, we assume that agents can only observe the behavior of their neighbours, that is, a subset of other individuals who live close by, on an appropriately characterized spatial structure. Second, we let each agent revise his original decision repeatedly and postulate that updating opportunities occur in a random sequential order. Third, we assume that agents do not know the time of their decision and have no recall of past (own and neighbours’) experiences. This last assumption makes the inference problem easier but far from trivial. We believe that not only it helps to make the model tractable (assuming perfect memory would make the inference problem overwhelmingly complicated), but it is also plausible in the analysis of the dynamics of choices taken by a large population.

In this set up, we address two issues. The first concerns the social learning process in terms of its asymptotic properties. Starting from an initial random configuration of beliefs in the population, we ask whether social learning is complete, in the sense that beliefs converge to the truth, or at least adequate, in the sense that all agents choose the action that is optimal given the state of the world. The second issue concerns the speed of learning. In our view, this issue is particularly relevant as the distinction between slow convergence to the truth and fast convergence to the false is not an obvious one, neither in practical terms, nor in terms of efficiency. We are able to provide an analytical solution for the speed of convergence of actions, and, by pursuing a space-time analysis (i.e., by relating the two dimensions, time and space, over which our process is defined), for the process of cluster formation.

Our main results are the following. When agents are equally informed (i.e., endowed with a signal of equal precision) learning is not complete and the process of actions converges exponentially fast to a configuration where somebody is permanently wrong. When agents are unequally informed, in the sense that some receive a fully informative signal and some receive a completely uninformative signal, social learning is adequate, that is, there is convergence to a state in which everybody chooses the correct action. Convergence, however, obtains very slowly.

The paper is organized as follows. Section 2 describes the model. Section 3 contains the main results. Section 4 relates our work to the existing literature. Finally, Section 5 concludes and Appendices A and B contain the proofs.

2. The Economy

We consider a set

Ω = {0, 1}

of possible states of nature. The two states are equally likely. In the economy there is a set X of countably many agents. Each agent

x \in X

has to choose an action in the action space

A = {0, 1}

. Time t runs continuously (

t \geq 0

). Each agent x makes a decision at time

t = 0

and then may be called to revise it more than once. In particular, each agent may have to choose a new action at a random exponential time, with mean 1. In any small time interval, at most one agent can reassess his decision, and every agent is equally likely to receive an updating opportunity. We denote the action of agent x at time t by

η_{t} (x) \in A

. Let

{τ_{x_{l}}}

be the sequence of times when x receives an updating opportunity, with

τ_{x_{0}} \equiv 0

and

l = 0, 1, 2, . . .

Agent x’s choice at t is equal to the decision he made at his last updating opportunity, that is,

η_{t} (x) = η_{τ_{x_{l}}} (x)

for

τ_{x_{l}} \leq t < τ_{x_{l + 1}}

.

Each agent gathers information on the state of the world in two ways. At time

t = 0

he observes a private symmetric binary signal on the realized state of nature. We denote the signal observed by agent x by

θ^{σ} (x) : Ω \to {0, 1}

, where the index

σ \in \{h, l\}

refers to the precision of the signal. An agent can receive a signal of high precision

q^{h} \equiv Pr [θ^{h} (x) = 1 ∣ ω = 1] = Pr [θ^{h} (x) = 0 ∣ ω = 0] \in (0.5, 1]

or a signal of low precision

q^{l} \equiv Pr [θ^{l} (x) = 1 ∣ ω = 1] = Pr [θ^{l} (x) = 0 ∣ ω = 0] \in [0.5, q^{h}]

. Note that, conditional on a state of nature, the signals that agents receive are independently distributed. Note also that

q^{l}

is weakly lower than

q^{h}

. We will analyze a case in which the precisions are identical and one in which one precision is strictly lower than the other.

Having observed the signal at time 0, each agent x makes his first choice,

η_{0} (x)

. When the agent receives another opportunity to take a decision (i.e., to revise the choice previously made), he observes the decisions taken by a subset of other agents in the population. This is the second way in which he gathers information on the state of nature. We provide each agent with a spatial location on a 1-dimensional lattice

Z^{1}

(an address), and assume that he can only interact with the set of agents who live in his vicinity. Formally, we take

X \subseteq Z^{1}

and define the set of x’s nearest neighbours as

N (x) = {y : ∥ y - x ∥ = 1}

, that is, the set of two agents who live at Euclidean distance 1 from agent x. We denote these two agents by

x - 1

and

x + 1

, and the information set upon which agent x takes a decision at time t by

I_{t} (x)

. In order for agents to be able to fully draw inference upon the observation of their neighbours’ action, one should endow each agent with a very rich information set, including the history of actions chosen in their neighbourhood and the exact order in which updating opportunities have been assigned until that point. Although we assume that agents are able to perform Bayesian updating, we take the view that these requirements are unreasonable in the set up of a large population of agents. On these grounds, we assume that agents have limited memory, in the sense that, if at time t agent x has to choose an action, his information set is:

I_{t} (x) \subseteq \{\begin{matrix} {θ^{σ} (x), σ} & for t = 0 \\ {η_{t} (x), η_{t} (y), y \in {x \pm 1}, σ} & for t > 0 \end{matrix}

where

σ \in {h, l} .

For the entire paper, we keep the assumption that at time

t = 0

agents observe only their private signal. For

t > 0

, instead, we will consider two cases, one in which the agent observes both neighbours, and one in which he only observes one neighbour (randomly drawn). Note that, because of limited memory, agents revise their decisions every time completely ignoring past history1. Note also that the time t does not belong to the information set, that is, the agent does not know the time of his decision, neither he knows how many times he already had an opportunity to revise his choice. On the other hand σ does belong to agent

x^{'} s

information set, meaning that at each time a decision is to be made, x knows the precision of the signal he received.

Agents’ payoff only depends on the realized state of nature ω and on the chosen action. While action 0 always gives a payoff of zero, action 1 gives a payoff of one if the state is 1 and a payoff of minus one if the state is 0. On the basis of the available information, at time t agent x chooses

η_{t} (x)

to maximize

E [U (η_{t} (x), ω) | I_{t} (x)]

and sticks to this decision until a new updating opportunity arises.

If we denote the belief at time t that

ω = 1

by

π_{t} (x) = Pr [ω = 1 ∣ I_{t} (x)]

, then x’s optimal action is

η_{t}^{*} (x) = \{\begin{matrix} 1 & if π_{t} (x) > 0.5 \\ {0, 1} & if π_{t} (x) = 0.5 \\ 0 & if π_{t} (x) < 0.5 \end{matrix}

(1)

or, equivalently,

η_{t}^{*} (x) = \{\begin{matrix} 1 & λ_{t} (x) > 0 \\ {0, 1} & λ_{t} (x) = 0 \\ 0 & λ_{t} (x) < 0 \end{matrix}

(2)

where

λ_{t} (x) \equiv log \frac{Pr [ω = 1 ∣ I_{t} (x)]}{Pr [ω = 0 ∣ I_{t} (x)]}

denotes the log-likelihood ratio (LLR).

Essentially, agents choose their best action whenever they have the opportunity to do so. A simple interpretation is that the actual payoff is realized only when at a random time the game is over. Since agents do not know the end of the game, maximizing the payoff every time is in fact their optimal behavior.

We now define the equilibrium in our game.

Definition 1 (Equilibrium)

An equilibrium is a profile of strategies

{η_{t}^{*} (x)}_{x \in X}

such that, for all

x \in X

,

η_{t}^{*} (x) \in arg max E [U (η (x), ω) | I_{t} (x)]

.

Clearly, an equilibrium is absorbing for our learning processes if there exists a time T such that for any

x \in X

and for any

t > T

,

η_{t}^{*} (x) = η^{*} (x)

. The dynamics are as follows. Agent x makes a decision at time

t = 0

and then, whenever he has an updating opportunity, chooses an action according to (1), that is,

η_{t} (x) = \begin{matrix} η_{τ_{x_{l}}}^{*} (x) & for τ_{x_{l}} \leq t < τ_{x_{l + 1}} and l = 0, 1, 2, . . . \end{matrix}

At the beginning of time, ω is realized and each agent x receives a signal

θ^{σ} (x)

which determines

η_{0} (x)

. The process then evolves stochastically in continuous time. We refer to the process of social learning as to the dynamic process generated by the collection of all individual actions and we are interested in analyzing its properties. We shall denote the state of the process at time t by

η_{t} \in

{0, 1}^{X}

and we are interested in characterizing its evolution over time and over space.

3. Social Learning

Before proceeding to the analysis, we find it useful to discuss the relation between a canonical model of social learning and a model of social learning with local interactions as ours. Consider the standard model of sequential social learning of Birkhchandani et al. [6] and suppose that each agent can directly observe the signals received by his predecessors before making a decision. Suppose, for simplicity, that

q^{h} = q^{l} \equiv q \in (0.5, 1)

and that agents are indexed by

1, 2, 3, . . n

. Since the precisions of the signals are identical, we omit the superscript for θ. Then, the LLR for agent n on the basis of his information set

I (n) = {{θ (y)}_{y \leq n}}

is

λ (n) \equiv 2 (log \frac{q}{1 - q}) (θ (n) - \frac{1}{2}) + 2 (log \frac{q}{1 - q}) \sum_{y < n} (θ (y) - \frac{1}{2})

Suppose, for instance, that the true state of nature is

ω = 0

. Then, the random variables

θ (.)

have mean

1 - q

. This implies that

λ (n)

tends to

- \infty

, as n tends to infinity. In other words, the assessment of the probability that state

ω = 1

is true,

π (n) \equiv exp [λ (n)] / [1 + exp [λ (n)]]

, will tend to zero exponentially fast, as the number of observations increases. Therefore, in a canonical model of social learning in which agents can observe all signals, learning is complete in the sense that beliefs converge to the truth (and, hence, actions to the correct decisions), that is,

Pr [{lim}_{n \to \infty} π (n) = η (n) = ω] = 1

for all n. In particular, convergence obtains exponentially fast.

Essentially, in the canonical model of social learning, the observability of private information guarantees that the outcome of the learning process is informationally efficient. Actions are clearly not as informative as signals in this setting: potential inefficiencies may arise when agents only observe the actions taken by their predecessors and not their signals. Indeed, given a discrete action space, these inefficiencies may take the extreme form of an informational cascade, in which beliefs do not converge to the truth and the entire population settles on the wrong action.

Things are different if interaction is local, as in our framework. To see this, assume that agents do observe signals (and not actions), but only those of their nearest neighbours. Since signals are noisy and each agent observes three signals only, agents’ beliefs, as measured by their LLR, are bounded. As a result, we cannot expect any convergence in beliefs to the truth in this case, and agents will continue choosing the same action (perhaps the incorrect one) indefinitely. Observing actions may in fact make agents better off, as actions taken by one’s neighbour may convey information on signals received by that neighbour’s neighbours and so on.

We are interested in characterizing the properties of social learning processes with local interactions in terms of the degree of informational efficiency. For the reasons just mentioned, requiring complete learning would be too demanding in our model. A weaker requirement is, instead, that all agents eventually choose the correct action, as in the following definition:

Definition 2 (Adequate Learning)

The social learning process shows adequate learning if

lim_{t \to \infty} Pr [η_{t} (x) = ω f o r a l l x i n X] = 1

In fact,

{lim}_{t \to \infty} Pr [η_{t} (x) = ω]

is the limit measure of agents who choose the action appropriate for the true state of the world and is our measure of how informationally efficient the social learning process is.

We now move to address these issues with reference to our specific models.

3.1. Equally Informed Agents

We start by studying the case in which each agent receives a signal of equal precision2

q^{h} = q^{l} \equiv q \in (0.5, 1)

and observes both his neighbours. At any revising opportunity, agent x’s information set is:

I_{t} (x) = \{\begin{matrix} {θ (x), σ} & at t = 0 \\ {η_{t} (x), η_{t} (y), y \in {x \pm 1}, σ} & at each t = τ_{x} \end{matrix} e q u a t i o n *

where the superscript σ is dropped for notational convenience. We now show that in this set up in equilibrium extreme inefficiencies arise.

Theorem 3

If each agent x receives a symmetric binary signal with precision

q \in (0.5, 1)

and can observe both neighbours, then the process of social learning is not adequate, and

{lim}_{t \to \infty} Pr [η_{t} (x) = ω

for all x in

X] = 0

. The process converges exponentially fast to a configuration where some agents are permanently wrong.

The proof of the Theorem is contained in Appendix A. It relies on the explicit characterization of the process of inference underlying agents’ optimal choices. Let

{τ^{i}}

be the sequence of random times at which agents in the population have an opportunity to revise their choice. In other words,

τ^{1}

is the first time an agent (randomly chosen) has the opportunity to change his choice,

τ^{2}

is the second time, etc. Consider the following strategy S1:

\begin{matrix} At any time τ_{x} : & if η_{τ_{x}} (x - 1) = η_{τ_{x}} (x + 1), then choose η_{τ_{x}} (x \pm 1) \\ if η_{τ_{x}} (x - 1) \neq η_{τ_{x}} (x + 1), then stick to η_{τ_{x}} (x) \end{matrix}

Recall that by the assumptions of the model,

Pr [τ_{x} = τ_{y} = τ^{i}] = 0

(i.e., no two agents act at the same time) and

Pr [τ_{x} = τ^{i}] = Pr [τ_{y} = τ^{i}]

for all

x,

y (i.e., within any time period agents are equally likely to receive an updating opportunity). Hence, to show that the above strategy is optimal, we need to show that it is so at any time, that is, for any

τ_{x} = τ^{_{i}}

, and at any stage of the revision process, that is, for any

τ_{x_{l}}

,

l = 1, 2, . .

.

To start the analysis, it is useful to notice that this strategy is clearly optimal if x is the first agent to receive an updating opportunity. Indeed, at time

t = 0

, upon receiving the signal

θ (x)

, any agent x has an LLR equal to

λ_{0} (x) = 2 log [\frac{q}{1 - q}] (θ (x) - \frac{1}{2})

and, given the incentive structure,

η_{0} (x) = θ (x)

is the optimal choice at time

t = 0

. Since at time

t = 0

all agents are playing their signals, if agent x is the first to act, his information set will consist of

{θ (x) = η_{0} (x), θ (x \pm 1) = η_{0} (x \pm 1)}

and

λ_{τ^{1}} (x) = 2 log [\frac{q}{1 - q}] {(θ (x) - \frac{1}{2}) + (θ (x - 1) - \frac{1}{2}) + (θ (x + 1) - \frac{1}{2})}

As a result, the above strategy is the only optimal strategy if x is the first agent to receive an updating opportunity. As such, it is a candidate to be an optimal strategy at any time

τ_{x} > 0

. By induction, in the proof, Remark 5 shows that this strategy is optimal as long as it is followed by all other agents. On the basis of this result, Remark 6 proceeds to characterize the equilibrium of the social learning process, as well as to compute the rate of convergence of the process of action choices. Finally, Remark 7 emphasizes that this social learning process with local interactions gives rise to an extreme form of informational inefficiency, as the probability that the whole population learns to behave optimally, given the state of the world, is in fact zero. The result shows that the process of social learning may get absorbed in one of an infinite number of states where someone chooses the correct action and someone does not. In essence, the reason for this endemic multiplicity of stable limit configurations is that agents are extremely inward looking, in the sense that their choices are entirely determined by what happens inside their small neighbourhood, and although neighbourhoods are overlapping, information fails to be transmitted. To gain further intuition, consider the border between a cluster (of at least two agents) choosing action 0 and a cluster (of at least two agents) choosing action 1. As each of the two bordering agents has at least one neighbour choosing the same action as he does, neither of them will ever flip and information transmission will come to a halt.

It is interesting to consider what would happen if such bordering agents did not rely so much on their private (possibly wrong) information and allowed for the possibility of changing action in any situation where the actions chosen by their neighbours were in conflict. In what follows, we build on this intuition.

3.2. Unequally Informed Agents

We now move to a different scenario, in which agents receive signals of different precisions. We study, in particular, the case in which some agents in the population are perfectly informed, while others receive an uninformative signal. In terms of our notation, this means that

q^{h} = 1

,

q^{l} = 0.5

and that the probability that each agent receives a strongly-informative signal is denoted by

r \in (0, 1)

. We assume that at each time

t > 0

, when x is to take a decision, he observes the action currently chosen by one of his two neighbours, drawn with equal probabilities in

{x \pm 1}

.

The main difference with respect to the model previously analyzed is that now information is not homogeneous among agents: while agents who receive a fully informative signal will always choose the correct action independently of their neighbours, agents who receive a (fully) uninformative signal will draw Bayesian inference on the basis of their observation, that now consists of the action currently chosen by a single neighbour. The next result shows that the properties of the entailed social learning process with local interactions are very different in this set-up.

Theorem 4

If each agent is perfectly informed with probability

r \in (0, 1)

and perfectly uninformed otherwise, and can observe a randomly drawn neighbour, then the process of social learning with local interactions is adequate, as

Pr [{lim}_{t \to \infty} η_{t} (x) = ω

for all x in

X] = 1

. The process converges slowly (at rate

\sqrt{t})

to a configuration where all agents choose the correct action.

The proof is contained in Appendix B and its logic parallels that of Theorem 3. Since agents who are perfectly informed always choose the correct action, the focus is on the characterization of the behaviour of the remaining uninformed agents. For convenience, we denote agents who are perfectly informed as

x

and agents who are perfectly uninformed as x. Under the assumptions of this model,

I_{t} (x) = \{\begin{matrix} {θ^{σ} (x), σ} & at t = 0 \\ {η_{t} (x), η_{t} (y), Pr [y = x - 1] = Pr [y = x + 1] = 0.5, σ} & at each t = τ_{x} \end{matrix}

Consider the following strategy S2 for agent x:

At any time τ_{x} : choose η_{τ_{x}} (x) = η_{τ_{x}} (y)

To understand why this strategy is optimal, suppose agent x is the first to act and observes

η_{τ^{1}} (y) = 1

. As x is uninformed,

λ_{0} (x) = 0

and

λ_{τ^{1}} (x) \equiv log \frac{Pr [η (y) = 1 ∣ ω = 1] Pr [ω = 1]}{Pr [η (y) = 1 ∣ ω = 0] Pr [ω = 0]} = log \frac{r + (1 - r) \frac{1}{2}}{(1 - r) \frac{1}{2}} = log \frac{1 + r}{1 - r} > 0

where, we recall, r is the probability that agent y is perfectly informed. By the same token, observing

η_{_{τ^{1}}} (y) = 0

, leads agent x to revise his LLR to

log \frac{1 - r}{1 + r} < 0

. Since each neighbour is equally likely to be observed, choosing

η_{τ^{1}} (x) = η_{τ^{1}} (y)

is optimal for agent x at his first updating opportunity,

τ^{1}

. De facto, this strategy posits that when the evidence provided by the observation of the neighbours is strong (i.e., when both neighbours choose the same action), agent x optimally chooses to agree with them (as he did in the previous model), but whenever the actions observed by x provide only weak evidence on the unknown state (i.e., when neighbours disagree), agent x may choose any of the two actions with equal probability.

Remark 8 shows that strategy S2 is optimal for any agent x at any stage of the revision process, as long as it is followed by all other uninformed agents. Remark 9 characterizes the limit behaviour of this social learning process and Remark 10 shows that this social learning process is adequate, in the sense that, albeit very slowly, all agents will eventually choose the correct action.

The intuition behind this result is that, in this model, those agents who are aware of being uninformed will disregard their private information and be more prone to changing actions. On the other hand, those agents who are aware of being perfectly informed do unerringly play the correct action. As a result, this modeled heterogeneity in available private information significantly improves the efficiency of the mechanism of information transmission.

4. Related Literature

The literature on social learning has been growing very fast over the last two decades. Early models of social learning (Bikhchandani et al. [1], Banerjee [2]), typically referred to as models of herding and informational cascades, show that individuals who take choices sequentially and observe the choices made by all predecessors may neglect their private information and base their decisions entirely on what is publicly observed (an informational cascade). Herds may occur because the informational content of the public history of choices may overwhelm the information contained in the agents’ private signals. Since in this case agents’ private information is not revealed through their actions, learning may come to a halt and beliefs never converge to the truth.

After this seminal work, many papers have extended the analysis in various directions and applied it to study different issues (for comprehensive surveys see, among others, Gale [3], Devenow et al. [4], Hirshleifer and Theo [5], Chamley [6], Bikhchandani et al. [7], Vives [8]). The papers most related to ours study social learning when agents can only observe other individuals to which they are connected in a network. This is the case in Gale and Kariv [9]. In their model, agents act simultaneously, have perfect recall, and can revise their previous decisions. Their results show that, under some conditions, despite the fact that agents cannot observe the entire population, eventually, uniformity of actions occurs. Acemoglu et al. [10] analyze a situation in which agents observe the past actions of a stochastically-generated neighbourhood of individuals. In their set up, when beliefs are unbounded, there is asymptotic learning (defined as convergence of the actions to the correct one) as long as there is some minimal amount of “expansion in observations”. For many common deterministic and stochastic networks, bounded private beliefs are, instead, incompatible with asymptotic learning, as in the canonical model of social learning. Nevertheless, the authors find conditions under which asymptotic learning obtains even with bounded private beliefs for a large class of stochastic network topologies.

Other models of social learning in networks include Bala and Goyal [11], De Marzo et al. [12], Acemoglu et al. [13] and Ellison and Fudenberg [14]. What these papers have in common is that they assume some form of bounded rationality. In Bala and Goyal [11], agents in a network choose after observing their neighbours’ actions and payoffs. It should be noticed that this is a model of social experimentation rather than social learning: agents learn by observing the outcome (payoff) of an experiment (choice of action) rather than by inferring another agent’s private information from his action. There is private information in their model, but agents are assumed to ignore it to some extent. By assumption, each agent learns from his neighbour’s actions, but does not ask what information might have led the neighbour to choose those actions. They show that, in a connected network, in the long run, everyone adopts the same action and that the action chosen can be suboptimal. De Marzo et al. [12] and Acemoglu et al. [13] also focus on networks, but learning in these models is non-Bayesian. In Ellison and Fudenberg [14], finally, agents consider the experiences of their neighbours and learn using rules of thumb. In some cases, even naive rules can lead to efficient decisions, but adjustment to an innovation can be slow.

Beyond the papers that focus on social learning in networks, our paper is also related in the motivation to studies that analyze social learning when agents can only observe a subset of other agents’ actions. Banerjee and Fudenberg [15] and Smith and Sørensen [16] study social learning when agents observe a sample of predecessors’ actions. Banerjee and Fudenberg [15] present a model in which, at every time, a continuum of agents choose a binary action after observing a sample of previous decisions (and, possibly, of signals on the outcomes). This can be interpreted as a model of word of mouth communication in large populations. The authors find sufficient conditions for herding to arise, and conditions for all agents to settle on the correct choice. Smith and Sørensen [16] study a sequential decision model in which agents can only observe unordered random samples of predecessors’ actions. They characterize different conditions on the sampling procedure and on the beliefs to have complete or incomplete learning. When the past is not over-sampled, that is, not affected forever by any one individual, and when beliefs are unbounded, complete learning eventually obtains.

The theme of imperfect observability of other agents’ actions is common to a series of other papers in the social learning literature. Çelen and Kariv [17] extend the standard model of sequential social learning by allowing each agent to observe the decision of his immediate predecessor only. The prediction of these authors is that behavior does not settle on a single action. Long periods of herding can be observed, but switches to the other action may occur. As time passes, the periods of herding become longer and longer, and the switches increasingly rare. Larson [18] analyzes a situation in which agents observe a weighted average of past actions before making a choice in a continuous action space. Similarly to our work, the focus is on the speed of learning (since the continuous action space guarantees that complete learning eventually occurs). An interesting observation of this study is that the speed of learning depends on how effectively the noise coming from early actions is purged. Guarino et al. [19] introduce a model of aggregate information cascades where only one of two possible actions is observable to others. Agents make a binary decision in a random order and agents are not aware of their own position in the sequence. When called upon, they are only informed about the total number of others who have chosen the observable action before them. The result of this study is that only one type of cascade arises in equilibrium, the cascade on the observable action. Collander and Hörner [20] present a model in which an agent observes only the total number of choices of each type (in a binary action setting), rather than the full sequence of actions. They characterize conditions under which later agents optimally imitate the minority, rather than the majority action.

Finally the issue of imperfect observability is also discussed in recent papers by Bohren [21], Eyster and Rabin [22] and Guarino and Jehiel [23] in contexts in which agents are not fully rational. The imperfect observability can actually alleviate some biases that bounded rationality produces in a classical model of learning with continuous action space similar to that of Lee [24].

5. Conclusions

In our economy, a large population of individuals have to choose one out of two available actions. Each action is optimal in one of two unknown states of the world. Agents repeatedly and reversibly choose an action, the payoff to which will materialize when the state of the world realizes. Agents derive a posterior probability on the basis of a symmetric binary signal that they receive and by observing a sample of other agents, called their neighbours. Observed choices can be informative, since signals are, and this raises an issue of informational externality. While signals are generated by a probability distribution that is exogenously given to each agent, observed choices are endogenous to the model and, given the postulated spatial structure of the process, show a potentially high degree of spatial correlation.

We have studied two social learning processes, one in which agents are homogenous in the quality of the private information they receive (as measured by the precision of their signals) and one in which the quality of the information differs (in that some agents are perfectly informed, while others are completely uninformed). We have compared the two social learning processes in terms of the probability with which they may prove to be adequate, that is reach a configuration where every agent adopts the action that is optimal given the true state of the world. As we pointed out, since beliefs are bounded by the local nature of the social interaction, complete learning is out of reach within this class of models. We have shown that the specific kind of heterogeneity embedded in the second model guarantees that, albeit very slowly, the social learning process is adequate, since it converges to a configuration where all agents adopt the correct action. This cannot be so in the first model, as the social learning process gets absorbed exponentially quickly in a configuration where some agents permanently adopt the incorrect action. The explicit characterization of the rates of convergence proves to be relevant if one wants to compare the two models in terms of informational efficiency: while in the first model we observe a quick and complete blockage of information transmission, in the second information does get disseminated, but this occurs very slowly.

We conclude with a few remarks and conjectures.

In the model of Section 3.1 social learning is not adequate. One may wonder whether learning could be adequate in the presence of some agents who are perfectly informed about the state of the world. To address this issue suppose that

1 = q^{h} > q^{l} > 0.5

and

0 < r < 1

, that is, suppose that all agents receive an informative signal and a positive measure of them are perfectly informed. We conjecture that Theorem 3 would carry on in this case as well. Any perfectly informed agent, say agent 0, knows the true state of the world, ω, and sticks to

η (0) = ω

independently of the actions adopted in his neighbourhood. This relevant information is not necessarily transmitted to others. Since agents

\pm 1

are themselves informed and observe both neighbours, if the other neighbour,

\pm 2

respectively, chooses the wrong action,

1 - ω

, they would stick to the wrong action. Also, the speed at which the social learning process converges would still be driven by the use of strategy S1 on the part of the less informed agents, and hence would still be exponentially fast. The result would hold, a fortiori, if

q^{h} < 1

, since now agents know that no neighbour is perfectly informed about the state of the world. This suggests that heterogeneity in the quality of information per se is not sufficient to guarantee that the social learning process is adequate.

In the model of Section 3.2 an extreme form of heterogeneity in the quality of information leads to adequate social learning. One may wonder whether the result extends to the case in which in the population there are uninformed agents and other agents that receive a noisy signal:

1 > q^{h} > q^{l} = 0.5

and

0 < r < 1

. We conjecture that this heterogeneity is not sufficient to guarantee adequate learning. Strategy S2 would still be optimal for the uninformed agents in this case. Also, for a non empty set of values of

q^{h}

a better informed agent, say 0, would play his signal independently of his neighbours. As a result, with positive probability, agents

\pm 1

(if uninformed) will learn agent 0’s action, and transmit it to agents

\pm 2

(if uninformed), etc. The difference with Section 3.2 is that with positive probability some better informed agents receive the wrong signal and never correct their initial decision, which precludes adequate learning.

In essence, sufficient conditions that guarantee that the social learning process is adequate, are the existence of some agents who know the truth and unerringly choose the correct action, together with the existence of some agents who, being uninformed, decide to change their original choice. Perhaps contrary to intuition, in this set-up of limited memory a process of slow clustering on the correct decision ensues, by thus guaranteeing that the social learning process reaches an informationally efficient state. Our explicit characterization of the convergence rates shows that the speed at which this cluster grows does not depend on the proportion of perfectly informed agents in the population: our estimate is in fact unaltered for any value of the parameter

0 \leq r < 1

.3

Finally, one may wonder if endowing agents with the ability to convey more information to their neighbours, for example by fully revealing their posterior belief to each other, may enhance informational efficiency. Duffie et al. [25] show that this is the case in a model that does not account for a spatial dimension and are able to characterize explicitly the convergence of the cross-sectional distribution of beliefs to a common posterior. Their results do not apply to our setting,4 but the question grants future research.

Acknowledgements

The authors wish to thank the Editor and two anonymous referees for constructive comments and suggestions, and S. Camussi and C. Zedan for valuable research assistance.

References

Birkhchandani, S.; Hirshleifer, D.; Welch, I. A Theory of Fads, Fashions, Custom, and Cultural Change as Information Cascades. J. Polit. Econ. 1992, 100, 992–1026. [Google Scholar] [CrossRef]
Banerjee, A.V. A Simple Model of Herd Behavior. Quart. J. Econ. 1992, 107, 797–817. [Google Scholar] [CrossRef]
Gale, D. What Have We Learned from Social Learning. Eur. Econ. Rev. 1996, 40, 617–628. [Google Scholar] [CrossRef]
Devenow, A.; Welch, I. Rational Herding in Financial Economics. Eur. Econ. Rev. 1996, 40, 603–615. [Google Scholar] [CrossRef]
Hirshleifer, D.; Theo, S. Herd Behaviour and Cascading in Capital Markets: A Review and Synthesis. Eur. Financ. Manag. 2003, 9, 25–66. [Google Scholar] [CrossRef]
Chamley, C.P. Rational Herds: Economic Models of Social Learning; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
Birkhchandani, S.; Hirshleifer, D.; Welch, I. Informational Cascades and Observational Learning. The New Palgrave Dictionary of Economics. Available online: www.dictionaryofeconomics.com/article?id=pde2008000103 (accessed on 10 September 2009).
Vives, X. Information and Learning in Markets; Princeton University Press: Princeton, NJ, USA, 2008. [Google Scholar]
Gale, D.; Kariv, S. Bayesian Learning in Social Networks. Game. Econ. Behav. 2003, 45, 329–346. [Google Scholar] [CrossRef]
Acemoglu, D.; Dahleh, M.A.; Lobel, I.; Ozdaglar, A. Bayesian Learning in Social Networks; MIT: Cambridge, MA, USA, 2010; mimeo. [Google Scholar]
Bala, V.; Goyal, S. Learning form Neighbours. Rev. Econ. Stud. 1998, 65, 595–621. [Google Scholar] [CrossRef]
De Marzo, P.; Vayanos, D.; Zwiebel, J. Persuasion Bias, Social Influence, and Unidimensional Opinions. Quart. J. Econ. 2003, 118, 909–968. [Google Scholar] [CrossRef]
Acemoglu, D.; Ozdaglar, A.; ParandetGreibi, A. Spread of (Mis)Information in Social Networks; MIT: Cambridge, MA, USA, 2009; WP 09-15. [Google Scholar]
Ellison, G.; Fudenberg, D. Rules of Thumb for Social Learning. J. Polit. Econ. 1993, 111, 612–643. [Google Scholar] [CrossRef]
Banerjee, A.; Fudenberg, D. Word of Mouth Learning. Game. Econ. Behav. 2004, 46, 1–22. [Google Scholar] [CrossRef]
Smith, L.; Soerensen, P. Rational Social Learning with Random Sampling; University of Michigan: Ann Arbor, MI, USA, 2008; mimeo. [Google Scholar]
Çelen, B.; Kariv, S. Observational Learning under Imperfect Information. Game. Econ. Behav. 2004, 47, 72–86. [Google Scholar] [CrossRef]
Larson, N. In with the New Or Out with the Old? Bottlenecks in Social Learning; University of Virginia: Charlottesville, VA, USA, 2008; mimeo. [Google Scholar]
Guarino, A.; Harmgart, H.; Huck, S. Aggregate Information Cascades; University College London: London, UK, 2008; mimeo. [Google Scholar]
Collander, S.; Hörner, J. The Wisdom of the Minority. J. Econ. Theor. 2009, 144, 1421–1439. [Google Scholar] [CrossRef]
Bohren, A. Information-Processing Uncertainty in Social Learning; University of San Diego: San Diego, CA, USA, 2009; mimeo. [Google Scholar]
Eyster, E.; Rabin, M. Naive Herding; Department of Economics, London School of Economics: London, UK, 2008. [Google Scholar]
Guarino, A.; Jehiel, P. Social Learning with Coarse Inference; ESRC Centre for Economic Learning and Social Evolution: London, UK, 2009; ELSE Working Papers 337. [Google Scholar]
Lee, I.H. On the Convergence of Informational Cascades. J. Econ. Theor. 1993, 61, 395–411. [Google Scholar] [CrossRef]
Duffie, D.; Manso, G. Information Percolation in Large Markets. Amer. Econ. Rev. Paper. Proc. 2007, 97, 203–209. [Google Scholar] [CrossRef]
Durrett, R.; Steiff, J.E. Fixation Results for Threshold Voter. Ann. Probab. 1993, 21, 232–247. [Google Scholar] [CrossRef]
Liggett, T.M. Interacting Particle Systems; Springer-Verlag: Berlin, Germany, 1985. [Google Scholar]
Bramson, M.; Griffeath, D. Clustering and Dispersion Rates for Some Interacting Particle Systems on Z¹. Ann. Probab. 1980, 8, 183–213. [Google Scholar] [CrossRef]

Appendix A

Proof of Theorem 3.

The proof of the Theorem 3 is split into a few Remarks: Remark 5 shows that the model admits an equilibrium; Remark 6 characterizes limit behaviour and convergence rates of this process of social learning with local interactions and finally Remark 7 evaluates the degree of informational efficiency of the process.

Remark 5

Suppose all agents

y \neq x

choose stationary strategy S1. Then this strategy is also optimal for x at any time

τ_{x_{l}}

.

Proof.

First, suppose that

τ_{x_{1}} = τ^{_{1}}

. Then, the statement is true, since

η_{0} (y) = θ (y)

for all y. Suppose now that

τ_{x_{1}} > τ^{_{1}}

. Let us describe the process of inference undertaken by agent x in such a case (i.e., if he knew that at least one other agent had received an updating opportunity before). We drop the time subscript for notational convenience. Due to the symmetry of the model, WLOG we consider

θ (x) = 0

.

Let us consider first the case in which

η (x - 1) = 1

. Agent x needs to infer

θ (x - 1)

on the basis of

I (x) = {θ (x) = 0 = η (x), η (x - 1) = 1, η (x + 1)}

. By Bayesian updating,

Pr [θ (x - 1) = 1 ∣ I (x)] \equiv

(Pr [η (x - 1) = 1 ∣ θ (x - 1) = 1, θ (x) = 0, η (x + 1)] Pr [θ (x - 1) = 1 ∣ θ (x) = 0, η (x + 1)])

\begin{matrix} (Pr [η (x - 1) & = & 1 ∣ θ (x - 1) = 1, θ (x) = 0, η (x + 1)] Pr [θ (x - 1) = 1 ∣ θ (x) = 0, η (x + 1)] + \\ + Pr [η (x - 1) & = & 1 ∣ θ (x - 1) = 0, θ (x) = {0, η (x + 1)] Pr [θ (x - 1) = 0 ∣ θ (x) = 0, η (x + 1)])}^{- 1} \end{matrix}

If

τ_{{(x - 1)}_{1}} > τ_{x_{1}}

, this probability is one, as

x - 1

is playing his signal, by construction. If

τ_{{(x - 1)}_{1}} < τ_{x_{1}}

and agent

x - 1

has followed strategy S1, this probability is also equal to one, since

Pr [η (x - 1) = 1 ∣ θ (x - 1) = 0, θ (x) = 0, η (x + 1)] = 0

. Hence an agent who observes a neighbour choosing an action different from the action he himself is choosing, infers that neighbour is playing his signal:

Pr [θ (x - 1) = 1 ∣ I (x)] = 1

Let us consider now the case of

η (x - 1) = 0

. Agent x needs to infer

θ (x - 1)

on the basis of

I (x) = {θ (x) = 0 = η (x), η (x - 1) = 0, η (x + 1)}

. If

τ_{{(x - 1)}_{1}} > τ_{x_{1}}

, clearly

θ (x - 1) = η (x - 1) = 0

. If

τ_{{(x - 1)}_{1}} < τ_{x_{1}}

and agent

x - 1

has followed strategy S1, then by a logic analog to that followed in the previous paragraph,

Pr [η (x - 1) = 0 ∣ θ (x - 1) = 1, θ (x) = 0, η (x + 1)] = Pr [η (x - 2) = 0 ∣ I (x)]

and

Pr [η (x - 1) = 0 ∣ θ (x - 1) = 0, θ (x) = 0, η (x + 1)] = 1

Let

Pr [η (x - 2) = 0 ∣ I (x)] \equiv 1 - α

and

Pr [θ (x - 1) = 1 ∣ θ (x) = 0, η (x + 1)] \equiv 1 - β

. Then,

Pr [θ (x - 1) = 1 ∣ I (x), τ_{{(x - 1)}_{1}} < τ_{x_{1}}] \equiv \frac{(1 - α) (1 - β)}{(1 - α) (1 - β) + β}

Note that β is the belief that agent x has on the signal of

x - 1

being equal to zero. As such, β depends on the value of

η (x + 1)

, i.e., either

Pr [θ (x - 1) = 1 ∣ θ (x) = 0, η (x + 1) = 1] = 0.5

or

Pr [θ (x - 1) = 1 ∣ θ (x) = 0, η (x + 1) = 0] < Pr [θ (x - 1) = 1 ∣ θ (x) = 0] < 0.5

Hence

β \geq 0.5

and, for all

0 \leq α \leq 1

,

Pr [θ (x - 1) = 1 ∣ I (x), τ_{{(x - 1)}_{1}} < τ_{x_{1}}] \equiv \frac{(1 - α)}{(1 - α) + \frac{β}{1 - β}} \equiv γ < \frac{1}{2}

As a result,

Pr [θ (x - 1) = 1 ∣ I (x)] \leq Pr [τ_{{(x - 1)}_{1}} < τ_{x_{1}}] γ < \frac{1}{2}

Hence an agent who observes a neighbour choosing the same action as he himself is choosing, infers that the neighbour is more likely to be playing his signal (than to have used an updating opportunity).

As a result of the above considerations, and for

y \in {x \pm 1}

the conditional expectations of

θ (y)

are

\begin{matrix} E [θ (y) & ∣ & θ (x) = 0, η (y) = 1] = 1, and \\ E [θ (y) & ∣ & θ (x) = 0, η (y) = 0] < \frac{1}{2} \end{matrix}

We now proceed to show that, given these conditional expectations, the strategy is optimal for x at time

τ_{x_{1}}

for any possible

I (x) = {θ (x) = 0 = η (x), η (x \pm 1)}

.

Let us first prove the "if" part. Suppose that

I (x) = {θ (x) = 0, η (x - 1) = η (x + 1) = 1}

. By the above considerations

θ (x - 1) = θ (x + 1) = 1

. Hence:

\begin{matrix} λ_{1} (x) & \equiv & 2 log [\frac{q}{1 - q}] (E [θ (x - 1) ∣ I (x)] + θ (x) + E [θ (x + 1) ∣ I (x)] - \frac{3}{2}) = \\ = & 2 log [\frac{q}{1 - q}] (2 - \frac{3}{2}) = log [\frac{q}{1 - q}] > 0 \end{matrix}

Now let us prove the "only if" part. We have to consider different cases.

Case a): Suppose that

I (x) = {θ (x) = 0, η (x - 1) = η (x + 1) = 0}

. By the above considerations

E [θ (x - 1) ∣ I (x)] < 0.5

. Hence:

\begin{matrix} λ_{1} (x) & \equiv & 2 log [\frac{q}{1 - q}] (E [θ (x - 1) ∣ I (x)] + θ (x) + E [θ (x + 1) ∣ I (x)] - \frac{3}{2}) = \\ < & 2 log [\frac{q}{1 - q}] (1 - \frac{3}{2}) = - log [\frac{q}{1 - q}] < 0 \end{matrix}

since

q > 0.5,

as assumed.

Case b): Suppose that

I (x) = {θ (x) = 0, η (x - 1) = 0, η (x + 1) = 1}

(or viceversa). By the above considerations

E [θ (x - 1)] ∣ I (x)] < 0.5)

and

E [θ (x + 1) ∣ I (x)] = 1

. Hence:

\begin{matrix} λ_{1} (x) & \equiv & 2 log [\frac{q}{1 - q}] (E [θ (x - 1) ∣ I (x)] + θ (x) + E [θ (x + 1) ∣ I (x)] - \frac{3}{2}) = \\ < & 2 log [\frac{q}{1 - q}] (\frac{1}{2} + 1 - \frac{3}{2}) = 0 \end{matrix}

since

q > 0.5,

as assumed.

This concludes the proof that, under the stated assumptions, this strategy is optimal at time

τ_{x_{1}} = τ^{i}

for

i = 1, 2, . . .

(i.e., at the first updating opportunity that agent x gets, independently of when this opportunity arises). We now show that the statement holds at any time

τ_{x_{l}}

for

l = 2, 3 . . .

Consider

τ_{x_{2}} > τ_{x_{1}}

and let

I_{2} (x) = {θ (x) = 0, η_{τ_{x_{2}}} (x \pm 1)}

denote x’s information set at time

τ_{x_{2}} .

If

η_{τ_{x_{1}}} (x) = θ (x)

, clearly the previous part of the proof holds in this case as well. Suppose instead that

η_{τ_{x_{1}}} (x) \neq θ (x)

. In this case x has flipped to

η (x) = 1

at time

τ_{x_{1}}

, because

η_{τ_{x_{1}}} (x \pm 1) = 1

. By the reasoning above, this means that x could perfectly infer that

θ (x \pm 1) = 1

. Hence, strategy S1 is optimal at time

τ_{x_{2}}

as well. In other words, at the second updating opportunity and for any

x :

\begin{matrix} either η_{τ_{x_{1}}} (x) & = & 1 and λ_{1} (x) = log [\frac{q}{1 - q}] \\ or η_{τ_{x_{1}}} (x) & = & 0 and λ_{1} (x) = - log [\frac{q}{1 - q}] \end{matrix}

An entirely analog reasoning shows that the strategy is also optimal at any time

τ_{x_{l}}

for

l > 2,

when one notices that in between any

τ_{x_{l + 1}}

and

τ_{x_{l}}

within x’s neighbourhood the number of agents

x \pm 1

such that

η (x \pm 1) = η (x)

cannot decrease. To see this, consider

τ_{x_{3}} > τ_{x_{2}}

and let

I_{3} (x) = {θ (x) = 0, η_{τ_{x_{3}}} (x \pm 1)}

. If

η_{τ_{x_{2}}} (x) = θ (x) = 0

, it must be that also

η_{τ_{x_{1}}} (x) = θ (x) = 0

(since it must have been that

η_{τ_{x_{1}}} (x \pm 1) = 1

) and the first part of the proof holds. If

η_{τ_{x_{2}}} (x) = 1 \neq θ (x) = 0

, then x must have flipped either at time

τ_{x_{1}}

or at time

τ_{x_{2}}

, and again we know from the previous part of the proof that strategy S1 was optimal in those cases. As a Corollary, the above reasoning shows that, within this model, each agent can flip at most once.

Remark 6

If agents use strategy S1, the characterization of the limit behaviour of the social learning process is as follows5.

Let

{\hat{η}}

be the set of configurations such that, for each x in X, there is at least a y in

N (x) = {η (x \pm 1)}

such that

η (y) = η (x)

. Then, starting from any given initial distribution,

μ^{ω},

the process converges in probability to a configuration

η_{\infty}

∈

{\hat{η}}

:

P^{μ^{ω}} [lim_{t \to \infty} η_{t} = η_{\infty}] = 1

Convergence obtains exponentially fast:

P^{μ^{ω}} [η_{t} \neq η_{\infty}] \propto exp [- t]

Proof.

We shall find it convenient to model transitions in terms of flip rates, i.e., the rates at which

η_{t} (x)

flips to

1 - η_{t} (x)

. By flip rate c we mean that the probability that the transition occurs in an infinitesimal time

d t

is

c d t .

We shall denote flip rates by

c (x, η_{t})

to emphasize their dependence on the current state of action chosen in the population and assume that

Pr [η_{t} (x) ∣ 1 - η_{t} (x)] = c (x, η_{t}) t + o (t)

.

By Remark 5, the flip rates for this process are:

c (x, η) = \{\begin{matrix} 1 & η (y) \neq η (x) \forall y = {x \pm 1} \\ 0 & otherwise \end{matrix}

and the characterization of

\hat{η}

follows by simple inspection of these.

To show that the process of actions converges, let

δ_{x, y} (t) = 1

if

η_{t} (x) \neq η_{t} (y)

, and 0 otherwise. Recall that agents live on a one-dimensional lattice

X = Z^{1} = {. ., - 2, - 1, 0, + 1, + 2, . .} .

We follow the lines of Durrett et al. [26] and define the following function:

Υ_{t} = \sum_{x \in X} \sum_{y \in N (x)} exp [- | x + y |] δ_{x, y} (t)

Note that, by construction,

0 \leq Υ_{t} < \bar{Υ} < \infty

. We shall show that, starting from

Υ_{0}

, at any time in which any x flips from

η (x)

to

1 - η (x)

, this function decreases by a strictly positive amount. To this aim, let

Υ_{t} (x)

be:

Υ_{t} (x) = \sum_{y \in N (x)} exp [- | x + y |] δ_{x, y} (t)

and for simplicity6 take

x = 0

, with neighbours

y \in {- 1, + 1}

:

Υ_{t} (0) = \sum_{y \in {- 1, + 1}} exp [- | y |] δ_{0, y} (t)

Note that agent

x =

0 will flip if and only if

\sum_{y} δ_{0, y} (t) = 2,

and, by the construction of the model, this can happen with positive probability. Suppose that this happens. Then the drop in Υ at site 0, after the flip occurred, is equal to

2 exp [- 1]

which is strictly positive. As the same argument applies to any generic site, this implies that the function

Υ_{t}

is strictly decreasing at any time at which an agent flips action.

Let

\hat{Υ} \equiv \hat{Υ} (\hat{η})

be the value of this function at any stable configuration

\hat{η}

such that

{lim}_{t \to \infty} η_{t} = \hat{η}

and consider

{\hat{Υ}}_{t} \equiv Υ_{t} - \hat{Υ}

along the realizations of the process leading to

\hat{η}

. To show that convergence obtains exponentially fast we will show that there exists

k > 0

and

ε > 0

such that:

P^{Υ_{0}} [{\hat{Υ}}_{t} > 0] \leq k Υ_{0} exp [- ε t]

To this aim, we need to make the transition from convergence along integer times (as in

{\hat{Υ}}_{t}

) to convergence in real time (for

η_{t}

). Let

δ_{x, y} [(n - 1) t, n t]

for

n \geq 1

and

{\hat{Υ}}_{n t} = [{\hat{Υ}}_{(n - 1) t, n t}] \leq \sum_{k = 1, . . ., t} {\hat{Υ}}_{(k - 1) t, k t}

. Since, by construction,

\hat{Υ}

is finite,

E [{\hat{Υ}}_{t}]

is also finite. Since

\hat{Υ} \leq \bar{Υ} < \infty

, then

\hat{\hat{Υ}} \equiv {(\bar{Υ})}^{- 1} \hat{Υ} \leq 1

. Let

\tilde{Υ} \equiv E [exp [ξ {\hat{\hat{Υ}}}_{n t}]] < 1

which is true for a small positive ξ. This implies that:

Pr [{\hat{Υ}}_{n t} > 0] \leq {\tilde{Υ}}^{n}

and since

P^{Υ_{0}} [{\hat{Υ}}_{s} > 0]

is monotonic in s, for

k \equiv {\tilde{Υ}}^{- 1}

and for

ε \equiv t^{- 1} log {\tilde{Υ}}^{- 1}

the assert is proved.

Remark 7

If agents use strategy S1, the process of social learning with local interactions is not adequate and

Pr [lim_{t \to \infty} η_{t} (x) = ω for all x in X] = 0

Proof.

Recall that the initial choice of actions is produced by a product measure:

{Pr}^{μ_{ω = 1}} [η : η (x) = 1] = q

or

{Pr}^{μ_{ω = 0}} [η : η (x) = 1] = 1 - q

respectively. Since

0.5 < q < 1

the probability that any two adjacent agents receive the same signal is strictly positive. Once this happens, as the above Remark shows, these agents will never flip. Hence, this process fails to satisfy Definition (2) and learning is not adequate.

Appendix B

Proof of Theorem 4.

The proof of Theorem 4 is split into three Remarks: Remark 8 shows that an equilibrium exists, Remark 9 characterizes the limit behaviour and Remark 10 evaluates the degree of informational efficiency of this model. The logic of the proof parallels that of Theorem 3.

Remark 8

Suppose all agents

y \neq x

choose strategy S2. Then this strategy is also optimal for x at any

τ_{x_{l}}

.

Proof.

We follow exactly the same logic as that of Remark 5 and describe the process of inference undertaken by agent x at time

τ_{x_{l}},

(we drop the time subscript for notational convenience). Notice that, within this model, agent x cannot draw any inference from his signal. Also, agent x cares about his neighbours’ signal only insofar as they are informed.

Since

q (x) = 0.5

by construction,

λ_{0} (x) = 0 .

Suppose

τ_{x_{1}} = τ^{1}

(i.e., agent x is the first to receive an updating opportunity) and

η (x) = 1

. Agent x needs to compute

λ_{1} (x)

on the basis of

I (x) = {η (x), η_{τ_{x_{1}}} (y)}

, for

y \in {x \pm 1}

, resulting in:

\begin{matrix} λ_{1} (x) & \equiv & log \frac{Pr [η (y) ∣ ω = 1] Pr [ω = 1]}{Pr [η (y) ∣ ω = 0] Pr [ω = 0]} \\ = & \{\begin{matrix} log \frac{r + (1 - r) \frac{1}{2}}{(1 - r) \frac{1}{2}} = log [\frac{1 + r}{1 - r}] > 0 & if η (y) = η (x) = 1 \\ log \frac{(1 - r) \frac{1}{2}}{r + (1 - r) \frac{1}{2}} = log [\frac{1 - r}{1 + r}] < 0 & if η (y) \neq η (x) = 1 \end{matrix} \end{matrix}

Since

Pr [y = x - 1] = Pr [y = x + 1] = 0.5

, this shows that S2 is optimal at time

τ_{x_{1}} = τ^{1}

.

Consider

τ_{x} > τ^{1}

and let

I_{τ_{x}} (x) = {η_{τ_{x}} (x), η_{τ_{x}} (y))}

denote x’s information set at time

τ_{x} .

When drawing inference, agent x has now to consider the possibility that agent y may have received an updating opportunity and may have used strategy S2. As x is uninformed,

λ (x) = 0

. Let

s \equiv Pr [τ_{y} < τ_{x}]

, i.e., the probability that agent y has received an updating opportunity before agent x.

Suppose

y = x - 1

,

η (x - 1) = 1

and

η (x) = 1

. Then:

\begin{matrix} Pr [η (x - 1) = 1 ∣ ω = 1] = r + (1 - r) {(1 - s) \frac{1}{2} + s [η (x - 2) \frac{1}{2} + \frac{1}{2}]} \\ Pr [η (x - 1) = 1 ∣ ω = 0] = (1 - r) {(1 - s) \frac{1}{2} + s [η (x - 2) \frac{1}{2} + \frac{1}{2}]} \end{matrix}

where the term in square brackets refers to the possibility that

x - 1

might have chosen

η (x - 1) = 1

as a result of S2 (and hence observed either

η (x - 2) = 1

or

η (x) = 1)

. Notice that, by construction,

η (x - 2) \in {0, 1}

is given at time

τ_{x}

.

Suppose

y = x + 1

,

η (x + 1) = 0

and

η (x) = 1

.

Then:

\begin{matrix} Pr [η (x + 1) = 0 ∣ ω = 1] = (1 - r) {(1 - s) \frac{1}{2} + s [\frac{1}{2} (1 - η (x + 2))]} \\ Pr [η (x + 1) = 0 ∣ ω = 0] = r + (1 - r) {(1 - s) \frac{1}{2} + s [\frac{1}{2} (1 - η (x + 2))]} \end{matrix}

where the term in square brackets refers to the possibility that

x + 1

might have chosen

η (x + 1) = 0

as a result of S2 (and hence observed either

η (x + 2) = 0

or

η (x) = 1)

. Notice that, by construction,

η (x + 2) \in {0, 1}

is given at time

τ_{x}

.

As a result,

\begin{matrix} λ (x) & \equiv & log \frac{Pr [η (y) ∣ ω = 1] Pr [ω = 1]}{Pr [η (y) ∣ ω = 0] Pr [ω = 0]} \\ = & \{\begin{matrix} log \frac{r + (1 - r) {(1 - s) \frac{1}{2} + s [η (x - 2) \frac{1}{2} + \frac{1}{2}]}}{(1 - r) {(1 - s) \frac{1}{2} + s [η (x - 2) \frac{1}{2} + \frac{1}{2}]}} > 0 & if y = x - 1, η (y) = 1 = η (x) \\ log \frac{(1 - r) {(1 - s) \frac{1}{2} + s [\frac{1}{2} (1 - η (x + 2))]}}{r + (1 - r) {(1 - s) \frac{1}{2} + s [\frac{1}{2} (1 - η (x + 2))]}} < 0 & if y = x + 1, η (y) = 0 \neq η (x) \end{matrix} \end{matrix}

Since

Pr [y = x - 1] = Pr [y = x + 1] = 0.5

, this shows that S2 is optimal at any time

τ_{x}

.

Remark 9

If agents use strategy S2, the characterization of the limit behaviour of the social learning process is as follows.

Let

η^{ω}

be the configurations where

η (x) = ω

for all

x \in X

Then, starting from any given initial condition,

μ^{ω},

the process converges in probability to configuration

η^{ω}

:

P^{μ^{ω}} [lim_{t \to \infty} η_{t} = η^{ω}] = 1

Convergence obtains slowly, namely at rate

\sqrt{t}

:

P^{μ^{ω}} [η_{t} \neq η^{ω}] \propto \frac{1}{\sqrt{t}}

Proof.

Let us denote the population of agents as

X \cup X

, where

x \in X

are the informed agents and

x \in X

are the uninformed agents. By construction, the flip rates for this process

{c (x, η), c (x, η)}

are:

\begin{matrix} c (x, η) & = & 0 \end{matrix}

(3)

\begin{matrix} c (x, η) & = & \{\begin{matrix} \frac{1}{2} \sum_{y \in {x \pm 1}} η (y) & η (x) = 0 \\ \frac{1}{2} \sum_{y \in {x \pm 1}} (1 - η (y)) & η (x) = 1 \end{matrix} \end{matrix}

(4)

By simple inspection, it is clear that only the state for which

η (x) = ω

for all x in X is stationary for this process. However, since the process

η

defines a continuous time Markov chain on the state-space

S = Z^{1}

which is countable, but infinite, we need to prove that the process is ergodic, that is, that starting from any initial distribution

μ^{ω}

, the process will converge to

η^{ω}

with probability one (first part of the assert).

We proceed as follows. Let

S_{N}

be finite sets that increase to S, such that

{lim}_{N \to \infty} S_{N} = S

. Define the following flip rates:

c_{i}^{N} = \{\begin{matrix} {c (x, η), c (x, η)} & if x, x \in S_{N} \\ 0 & if x, x \notin S_{N} and η (x) = i \\ 1 & if x, x \notin S_{N} and η (x) \neq i \end{matrix}

Let us call the process defined by these flip rates

S_{i, N} (t)

. Notice that this process is equal to the original process for all

x, x

in

S_{N}

and characterized by all coordinates set equal to i for

x, x

not in

S_{N}

.

Let

μ^{0} S_{0, N} (t)

be the law of the process characterized by flip rates

c_{0}^{N}

when the initial distribution is given by all 0 at time 0 and let

μ^{1} S_{1, N} (t)

be the law of the process characterized by flip rates

c_{1}^{N}

when the initial distribution is given by all 1 at time

0 .

As the original process is attractive7, so are the processes

c_{i}^{N}

and, by Theorem 2.7 in Liggett [26]:

μ^{0} S_{0, N} (t) \leq μ^{θ} S (t) \leq μ^{1} S_{1, N} (t)

for

θ \in (0, 1),

and

\begin{matrix} lim_{N \to \infty} lim_{t \to \infty} μ^{0} S_{0, N} (t) & = & lim_{t \to \infty} μ^{0} S (t) \\ lim_{N \to \infty} lim_{t \to \infty} μ^{1} S_{1, N} (t) & = & lim_{t \to \infty} μ^{1} S (t) \end{matrix}

WLOG suppose

ω = 1

. Then

{lim}_{t \to \infty} μ^{0} S_{0, N} (t) = {lim}_{t \to \infty} μ^{1} S_{1, N} = μ^{1, N},

that is, as

t \to \infty,

independently of the initial distribution, the process restricted on

S_{N}

converges to a configuration all ones. In fact

S_{i, N} (t)

is a finite Markov chain over

S_{N},

and as there is a unique absorbing state (

η_{N}^{1} \equiv {η (x) = 1

for all

x \in S_{N}}

) we know that the unique ergodic distribution posits pointmass one on this state. As

{lim}_{N \to \infty} S_{N} = S,

il follows that

lim_{N \to \infty} lim_{t \to \infty} μ^{0} S_{0, N} (t) = lim_{N \to \infty} lim_{t \to \infty} μ^{1} S_{1, N} (t) = lim_{N \to \infty} μ^{1, N} = μ^{1}

and the first part of the assert follows.

To prove the second part of the statement, we need to compute the rate of convergence for this process. Notice that the rate at which social learning takes place is given by the speed with which those uninformed agents flip. Hence we need to study the dynamics of choices of the individuals in X. We notice that this dynamics is analog to that of the Voter’s model (Liggett [27], Section 1 and Section 3, Chapter V or in Bramson and Griffeath [28]), well studied in the statistical literature. In the Voter’s model, a voter at

x \in Z^{d}

changes his opinion at an exponential rate (with mean one) proportional to the number of

2 d

nearest neighbours with the opposite opinion. If

2 d

neighbours disagree with the person at x, the flip rate is 1. It can be seen by equation (3) that this is exactly the dynamics of the uninformed agents in our model. Hence, although the asymptotics of our model are substantially different from those of the Voter’s model, the dynamics is exactly the same.

To show that learning occurs at rate

\sqrt{t}

we proceed as follows. As the process is defined in the two dimensions of time and space, we shall find it useful to relate these two dimensions in a space-time analysis. In particular, we characterize a clustering process, by relying on the local specification of the model. With the term “cluster” we mean the length of a segment with all connected individuals choosing the same action. In order to see how the size of a cluster increases with time, we shall later express the length of a cluster as a function of t. Formally, given a configuration, η, we define a cluster as the connected components of

{x : η (x) = 0}

or

{x : η (x) = 1};

the size of a cluster of ones in a segment of side l around the origin as:

∣ η_{l} ∣ = ∣ {x : η (x) = 1; x \in [- l, l]} ∣

and the mean cluster size of η around the origin as:

C (η) = lim_{l \to \infty} \frac{2 l}{‘number of clusters of η in [- l, l] ’}

whenever this limit exists.

Given the asymptotics described, we already know that the mean cluster size tends to grow indefinitely. To prove the statement, we need to show that the mean cluster size,

C^{μ_{ω}} (η_{t})

, grows in probability at rate

\sqrt{t}

, in the sense that:

\frac{C^{μ_{ω}} (η_{t})}{t^{1 / 2}} \to_{p} K

where K is a positive constant depending on ω. Since, as stated before, this model reproduces the of the Voter’s model, this statement is proved in Bramson and Griffeath [9]. In fact, Theorem 7, p. 211 of that paper also provides the following estimate for the lower and upper bound of the limit expected value of the above quantity (re-written with our parametrization):

\sqrt{π} (\frac{1}{2 \frac{1 + r}{2} \frac{1 - r}{2})}) \leq lim_{t \to \infty} E [\frac{C^{μ_{ω}} (η_{t})}{t^{1 / 2}}] \leq 2 (\frac{{(\frac{1 + r}{2})}^{2} + {(\frac{1 - r}{2})}^{2}}{(\frac{1 + r}{2}) (\frac{1 - r}{2})}) \sqrt{π}

(5)

where

π = 3 . 141 6 .

Remark 10

If agents use strategy S2 , the process of social learning with local interactions is adequate, in that

lim_{t \to \infty} Pr [η_{t} (x) = ω f o r a l l x i n X] = 1

Proof.

Recall that the initial condition is produced by a product measure:

{Pr}^{μ_{ω = 1}} [η : η (x) = 1] = \frac{1 + r}{2}

or

{Pr}^{μ_{ω = 0}} [η : η (x) = 1] = \frac{1 - r}{2}

respectively. Hence, by the previous Remark, the assert follows.

^1.An entirely equivalent specification could be obtained by enlarging the information set and by imposing that agents only use stationary strategies, that is, strategies that do not depend on time, nor on location.
^2.Note that this is the case contemplated by the canonical model of social learning.
^3.Clearly, for $r = 0$ the social learning process could fail to be adequate, as it would only show consensus on a particular action, not necessarily the correct one.
^4.Due to the amount of spatial correlation, the information sets upon which our agents base their decisions are not disjoint, which contradicts a basic assumption of their model.
^5.In stating the results, we use the following additional notation. We denote any probability distribution over the state space by $μ_{t}$ , and the initial distribution by $μ^{ω}$ . Since at time $t = 0$ choices are determined by the signals and in any given state of the world ω these are stochastically independent, this initial distribution is by construction a product measure. As for any $t > 0$ choices may instead depend on the spatial configuration of action chosen within neighbourhoods, $μ_{t}$ will typically display an amount of spatial correlation.
^6.This is done WLOG, since the initial distribution that determines the initial condition is (a product measure and hence) traslation invariant.
^7.We say that, for $η, ζ \in$ ${0, 1}^{Z^{1}}$ , $η \leq ζ$ if $η (x) \leq ζ (x)$ for all $x \in Z^{1}$ . Then a process is defined to be attractive (or monotonic) if, whenever $η \leq ζ$ flip rates satisfy the following:

$\begin{matrix} c (x, η) & \leq & c (x, ζ) if η (x) = ζ (x) = 0 \\ c (x, η) & \geq & c (x, ζ) if η (x) = ζ (x) = 1 \end{matrix}$

© 2010 by the authors; licensee MDPI, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

Share and Cite

MDPI and ACS Style

Guarino, A.; Ianni, A. Bayesian Social Learning with Local Interactions. Games 2010, 1, 438-458. https://doi.org/10.3390/g1040438

AMA Style

Guarino A, Ianni A. Bayesian Social Learning with Local Interactions. Games. 2010; 1(4):438-458. https://doi.org/10.3390/g1040438

Chicago/Turabian Style

Guarino, Antonio, and Antonella Ianni. 2010. "Bayesian Social Learning with Local Interactions" Games 1, no. 4: 438-458. https://doi.org/10.3390/g1040438

APA Style

Guarino, A., & Ianni, A. (2010). Bayesian Social Learning with Local Interactions. Games, 1(4), 438-458. https://doi.org/10.3390/g1040438

Article Menu

Bayesian Social Learning with Local Interactions

Abstract

1. Introduction

2. The Economy

3. Social Learning

3.1. Equally Informed Agents

3.2. Unequally Informed Agents

4. Related Literature

5. Conclusions

Acknowledgements

References

Appendix A

Appendix B

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI