1. Introduction
In the parable of the blind men and the elephant
1, individuals attempt to comprehend the concept of an elephant solely through the sense of touch. Each person, limited to exploring only a specific part of the elephant, such as the wriggling trunk, flapping ears, or swinging tail, reaches their own conclusions about the shape of an elephant. One perceives the elephant as resembling a snake, another as a fan, and yet another as a rope. The parable serves as an analogy to the situation where economic agents have incomplete perceptions about the true state of the world. Players choose actions based on their limited perception of the state of the economy, but, eventually, the true state of the world is revealed. Players’ actions accordingly yield rewards or penalties.
2To incorporate the above aspect of reality into a formal model, this paper considers a class of dynamic Bayesian games in which types evolve stochastically according to a first-order Markov process on a continuous type space (“Bayesian stochastic games”). It is well known that equilibria of stochastic games with the continuous state space are elusive. This paper overcomes the challenge with the Bayesian feature. Dynamic Bayesian games with serially correlated types, however, are notorious for the curse of dimensionality that the dimension of players’ beliefs grows over time, and, thus, equilibria of such games are generally not tractable. This paper introduces delayed revelation of private information which is referred to as “periodic revelation” to overcome the dimensionality issue of beliefs. Types remain private when players choose actions, but they are revealed alongside actions when payoffs are obtained. In this framework, there exist a class of stationary Markov perfect equilibria. The game structure and equilibrium concept can be applied to the analyses of dynamic oligopoly with asymmetric information or many other economic environments with a delay of information revelation.
Suppose, at each time t, a player has a limited piece of information about the true state of the world. It is defined as the type of the player in the t-stage Bayesian game. The type profile of all the players is the true state of the economy, and it is hidden when players choose actions. But type and action profiles are eventually revealed to everyone at the end of the stage when individual payoffs of the stage Bayesian game are obtained. Toward the next period, the type profile stochastically evolves according to a first-order Markov process based on the revealed type and action profiles. Especially, the previous type and action profiles , are publicly known at the beginning of time t, and all the players know the probability distribution over the new type profile as common knowledge.
To see the structure in a familiar setting, consider a duopoly playing a dynamic Cournot competition. At time
t, firm
i learns its cost type
, which is drawn from a compact interval in the real line
. However, firm
i does not know firm
j’s cost type, and vice versa. Firm
i is aware that the cost type profile, which is the state of the economy, evolves stochastically over time according to a first-order Markov process, and that the stochastic process is based on the previous type and action profiles. Suppose the previous stage types and actions are publicly known, say, after the financial statements of firms are revealed. Then, firm
i can have a belief over firm
j’s current cost type based on the previous type profile and action profile. Conditional on their private cost types and beliefs about each other, firms choose optimal quantities to produce. Although similar approaches have appeared in the dynamic Bayesian games literature frequently, to the best of my knowledge, the models have suffered from the curse of dimensionality due to the history-dependent beliefs.
3 This paper, by contrast, allows players to have time-invariant beliefs as long as the previous actions and types are the same under periodic revelation. Also, the models have never been treated in a continuous type space under a first-order Markov process. This paper shows existence of a class of stationary Markov perfect equilibria in this environment, which are termed “stationary Bayesian–Markov equilibria”.
Developing a framework for analyses of dynamic oligopolies with asymmetric information has been an open question for a long time due to the difficulty of dealing with the beliefs. Fershtman and Pakes [
4] propose a framework for dynamic games with asymmetric information over a discrete type space focusing on empirical tractability. Their theoretical equilibrium concept is history-dependent, but they cleverly detour the dimensionality issue with the assumption that accumulated data include all the information about the history.
4 Cole and Kocherlakota [
5] and Athey and Bagwell [
6], respectively, suggest new equilibrium concepts in a class of dynamic Bayesian games in which players’ beliefs are a function of the full history of public information.
5 Hörner, Takahashi, and Vieille [
7] characterize a subset of equilibrium payoffs, focusing on the case where players report their private information truthfully in dynamic stochastic Bayesian games.
6The key distinction of this paper from existing literature in dynamic Bayesian games is to add periodic revelation to mitigate the dimensionality problem. In the literature, it is conventional to formulate that private information remains hidden and never becomes public. If the game is under the serially correlated type evolution, without periodic revelation, players formulate their beliefs about other players’ current type based on the history of available past information. Then, the dimension of each player’s beliefs exponentially grows over time.
7 Periodic revelation in this paper, however, enables players to have common prior for the current stage type distribution based on the revealed information, and, thus, players have time-invariant beliefs as long as the revealed information is the same regardless of the calendar time, which are consistent with the underlying type evolution. This helps to establish the stationary equilibrium concept even when types are serially correlated. Practically, it captures the economic environments where private information is periodically disclosed, either by legal requirements or voluntarily. Therefore, the model can be applied to dynamic oligopolies under the periodic disclosure requirement on the firms’ financial performance, e.g., Form 10-K in the U.S., or to the setting of Barro and Gordon [
8] to analyze the effects of release of the Federal Open Market Committee (FOMC) transcripts with a five-year delay (See Ko and Kocherlakota [
9]).
Levy and McLennan [
10] show that, for stochastic games with complete information in a continuous state space, existence of stationary Markov equilibria is not guaranteed. In such environments, Nowak and Raghavan [
11] show that there exists a stationary correlated equilibrium. Duggan [
12] shows that there exists a stationary Markov equilibrium in cases where the state variable has an additional random component. The random component, so-called noise, can be viewed as an embedded randomization device for each state. In this perspective, a stationary Markov equilibrium in noisy stochastic games can be seen isomorphic to a stationary correlated equilibrium by Nowak and Raghavan [
11]. Barelli and Duggan [
13] formulate a dynamic stochastic game in which players’ strategies depend on the past and current information to show that Nowak and Raghavan’s stationary correlated equilibrium can be uncorrelated. By contrast, this paper considers the case where players have private information: the state is defined by a type profile of the game of incomplete information. As the players compute their interim payoff by integrating possible scenarios over their beliefs, the convexity of the set of interim payoffs is naturally obtained. It is more realistic that the convexity is obtained from the Bayesian structure, compared with previous approaches that utilize public randomization devices (Nowak and Raghavan [
11]) or random noise (Duggan [
12]).
The concept of the stationary Bayesian–Markov equilibrium is closely related to a stationary correlated equilibrium in Nowak and Raghavan [
11]. That is, in a Bayesian game, players have beliefs about the type profile of their game, which is the state of the economy. Players then choose actions to maximize their expected payoffs considering their individual beliefs about the state of the economy. According to Aumann [
14], a resulting Bayesian equilibrium can be regarded as a correlated equilibrium distribution, utilizing the collection of beliefs as a randomization device.
To prove existence of stationary Markov equilibria in conventional stochastic games, it is common to consider an induced game of the original stochastic games. The induced game is defined by a stage game that is indexed by the state of the world and a profile of continuation value functions. Then, the next step is to find a fixed point of the expected continuation value function in the induced game. Finally, one may apply the generalized implicit function theorem and a measurable selector theorem to extract an equilibrium strategy profile. During this process, it is crucial to have a convex set of the expected continuation value functions in order to ensure existence of a fixed point. Therefore, the key of proof is convexification of the set of continuation values in each state of the economy. Nowak and Raghavan [
11] introduce a public randomization device, a so-called sunspot, as a means to convexification. In this paper, the idea of convexification is extended to the case where each player obtains private information according to common prior that depends on the past information from periodic revelation.
The model of a Bayesian stochastic game with periodic revelation is formally described in
Section 2.
Section 3 contains the existence theorem and the proof.
Section 4 is devoted to a computational algorithm.
Section 5 sketches the extension of proof to
K-periodic revelation and concludes. In the
Supplementary Materials, I present an application, specifically an incomplete information version of an innovation race between two pharmaceutical companies with periodic revelation.
3. Existence Theorem
Theorem 1 (Existence Theorem). For every Bayesian stochastic game with periodic revelation, there exists a stationary Bayesian–Markov equilibrium.
First, I construct the set
V of interim expected continuation value function profiles in the following two paragraphs.
17 Fix
. Let
be the collection of
-equivalence classes of
-essentially bounded, measurable extended real-valued functions from
to
;
is equipped with the usual norm
, which gives the smallest essential upper bound; that is,
if
for
-almost all
;
is the collection of
-equivalence classes of integrable functions from
to
;
is equipped with
such that
.
18 By the Riesz representation theorem (see Royden–Fitzpatrick ([
19], hereafter RF, p. 400),
is the dual space of
, and it is endowed with weak-* topology. By Proposition 14.21 of RF (p. 287),
is a locally convex Hausdorff topological vector space. Let
denote the Cartesian product
. Then, endowed with product topology,
is a locally convex Hausdorff topological vector space.
Let
consist of functions
with
; that is,
for
-almost all
. The constant
is of which
for all
. Clearly,
is nonempty and convex. Then, the Cartesian product
is also non-empty and convex. By Alaoglu’s theorem (RF, p. 299),
is compact. The finite product
V is also compact (see Theorem 26.7, Munkres [
20], p. 167).
Now I show
V is metrizable in the weak-* topology. Since
is separable metric space, its Borel
-algebra is countably generated.
is separable.
19 By Corollary 15.11 of RF (p. 306),
is metrizable in the weak-* topology. The Cartesian product
V is metrizable in the product weak-* topology, so it is compact if and only if sequentially compact (see Theorem 28.2, Munkres [
20], p. 179).
In order to see if the correspondence
is nonempty, closed graph, and convex valued, I consider the following induced game. Eventually I want to show that the set of interim expected payoffs profiles
from the induced game is in fact equivalent to its convexification
. Let
denote a state-
s induced game of a Bayesian stochastic game given a profile of continuation value functions
v and the realized state-action profiles
in the previous period:
Here,
is defined as follows: for each
, each
,
In the interim stage of a general one-shot Bayesian game, suppose players use behavioral strategies. Knowing their realized type (
), player
i exerts a mixed action (
), induced by their behavioral strategy (a measurable mapping
). In the induced game of the Bayesian stochastic game in this paper, players use stationary Bayesian–Markov strategies, and it is essentially the same to the behavioral strategy in a one-shot Bayesian game. The mixed actions induced by stationary Bayesian–Markov strategy profile
of all players determine a product probability measure
. The difference between a stationary Bayesian–Markov strategy and a behavioral strategy is that, in the former, beliefs are given by the consequence of the previous stage game, and, thus, mixed actions depend on the previously realized type and action profiles as well as the current own type of player
i oneself.
20The space
and
are endowed with the product weak topology. By Theorem 2.8 of Billingsley ([
16], p. 23),
in
if and only if
in
for each
i.
Lemma 1. Given , for each v and each σ, is measurable in . Then, is measurable in . Given , for each , is jointly continuous in .
Proof. Define
and
For each
,
is measurable in
;
is bounded, then Theorem 19.7 of Aliprantis and Border ([
22], hereafter AB, p. 627) implies
is measurable in
. Recall that, given
, for each
,
is measurable. Since
is bounded,
is measurable. Similarly, since for each
,
is measurable,
is also measurable.
Fix
. Consider a sequence
, where for each
i,
in weak-* topology. I have
Since
is continuous,
, and
The first inequality is by the triangle inequality. The second inequality is by the fact that is essentially bounded by and that .
As
in weak-* topology, by Proposition 3.13 of Brezis ([
23], p. 63), for any given point of
,
. This induces
. Moreover, as
in
X,
, because
for given
s is continuous, and
is a continuous linear functional of
.
21 This gives us
. Then, combined with norm continuity of
in
a, I have the RHS of the last inequality converging to 0. Now, continuity of
follows from that given
s, the family of real valued functions
is equicontinuous at each
a, and the result of Rao [
24] under the absolutely continuous information structure holds. □
Fix
and
v. For each
s, let
be the set of mixed action profiles induced by Bayesian–Nash equilibria of
(s). Then
The condition holds if and only if player
j’s equilibrium mixed action makes all the other player
i’s choice of pure actions
that belongs to the support of
indifferent for player
i in terms of the interim expected continuation value. This condition induces player
i to mix their actions. As a result, player
i’s mixed action gives the same payoff (in terms of the interim expected continuation value) to any pure action from the support of
.
Now define
Then, for all
. Below I follow Duggan’s [
12] approach.
Lemma 2. For each v, is nonempty, compact valued, and lower measurable.
Proof. A correspondence
is lower measurable, nonempty, and compact valued. To see this, I want, for each open subset
G of
X, the lower inverse image of
G to be measurable;
. Since
is generated by measurable rectangles,
if and only if
, where
is a subbasis element for the product topology on
X. From
,
is lower measurable for each
i, it is clear that
is lower measurable. It is also nonempty and compact valued because so is each
. Then,
is lower measurable, nonempty, and compact valued by Himmelberg and Van Vleck [
25].
Notice that
where, given
,
, and
. Similarly,
, and
is the Dirac delta measure concentrated at
. Define
Then,
implies
.
22 By Lemma 2 of Duggan [
12] and measurability of
for each
,
is lower measurable. Given
, Balder [
18] gives us nonemptiness of
for each
s. Recall that the interim expected payoff function
is continuous, and the finite product of
is compact. By the theorem of maximum (Theorem 17.31, AB, p. 570),
is compact subset of
for each
s. □
Given s as realized type profile, define the set of realized payoffs for player i from as . Then, is nonempty, compact valued, and lower measurable, since is continuous. By the Kuratowski–Ryll-Nardzewski measurable selection theorem (see Theorem 18.13, AB, p.600), it admits a measurable selector. Then, the correspondence is integrable. The set of interim expected payoffs for player i is denoted as . Let denote the Cartesian product .
Lemma 3. For each v, each , .
Proof. By Theorem 18.10 of AB (p. 598), the correspondence for each player
i’s realized payoffs
is, in fact, measurable. For any
, consider the
-section of the correspondence,
; it is clearly measurable. By Theorem 4 of Hildenbrand ([
26], p. 64),
Hence, for each
,
. Since Cartesian product of convex sets of
is convex in
,
. □
Lemma 4. The mapping is lower measurable, nonempty, compact, and convex valued.
Proof. Notice that for each
,
is nonempty and convex. Recall that
admits a measurable selector. Let
denote the set of measurable selectors from the correspondence. Note that
. Since
is compact, for each
, there is a family of functions that converge pointwise to
at each
. Notice that
is bounded, so a family of functions that converges pointwise to
is uniformly integrable. Recall that each
is a probability measure. Then,
is a set of finite measure; the finite Cartesian product
is also of finite measure. Obviously, the aforementioned family of functions is tight. Applying the Vitali convergence theorem (RF, p. 377), I obtain compactness of
for each
i. The finite product
is therefore compact at each
. Applying Tonelli’s theorem (RF, p. 420), I have
measurable. Thus, I also get the lower measurability of
. Then, by Proposition 2.3 (ii) of Himmelberg ([
27], p. 67),
is lower measurable. □
By the Kuratowski–Ryll-Nardzewski measurable selection theorem (Theorem 18.13, AB, p. 600), has a measurable selector. Given v, define to be the set of all -equivalence classes of measurable selectors of .
Lemma 5. The mapping is nonempty, closed graph, and convex valued.
Proof. By construction, for each
v,
is nonempty, closed, and convex. Recall that the interim expected payoff function
is continuous. By the theorem of maximum,
is upper hemicontinuous. Suppose a sequence
, where
and
. Lemma 1 tells us that for each player
i23,
Now I proceed similarly to Lemma 7 of Nowak and Raghavan [
11]. Suppose
and
in the product weak-* topology on
V. I want to show that
. By Mazur’s theorem and Alaoglu’s theorem (Brezis [
23], p. 61 and p. 66), there is sequence
made up of convex combinations of the
s that satisfies
as
. This implies that
converges to
g pointwise almost everywhere on
T. Let
for all
where
. Recall that
. Then, for each
, for each
m, if
, then
; since
,
; thus,
for each
and clearly
. Hence,
is upper-hemicontinuous. In addition, I observe that for each
m,
is closed and
is closed. By the closed graph theorem for correspondences (Theorem 17.11, AB, p. 561),
is closed graph. □
Proof of Existence Theorem. Obviously,
. By the Kakutani–Fan–Glicksberg theorem (see Theorem 17.55, AB, p. 583), I have a fixed point of
. Then, I have
such that given
,
for all
,
. Recall that
is measurable in
, continuous in
(Lemma 1). Now, by Filippov’s implicit function theorem (see Theorem 18.17, AB, p. 603), there exists a measurable mapping
such that, given
, for each
s,
and for all
i, each
,
For each
i,
,
admits a measurable selector by the Kuratowski–Ryll-Nardzewski measurable selection theorem (Theorem 18.13, AB, p. 600). Let
be any measurable selector of
. Given
, put
Then,
for all
s, and
Therefore,
f is the stationary Bayesian–Markov equilibrium strategy profile. □
4. Computational Algorithm
The flow of the proof in
Section 3 can be used as a computational algorithm to find an equilibrium. The algorithm is especially useful for the environment of heterogeneous beliefs.
24 First, find a fixed point in a set of interim expected continuous value functions, and second, extract an equilibrium strategy that generates the fixed point value function. In this sense, the computational algorithm for the dynamic Bayesian games in a continuous type space with periodic revelation is close to the value function iteration in macroeconometrics.
25 The difference between the popular macroeconometrics technique and the following algorithm is, however, that there are multiple agents who have heterogeneous beliefs about the others’ types in dynamic Bayesian games. That is, the highlight of the computational algorithm in this section is to find a fixed point for a array of interim expected continuous value functions given the heterogeneous beliefs.
Start the algorithm by approximating interim expected continuous value function
for player
i by a polynomial, preferably a Chebyshev orthogonal polynomial.
26 Accordingly, the type space and action space are given by grids of Chebyshev nodes. In addition, build the beliefs and transition probability distribution over different sets of grids for numerical integration, for example, a type space with equidistant nodes
and an action space with equidistant nodes,
for all
i. Over the grids, construct the probability distribution over the collection of types for the current stage,
, given any previous type and action profiles,
. From this
, beliefs for the other players
and the future type distribution
can be obtained.
Then, the main algorithm is given by the following backward iteration process: start with the
jth guess of
and plug it into the right-hand side. After numerical integration using
and
, the result from the right-hand side is fed to the loop as the
th guess of
. Repeat the iteration until
converges. By the existence theorem, a fixed point for
exists, and, thus, for large enough
j,
is guaranteed to be sufficiently close to
.
Define an operator
that implements Equation (
29). That is,
and, more precisely, it is an operation for coefficients of Chebyshev polynomials. As there are multiple players
, by stacking all the vectors of
for all
i, in the
jth iteration,
The sequence of the vectors
for all
i is guaranteed to converge as
is a contraction mapping. Denote the solution as
. Then, the associated equilibrium stationary Bayesian strategy profile
in Equation (
29) can be obtained by plugging in the solution
to the both sides.
5. Conclusions and Extension
In reality, economic agents rarely have a true understanding about the state of the economy. Often they have their own biased perceptions which reflect only part of the true state of the economy, and their perceptions may remain private. Moreover, such biased perceptions tend to be serially correlated. To reflect such aspects of reality more closely, dynamic games with asymmetric information can be a better modeling choice than ones with symmetric information. This paper constructs a class of dynamic Bayesian games where types evolve stochastically according to a first-order Markov process depending on the type and action profiles from the previous period. In this class of dynamic Bayesian games, however, players’ beliefs exponentially grow over time. To mitigate the curse of dimensionality in the dynamic Bayesian game structure, this paper considers the environment where the asymmetric information in the past becomes symmetric with a delay (periodic revelation). Specifically, this paper considers the case where the previous type profile is revealed as public information with a one-period delay. That is, type profile remains hidden when players choose actions, but type and action profiles are revealed when players obtain their payoffs. As common prior for the next stage Bayesian game is pinned down by the type and action profiles, players’ beliefs do not explode over time.
Theoretically, in the dynamic Bayesian game of this paper, the type space is a complete separable metric space (Polish space) and the action space is a compact metric space. A stationary Bayesian–Markov strategy is constructed as a measurable mapping that maps a tuple of the previous type profile, action profile, and player
i’s current type, i.e.,
, to a probability distribution over actions. Then, there exists a stationary Bayesian–Markov equilibrium. This stationary Bayesian–Markov equilibrium concept is related to the stationary correlated equilibrium concept in Nowak and Raghavan [
11] through Aumann [
14].
Similarly, the proof can be extended to a game with
K-period lagged revelation, i.e., in each period
t, players observe type and action profiles up to period
, whereas for periods from
to
t, they only observe the history of their own types, their own actions, and their own payoffs. Then, existence of a stationary Bayesian–Markov equilibrium with
K-period lagged revelation can be proved once players’ heterogeneous beliefs
are adjusted judiciously: The unrevealed history from period
to period
t plays the role of individual player’s new “type”, and beliefs are assumed to be induced by a common prior conditional on the lagged information
.
Figure 2 depicts the timeline. See the
Supplementary Materials for the formal proof.