Characterizing Agent Behavior in Revision Games with Uncertain Deadline

Zhuohan Wang; Dong Hao

doi:10.3390/g13060073

and

School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 610056, China

^*

Author to whom correspondence should be addressed.

Games2022, 13(6), 73;https://doi.org/10.3390/g13060073

This article belongs to the Section Algorithmic and Computational Game Theory

Version Notes

Order Reprints

Abstract

Revision game is a very recent advance in dynamic game theory and it can be used to analyze the trading in the pre-opening stock market. In such games, players prepare actions that will be implemented at a given deadline, before which they may have opportunities to revise actions. For the first time, we study the role of the deadline in revision games, which is the core component that distinguishes revision games from classic games. We introduce the deadline distribution into revision game model and characterize the sufficient and necessary condition for players’ strategies to constitute an equilibrium. The equilibrium strategy with respect to the deadline uncertainty is given by a simple differential equation set. Governed by this differential equation set, players initially fully cooperate, and the cooperation level decreases as time progresses. The uncertainty has a great impact on players’ behavior. As the uncertainty increases, players become more risk averse, in the sense that they prefer lower mutual cooperation rate rather than higher payoff with higher uncertainty. Specifically, they will not stay in full cooperation for a long time, while after they deviate from the full cooperation, they adjust their plans more slowly and cautiously. The deadline uncertainty can improve the competition and avoid collusion in games, which could be utilized for auction design and pre-opening stock market regulations.

Keywords:

revision game; uncertain deadline; stochastic process; multiagent cooperation; pre-opening stock market

1. Introduction

Revision game [1] is a multiplayer continuous-time game with continuous action space. It starts at time

- T

and ends at a fixed deadline time 0. Players can prepare an initial action at time

- T

and thereafter revise their action according to a Poisson process, which is called revision opportunity. During the game, players can fully observe each other’s action. When a revision opportunity arrives, players can change their actions simultaneously. While players can change their actions many times, the payoff is only obtained at the deadline, which depends on the last action players choose.

Some work explores general properties of revision games. When players do not act simultaneously, Moroni et al. propose asynchronous revision games and prove the existence of trembling hand and sequential equilibrium [2]. The use of asynchronous revision game in common and opposing game is studied by [3]. Stochastic revision game is defined by [4] where players’ payoff not only depends on the last action, but also the environment state. They prove the existence of Markov perfect equilibrium. Gensbittel et al. study the equilibrium payoff revision game, which they call revision value, and characterize the equilibrium strategy for zero-sum revision games [5].

Many real-world scenarios can be modeled as revision games. A mostly well-studied case is the pre-opening phase in stock markets such as Nasdaq or Euronext. Traders can submit orders before the opening of the market, which can be changed until the opening time. The opening price is determined by submitted order price and quantity, which can be seen on the public screen. Another case is the online auction websites like eBay, where bidders actually play a revision game during the auction. The eBay auctions usually have a deadline, before which bidders can revise their bids many times. Bidders’ opportunities to use eBay and to change bids are following a stochastic process (many human activities can be characterized by a Poisson process). In [1], they apply revision game model into stock pre-opening period, where traders can revise orders based on the refreshment of public screen until market opens. They point out that traders may have incentive to revise their order over time during the pre-opening phase and form collusion. Although the result can not be interpreted as a precise description of the pre-opening phase, it provides possibility of implicit collusion among the market participants. Kamada and Sugaya also use revision game model to analyze candidates behaviors of election campaign. They explain why candidates use ambiguous language in campaigns and change their policies as the campaign progresses [6].

However, when players engage in games with a fixed deadline, there might be many problems. For example, a fixed deadline may lead to price manipulation [7]. Traders can profit from manipulating opening price by submitting large orders in the very last few seconds [8]. To tackle this realistic challenge, many exchanges, like Euronext, Deutshe Borse and Tel Aviv Stock Exchange, switched from opening trade that ends at a fixed time to one that ends at a random time [9]. Introducing uncertain deadline is also meaningful for online auction platforms such as eBay, where auctions usually have fixed deadlines [10]. Roth and Ockenfels find that there exist many late bids (called snipping), where some bidders may not bid until the last possible moment, thus reduce seller’s revenue [11]. Ockenfels and Roth show that snipping is sensitive to the rules of how auction ends [12]. Füllbrunn and Sadrieh conduct experiments about random-deadline auction and show that bidders in such auctions bid more frequently in the early stage than in fixed-deadline auctions [13]. Google also designs a patent of random ending time system for online auction so that bidders have no preferences over the time of bidding [14]. Häfner and Stewart show that by choosing appropriate ending time distribution it can mitigate front-running problem in a discrete blockchain auction [15].

In this work, we introduce the deadline distribution into revision game model and characterize the sufficient and necessary condition for players’ strategies to constitute an equilibrium, where the equilibrium strategy is given by a simple set of differential equation set. With theoretical analysis and experimental results on Cournot game and public goods game, we find that as deadline uncertainty increases, players will become less cooperative than in the fixed-deadline situation thus collusion can be controlled. Therefore, our work can be helpful in auction design and pre-opening stock market regulations.

2. Preliminaries

Consider a symmetric game with two players

i = 1, 2

(our results easily extends to n-player case), where player i’ action and payoff are denoted by

a_{i} \in A_{i}

and

π_{i} (a_{1}, a_{2})

, respectively. Players share the same action spaces A which is convex in

R

. The game starts at time

- T

and ends at time 0. The ending time is called deadline. Players prepare an initial action at time

- T

and can revise actions when revision opportunities arrive during time interval

(- T, 0]

. Revision opportunities’ arrival is based on a Poisson process with arrival rate

λ > 0

. Players can fully observe the opponents’ initial action and subsequent revised actions. Players revise their actions according to revision opportunities simultaneously without any cost. There is only one payoff for each player, which is realized at the deadline. In this paper, we will analyze the symmetric equilibrium for this game, thus we denote players’ payoffs at a symmetric action profile as

π (a) : = π_{1} (a, a) = π_{2} (a, a)

. We follow the following assumptions of revision games [1].

Assumption 1.

For each stage game at time

- t

, there exists a unique pure symmetric Nash equilibrium action profile

(a^{N}, a^{N})

, which is a where both players totally defect, and their payoffs

π^{N} : = π (a^{N}, a^{N})

; there is also a unique optimal symmetric action profile

(a^{*}, a^{*})

and

a^{*} = {arg max}_{a} π (a, a)

. If

a^{N} < a^{*}

,

π (a)

is strictly increasing for

a \in [a^{N}, a^{*}]

(symmetrically holds if

a^{*} < a^{N}

).

Assumption 2.

Assume player i’s action

a_{i}

and payoff

π_{i}

are continuous and

{max}_{a_{i}} π_{i} (a_{1}, a_{2})

always exists. We denote player i’s maximum deviation gain at a symmetric action profile

(a, a)

by

d (a) : = {max}_{a_{i}} π_{i} (a_{i}, a) - π_{i} (a, a)

. Moreover,

d (a)

is strictly increasing for

a \in [a^{N}, a^{*}]

(symmetrically holds if

a^{*} < a^{N}

).

Assumption 1 requires two distinct action profiles, one is Nash equilibrium action profile, the other is optimal action profile. The symmetric payoff

π (a)

monotonically decreases as we move away from the optimal action

a^{*}

. Assumption 2 requires the continuity of players’ action and payoff, and define the maximum deviation gain. The deviation gain monotonically increase as we move away from the Nash equilibrium. The assumptions above are all very common in continuous action games, and can be found in some classic games such as continuous prisoner’s dilemma, the classic Cournot competition and the Bertrand competition.

3. Trigger Equilibrium for Fixed Deadline

The currently existing equilibrium strategy is a grim trigger strategy. By grim trigger, a revision game player initially follows her plan, but punishes the opponent if a certain level of defection (i.e., the trigger) is observed.

Denote t the remaining period until the deadline. At any time

- t

, a player’s continuation plan is function

x (t) : [- T, 0] \to A

, which realizes an action a for each t. A plan decides an action a for each time point t. We say that action a is cooperative (or collusive) or achieves (some degree of) cooperation (or collusion) if it provides a higher payoff than the Nash equilibrium:

π (a) = π_{i} (a, a) > π^{N}

.

A symmetric grim trigger strategy for revision games is defined as follows. Players start with the initial action

x (T)

, and when a revision opportunity arrives at time

- t

, they change their actions to

x (t)

. If any player fails to choose

x (t)

, which we regard that as betrayal, both players choose the Nash action

a^{N}

in all future revision opportunities.

Next, we will characterize the set of trigger strategy equilibrium. Let

λ

be the arrival rate of a Poisson process. By this arrival rate, the probability of no Poisson arrival in the remaining time t is calculated as

e^{- λ t}

. The probability there is no future revision opportunity after t is

λ e^{- λ t}

. Therefore, the expected payoff of each player associated with strategy

x (t)

over period

(- T, 0]

can be calculated as:

V (x) : = π (x (T)) e^{- λ T} + \int_{0}^{T} π (x (t)) λ e^{- λ t} d t

(1)

To form subgame-perfect equilibrium at time t, the incentive constraint for the trigger strategy with plan

x (t)

is:

\underset{d e v i a t i o n}{\underset{︸}{d (x (t)) e^{- λ t}}} \leq \underset{p u n i s h m e n t}{\underset{︸}{\int_{0}^{t} [π (x (s)) - π^{N}] λ e^{- λ s} d s}}

(2)

The left side means that, if the player who deviates from the plan

x (t)

wants to realize the deviation gain, then no revision opportunity should arrive during

(- t, 0]

. The right side says that, if there is one revision opportunity that arrives at time

- s

, the player who deviates from the plan gets the punishment

π (x (s)) - π^{N}

. The incentive constraint shows that a player can not increase her payoff by deviating from

x (t)

. There are many

x (t)

that satisfy incentive constraint, so that many trigger strategy equilibrium exist.

The plan which satisfies the binding constraint of Equation (2) (i.e., the LHS is equal to the RHS) is called the trigger strategy equilibrium plan, or equilibrium plan. As time

- t

approaches the deadline 0, the RHS becomes zero. So the deviation gain on the LHS decreases as time goes by. This means the plan

x (t)

should be more uncooperative as the deadline approaches. When time is very close to the deadline,

d (x (t))

is near zero, so it must be that

x (t)

is close to the Nash equilibrium action

a^{N}

. At the very deadline, we can know that:

x (0) = a^{N}

. Equation (3) is the condition for subgame-perfect equilibrium: no player can obtain excess payoff by deviating the plan at any time

- t

. By solving Equation (2), one can have the following Theorem 1.

Theorem 1.

Assume d is differentiable on

(a^{N}, a^{*}]

and

d^{'} > 0

if

a^{N} < a^{*}

(or symmetrically

d^{'} < 0

if

a^{*} < a^{N}

), then differentiating both sides of the binding incentive constraint of Equation (2) by t, we obtain a differential equation about the continuation plan x as:

\frac{d x}{d t} = \frac{λ (d (x) + π (x) - π^{N})}{d^{'} (x)},

(3)

with the boundary condition

x (0) = a^{N}

. Equation (3) gives the equilibrium plan for revision games with fixed deadline.

Note that this differential equation is a first-order ordinary differential equation, where

x (t)

is monotonically decreasing for

a^{*} < a^{N}

(symmetrically holds if

a^{*} > a^{N}

). To solve it, we only need one boundary condition, which is

x (0) = a^{N}

. In summary, we can use the solution to Equation (3) with boundary condition

x (0) = a^{N}

as the trigger strategy in the fixed-deadline revision game. As long as every player obeys the trigger strategy, cooperation is achieved and everyone gets a higher payoff than Nash Equilibrium payoff.

4. Games with Bernoulli Deadline Distribution

In Section 3, the deadline (at time 0) is fixed and is common knowledge to every player. In this section, we will give the form of trigger strategy and equilibrium plan in the two-point random deadline situation.

4.1. Extended Revision Game

Consider a revision game where there are two possible deadlines, and the deadlines can be captured by binomial distribution. The scenario is depicted in Figure 1. The deadline is either time 0 with probability p or time

- a

with probability

1 - p

. Note that when players are in time period

[- t, - a)

, they do not know when is the deadline time, only know the deadline probability distribution. But when they are at time

- a

, they will know the exact deadline time because the game either ends at time

- a

or continues. When the game continues, player know the deadline is not time

- a

, but time 0. As long as players pass through the time

- a

, the two-point deadline game suddenly transfers to the fixed deadline situation. For each case of the deadline, we can utilize the trigger strategy described in Section 3 to determine players’ plan.

Figure 1. Two possible deadlines, one is at 0 with probability p, and the other one is at

- a

with probability

1 - p

. Player makes a plan taking into consideration of both possible deadlines.

Therefore, by extending the LHS of Equation (2), the expectation of deviation gain at time

- t

in the two-point deadline situation can be written as follows.

d (x (t)) e^{- λ t} \cdot p + d (x (t)) e^{- λ (t - a)} \cdot (1 - p) .

(4)

The first term in Equation (4) is the expectation of deviation gain if the deadline is time 0, the second term is the expectation of deviation gain if the deadline is time

- a

.

Similarly, by extending the RHS of Equation (2), we can also rewrite the expectation of continuation punishment in the future as follows:

\begin{matrix} {\int_{0}^{t} [π (x (s)) - π^{N}] λ e^{- λ s} d s} \cdot p + \\ {\int_{a}^{t} [π (x (s)) - π^{N}] λ e^{- λ s} d s} \cdot (1 - p), \end{matrix}

(5)

where the first integral is the punishment if the deadline is

- t = - a

, while the second integral is that if the deadline is

- t = 0

. It can be simplified into

\int_{a}^{t} [π (x (s)) - π^{N}] λ e^{- λ s} d s + p \int_{0}^{a} [π (x (s)) - π^{N}] λ e^{- λ s} d s

This simplified form indicates, no matter when the deadline is realized, players will punish the deviation in the time period

[- t, - a]

as long as there is a revision opportunity. But only if the deadline is time 0, they can take punishment measures in the time period

[- a, 0]

. So the punishment from

[- a, 0]

needs to multiply a probability factor p, while the punishment from

[- t, - a]

does not need to.

4.2. Risk-Averse Equilibrium Plan

With Equation (4) denoting the expectation of deviation gain, and Equation (5) denoting the expectation of future punishment, we write the incentive constraint like Equation (2) to form the subgame perfect equilibrium in time period

[- t, - a)

:

\begin{matrix} d (x (t)) e^{- λ t} \cdot p + d (x (t)) e^{- λ (t - a)} \cdot (1 - p) \leq \\ \int_{a}^{t} [π (x (s)) - π^{N}] λ e^{- λ s} d s + p \int_{0}^{a} [π (x (s)) - π^{N}] λ e^{- λ s} d s \end{matrix}

(6)

While there could be many trigger strategies which can satisfy Equation (6), we only focus on the strategy that can bring the player highest expected payoff.

Comparing Equation (2) and Equation (6), we can find that with deadline uncertainty introduced, the deviation gain in equation Equation (6) becomes larger than that in equation Equation (2), while the future punishment becomes smaller. This means with the deadline uncertainty, players get more temptation of deviation at time

- t

. Once deviated from the predetermined plan, the uncertainty of deadline can diminish the punishment harshness. Therefore, when players confronted with a more complex deadline rather than a fixed deadline, they will become less cooperative. Technically, the binding constraint of Equation (6) can give us the equilibrium plan in the following theorem.

Lemma 1.

For revision games with Bernoulli deadline distribution, the equilibrium plan on

[- t, - a)

satisfying the binding constraint of Equation (6) is:

\frac{d x}{d t} = \frac{λ \cdot d (x) + (π (x) - π^{N}) \cdot \frac{λ}{p + e^{λ a} (1 - p)}}{d^{'} (x)}

(7)

Proof.

Differentiating the LHS of Equation (6) by t, we can get

[d^{'} (x (t)) \frac{d x}{d t} - λ d (x (t))] \cdot e^{- λ t} \cdot [p + e^{λ a} (1 - p)] .

As for the RHS of Equation (6), notice that only the first item contains t, so we transfer the first item from the integral on

[a, t]

to the difference between integral on

[0, t]

and integral on

[0, a]

. After differentiating by t, the RHS of Equation (6) becomes

[π (x (t)) - π^{N}] \cdot λ e^{- λ t} .

Let the two differentials be equal, we can get Equation (7) as long as

d^{'} (x (t)) \neq 0

. □

Comparing Equation (7) with Equation (3), we can find that in the equilibrium plan for revision games with uncertain deadline, the players’ risk aversion is captured by the term

RA = \frac{1}{p + e^{λ a} (1 - p)},

(8)

where the denominator is greater equal than 1. Thus the gradient

\frac{d x}{d t}

for adjusting the plan becomes smaller, which indicates that even when confronting with one additional possible deadline, players become more conservative to adjust their actions. Thus we refer to the term in Equation (8) as the degree of risk aversion. As time point

- a

moves to time point 0, the denominator becomes closer to 1, which means as time goes by, players are more intent to go to a non-cooperative state. In the extreme case, if

- a = 0

, then

p + e^{λ a} (1 - p)

becomes 1 and Equation (7) degenerates to Equation (3). We will further interpret the meaning of the change in Section 4.2. It is worth noting that the Equation (7) only describes in which form the equilibrium strategy should be during the time period

[- t, - a)

, but the exact solution to it can not be determined because of the lack of boundary condition. We will give the method of how to calculate the equilibrium strategy in Section 5.

5. Games with Multiple Deadline Distribution

The previous section investigates revision games with uncertain deadline in a simple Bernoulli distribution case as a warm-up, this section extends these games into an arbitrary deadline distribution cases.

5.1. Multiple Possible Deadlines

Consider a game with a set of multiple possible deadlines

z

= [0, - z_{1}, - z_{2}, \dots, - z_{n}]

and the corresponding probability distribution

p

= [p_{0}, p_{1}, p_{2}, \dots, p_{n}]

. During time period

[- t, - z_{n})

, players can not be sure when the deadline will come. Similarly as in Equation (4), we can derive the expectation of deviation gain as follows.

\begin{matrix} G (t) = & d (x (t)) e^{- λ t} \cdot p_{0} + d (x (t)) e^{- λ (t - z_{1})} \cdot p_{1} \\ + \dots + d (x (t)) e^{- λ (t - z_{n})} \cdot p_{n} \end{matrix}

(9)

Each term in Equation (9) represents one expected deviation gain for a possible deadline. For instance, for deadline

- z_{k}

, the value of

e^{- λ (t - z_{k})}

denotes the probability no revision opportunity arrives during time period

[- t, - z_{k})

, while the deadline at time

- z_{k}

is realized with probability

p_{k}

.

Similarly, we can derive the continuation punishment when someone deviates from the predetermined plan at time

- t

:

\begin{matrix} P (t) = & {\int_{0}^{z_{1}} [π (x (s)) - π^{N}] λ e^{- λ s} d s} \cdot p_{0} + \\ {\int_{z_{1}}^{z_{2}} [π (x (s)) - π^{N}] λ e^{- λ s} d s} \cdot (p_{0} + p_{1}) + \dots + \\ {\int_{z_{n}}^{t} [π (x (s)) - π^{N}] λ e^{- λ s} d s} \cdot (p_{0} + p_{1} + \dots + p_{n}) \end{matrix}

(10)

This summation depicts, if the deadline arrives at time

- z_{n}

, the one who deviates can only be punished in time period

(- t, - z_{n}]

with probability

p_{0} + p_{1} + \dots + p_{n} = 1

. But, if the deadline arrives at time 0, the deviator has to suffer from the punishment all the way to time 0. The first item in Equation (10) represents the expectation of punishment in time period

(- z_{1}, 0]

, the second item represents the expectation of punishment in time period

(- z_{2}, - z_{1}]

, ⋯, the last item represents the expectation of punishment in period

(- t, - z_{n}]

.

For plan x to constitute an equilibrium, it is required that

G (t) \leq P (t) .

(11)

Equalizing

G (t)

and

P (t)

gives us the binding constraint for plan x together with them grim trigger mechanism to be an equilibrium strategy. According to the binding constraint, we easily get the following proposition.

Proposition 1.

When

t \to 0

, the value of

P (t)

in Equation (10) is extremely small, by the binding constraint

G (t) = P (t)

, it should be that

d (x (t)) \to 0

, which further requires that

x (t)

at time 0 is the Nash action

a^{N}

.

5.2. Equilibrium Plan as Differential Equation Set

To formally represent the equilibrium plan for any time, we do the following reasoning. Assume

- t \in (- T, 0]

is the current time and let

- y \in z

denote an arbitrary possible deadline. Let

- z^{'}

be the nearest possible deadline after time

- t

. That is,

z^{'} = max y

among all

y < t

. On the one hand, for each

- y

, denote a cumulative distribution function (CDF) of the time distance

t - y

by

F (t - y) = e^{- λ (t - y)}

, which is the cumulative probability that no revision opportunity arrives during period

[- t, - y]

. Therefore, the vector

[e^{- λ t}, e^{- λ (t - z_{1})}, \dots, e^{- λ (t - z_{n})}]

is a vector of CDFs. On the other hand, let

g (- y)

denote the probability density function (PDF) of deadline at time

- y

.

We differentiate Equation (9) and the last term of Equation (10) which is correlated with variable t, and let the two result be equal, then we obtain the equilibrium plan for multiple possible deadline situation. The result is given as follows.

Theorem 2.

The equilibrium plan for revision games with multiple possible deadlines in the whole game period

(- T, 0]

is characterized by a differential equation:

\begin{matrix} \frac{d x}{d t} = & \frac{λ \cdot d (x) + λ \cdot [π (x) - π^{N}] \cdot RA (t)}{d^{'} (x)} . \end{matrix}

(12)

The value

RA (t)

is the time-sensitive risk aversion rate and

RA (t) = \frac{F (t)}{\int_{y = 0}^{y = z^{'}} F (t - y) \cdot g (- y) d y},

(13)

where

t \in (- T, 0]

,

y \in z

,

z^{'} = max y

among all

y < t

.

The corresponding discrete form of

RA (t)

in Equation (13) is:

\begin{matrix} RA (t) = & \frac{e^{- λ t}}{[e^{- λ t}, \dots, e^{- λ (t - z_{n})}] \cdot p} . \end{matrix}

(14)

Comparing the two risk aversion rates in Equation (8) and in Equation (14), we can find that in the multiple possible deadline case, only the value of ra is different. Note that when

p_{1} = p_{2} = \dots = p_{n} = 0

or time points

- z_{1}, - z_{2} \dots, - z_{n}

are close enough to time point 0, Equation (12) degenerates to Equation (3), i.e., the multiple possible deadline case degenerates to the fixed deadline case where the monotonicity of

x (t)

remains the same. As time

- t

approaches time 0,

x (t)

approaches

a^{N}

.

Now we discuss about how to implement Theorem 2 and output the equilibrium plan x. Proposition 1 tells us that the final action at deadline should be the Nash action

a^{N}

. Taking this fact as an terminal condition, we can apply Equation (12) recursively to get x for all

t \in (- T, 0]

. In the first loop of the recursion, we can obtain the first part of

x (t)

where

t \in (z_{1}, 0]

by using

x (0) = a^{N}

and by introducing

z^{'} = 0

into Equation (13). Then we can know the agents’ action at time

- z_{1}

, which is a new seed for us to generate the second part of

x (t)

for

t \in (z_{2}, z_{1}]

. Repeat this operation for all

t \in (- T, 0]

, we can generate every part of the equilibrium plan x.

When recursively implementing Equation (12), the value of function g in Equation (13) could vary over time. This is because that as long as players pass through a possible deadline

z_{k}

, it will be certain that this deadline sample

- z_{k}

didn’t materialize, meaning that the probability distribution

p

(or equivalently, PDF g) over the remaining possible deadlines

[- z_{k - 1}, - z_{k - 2}, \dots, - z_{1}, 0]

is updated. This updating is by Bayes’ law as follows, where

p^{'}

is the new probability belief over the rest deadline points.

p^{'} = \frac{[p_{k - 1}, p_{k - 2}, \dots, p_{0}]}{1 - p_{k}} .

(15)

Therefore, at different time point

- t

, different g further results in different risk aversion rate

RA (t)

, which finally affects the gradient of plan x at each time point

- t

.

5.3. Risk Aversion Rate in Equilibrium

Theorem 2 characterizes agents’ behavior in the equilibrium of general revision games with multiple possible deadlines. The equilibrium plan is given in a simple form as a differential equation, where the risk aversion rate quantifies how players handle the deadline uncertainty, and this uncertainty is time-sensitive. The risk aversion rate is correlated to the CDF of the Poisson arrival rate, the PDF of the deadline distribution, as well as the current time

- t

and the nearest possible deadline

- z^{'}

. Basically, it has the following features.

Proposition 2.

The risk aversion raterais bounded in

(0, 1]

.

Proof.

We only need to check the range of the denominator of the first line in Equation (14). Note that

p_{0} + p_{1} + \dots + p_{n} = 1

and

e^{λ z_{n}} > e^{λ z_{1}} > \dots > 1

, so the denominator ranges in

[1, + \infty)

then ra ranges in

(0, 1]

. □

Proposition 3.

Assume the deadline probability distribution is continuous uniform distribution on

[- γ, 0]

. As

γ \to 0

, it will be that

RA \to 1

, and Equation (12) degenerates to that in the fixed-deadline situation. As

γ \to \infty

, it will be that

RA \to 0

, and Equation (12) becomes

\frac{d x}{d t} = \frac{λ d (x)}{d^{'} (x)}

.

Proof.

We can rewrite ra as

RA (γ) = \frac{1}{(e^{λ d s \cdot 0} + \dots + e^{λ γ}) \cdot \frac{d s}{γ}} = \frac{γ}{\int_{0}^{γ} e^{λ s} d s} = \frac{γ \cdot λ}{e^{λ γ} - 1} .

When

γ \to 0

, ra is close to 1, which means that Equation (12) degenerates to Equation (3). As

γ

approaches infinity, which means that the deadline distribution coverage reaches infinity, ra becomes zero. This means that the item

π (x) - π^{N}

does not exist and the differential equation becomes

\frac{d x}{d t} = \frac{λ d (x)}{d^{'} (x)}

. □

According to the Poisson process with arrival rate

λ

, on the one hand, we can find that in Equation (13) the numerator is

F (t) = e^{- λ t}

is simply “the probability of no revision opportunity comes from

- t

to deadline 0 for the revision games with a fixed deadline”. On the other hand, in the denominator, each term

e^{- λ (t - z_{k})}

is the probability that no revision opportunity arrives from time

- t

to the k-th possible deadline

- z_{k}

, while each term

p_{k}

in the second vector

p

is the probability density of the k-th possible deadline. Therefore, the denominator as a whole denotes the “the probability of no revision opportunity comes from

- t

to the uncertain deadline for the revision games with multiple possible deadline”.

Theorem 3.

(a)

. The risk aversion raterais monotonically non-decreasing in time

- t

.

(b)

. For a given time

- t

and a fixed density

g (\cdot)

of possible deadline,rais monotonically decreasing as the deadline range

(z_{n}, 0]

expands.

Proof.

Assuming that players at

[- z_{k + 1}, - z_{k})

, the time points ahead are

- z_{k}, - z_{k - 1}, \dots, 0

and the corresponding deadline distribution is

p

= [p_{k}, p_{k - 1}, \dots, p_{0}]

. After they pass through time point

- z_{k}

, they will update the belief of deadline probability on

- t_{k - 1}, - t_{k - 2},

\dots, 0

from

p

to

p^{'}

by Equation (15). Let

A = p_{0} + e^{λ z_{1}} p_{1} + \dots + e^{λ z_{k - 1}} p_{k - 1}

, and multiply ra by

e^{λ t}

, then the denominator of new ra changes from

A + e^{λ z_{k}} p_{k}

to

\frac{A}{1 - p_{k}}

. Consider the difference:

A + e^{λ z_{k}} p_{k} - \frac{A}{1 - p_{k}} = (e^{λ z_{k}} (1 - p_{k}) - A) \cdot \frac{p_{k}}{1 - p_{k}} .

If

p_{k} = 0

, when means that the possibility of the deadline arrives at time

- t_{k}

is zero, then the difference becomes zero and denominator doesn’t change as well as ra. If

p_{k} \neq 0

, we only need to check the sign of

e^{λ z_{k}} (1 - p_{k}) - A

, which is

e^{λ t_{k}} - (p_{0} + e^{λ z_{1}} p_{1} + \dots + e^{λ z_{k}} p_{k}) > 0 .

This means that the denominator of ra becomes smaller and ra becomes larger as players pass through the deadline point. Thus, we can get two conclusions. Firstly, as players progress with time, ra is monotonically non-decreasing; Secondly, the wider range the distribution spans, the smaller ra becomes. □

The pattern of ra revealed in Theorem 3 is important. By point

(a)

, now we can know and explain the underlying mechanism that, as time goes by, players decay to mutual betrayal (i.e., action profile

(a^{N}, a^{N})

) faster and faster. By point

(b)

, it can be seen that as the deadline uncertainty increases, players’ plan becomes more conservative, in the sense that they do not prefer to change actions much. Thus Equation (13) and Theorem 3 significantly quantify agents’ risk aversion and their behaviors regarding different levels of uncertainty of the games’ deadlines. Figure 2 shows example equilibrium plans which are computed by Theorem 2 and have features in Theorem 3. We have the following observations from Figure 2. (i) By introducing deadline uncertainty, players deviate from full cooperation earlier and become less cooperative in the rest time. Thus, setting an uncertain deadline can reduce collusion among players. (ii) As the coverage of possible deadlines increases, ra monotonically decreases. Thus, with a larger deadline coverage, players not only stay shorter in the full cooperation state, they also adjust their actions more slowly.

Figure 2. Plans for Cournot game with various deadline distribution.

In a word, on the one hand, by utilizing Theorems 2 directly, we can derive the equilibrium plans. The solution of Equation (12) is essentially a generalized version of plans for players engaging in games with uncertain deadlines, and the existing plan derived in [1] is an extreme case of our result. On the other hand, by utilizing these results reversely, a mechanism designer can introduce the uncertain deadline into markets or auctions to reduce agents’ collusion and make the system better.

6. Experiments

We implement our theoretical results into the conventional Cournot duopoly game and public goods game.

6.1. Cournot Duopoly Game

There are two firms

i = 1, 2

, whose production is denoted by

x_{i}

. The payoff function for firm i is given by

π_{i} = (a - (x_{1} + x_{2}) - c) x_{i}

. Two firms can prepare an initial action at

- T

and revise their production amount according to a Poisson process until the unknown deadline. We set

a = 4

,

c = 1

. In the one-shot Cournot game, the full cooperation action is

0.75

and the Nash action is 1, the corresponding payoffs are

1.125

and 1, respectively. The revision Cournot game starts at time

T = - 10

.

We introduce normal deadline distribution to our Cournot duopoly game. We change

μ

,

σ

of normal deadline distribution and Poisson process

λ

to check how these parameters affect the players’ behavior and payoff. We investigate the effects from four perspectives, the first is full cooperation duration as the red line depicted in Figure 2, the second is players’ average returns which represent producers’ payoff, the third is the Price of Anarchy (POA) [16] measuring the efficiency of this system, the fourth is consumers’ surplus [17] representing how much profits consumers can obtain from producers’ game. Red, blue and green colors in Figure 3 represent the coverage of deadline distribution, which are

[- 5, 0]

,

[- 7, 0]

and

[- 10, 0]

, respectively, and the corresponding

μ

is

2.5

,

3.5

, 5.

Figure 3. Performances of equilibrium plan in revision Cournot game with various deadline uncertainty settings. For each column, we change

μ

,

σ

of normal deadline distribution and Poisson process

λ

, to check how these parameters affect the players’ behavior and payoff in each row.

(i) In the first column of Figure 3, we set

σ = 1

,

λ = 0.2

and change

μ

of normal distribution. We find that as

μ

approaches the starting time

- T

, the cooperation time and expected return decrease. We think that the movement of

μ

towards

- T

increases the chance that deadline arrives in a short time, thus producers choose to deviate full cooperation earlier and then obtain lower payoff, then the POA of this system becomes lower. As

μ

approaches the starting time

- T

, consumers can get more surplus because producers become less cooperative.

(ii) In the second column, we set

μ = 2.5, 3.5, 5

, respectively,

λ = 0.2

and check what

σ

’s impact is. It shows that, when we increase

σ

of the normal distribution, cooperation time increases but the average payoff goes down. When given the plan, players’ payoff only depends on the last revision time. With bigger

σ

, probability that the deadline arrives at a position where is far away from

μ

increases, as well as the probability that the last revision opportunity arrives near time 0. When this happens, producers’ last action is approaching the Nash action, therefore the payoff decreases and the POA becomes higher. As producers’ profit decreases, the consumers’ surplus increases.

(iii) In the last column, we set

μ = 2.5, 3.5, 5

, respectively,

σ = 3

and increase Poisson

λ

from

0.2

to

0.5

. Doing so decreases the expectation of deviation gain in Equation (9) and increases the expectation of future punishment in Equation (10), thus makes cooperation time longer and return higher and reduces POA. This also decreases the consumers’ surplus due to the enforced cooperation between producers.

6.2. Public Goods Game

There are N players in our public goods game, where every player has money amount

M = 5

. We set the multiplication factor

δ = 2

, which means that the money output is two times of the money input in the public pool. Then the total output is divided equally among each player no matter they donate or not. The Nash equilibrium of this game is that everyone donates nothing while the group’s total payoff is maximized when everyone donates everything(full cooperation). Players prepare an initial donation amount at time

- T

and revise the amount according to the Poisson process until the uncertain deadline arrives. The revision public goods game starts at time

T = - 10

.

We introduce normal deadline distribution to our public goods game. We change player number N,

μ

,

σ

of normal deadline distribution and Poisson process

λ

to check how these parameters cause effects on the players’ behavior and payoff. Almost like in the Cournot duopoly game, we choose players’ full cooperation time, players’ average return and Price of Anarchy of the game as our observation indicators. Red, blue and green colors in Figure 4 represent the coverage of deadline distribution, which are

[- 5, 0]

,

[- 7, 0]

and

[- 10, 0]

, respectively, and the corresponding

μ

is

2.5

,

3.5

, 5.

Figure 4. Performances of equilibrium plan in revision public goods game with various player number and deadline uncertainty settings. For each column, we change player number N,

μ

,

σ

of normal deadline distribution and Poisson process

λ

, to check how these parameters cause effects on the players’ behavior and payoff in each row.

(i) In the first column of Figure 4, we set

μ = 5

,

σ = 3

and

λ = 0.5

, and range player number from 3 to 12. We can see that as the player number increases, players will be less cooperative and lower payoffs will be obtained. Meanwhile, POA increases along with player number’s increasing.

(ii) In the second column, we set

σ = 3

,

λ = 0.5

and player number

N = 6

. As

μ

ranges from time 0 to time

- 10

, full cooperation time and players’ average return decreases and POA increases.

(iii) In the third column, we set

μ = 2.5, 3.5, 5

,

λ = 0.5

,

N = 6

, and range

σ

from 1 to 5. We find that as

σ

increases, full cooperation time increases as well as average return of [−10, 0] distribution coverage, while average return of [−7, 0] and [−5, 0] distribution coverage basically remains the same.

(iv) In the last column, we set

μ = 5

,

σ = 3

,

N = 6

and range Poisson process

λ

from

0.5

to

0.8

. We can see that as

λ

increases, cooperation time and average return both increases and POA decreases correspondingly.

7. Conclusions and Future Works

This work is the first to identify the subgame perfect equilibrium in revision games with uncertain deadline. Our derived equilibrium plan consists of a set of differential equations, each of which contains a risk averse rate. We find some properties about the risk averse rate ra, such as when players move with time, ra is monotonically non-decreasing; when deadline distribution expands, ra is monotonically decreasing. By introducing deadline uncertainty, players become less cooperative thus collusion can be controlled.

We must point out that, although many scenarios like pre-opening and online auction can be modeled as revision game, the equilibrium strategy proposed by [1] or this paper can only provide one possibility of the reality. Thus, there are more work to do in the future to make our strategy more robust and practical. We plan to set players’ payoff function differently [18] and characterize their behaviors in uncertain deadline situation. Moreover, when players can not fully observe other people’s action [19], how to maintain cooperation in revision game is a realistic problem.

Author Contributions

Original draft, Z.W.; Review and editing, D.H. Both authors contributed to the writing and editing of the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Not applicable.

Acknowledgments

We would like to thank the referees for suggestions and comments that improved this paper and the academic editors for their work.

Conflicts of Interest

The authors declare no conflict of interest.

References

Kamada, Y.; Kandori, M. Revision games. Econometrica 2020, 88, 1599–1630. [Google Scholar] [CrossRef]
Moroni, S. Existence of Trembling Hand Perfect and Sequential Equilibrium in Games with Stochastic Timing of Moves; Working Paper Series 19/005; Department of Economics, University of Pittsburgh: Pittsburgh, PA, USA, 2018. [Google Scholar]
Calcagno, R.; Kamada, Y.; Lovo, S.; Sugaya, T. Asynchronicity and coordination in common and opposing interest games. Theor. Econ. 2014, 9, 409–434. [Google Scholar] [CrossRef][Green Version]
Lovo, S.; Tomala, T. Markov Perfect Equilibria in Stochastic Revision Games; HEC Paris Research Paper No. ECO/SCD-2015-1093; HEC Paris: Paris, France, 2015. [Google Scholar]
Gensbittel, F.; Lovo, S.; Renault, J.; Tomala, T. Zero-sum revision games. Games Econ. Behav. 2018, 108, 504–522. [Google Scholar] [CrossRef]
Kamada, Y.; Sugaya, T. Valence Candidates and Ambiguous Platforms in Policy Announcement Games; University of California Berkeley: Berkeley, CA, USA, 2014. [Google Scholar]
Goldstein, I.; Guembel, A. Manipulation and the allocational role of prices. Rev. Econ. Stud. 2008, 75, 133–164. [Google Scholar] [CrossRef]
Ni, S.X.; Pearson, N.D.; Poteshman, A.M. Stock price clustering on option expiration dates. J. Financ. Econ. 2005, 78, 49–87. [Google Scholar] [CrossRef]
Hauser, S.; Kamara, A.; Shurki, I. The effects of randomizing the opening time on the performance of a stock market under stress. J. Financ. Mark. 2012, 15, 392–415. [Google Scholar] [CrossRef]
Sailer, K. Searching the eBay Marketplace, CESifo Working Paper No. 1848. 2006. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=949418 (accessed on 8 September 2022).
Roth, A.E.; Ockenfels, A. Last-minute bidding and the rules for ending second-price auctions: Evidence from eBay and Amazon auctions on the Internet. Am. Econ. Rev. 2002, 92, 1093–1103. [Google Scholar] [CrossRef]
Ockenfels, A.; Roth, A.E. Late and multiple bidding in second price Internet auctions: Theory and evidence concerning different rules for ending an auction. Games Econ. Behav. 2006, 55, 297–320. [Google Scholar] [CrossRef]
Füllbrunn, S.; Sadrieh, A. Sudden termination auctions—An experimental study. J. Econ. Manag. Strategy 2012, 21, 519–540. [Google Scholar] [CrossRef]
Megiddo, N. Smooth End of Auction on the Internet. U.S. Patent 6,665,649, 16 December 2003. [Google Scholar]
Häfner, S.; Stewart, A. Blockchains, Front-Running, and Candle Auctions. Front-Running, and Candle Auctions (14 May 2021). 2021. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3846363 (accessed on 10 September 2022).
Guo, X.; Yang, H. The price of anarchy of Cournot oligopoly. In Proceedings of the International Workshop on Internet and Network Economics, Hong Kong, China, 15–17 December 2005; pp. 246–257. [Google Scholar]
Anderson, S.P.; Renault, R. Efficiency and surplus bounds in Cournot competition. J. Econ. Theory 2003, 113, 253–264. [Google Scholar] [CrossRef]
Selton, R. Re-examination of the perfectlessness concept for equilibrium in extensive games’. Int. J. Game Theory 1975, 4, 22–25. [Google Scholar]
Kreps, D.M.; Wilson, R. Reputation and imperfect information. J. Econ. Theory 1982, 27, 253–279. [Google Scholar] [CrossRef]

Figure 1. Two possible deadlines, one is at 0 with probability p, and the other one is at

- a

with probability

1 - p

. Player makes a plan taking into consideration of both possible deadlines.

Figure 1. Two possible deadlines, one is at 0 with probability p, and the other one is at

- a

with probability

1 - p

. Player makes a plan taking into consideration of both possible deadlines.

Figure 2. Plans for Cournot game with various deadline distribution.

Figure 3. Performances of equilibrium plan in revision Cournot game with various deadline uncertainty settings. For each column, we change

μ

,

σ

of normal deadline distribution and Poisson process

λ

, to check how these parameters affect the players’ behavior and payoff in each row.

Figure 3. Performances of equilibrium plan in revision Cournot game with various deadline uncertainty settings. For each column, we change

μ

,

σ

of normal deadline distribution and Poisson process

λ

, to check how these parameters affect the players’ behavior and payoff in each row.

Figure 4. Performances of equilibrium plan in revision public goods game with various player number and deadline uncertainty settings. For each column, we change player number N,

μ

,

σ

of normal deadline distribution and Poisson process

λ

, to check how these parameters cause effects on the players’ behavior and payoff in each row.

Figure 4. Performances of equilibrium plan in revision public goods game with various player number and deadline uncertainty settings. For each column, we change player number N,

μ

,

σ

of normal deadline distribution and Poisson process

λ

, to check how these parameters cause effects on the players’ behavior and payoff in each row.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Characterizing Agent Behavior in Revision Games with Uncertain Deadline

Abstract

1. Introduction

2. Preliminaries

3. Trigger Equilibrium for Fixed Deadline

4. Games with Bernoulli Deadline Distribution

4.1. Extended Revision Game

4.2. Risk-Averse Equilibrium Plan

5. Games with Multiple Deadline Distribution

5.1. Multiple Possible Deadlines

5.2. Equilibrium Plan as Differential Equation Set

5.3. Risk Aversion Rate in Equilibrium

6. Experiments

6.1. Cournot Duopoly Game

6.2. Public Goods Game

7. Conclusions and Future Works

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics