Next Article in Journal
Erev, I. et al. A Choice Prediction Competition for Market Entry Games: An Introduction. Games 2010, 1, 117-136
Previous Article in Journal
Backward Induction versus Forward Induction Reasoning
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Shapley Polygons in 4 x 4 Games

Department of Mathematics, University Vienna, c/o Josef Hofbauer, Nordbergstraße 15, A-1090 Wien, Austria
Games 2010, 1(3), 189-220; https://doi.org/10.3390/g1030189
Submission received: 9 June 2010 / Accepted: 14 July 2010 / Published: 15 July 2010

Abstract

:
We study 4 × 4 games for which the best response dynamics contain a cycle. We give examples in which multiple Shapley polygons occur for these kinds of games. We derive conditions under which Shapley polygons exist and conditions for the stability of these polygons. It turns out that there is a very strong connection between the stability of heteroclinic cycles for the replicator equation and Shapley polygons for the best response dynamics. It is also shown that chaotic behaviour can not occur in this kind of game.

1. Introduction

Identifying cycles in games is easy, but it is not easy to analyse the qualitative behaviour of these cycling structures. It is a priori not clear, if these systems lead to convergence to an interior fixed point, to an periodic attractor or to chaotic behaviour. For example even the structure of the simple RSP game can either lead to convergence to a Nash equilibrium or to convergence to a periodic orbit for the best response dynamics. In [1] the RSP game was analysed—besides some other low dimensional games—for the best response dynamics and the results were compared to results for the replicator equation. A strong connection was found between the limit set of time averages of the orbits for the replicator equation and the ω limits of the best response dynamics. A later paper [2] shows that this connection was not specific for these games but is true in a more general sense. It is shown in this paper that the limit set of the time averages (of orbits starting in the interior) for the replicator equation is a subset of the maximal invariant set for the best response dynamics. Cycles for low dimensional games for the replicator equation are for example thoroughly analysed in [3,4]. In these papers conditions are given under which so called heteroclinic cycles are attracting or repelling. Permanence is an important property in evolutionary game theory and the existence of an attracting heteroclinic always excludes the possibility that a system is permanent. Permanent means—biologically speaking—that all species are safe from going extinct. We will show equivalence between permanence for the replicator equation and the non existence of Shapley polygons for the best response for so called monocyclic payoff matrices.
We will also analyse games with embedded RSP cycles for the best response dynamics and give a classification as complete as possible for them. A comparison of the results for the best response dynamics to those for the replicator dynamics is made with respect to [2] where the strong connection between the time average for the replicator equation and the invariant sets for the best response dynamics is shown. For the analysis of the best response dynamics we construct a two dimensional return map. Surprisingly this return map is very similar to the return map for the analysis of the replicator equation in [3], to be more precise the transition matrices are identical. These transition matrices play an important role in the analysis and as they are identical we get the equivalence of the existence of Shapley polygons for the best response dynamics and the existence of a relatively asymptotically stable heteroclinic cycle1. In this paper some counterexamples are given for some conjectures that might be drawn from the analysis of RSP game for the best response dynamics. It is shown that the invariant set V 0 for the RSP does not have to be invariant in 4 × 4 games. It is also shown that V ( x ( t ) ) is not always a good Ljapunov function. As a ‘side product’we get that for this class of games no chaotic behaviour can occur.
The paper is structured as follows: It starts with some general assumptions on the payoff matrix. The main results of this paper and their discussion with respect to earlier results for the replicator equation can be found in Section 3. In Section 4 examples are provided for different asymptotical behaviour for these kinds of games. The remaining sections contain the construction and analysis of the return map.

2. Preliminaries and Assumptions on the Payoff Matrix A

In this work we will deal with finite, symmetric 2 person normal form games. The player’s payoffs are summarised in the ( n × n ) matrix A where a i j describes the payoff of strategy i against strategy j.
A = ( a i j ) , i , j = 1 , , n
The expression A | I always describes the payoff matrix for a game restricted to the strategies contained in the index set I. (If there is no risk of confusion the index set I will not be written down explicitly.)
We use the following notation in this work
Δ n : = { x R n : x i 0   a n d   i = 1 n x i = 1 }
which represents the ( n 1 ) -dimensional simplex. Vectors are written in bold letters e.g x . The i-th vertex of the simplex is denoted with e i . The standard scalar product of two vectors u and v is written in the following way
u · v : = i = 1 n u i v i
Lastly we define the set R 0 n by
R 0 n : = { x R n : i = 1 n x i = 0 }
If there is no risk of confusion, we will call the vertex e i sloppily i. In this paper we will analyse games for the best response (or best reply—used equivalently) dynamics (see [5]), which is of the following form
x ˙ B R ( x ) x
where B R ( x ) stands for the set of best responses to a given state x . As mentioned we will compare the results of the qualitative behaviour to the behaviour of the replicator equation [6].
x ˙ i = x i ( ( A x ) i x · A x )
We define the set B i , which is the set of all x Δ n for which i is the unique best reply against x . More generally, we define B K with K { 1 , , n } , K , the set of all x Δ n for which all pure strategies in K are a best response against x and there are no other pure best responses,
B K : = { x Δ n : k B R ( x ) k K a n d j B R ( x ) f o r j K }
Definition 1
In this paper we call a game generic, if
( i ) B K = o r d i m B K = n c a r d ( K ) f o r a l l n o n e m p t y K { 1 , , n } a n d
( i i ) i = 1 n B i ¯ = Δ n
hold.
Following [6] and [7] we state the following lemma
Lemma 1
Let
V : Δ n R , V ( x ) : = max i ( A x ) i
and let the payoff matrix A = ( a i j ) be normalised in the way that a i i = 0 holds for all i = 1 , , n then V satisfies V ˙ ( x ) = V ( x ) for all x B j and hence | V ( x ( t ) ) | is strictly decreasing along the piecewise linear solutions to Equation (1) inside each B j .
Moreover, if x ( t ) is in j B j for almost all t > 0 we get V ( x ( t ) ) 0 as t . In other words the orbits converge to the set V 0 , which is defined as
V 0 : = { x Δ n : V ( x ) = 0 }
Definition 2
A periodic orbit Γ under the best response dynamic is called a Shapley polygon.
Let us now introduce better and best reply cycles.
Definition 3
If a j i > a k i for all k j and a j j > a i j hold, we call this a best response arc2 from e i to e j and write symbolically e i e j .
This simply means that j is the unique best reply to i and j is a better reply against itself than i. This gives rise to a directed graph G for all best response arcs.
Definition 4
We call a cycle in this directed graph G a best response cycle.
For example if the graph is e 1 e 2 e 3 e 4 e 2 , then the best response cycle connects e 2 , e 3 and e 4 . Clearly, a game can have more than one best response cycle, but these cycles are disjoint. Note that neither a best response cycle nor a best response arc have to exist.
In analogy we define a better reply arc from e i to e j , if a j i > a i i and a j j > a i j hold and write symbolically e i e j . Again we get a directed graph G and cycles in this graph G are called better reply cycles. Note that better reply cycles do not have to be disjoint and every best reply cycle is also a better reply cycle. If there is no risk of confusion we will write only the best response (or reply) cycle 1234 instead of e 1 e 2 e 3 e 4 e 1 . We also write the better reply cycle 1234 instead of e 1 e 2 e 3 e 4 e 1 .
In this paper we are especially interested in games with a full better reply cycle, which means a cycle exists using all pure strategies. This also means that no pure strategy can become a Nash equilibrium. Despite this cycling behaviour, there is no need for a periodic orbit to exist for example because Nash equilibria can attract all orbits (see for example Section 7). On the one hand it will turn out that the number of Nash equilibria is very hard to predict (see Section 4); on the other hand it will turn out that for certain classes of payoff matrices only solutions for a set of measure 0 of starting values converge to a Nash equilibrium.
Throughout this paper we consider the following payoff matrix A
A = ( a i j ) = 0 c 2 t 3 e 4 e 1 0 c 3 t 4 t 1 e 2 0 c 4 c 1 t 2 e 3 0
and assume the following for all i for the payoff matrix A, unless stated otherwise
( i ) e i > 0 ,
( i i ) c i > 0 ,
( i i i ) if t i > 0 then t i + 2 < 0 and
( i v ) the game is generic as in Definition 1
This notation is from [3], in which the stability of so called heteroclinic cycles for the replicator dynamics was analysed. A heteroclinic cycle corresponds to a better reply cycle. The e i s stand for the expanding eigenvalues in direction of the heteroclinic cycle, the c i s correspond to the contracting eigenvalues along the heteroclinic cycle and the t i s are the eigenvalues transverse to the heteroclinic cycle. Note that we can only speak of eigenvalues with respect to the replicator equation3.
Remark 
The assumptions (9) and (10) guarantee the existence of a heteroclinic cycle for the replicator equation. A heteroclinic cycle is a union of orbits j = 1 n x j ( t ) (together with their α and ω limits, which are fixed points for the system), for which the ω-limit of x i ( t ) is the α-limit of x i + 1 ( t ) (with i + 1 taken modulo n). The stability of such heteroclinic cycles has thoroughly been studied in [3,4].
Remark 
(9) and (10) assure that a better reply cycle e 1 e 2 e 3 e 4 e 1 exists for A. Whereas (11) assures that no restricted two strategy game has a stable interior Nash equilibrium. This is easy to see: If both t i and t i + 2 are greater than zero, we get the following structure in the corresponding matrix of the restricted game
P i , i + 2 = 0 + + 0
which means that the interior Nash equilibrium of this restricted game is attracting. This is the only way to get a stable interior equilibrium. If we have two opposite signs, there is no interior equilibrium and if there are two minus then the interior equilibrium is repelling.
The assumption in (11) seems to be somehow arbitrary, but we want to exclude a kind of forced movement 4 as example 1 should explain.
Example 1
We take the following payoff matrix A
A = 0 9 64 15 64 47 60 9 32 0 3 16 5 16 3 32 3 8 0 1 4 5 32 1 8 1 2 0
Note that our assumptions (9)-(10) hold for this matrix, but (11) is violated, because all the t i are positive. This example also shows that V ( x ( t ) ) is not always a good Ljapunov function. This matrix has a unique Nash equilibrium
N 1234 = 860 1899 , 361 1899 , 181 633 , 15 211
But the minimum of V ( x ) = max i ( A x ) i is not attained at the equilibrium N 1234 , but at the point m = 5 9 , 1 9 , 1 3 , 0 . Its value there is
V ( m ) = 3 32 = 0.09375
whereas its value at the Nash equilibrium is
V N 1234 = 81 844 = 0.09597
Along the line segment s, which connects m and N 1234 the pure strategies 2, 3 and 4 are best responses.
Figure 1. Sufficiently close to the line segment s, the dynamics is similar to the dynamics for the restricted game A | 234 . Every orbit hits the set B 24 , shown by the arrow in the figure. Inside this set there is a forced movement towards the unique Nash equilibrium of the restricted game.
Figure 1. Sufficiently close to the line segment s, the dynamics is similar to the dynamics for the restricted game A | 234 . Every orbit hits the set B 24 , shown by the arrow in the figure. Inside this set there is a forced movement towards the unique Nash equilibrium of the restricted game.
Games 01 00189 g001
Therefore if we start in this segment the solution x ( t ) heads towards the unique Nash equilibrium N 234 = 0 , 7 13 , 1 13 , 5 13 of the restricted game A | 234 and N 1234 is reached in finite time (Along such an orbit V ( x ( t ) ) does not decrease!). Moreover there is a two dimensional manifold, in which 2 and 4 are the best responses, hence every orbit inside this set tends towards the Nash equilibrium N 24 = 0 , 5 7 , 0 , 2 7 of the restricted game. After finite time s is reached and N 1234 is reached in finite time again. The movement inside B 24 is forced in the way that orbits close by want to cross B 24 from both sides. Thus the resulting movement is towards N 24 . We can describe the dynamics for the game as follows. Some orbits that head for pure strategies cycle towards N 1234 and reach it in finite time. All other orbits (especially those x , with V ( x ) < V ( N 1234 ) ) converge via s to N 1234 in finite time.

3. Main Results

In this section the main results of this paper are summarised. The following lemma is a central part for the analysis. It shows the form the return map which plays a crucial role in the analysis of the best response dynamics. One big advantage is that this map is of lower dimension than the game. Due to the lower dimension the map is easier to analyse, but no information on the global dynamics is lost. This is in contrast to the analysis of the replicator equation in [3] where the cross sections for the return map are placed near fixed points (the vertices) via a linearisation. These fixed points cannot be Nash equilibria of the game and hence no information on the behaviour close to Nash equilibria can be achieved directly by this method. This is especially true with respect to the interior Nash equilibrium. Whereas for the best response dynamics these cross-sections for the return map are given more naturally and no linearisation is needed. Almost all orbits for the best response dynamics cross a transition face and so this allows us for the best response dynamics to give results for almost all orbits. Moreover for some cases we are also able to give some general results about the existence and stability of Nash equilibria. Especially for the interior Nash equilibrium N 1234 , if it exists, we can give very precise results about its stability. Clearly if we can state a result on the existence of a Nash equilibrium for the best response dynamics it automatically applies for the replicator equation. The focus in [3,4] was on the study of the heteroclinic cycle and hence no results on the Nash equilibria were given.
Lemma 2
If there is an orbit from B i 1 , i to B i , i + 1 the map from B i 1 , i to B i , i + 1 can be written in the form (after an appropriate parametrisation):
T i ( u ) : = P i u 1 + d i · u , T i : R + 2 R + 2
where P i is a 2 × 2 matrix. This is a central projection and the center of the projection (see Section 6.3) is the best response vertex.
If there is an orbit from B i 1 , i to B i 1 , i following the full cycle the map from B i 1 , i to B i 1 , i can be written in the form (after an appropriate parametrisation):
Π ( u ) : = P u 1 + d · u , Π : R + 2 R + 2
where P is a 2 × 2 matrix.
Surprisingly the transition matrices for the stability analysis of the replicator equation in [3] are identical to the transition matrices found for the best response dynamics (after a careful choice of variables). Some technical difficulties arise in the analysis of the best response dynamics. This results from the placement of the cross-sections and the fact that the return map Π is a fractional linear map and hence a priori the map is not defined for all x R 2 . For the replicator equation there must exist an orbit from one cross-section to the next, if only the neighbourhood is chosen small enough. This follows directly from the form of the differential equation: x ˙ i = x i f i ( x ) . (In fact there are some other technical difficulties for the replicator equation, regarding the mapping from one cross-section to the next. These problems concern the shape of the image of the domain, which plays an important role in the analysis.) For the best response dynamics it is not clear if there is always an orbit from one cross-section to the next. Preliminarily this can only be guaranteed if an interior Nash equilibrium exists (because in that case no strategy is dominated and so each pure strategy is used for some x ), but as the examples in Section 4 below show it is not always true that an interior Nash equilibrium N 1234 exists. Thus we have to show that no strategy is dominated to prove that there is in fact always an orbit from one cross-section to the next. The problem with the denominator of the map can be solved by using a compactification of R 2 by introducing the real projective plane R P 2 (in short notation P 2 ). This is a detour, which was not necessary for the analysis of the replicator equation. But the gain for the best response is that we get more global results. To be more precise in [3,4] all results are local results. In contrast all statements found in this work apply at least to all orbits following a specific cycle. This means in a lot cases a result about almost all orbits.
Let us now turn to the concrete results on the best response dynamics. We start with a result on so called monocyclic matrices. The terminology follows [7] and it describes a payoff matrix for which all t i s are negative and hence the matrix only contains one better reply cycle.
A = 0 c 2 t 3 e 4 e 1 0 c 3 t 4 t 1 e 2 0 c 4 c 1 t 2 e 3 0
where c i > 0 , e i > 0 and t i < 0 for i = 1 , , 4 . In [10] is shown that for games with monocyclic payoff matrix and an interior Nash equilibrium x * all orbits converge to the interior Shapley polygon if x * · A x * < 0 holds. We used a different approach for the return map as in [10] and get more information on the global dynamics. Proposition 16 below gives a complete classification of the attractors for monocyclic matrices and hence a more precise result on the existence of a Shapley polygon. Additionally we get conditions for the stability of the interior Nash equilibrium. In [6] a list of equivalences for monocyclic games is given. With the help of the return map and some considerations on the index of the Nash equilibria we can completely classify 4 × 4 monocyclic games see Section 7 below. The only parameters needed for the classification are det ( A ) , Π e i and Π c i .
A behaviour can be observed similar to a supercritical Hopf bifurcation for monocyclic payoff matrices. In case N 1234 · A N 1234 is smaller than 0 there exists a Shapley polygon. In case N 1234 · A N 1234 = 0 holds this polygon shrank to the point N 1234 . We will call this a degenerated Shapley polygon. In case N 1234 · A N 1234 > 0 no Shapley polygon exists.
Theorem 3
Let A be a monocyclic payoff matrix. Then the following statements are equivalent
(i) 
There is no (possibly degenerated) Shapley polygon for the best response dynamics.
(ii) 
All orbits converge in finite time to the interior Nash equilibrium N 1234 .
(iii) 
The system is permanent for the replicator equation.
This result shows the strong connection between the replicator equation and the best response dynamics and it follows immediately from [2] that at least the time average of every interior orbit for the replicator equation converges to the interior Nash equilibrium. So by giving a classification for the best response dynamics for monocyclic games we automatically improve some results for the replicator equation.
The next two theorems give some results on payoff matrices with one additionally embedded RSP cycle.
Remark 
There are some recurring terms in the following results and throughout the analysis in the sections below (mainly in Lemma 11). To enhance readability we summarise these terms here.
Σ 1 = ( c 1 e 2 t 3 + c 2 e 3 t 1 + t 1 t 2 t 3 ) Σ 2 = ( c 2 e 3 t 4 + c 3 e 4 t 2 + t 2 t 3 t 4 ) e = Π e i , c = Π c i and q = e c 2
Σ 1 and Σ 2 can be interpreted as the stability criteria for the Shapley triangle to exist in the full game. If the restricted game A | { 1 , 2 , 3 } has a Shapley triangle then Σ 1 < 0 assures that it still exists in the full game. The same holds for Σ 2 and the 234 restricted game. The expression d e t ( A ) > q is equivalent to the requirement that the return matrix (13) has a real eigenvalue.
Theorem 4
Let A be a payoff matrix as in (8) with t 1 , 2 , 4 < 0 and t 3 > 0 . Then the game has an interior Shapley polygon, which is locally attracting if det A > 0 or e < c and one of the following conditions hold.
(i) 
Σ 1 > 0
(ii) 
Σ 1 < 0 , Σ 2 > 0 a n d det A > max e c + 2 t 4 A ( 24 ) 2 e 4 A ( 14 ) , q
For (ii) there are two interior Shapley polygons. One is locally attracting and one is a saddle. To be more precise every orbit following the 1234 is either attracted by the attracting Shapley polygon or repelled (except a two dimensional manifold) by the saddle-type Shapley polygon to follow a different cycle after some time.
This is exactly the same result as in [3] for the existence of relatively asymptotically stable heteroclinic cycle following the cycle 1234. Again note that that this result is not simply local. It is a result for all orbits following the 1234 cycle at least once. The next theorem shows that the return map for the best response dynamics is again more ‘powerful’than for the replicator equation and gives a result on the existence of the interior Nash equilibrium and its stability.
Theorem 5
Let A be a matrix as in (8) and let t 1 , 2 , 4 < 0 and t 3 > 0 hold. If e 1 e 2 t 3 < | t 1 | c 2 c 3 holds and
(a) 
Σ 1 < 0 holds, then an attracting Shapley polygon following the 123 cycle exists.
(b) 
Σ 1 > 0 holds and no interior Shapley polygon exists (see theorem 4) then an asymptotically stable interior Nash equilibrium N 1234 exists.
Theorem 4 together with Theorem 5 also give conditions under which an attracting Shapley triangle, an attracting Shapley polygon in the interior and a second interior Shapley polygon, which is a saddle, exist. The theorem above also provides a connection between the existence of an attracting Shapley polygon and the stability of the interior Nash equilibrium. And hence it draws a connection to the existence of heteroclinic cycles and the interior Nash equilibrium, which was not done in [3] and [4].
The next theorem presents some statements on games with two embedded RSP cycles.
Theorem 6
Let A be a matrix as in (8) and let t 1 < 0 , t 2 < 0 and t 3 > 0 , t 4 > 0 and let Σ 1 > 0 and Σ 2 > 0 hold. Then if
(i) 
det ( A ) > 0 o r e < c hold then an interior Shapley polygon exists, which attracts almost all orbits.
(ii) 
q < det ( A ) < 0 a n d e > c hold then an interior Nash equilibrium N 1234 exists, which is globally asymptotically stable.
Note that the results from Theorem 5 can be also applied so with two embedded RSP cycles it also possible to construct examples with two Shapley polygons on the boundary. One following the 123 cycle and one following the 234 cycle. It is important to note that the return map per se only allows a statement that the interior Nash equilibrium N 1234 is relatively asymptotically stable (for a definition see [3]) with respect to the set of orbits that follow the 1234 cycle. But on one hand we know for the best response dynamics (especially when no orbit can follow a cycle different to 1234 infinitely many times) that only a set of measure 0 does not converge to the interior Nash equilibrium and on the other hand in cases mentioned above we were able with some additional considerations to show that no other Nash equilibrium can exist and hence the interior equilibrium is globally asymptotically stable.

4. Examples

In this section some examples for the different cases in the sections above are presented. These examples show that there is no connection between the support of the Shapley polygon and the support of the Nash equilibria. There is also no connection between the global dynamics and the Nash equilibria. The examples will also show that the support of the best reply cycle and the support of the cycle the Shapley polygon follows, do not have to be identical. There is also an example, where the support of the unique Nash equilibrium is not contained in the best reply cycle. Under this assumption the Shapley polygon can be seen as another solution concept for games which only have unstable Nash equilibria.
Example 2
This example has a unique Shapley polygon. It lies on the 123 face and the game has no interior Nash equilibrium. Note that the best response cycle connects the strategies 234, while the attracting Shapley polygon follows the 123 cycle (Theorem 4). So the support of the best response cycle does not have to be contained in the support of the attracting Shapley polygon and vice versa.
A = 0 1 3 20 1 1 0 1 5 2 1 1 0 4 1 15 1 0
The unique Nash equilibrium is given by
N 123 = 11 81 10 81 20 27 0
Example 3
The second example has a unique Shapley polygon on the 123 face (Theorem 4) and a unique Nash equilibrium on the 124 face. This example stresses the non-correlation of Nash equilibria and Shapley polygons as neither the support of the Nash equilibrium is contained in the support of the Shapley polygon nor is the support of the Shapley polygon contained in the support of the Nash equilibrium.
A = 0 3 3 2 1 1 0 3 15 15 1 0 1 1 39 20 1 0
The unique Nash equilibrium is given by
N 124 = 276 697 20 41 0 81 697
Example 4
The following example has one Nash equilibrium but no interior. The consequence is that it is not necessary to have an interior Nash equilibrium, if there is a Shapley polygon with full support. The Shapley polygon on the boundary lies on the 123 face, whereas the best response cycle connects 234 (Theorem 5).
A = 0 1 1 3 1 1 0 2 2 1 1 0 85 2 2 1 0
The unique Nash equilibrium is given by
N 123 = 13 30 4 15 3 10 0
Example 5
The last example has two Shapley polygons on the boundary and one in the interior. The two Shapley polygons on the boundary lie in the 123 and the 234 face (Theorem 6).
A = 0 2 12623 20000 1 1 0 11 100 31 10 3 1 0 10 1 3 1 0
The unique Nash equilibrium is given by
N 123 = 19223 256315 57092 256315 36000 51263 0
We get two attractors (Shapley polygons) on the boundary and an interior unstable Shapley polygon. Additionally there is one orbit from the 234 face that moves towards N 123 .

5. General Properties of the Dynamics

The following lemma gives more insight in the solutions of the game with payoff matrix A and the assumptions made in the beginning.
Lemma 7
Let A be a payoff matrix of the form (8) with the assumptions (9)–(12). Let x B i with i = 1 , , 4 then there exists a T ( 0 , ] such that the solution x ( t ) is unique for t 0 , T and x ( t ) i = 1 4 B i holds for almost all 0 t T . For t > T the solution x ( t ) is no longer unique or there is an interval I = t , t such that the orbit x ( t ) i = 1 4 B i for t I .
If T < holds, then either an interior Nash equilibrium is reached in finite time or x ( t ) converges for t to a Nash equilibrium on the boundary. If T = holds, then x ( t ) V 0 as t .
Remark 
Note that assumption (11) is needed for this lemma, which excludes forced movement. In general we cannot predict the behaviour of the orbit after the interior Nash equilibrium N 1234 is reached via a forced movement. If N 1234 is the unique equilibrium the orbit cannot leave N 1234 , but if the game has several Nash equilibria, then uniqueness is lost and it is possible to continue the path towards each equilibrium at any time. If, for example, N 13 exists the orbit can move towards N 13 and then can move towards strategy 1 or 3 at any time. In this case the orbit reenters the set i = 1 4 B i .
Proof 
x B i means that
( A x ) i > ( A x ) j j i
which means that the orbit has to move towards e i and hence can be written as
x ( t ) = e t x + ( 1 e t ) e i
for t small enough. If we look at (after a change of time t 1 e t )
A x ( t ) = A x + t A ( e i x ) = ( 1 t ) A x + t A e i
we can calculate, which strategy becomes the next best response. A e i is simply the i-th column of the matrix A = ( a i j ) . We know by (9) and (10) that a i i = 0 , a i + 1 , i > 0 and a i 1 , i < 0 . Now there are two possibilities for a i + 2 , i , which corresponds to t i . It can be greater or smaller than zero.
  • a i + 2 , i < 0 :
    In this case it is obvious that only the payoff of the ( i + 1 ) -st strategy is increasing and for some t ˜
    ( A x ( t ˜ ) ) i = ( A x ( t ˜ ) ) i + 1 > ( A x ( t ˜ ) ) i + 2 , i + 3
    which means that x ( t ˜ ) B i , i + 1 or in other words the best response is not unique at this moment. To find out the possible solutions we have to look at the restricted game in which only these strategies are used, which are best replies at the moment. The orbit can move in the direction of any Nash equilibrium of this restricted game A | l m = ( a l m ) , l , m = i , i + 1 .
    This restricted game A | l m has only one Nash equilibrium E i + 1 (because of (9) and (10)). Thus the path can only be continued in one way, so x ( t ) enters B i + 1 , which means that the solution stays unique.
  • a i + 2 , i > 0 :
    By (11) we know that in this case a i , i + 2 < 0 must hold. Now three strategies can become best replies which means that we have three possibilities
    x ( t ˜ ) B i , i + 1
    as before x ( t ) B i + 1 for t > t ˜ .
    x ( t ˜ ) B i , i + 2
    as before x ( t ) enters B i + 2 for t > t ˜ .
    x ( t ˜ ) B i , i + 1 , i + 2
    We have to look at the restricted 3×3 game and check for all possible Nash equilibria. The matrix has the following sign pattern.
    A i , i + 1 , i + 2 = a i i = 0 a i , i + 1 < 0 a i , i + 2 < 0 a i + 1 , i > 0 a i + 1 , i + 1 = 0 a i + 1 , i + 2 < 0 a i + 2 , i > 0 a i + 2 , i + 1 > 0 a i + 2 , i + 2 = 0
    so it is easy to see that this game has also only one Nash equilibrium (the pure strategy i + 2 ). Hence the solution x ( t ) remains unique and enters B i + 2 .
To show the second part of the lemma we first assume that T = then we know from Lemma 1 that x ( t ) converges to the set V 0 . Now assume that T < . Using Lemma 8 (see below) we know that the orbit x ( t ) ultimately follows only one better reply cycle. We distinguish between the following two cases.
  • The orbit follows the cycle 1234 for t < T .
    We show that in this case the interior Nash equilibrium is reached in finite time. Take the sequence x ( t k ) of turning points (note that at these points at least two strategies have the same payoff), then the sequence t k converges to T. The map t A x ( t ) i is continuous for i = 1 , , 4 . As the orbit follows the cycle 1234 along one cycle of the orbit the payoff of each strategy is equal to its successor, so because of continuity as the difference between t k and t k + 1 can be chosen arbitrarily small, the difference between the different payoffs gets also arbitrarily small. Thus for t = T all payoffs must be equal. This means that all strategies have the same payoff after finite time. It is not possible to reach the boundary (from int Δ n ) so the orbit must reach an interior Nash equilibrium in finite time. To make this verbal argument more precise take a sequence of t 1 k such that ( A x ) 1 = ( A x ) 2 holds and t 2 k such that ( A x ) 2 = ( A x ) 3 and so on. Each of these sequences t i k , i = 1 , 2 , 3 , 4 converges to T and hence at T the payoffs for all strategies must be equal.
  • The orbit follows the cycle 123 (which we can choose without loss of generality). Note that this can only happen if the 123 better reply cycle exists, e.g. t 1 < 0 and t 3 > 0 . We can use the same argument as for the cycle 1234 that for x ( T ) at least strategies 123 must have the same payoff. This means that at x ( T ) the following holds
    e 1 · A x = e 2 · A x = e 3 · A x e 4 · A x
    If all strategies have the same payoff a Nash equilibrium is reached, so we assume now that we reach a point x ( T ) , where only 123 are the best replies. To continue the orbit, we have to look at the restricted game A | 123 . It follows from assumptions (9)-(11) that this is an RSP game. Thus if supp x ( T ) = 123 , then the orbit has reached a Nash equilibrium. This can only happen if the starting point of the orbit was already in the 123 face.
    If supp x ( T ) 123 then the orbit moves towards the unique Nash equilibrium of the RSP game. Along this movement the payoff of strategies 123 change at the same rate, only the payoff of strategy 4 changes differently. Therefore the orbit either hits an interior Nash equilibrium (if it exists) or converges to the unique Nash equilibrium of the RSP game (which is also a Nash equilibrium of the full game). This means we either get x ( T ) = N 1234 or x moves towards N 123 .
Remark 
We can formulate the last statement in the proof above a bit stronger. Suppose T is finite and the orbit x ( t ) follows the 123 cycle and reaches at time T a point p with payoff A p = a , a , a , b with a > b , then the orbit reaches an interior Nash equilibrium N 1234 if and only if the Nash equilibrium N 123 of the restricted RSP game A | 123 is no Nash equilibrium of the full game. This follows directly from the fact that the payoff of the fourth strategy changes differently to the payoff of the strategies 123.
Lemma 8
An orbit x ( t ) can switch between better reply cycles at most once.
Proof 
We show this without loss of generality for the 123 and 1234 cycles. First note that for the possibility to switch between these two better reply cycles, there must be some orbits, which follow the 123 cycle and some orbits which follow the 1234 cycle. Now let x be an orbit, which follows the 123 (at least once). Hence this orbit must cross the sets B 12 , B 23 and B 31 . Now let y be an orbit which follows 1234. This orbit must cross the sets B 12 , B 23 , B 34 and B 14 . If we start in B 23 then 31 and 34 can become best replies. Thus there is a subset of B 23 which is mapped to B 134 , which is a one dimensional set by our assumption that A is generic. Hence the preimage of B 134 on B 23 is a line segment s. The orbits starting in s and moving to B 134 span a plane P in B 3 , which separates the orbits that follow the 123 cycle and the orbits that follow 1234. Because of the uniqueness of solutions in each B i this plane P can only be crossed in one way and hence switching between cycles is only possible once.
An immediate consequence of this lemma is
Corollary 9
For a matrix fulfilling (8) no chaotic behaviour can occur.
The results for the RSP game with respect to V 0 possibly gives rise to the idea that the behaviour is identical for a 4 × 4 payoff matrix in the sense that if V 0 is not empty it is already an attractor. The following example 6 shows that the set V 0 in general neither has to be attracting nor invariant. It is possible to enter or leave this set in finite time.
Example 6
We take the following payoff matrix
A = 0 1 1 1 16 1 0 8 1 4 1 3 0 1 10 1 7 2 5 0
This payoff matrix has a unique Nash equilibrium N 1234 = 21 631 , 162 3155 , 88 3155 , 560 631 with
V ( N 1234 ) = N 1234 · A N 1234 = 101 3155 > 0
but the minimum of V ( x ) is attained at x ^ = 77 111 , 118 555 , 52 555 , 0 with V ( x ^ ) = 31 555 . The results from Theorem 6 above show that almost all orbits converge to the interior Nash equilibrium.
Figure 2. The set V 0 is shown by the green pyramid. The red orbit connects x ^ , the top of the pyramid, the Nash equilibrium N 1234 and the target point N 234 for the line segment spanned s by x ^ and N 1234 .
Figure 2. The set V 0 is shown by the green pyramid. The red orbit connects x ^ , the top of the pyramid, the Nash equilibrium N 1234 and the target point N 234 for the line segment spanned s by x ^ and N 1234 .
Games 01 00189 g002
But besides the result from this theorem we can give a complete description of this game. The set V 0 is a pyramid with top p = 43 / 157 , 121 / 1099 , 57 / 1099 , 620 / 1099 . At this point the payoff vector A p is of the form w , 0 , 0 , 0 where w > 0 . This means the trajectory starting at p continues towards the unique Nash equilibrium N 234 of the restricted 234 game N 234 = 0 , 51 / 1180 , 29 / 1180 , 55 / 59 . (Note that N 234 is not a Nash equilibrium of the full game!) Inside the pyramid V ( x ) < 0 holds and outside the pyramid V ( x ) > 0 holds. So every orbit that starts inside the pyramid reaches the line segment s (spanned by x ^ and N 1234 ) in finite time, where 234 are the best replies. (This means the orbit reaches a point, where its payoff vector is of the form b , a , a , a , with b > a ). In s the orbit moves towards N 234 , passes p and reaches N 1234 in finite time. Close to s the dynamics is the same as for an RSP game with attracting Nash equilibrium.
Every orbit that starts in V 0 spirals (following the cycle 234) towards p and reaches it in finite time, but after an infinite number of turning points! Then again the orbit reaches N 1234 in finite time. Orbits outside the pyramid behave similar to the orbits inside, they spiral towards s reach it in finite time and thus reach the Nash equilibrium N 1234 in finite time. In this example every orbit reaches N 1234 in finite time. Therefore N 1234 is globally asymptotically stable.

6. Construction and Properties of the Return Map

The previous example showed that the analysis of the set V 0 is not sufficient to analyse 4 × 4 games. So another tool is needed. We will introduce the concept of Poincaré or –also called– return maps in this section. We just give a rough concept, which should be made intuitively clear by figure 3 below. For more detailed information and all the technical assumptions needed see for example [6]. To create a return map we simply take a cross-section and a starting point p in this cross-section. We now follow the orbit p ( t ) until it returns to the cross section for the first time. We denote this new point in the cross section by p . The map p p is called the return map. This map can be used to find periodic orbits, fixed points and to check their stability.
Figure 3. A simplified drawing of the kind of return map we will use below.
Figure 3. A simplified drawing of the kind of return map we will use below.
Games 01 00189 g003
We will use a set B i 1 , i as the cross-section and hence this set is the domain for our return map (which will be defined later). Since solutions are piecewise linear it is not possible to calculate the return map at once. Instead of doing so we have to calculate four transitions maps and glue them together to obtain a return map. First we have to find a proper parametrisation of our four cross-sections. (These cross-sections are simply given by B 12 , B 23 , B 34 and B 41 which correspond to the cycle given by the structure of the matrix.) It turns out that the following parametrisation of the set B i 1 , i is very useful
m = m j = ( ( A x ) i ( A x ) j ) for x B i 1 , i
We get m 0 . Without loss of generality we assume that m 0 (if m = 0 then x is an interior Nash equilibrium as all payoffs are equal there. This parametrisation can be interpreted as an incentive function, since it measures the relative success of strategy i compared to all others.) Hence the domain for our special parametrisation is a convex subset of R + 4 or is empty, which means that no orbit follows the full cycle.
We calculate the transition from B i 1 , i B i , i + 1 . We do this by calculating the time it takes a point from the inset B i 1 , i to reach the outset B i , i + 1 .
Remark 
The terminology in- and outset is chosen since the set B i can be entered through B i 1 , i and can be left through B i , i + 1 . Note that these need not be the only ways to enter or leave the set B i .

6.1. Proof of Lemma 2

Proof 
To improve the readability of the proof we prove this exemplarily for an orbit from B 41 to B 12 . So we start with a point x B 41 and parametrise it as in 16. Since 4 and 1 are the best replies the orbit moves towards 1. For t 0 small enough the following holds:
A x ( t ) = e t A x + ( 1 e t ) A e 1
We assume now that 2 becomes the next best reply. We have to solve
( A x ( t ) ) 1 = ( A x ( t ) ) 2
for (17). The solution can be easily calculated and is given by
e t * = e 1 e 1 + ( A x ) 1 ( A x ) 2
where t * represents the transition time from B 41 to B 12 . So we get the following mapping for m :
m = 0 ( A x ) 1 ( A x ) 2 ( A x ) 1 ( A x ) 3 0 λ 0 0 e 1 ( A x ) 1 ( A x ) 3 ) t 1 ( ( A x ) 1 ( A x ) 2 c 1 ( A x ) 1 ( A x ) 2 ) = m
with λ = e 1 e 1 + ( A x ) 1 ( A x ) 2 . Clearly we get m 0 . As there are two zeros in this map it is possible to interpret this map as a two dimensional map, if we set u / e 1 = ( A x ) 1 ( A x ) 2 and v / e 1 = ( A x ) 1 ( A x ) 3 and u = ( u , v ) we obtain the following:
f : u v 1 1 + u e 1 2 v t 1 e 1 u c 1 e 1 u
This map (20) is of the form:
T 1 ( u ) : = P 1 u 1 + d 1 · u , T 1 : B 41 B 12
It is easy to see that for a arbitrary map T i from B i 1 , 1 to B i , i + 1 we get
P i = t i / e i 1 c i / e i 0 , d i = 1 / e i 2 0
where P i is called the transition matrix. This map is projective map and properties of such maps can be found in Section 6.3 including that the composition of two projective maps is again a projective map (this follows directly from (31)). Glueing (via a composition) four transition maps together gives us the return map Π ( u ) from B 41 to itself by
Π ( u ) = P u 1 + d · u
where P = P 4 P 3 P 2 P 1 and d are given by
P = 1 e c 1 c 3 e 2 e 4 + c 3 e 4 t 1 t 2 t 4 Σ 1 e 1 Σ 2 c 4 Σ 1 Σ 3
d = d 1 d 2 = 1 e 1 2 + c 1 e 1 e 3 2 t 1 e 1 e 2 2 c 2 t 1 e 1 e 2 e 4 2 + t 1 t 2 e 1 e 2 e 3 2 c 1 t 3 e 1 e 3 e 4 2 t 1 t 2 t 3 e 1 e 2 e 3 e 4 2 1 e 2 2 + c 2 e 2 e 4 2 t 2 e 2 e 3 2 + t 2 t 3 e 2 e 3 e 4 2

6.2. A Necessary Condition for Shapley Polygons

The important property of the return map Π is that its fixed points either correspond to an interior fixed point (Nash equilibrium) or to a periodic orbit (Shapley polygon) for the best response dynamics. More precisely, the origin 0 R + 2 corresponds to an interior Nash equilibrium and every other fixed point in B 41 to a Shapley polygon. To find a fixed point of Π we have to solve the equation
u = P u 1 + d · u
which is simply an eigenvalue problem
( 1 + d · u ) u = P u
Note that the vector m in (16) is nonnegative. Hence for a full cycle a nonnegative vector must be mapped onto a nonnegative vector under the return map Π. If this is not possible no orbit follows the full cycle. The following corollary follows directly
Corollary 10
A Shapley polygon following the full cycle can only exist if the return matrix P in Equation (13) has a nonnegative eigenvector.
Proof 
A Shapley polygon corresponds to a fixed point for the return map Π in (13). This fixed point is an eigenvector of P by (25) and only nonnegative vectors can correspond to the set B 41 , where the return map initially started.
We state some general results for a certain class of two by two matrices T from [3]. In [3] these are conditions that the replicator equation has a relatively asymptotically stable heteroclinic cycle following the full better reply cycle, whereas for the best response dynamics they are necessary conditions for an attracting interior Nash equilibrium or the existence of an interior Shapley polygon.
Lemma 11
Let Σ 1 = ( c 1 e 2 t 3 + c 2 e 3 t 1 + t 1 t 2 t 3 ) , Σ 2 = ( c 2 e 3 t 4 + c 3 e 4 t 2 + t 2 t 3 t 4 ) , e = Π e i , c = Π c i and q = e c 2 and let P as in (13) and assume that p i j 0 for all i , j = 1 , 2 and that p 11 p 22 . Then P has a positive eigenvalue λ + with a corresponding positive eigenvector u + if and only if one of the following statements hold
( i ) Σ 1 > 0 , Σ 2 > 0 a n d det ( A ) q
( i i ) Σ 2 < 0 , Σ 1 > 0 a n d det ( A ) > max { e c 2 e 1 A ( 21 ) , q }
( i i i ) Σ 2 > 0 , Σ 1 < 0 a n d det ( A ) > max { e c + 2 t 4 A ( 24 ) 2 e 4 A ( 14 ) , q }
where A ( i j ) is the determinant of the matrix we obtain by omitting the ith row and the jth column in A. Additionally for the eigenvalue λ + holds λ + > 1 iff
( a ) det ( A ) > 0 or ( b ) e < c

6.3. Projective Geometry and Fractional Linear Maps

The next step is to analyse the behaviour if the iteration of the return map. This can lead to technical difficulties because (22) cannot be defined for all x R 2 . So before we start with the analysis of the concrete return map as described in Section 2 we give some general results on fractional linear maps of the form
Π ( x ) : = P x 1 + d · x , Π ( x ) : R 2 x R 2 : d · u = 1 R 2
where P = a b c d , d = d 1 d 2 with det ( P ) 0 . (We also implicitly assume that the matrix P has no entry equal to zero. The results would not differ, but else the proofs would contain a lot of cases.)
Maps of this form are also called central projections or projective maps. We have already mentioned in Section 6.2 that a nonnegative vector has to mapped onto a nonnegative vector. Consequently we have to find conditions under which an invariant set in R + 2 exists. If there is no such set no orbit can follow the full cycle infinitely many often hence no Shapley polygon in the interior can exist. We also know from Section 6.2 the fixed points of (22) to study their stability we have to analyse Π n ( x ) = Π Π Π ( x ) .
To do so we use projective geometry, for more details about projective geometry see [11]. We use homogenous coordinates and get the following: If x 0 0 then x 0 , x 1 , x 2 1 , x 1 x 0 , x 2 x 0 . These points are called regular and this method creates a bijective mapping from the regular points in P 2 to R 2 . Points with x 0 = 0 are called points at infinity and the set of all 0 , x 1 , x 2 is called the line at infinity. With this concept we can write the two dimensional fractional linear map Π from (30) in the following form
T : P 2 P 2 , T ( x ) = T x = 1 d 1 d 2 0 a b 0 c d x 0 x 1 x 2
In this map T the real points which satisfy 1 + d 1 x 1 + d 2 x 2 = 0 (with x 1 = x 1 / x 0 and x 2 = x 2 / x 0 ) are mapped to the line at infinity. We denote with T both the matrix and the map, but as the map is only a matrix times a vector, there is no problem in doing so. This map is a block triangular matrix and hence the product of two such matrices is again a block triangular matrix. This gives any easy proof that the composition of two central projections is again a central projection. Any line l ^ with starting point x 1 , x 2 and direction l 1 , l 2
l ^ : X = x 1 x 2 + μ l 1 l 2
in R 2 can be embedded in P 2 in the following form
l : X = 1 x 1 x 2 + μ 0 l 1 l 2
Here l from (32) contains only regular points which exactly correspond to the points defined by l ^ . A map through the origin in R 2 is mapped under T to
T l = d 1 l 1 μ + d 2 l 2 μ + 1 μ ( a l 1 + b l 2 ) μ ( c l 1 + d l 2 )
To calculate which lines in R 2 embedded in P 2 are invariant, we simply have to calculate the eigenvectors of P (not of T ! But note that if we project the eigenvectors of T onto their second and third component, we get the eigenvectors of P and the origin in R 2 .). These eigenvectors u + , are given by (note that we assumed in the beginning of this section that the matrix has no zero entry and its determinant is also not zero!)
u + , = a d ± tr ( P ) 2 4 det ( P ) 2 c , 1 = k 1 , a d tr ( P ) 2 4 det ( P ) 2 b
with k R and their corresponding eigenvalues
λ + , = 1 2 tr ( P ) ± tr ( P ) 2 4 det ( P )
chosen such that | λ + | | λ | . We are only interested in real eigenvalues and hence we assume for all following considerations
tr ( P ) 2 4 det ( P ) > 0
which means that one (and hence both) eigenvalue is real. Note that due to the block form of the matrix T its eigenvalues are given by 1 and the eigenvalues of P. We also get from (35) that P is similar to a diagonal matrix and as a consequence we get that T is similar to (a matrix we will also call T )
T = 1 d 1 d 2 0 λ + 0 0 0 λ
It is easy to see (by induction) that
T n x = x 1 + d 1 j = 0 n 1 λ + j x 2 + d 2 j = 0 n 1 λ j x 3 λ + n x 2 λ n x 3 1 λ + n x 2 N λ n x 3 N
where N = x 1 + d 1 j = 0 n 1 λ + j x 2 + d 2 j = 0 n 1 λ j x 3 and hence T n x 1 , 0 , 0 Ê for n , if | λ + , | 1 . This means if the absolute value of both eigenvalues of P is smaller than or equal to 1, the iteration of the map Π in (30) leads to convergence to the origin. If we use that
lim n λ + n T n x = k u + *
where u + * is the equivalent (or the regular point corresponding to u + ) of u + in P 2 and k R and x is an arbitrary vector, but not in s p a n { 1 , 0 , 0 , u * } (this set corresponds to the eigenspace E ( λ ) in R 2 ) and | λ + | > max 1 , | λ | holds, we get that all vectors x , which are not eigenvectors, converge to the eigenspace of the eigenvalue with greatest absolute value. (This can be seen easily by using the fact that T is diagonalizable.) It follows from (35) that T has 3 different eigenvalues and hence three different eigenvectors. These eigenvectors are regular points in P 2 and therefore corresponding to points in R 2 and hence the map Π has three fixed points if λ + , > 0 holds and each line connecting two of these fixed points is invariant under the map Π. More precisely, one eigenvector of T corresponds to the origin in R 2 and hence two of these three invariant lines correspond to the eigenspaces E ( λ + ) and E ( λ ) of P. The third invariant line connects the fixed points inside these eigenspaces and is generically disjoint from the origin. This third line corresponds to the set V 0 for the best response dynamics.
Proposition 12
Let det ( P ) > 0 and λ + > 1 then all x P 2 (except the two different eigenvectors of T ) converge to the fixed point in E ( λ + ) in a monotone way. If 0 < λ + 1 all x P 2 converge to the origin.
Proof 
Convergence properties follow from (38) and from det ( P ) > 0 (both eigenvalues are positive) follows det ( T ) > 0 and hence T is orientation preserving and convergence is monotone.
Corollary 13
It is impossible for an attracting interior Nash equilibrium N 1234 and an interior Shapley polygon to exist simultaneously.
Proof 
An interior Shapley polygon only exists if the dominating eigenvalue λ + is greater than 1. Conversely N 1234 is only attracting (with respect to the orbits that follow the 1234 cycle) if λ + is smaller or equal 1.

6.4. The Domain of the Return Map

Definition 5
A strategy y is called dominated if there exists p Δ n such that
i = 1 n p i a i j i = 1 n y i a i j
for all j = 1 , , n and inequality is strict for one j. If inequality is strict for all j then y is called strictly dominated.
Finally we define the domain of the return map in the following way
Π ( u ) = P u 1 + d · u , D B 14
where
D = u B 14 : P u 1 + d · u B 14 ¯
A crucial point is to show that D is not empty. To do this we use the proof of a theorem from [6] (page 90, Theorem 8.3.2). We directly get the following
Lemma 14
Let x int Δ n if e k is contained in the ω-limit ( x ( t ) ) for (2) then k cannot be a dominated strategy.
The lemma above shows that if the replicator equation has a heteroclinic cycle following the cycle 1234,which attracts an interior orbit, then no strategy can be dominated. Hence, whenever P has a positive eigenvector then D is not empty! This follows directly by Lemma 11. In [3] the conditions from Lemma 11 guarantee the existence of a relatively asymptotically heteroclinic cycle following the cycle 1234. Hence at least one interior orbit is attracted by this heteroclinic cycle.

6.5. Shapley Triangles in 4×4 Games

This section does not apply the methods of the return maps. We constructed the return map to analyse orbits that follow the full cycle. But as we are interested in games with embedded RSP cycle we have to find conditions under which these RSP cycles form have a Shapley triangle. It turns out that a direct approach is easier than an approach via return maps to find this conditions.
Definition 6
If the payoffs of all unused strategies along a Shapley polygon are negative, the Shapley polygon is called regular.
Proposition 15
Given a payoff matrix A as in (8) then
  • The 123 cycle has an asymptotically stable regular Shapley polygon, iff all of the following conditions are satisfied
    (i) 
    t 3 > 0 , t 1 < 0 ,
    (ii) 
    e 1 e 2 t 3 < | t 1 | c 2 c 3 and
    (iii) 
    Σ 1 = ( c 1 e 2 t 3 + c 2 e 3 t 1 + t 1 t 2 t 3 ) < 0 .
  • The 234 cycle has an asymptotically stable regular Shapley polygon, iff all of the following conditions are satisfied
    (i) 
    t 4 > 0 , t 2 < 0 ,
    (ii) 
    e 2 e 3 t 4 < | t 2 | c 3 c 4 and
    (iii) 
    Σ 2 = ( c 2 e 3 t 4 + c 3 e 4 t 2 + t 2 t 3 t 4 ) < 0 .
Proof 
We take the following payoff matrix
A ˜ = 0 b 2 a 3 d 1 a 1 0 b 3 d 2 b 1 a 2 0 d 3 s 1 s 2 s 3 0
This is a rock-scissors-paper game, if we take A ˜ | { 123 } . So we simply added a fourth arbitrary strategy to a typical RSP game. It is well known [6] that a Shapley polygon exists for the RSP game iff
a 1 a 2 a 3 < b 1 b 2 b 3
holds. We will call the s i s the transversal incentives since they are the positive or negative incentives to leave the 123 surface.
We denote the vertices of the Shapley polygon by S 1 , S 2 and S 3 . We need that ( A ˜ S i ) 4 < 0 holds for i = 1 , 2 , 3 . Because 4 must not be the best reply to a vertex of the Shapley polygon as in this case the dynamics would leave the 123 surface and enter the interior of Δ 4 . This leads to the following inequalities:
b 2 b 3 s 1 + a 1 a 3 s 2 + b 2 a 1 s 3 < 0
b 3 a 2 s 1 + b 1 b 3 s 2 + a 1 a 2 s 3 < 0 and
a 2 a 3 s 1 + b 1 a 3 s 2 + b 1 b 2 s 3 < 0
Clearly one of the s i must be negative to fulfil these inequalities. We assume without loss of generality s 1 < 0 holds. Note that these inequalities are trivially fulfilled if s i < 0 holds for all i = 1 , 2 , 3 .
  • Firstly assume that s 3 > 0 holds:
    Now suppose (44) holds. We get
    a 2 s 1 > b 1 a 3 s 2 + b 1 b 2 s 3 a 3
    and hence we obtain
    a 2 b 3 s 1 > b 3 · b 1 a 3 s 2 + b 1 b 2 s 3 a 3 = b 1 b 3 s 2 + b 1 b 2 b 3 · s 3 a 3 > b 1 b 3 s 2 + a 1 a 2 s 3
    Hence (45) also holds.
    We also have (together with (42))
    b 2 b 3 > a 1 a 2 a 3 b 1 and s 1 > b 1 a 3 s 2 + b 1 b 2 s 3 a 2 a 3
    so we get
    b 2 b 3 s 1 > a 1 a 2 a 3 b 1 · b 1 a 3 s 2 + b 1 b 2 s 3 a 2 a 3 = a 1 a 3 s 2 + a 1 b 2 s 3
    and (43) also holds.
  • Secondly assume that s 2 > 0 and s 1 , 3 < 0 hold.
    Suppose (44) holds. We have
    s 2 > b 3 a 2 s 1 + a 1 a 2 s 3 b 1 b 3
    We get
    b 1 a 3 s 2 > b 1 a 3 · b 3 a 2 s 1 + a 1 a 2 s 3 b 1 b 3 = a 2 a 3 s 1 + a 1 a 2 a 3 s 3 b 3
    Since s 3 < 0 is satisfied we have (see (42))
    a 1 a 2 a 3 b 3 < b 1 b 2 and a 1 a 2 a 3 s 3 b 3 > b 1 b 2 s 3
    and (44) holds.
    We also have
    a 1 a 3 > b 1 b 2 b 3 a 2 s 2 > b 3 a 2 s 1 + a 1 a 2 s 3 b 1 b 3
    so we get
    a 1 a 3 s 2 > b 1 b 2 b 3 a 2 · b 3 a 2 s 1 + a 1 a 2 s 3 b 1 b 3 = b 2 b 3 s 1 + b 2 a 1 s 3
    and (43) holds. A simple relabelling of the payoffs proves the proposition.

7. A Complete Classification of the Attractors of Monocyclic 4×4 Matrix

This section splits up in two parts. The first part gives a complete classification of monocyclic payoff matrices and provides a connection of the global stability of the interior Nash equilibrium N 1234 for the best response dynamics and permanence for the replicator equation. The second part provides the very technical but necessary part on the properties of the payoff matrix.

7.1. Classification of the Asymptotical Behaviour

As shown in 6 the return map Π is of the following form
Π ( u ) = P u 1 + d · u , B 14 B 14
In the case of a monocyclic matrix we get D = B 14 , as the transition matrices are nonnegative and as a consequence we get that P > 0 (as a product of four nonnegative matrices) and d > 0 (because of (24)) hold. Hence for the map R + 2 is invariant and as we have shown every x B 14 returns to this set after one cycle in case our matrix is monocyclic (see Lemma 7). The remaining question is if there are any fixed points in R + 2 besides the origin 0 .
To calculate the fixed points of Π we need to find the eigenvectors of the return matrix P. The Perron-Frobenius theorem states that P has a dominating eigenvalue λ + to a positive eigenvector u + and that any other eigenvector v which is not in the eigenspace of λ + , symbolically v E ( λ + ) is not positive. The remaining question is under which conditions λ + is greater or smaller than 1. The following theorem answers this question and provides a complete classification for monocyclic payoff matrices:
Proposition 16
For a monocyclic matrix (14) one of the following statements holds for the best response dynamics (1):
(i) 
λ + < 1 det ( A ) < 0 a n d e c : All orbits converge in finite time to the unique interior Nash equilibrium N 1234 .
(ii) 
λ + = 1 det ( A ) = 0 a n d e > c : All orbits converge (no longer in finite time) to the interior Nash equilibrium N 1234 , which we will call a degenerated Shapley polygon.
(iii) 
λ + > 1 det ( A ) > 0 o r e < c : Almost all orbits converge to the interior Shapley polygon.
Proof 
First note that the classification is complete because det A = 0 and e = c cannot hold simultaneously as shown in Lemma 19 below. The statements about λ + follow directly from Lemma 11, because for a monocyclic matrix Σ 1 , 2 > 0 is automatically fulfilled. Convergence to N 1234 or an interior Shapley polygon follows from Proposition 12. The equivalence in (ii) follows from some easy calculations. In case of convergence to the interior Nash equilibrium N 1234 it remains to show that all orbits converge to it (not only those following the 1234 cycle). This follows from the uniqueness of the equilibrium by Lemma 18 below. Convergence in finite time in (i) is an application of Lemma 1. If convergence would not occur in finite time V ( x ( t ) ) would go to zero, but V N 1234 > 0 holds.
In [6] (p. 178) it is shown that a system is permanent under the replicator equation (2) iff we obtain an M-matrix from A by moving the top row of A to the bottom. The following corollary contains the statements from Theorem 3 and provides a connection between M-matrices, permanence and the interior Nash equilibrium N 1234 . It is a collection of statements in [6], Proposition 16 and Lemma 20 below.
Corollary 17
For a monocyclic matrix (14) the following statements are equivalent:
(i) 
For the best response dynamics (1) there exists no (degenerated) Shapley polygon.
(ii) 
For the best response dynamics all orbits converge in finite time to the interior Nash equilibrium N 1234
(iii) 
There is an interior Nash equilibrium N 1234 and N 1234 · A N 1234 > 0 .
(iv) 
For the replicator equation (2) the system is permanent.
(v) 
det ( A ) < 0 and e c hold.
(vi) 
The matrix C obtained by moving the top row of A to the bottom is an M-matrix.
Remark 
As the examples in Section 4 show in the other cases the existence and stability of Nash equilibria is not so easy to predict. There are games, for which det ( A ) > 0 holds, but they do not have an interior Nash equilibrium. These examples also show that in case of e > c neither the existence of Nash equilibria on the boundary is guaranteed nor is the uniqueness of a Nash equilibrium guaranteed.

7.2. General Remarks on the Payoff Matrix

We start with a result on the interior Nash equilibrium.
Lemma 18
Let A be a generic monocyclic payoff matrix as in (14). Suppose N 1234 exists and det ( A ) 0 and e > c hold then N 1234 is the unique Nash equilibrium of the game.
Proof 
First we show that no Nash equilibrium on the 13 or 24 face can exist. We show this only for the 24 face as the proof for the 13 face is the same (modulo some permutations of the indices).
We start with
F 24 = 0 , t 4 t 2 + t 4 , 0 , t 2 t 2 + t 4
and calculate the payoffs there. We have to check, if ( A F 24 ) 1 < ( A F 24 ) 2 and ( A F 24 ) 3 < ( A F 24 ) 2 can hold simultaneously. Therefore we have to check if this system of inequalities has a solution.
( i ) e 1 e 2 e 3 e 4 c 1 c 2 c 3 c 4 > 0
( i i ) det ( A ) 0
( i i i ) e 4 t 2 c 2 t 4 t 2 t 4 and
( i v ) e 2 t 4 c 4 t 2 t 2 t 4
First note that if (47) holds then also
c 1 c 2 c 3 c 4 + c 2 c 4 e 1 e 3 + c 1 c 3 e 2 e 4 e 1 e 2 e 3 e 4 < 0
holds as we only omit positive terms. Similarly if (48) and (49) hold then
e 4 t 2 c 2 t 4 and
e 2 t 4 c 4 t 2
also hold. Combining these inequalities leads to the following system
( i ) c 2 c 4 e 1 e 3 + c 1 c 3 e 2 e 4 < e 1 e 2 e 3 e 4 + c 1 c 2 c 3 c 4
( i i ) c 2 c 4 e 4 e 2 and
( i i i ) e 1 e 3 c 1 c 3
Now (53) can be transformed into
e 2 ( c 1 c 3 e 4 e 1 e 3 e 4 ) < c 1 c 2 c 3 c 4 c 2 c 4 e 1 e 3
using (55) for (56) we get
e 2 > c 2 c 4 e 4
which is a contradiction to (54). Hence F 24 cannot be a Nash equilibrium for the game. (Similarly it is impossible for F 13 to be a Nash equilibrium.)
The next step is to check all possible Nash equilibria p with c a r d ( s u p p ( p ) ) = 3 . To do so we look at the restricted game 123. (The proof is the same for any other face). We assume that this restricted game has an interior equilibrium N 123 . Now we use the index theorem (see for example [6] page 159). Clearly the pure strategy 3 is a Nash equilibrium of the restricted game, if there is also an interior equilibrium N 123 there must be a third equilibrium N 13 . We now check the indices. We get
i ( e 3 ) = 1
i ( N 13 ) = 1
and hence i ( N 123 ) = 1 (as the indices must sum up to 1). Every Nash equilibrium with c a r d ( s u p p ( p ) ) = 3 has index 1. As a consequence only three Nash equilibria can exist in the full game and the index of the interior Nash equilibrium must be 1 . We will show that this leads to a contradiction. To do so we use that for a unique interior Nash equilibrium N 1234 the following holds (see [6] page 165)
i ( N 1234 ) = ( 1 ) n 1 sgn det ( A ) N 1234 · A N 1234 = ( 1 ) n 1 sgn det ( A n )
where A n is the matrix we obtain if we replace the last column in A with the vector which entries are all 1. It turns out (see below) in case of e > c that A n has the same sign as A and hence we get i ( N 1234 ) = 1 , which is a contradiction.
det ( A n ) = c 1 c 2 c 3 c 2 e 1 e 3 e 1 t 2 t 3 + c 1 c 3 e 2 e 1 e 2 e 3 + c 3 t 1 t 2 + c 2 e 3 t 1 + + t 1 t 2 t 3 + c 1 e 2 t 3 + e 1 e 2 t 3 + c 2 c 3 t 1
Suppose now det ( A n ) is greater than zero then clearly also the following holds as we have only omitted negative terms
c 1 c 2 c 3 c 2 e 1 e 3 + c 1 c 3 e 2 e 1 e 2 e 3 + c 3 t 1 t 2 > 0
Again we get a system of inequalities
( i ) c 1 c 2 c 3 c 4 + c 1 c 3 e 2 e 4 e 1 e 2 e 3 e 4 + c 3 e 4 t 1 t 2 + c 2 c 4 e 1 e 3 < 0 ( i i ) c 1 c 2 c 3 c 2 e 1 e 3 + c 1 c 3 e 2 e 1 e 2 e 3 + c 3 t 1 t 2 > 0 and ( i i i ) e 1 e 2 e 3 e 4 > c 1 c 2 c 3 c 4
We have to show that this system has no solution. We can transform this system into
( i ) c 2 c 4 ( e 1 e 3 c 1 c 3 ) < e 4 ( c 1 c 3 e 2 + e 1 e 2 e 3 c 3 t 1 t 2 ) ( i i ) c 2 e 4 ( c 1 c 3 e 1 e 3 ) > e 4 ( c 1 c 3 e 2 + e 1 e 2 e 3 c 3 t 1 t 2 ) and ( i i i ) e 1 e 2 e 3 e 4 > c 1 c 2 c 3 c 4
This system can be written as
( i ) c 2 c 4 ( e 1 e 3 c 1 c 3 ) < e 4 ( c 1 c 3 e 2 + e 1 e 2 e 3 c 3 t 1 t 2 ) < c 2 e 4 ( c 1 c 3 e 1 e 3 ) and
( i i ) e 1 e 2 e 3 e 4 > c 1 c 2 c 3 c 4
From (62) follows that
( e 1 e 3 c 1 c 3 ) < 0
If (62) and (63) hold then also
e 2 e 4 ( c 1 c 3 e 1 e 3 ) < c 2 c 4 ( c 1 c 3 e 1 e 3 )
holds. We get
e 2 e 4 < c 2 c 4
but this is together with (64) a contradiction to e > c . Therefore det ( A ) and det ( A n ) cannot be of opposite sign.
Lemma 19
For a monocyclic payoff matrix A as in (14) det ( A ) = 0 and e = c cannot hold simultaneously.
Proof 
We have to solve this system of equations:
c 1 c 2 c 3 c 4 + c 2 c 4 e 1 e 3 + c 1 c 3 e 2 e 4 + c 3 e 4 t 1 t 2 + c 4 e 1 t 2 t 3 +
+ c 2 e 3 t 1 t 4 + c 1 e 2 t 3 t 4 + t 1 t 2 t 3 t 4 = e 1 e 2 e 3 e 4
e 1 e 2 e 3 e 4 = c 1 c 2 c 3 c 4
Obviously each term in (67) that contains a t i is positive, so it suffices to show that a system
c 1 c 2 c 3 c 4 + c 2 c 4 e 1 e 3 + c 1 c 3 e 2 e 4 e 1 e 2 e 3 e 4 + p = 0
e 1 e 2 e 3 e 4 = c 1 c 2 c 3 c 4
with p > 0 has no solution. First note that from (69) follows that e 1 e 3 c 1 c 3 must hold. We assume now (without loss of generality) that e 1 e 3 > c 1 c 3 holds, hence also c 2 c 4 > e 2 e 4 must hold. If we write (69) in the following form
c 2 c 4 ( e 1 e 3 c 1 c 3 ) + e 2 e 4 ( c 1 c 3 e 1 e 3 ) + p = 0
it is easy to see that the left side is always greater than zero and hence the system has no solution, therefore det ( A ) = 0 and e = c cannot hold simultaneously.
Proposition 20
For (14) the following statements are equivalent:
(i) 
det ( A ) < 0 and e c holds.
(i) 
The matrix C obtained by moving the top row of A (see (14)) to the bottom is an M-matrix.
Proof 
To prove this we will show that all leading principal minors are positive. Clearly
e 1 > 0 , e 1 e 2 > 0 and
det ( C ) > 0
hold. Thus we have only to check if
c 1 c 3 e 2 + e 1 e 2 e 3 c 3 t 1 t 2 > 0
also holds. We assume that the converse is true. We will show that the resulting system of inequalities will lead to a contradiction. We have
c 1 c 3 e 2 + e 1 e 2 e 3 c 3 t 1 t 2 < 0
e 1 e 2 e 3 e 4 c 1 c 2 c 3 c 4 0
c 1 c 2 c 3 c 4 + e 1 e 2 e 3 e 4 > c 3 e 4 t 1 t 2 + c 1 c 3 e 2 e 4 + c 2 c 4 e 1 e 3
where (74) comes from (71) after omitting some negative terms on the left side. Multiplying (72) with e 4 and putting these inequalities together we get
c 1 c 2 c 3 c 4 e 1 e 2 e 3 e 4 < e 4 c 3 ( c 1 e 2 + t 1 t 2 ) < e 1 e 2 e 3 e 4 + c 1 c 2 c 3 c 4 c 2 c 4 e 1 e 3
and hence
0 e 1 e 2 e 3 e 4 c 1 c 2 c 3 c 4 < c 1 c 3 ( e 2 e 4 c 2 c 4 ) + c 3 e 4 t 1 t 2 < e 1 e 3 ( e 2 e 4 c 2 c 4 )
From this follows that e 1 e 3 > c 1 c 3 and e 2 e 4 > c 2 c 4 hold and hence we can write
0 e 1 e 2 e 3 e 4 e 1 c 2 e 3 c 4 < c 1 c 3 ( e 2 e 4 c 2 c 4 ) + c 3 e 4 t 1 t 2 < e 1 e 3 ( e 2 e 4 c 2 c 4 )
which is a contradiction. This proof also works in the other direction, clearly if det ( C ) > 0 holds then also det ( A ) < 0 holds, if additionally all leading principal minors are positive then e > c holds.

8. Proofs

8.1. Proof of Theorem 4

We split the proof in two parts. The first part is on a certain property of the return map.
Lemma 21
With the assumptions from Theorem 4 we get
Π 0 v R + 2
for v > 0 .
Proof 
It holds that
Π 0 v = ( 1 + d 2 v ) p 12 v p 22 v
and hence we have to show that under the assumptions made d 2 , p 12 and p 22 are greater than 0. That p 22 > 0 holds can be seen by using the fact that t r ( P ) > 2 det P > 0 holds. From
p 22 = c 2 c 4 e 3 + c 4 t 2 t 3 e 2 e 3 e 4 > 0 c 2 c 4 e 3 + c 4 t 2 t 3 > 0 c 2 e 3 + t 2 t 3 > 0
now follows that
d 2 = 1 e 2 2 + c 2 e 2 e 4 2 t 2 e 2 e 3 2 + t 2 t 3 e 2 e 3 e 4 2 > 0 e 3 2 e 4 2 + c 2 e 2 e 3 2 e 2 e 4 2 t 2 > e 2 e 3 t 2 t 3
holds. Lastly note that p 21 = e 1 Σ 2 and as Σ 2 > 0 holds under the assumptions made (use Mathematica to prevent some lengthy calculations to show this. The Mathematica file is added as a supplementary file called additionalmaterial.nb), which proves the lemma.
Proof of Theorem 4
It follows from Lemma 11 that the conditions guarantee the existence of a positive eigenvalue λ + and a corresponding positive eigenvector. It follows from Lemma 14 that no strategy is dominated. Hence each strategy is used and the domain of the return map is not empty. It follows from Lemma 21 that an invariant set in R + 2 exists, which is at least given by the set that lies between u + and 0 , v .

8.2. Proof of Theorem 5 and 6

Proof of Theorem 5
The first part is from Lemma 15 and assures the existence of an attracting Shapley polygon on the 123 face. For the asymptotic stability of the Nash equilibrium it only remains to show that there are no other Nash equilibria in the game. This can be seen by using the index theorem [6]. If there are only regular Nash equilibria there must be a Nash equilibrium with index 1 to be able to get more than one Nash equilibrium. With the help of Mathematica we get that N 24 (which has index 1 ) does never exist, if Σ 1 > 0 , Σ 2 > 0 and e 1 e 2 t 3 < | t 1 | c 2 c 3 hold. Again we use Mathematica to show that the index of the interior Nash equilibrium N 1234 is + 1 (with the formula from (60)), so it must be the unique Nash equilibrium and hence it is asymptotically stable. Note that no other Shapley polygon can exist. The Mathematica file is added as a supplementary file called additionalmaterial.nb.
Proof of Theorem 6
This follows directly from the proofs of Theorem 5 and Theorem 4.

9. Conclusions

In this paper we presented some results on the existence and stability of Shapley polygons. It turns out that it is possible to give a complete classification of the attractors for monocyclic matrices, whereas—as the examples in Section 4 show—it is rather impossible to give a classification for general matrices. It should be also mentioned that the support of Nash equilibria, Shapley polygons and best response cycles may be disjoint making it even harder to predict the asymptotical behaviour. As shown in [2] there is a strong connection between the replicator equation and the best response dynamics. This connection is especially stressed by the fact that we were able to construct a a return map which had an identical return matrix as found in the analysis of the replicator equation. The return map found in this work turned out to be able to deliver also results on the interior Nash equilibrium. From this point of view it might be interesting to have a closer look at the replicator equation and check the orbits, whose time averages converge to the saddle-type Shapley polygon.

Acknowledgements

I thank Josef Hofbauer, Bernhard von Stengel, Ulrich Berger and my reviewers for their useful hints and suggestions. All errors remain my own.

References

  1. Gaunersdorfer, A.; Hofbauer, J. Fictitious Play, Shapley Polygons and the Replicator Equation. Game. Econ. Behav. 1995, 11, 279–303. [Google Scholar] [CrossRef]
  2. Hofbauer, J.; Sorin, S.; Viossat, Y. Time average replicator and best reply dynamics. Math. Oper. Res. 2009, 34, 406–429. [Google Scholar] [CrossRef] [Green Version]
  3. Brannath, W. Heteroclinic networks on the tetrahedron. Nonlinearity 1994, 7, 1367–1384. [Google Scholar] [CrossRef]
  4. Kirk, V.; Silber, M. A competition between heteroclinic cycles. Nonlinearity 1994, 7, 1605–1621. [Google Scholar] [CrossRef]
  5. Matsui, A. Best response dynamics and socially stable strategies. J. Econ. Theor. 1992, 57, 343–362. [Google Scholar] [CrossRef]
  6. Hofbauer, J.; Sigmund, K. Evolutionary Games and Population Dynamics; Cambridge University Press: Cambridge, UK, 1998; 1998. [Google Scholar]
  7. Hofbauer, J. Stability for the best response dynamics. Working paper. 1995. [Google Scholar]
  8. Bang-Jensen, J.; Gutin, G. Digraphs: Theory, Algorithms and Applications, 2nd ed.; Springer-Verlag: London, UK, 2008. [Google Scholar]
  9. Filippov, A.F. Differential Equations with Discontinuous Righthand Sides; Springer: Dordrecht, Netherland, 1988; Volume 1. [Google Scholar]
  10. Benaïm, M.; Hofbauer, J.; Hopkins, E. Learning in games with unstable equilibria. Working paper. 2006. [Google Scholar]
  11. Cederberg, J.N. A Course in Modern Geometry; Springer Verlag: Berlin, Germany, 1989. [Google Scholar]
  • 1.See [3] for a precise definition of relatively asymptotically stable. In this context it means that the heteroclinic cycle Γ attracts an open set U and possibly Γ is not contained in U but Γ is contained in the closure of U.
  • 2.We follow here the terminology from graph theory, see for example [8].
  • 3.These eigenvalues are calculated with the help of the Jacobian at the rest point. It turns out that a i j a i i is the eigenvalue at e i in direction e j , see [6]
  • 4.Forced movement occurs quite often in the field of differential inclusions, but we do not want to deal with it for this class of payoff matrices, see for example [9]. An example for forced movement also occurs in example 1 where the orbit is forced to continue in a certain direction. This direction is the result of two adjacent vector fields.

Share and Cite

MDPI and ACS Style

Hahn, M. Shapley Polygons in 4 x 4 Games. Games 2010, 1, 189-220. https://doi.org/10.3390/g1030189

AMA Style

Hahn M. Shapley Polygons in 4 x 4 Games. Games. 2010; 1(3):189-220. https://doi.org/10.3390/g1030189

Chicago/Turabian Style

Hahn, Martin. 2010. "Shapley Polygons in 4 x 4 Games" Games 1, no. 3: 189-220. https://doi.org/10.3390/g1030189

Article Metrics

Back to TopTop