Characterizing Manipulation via Machiavellianism

Sanchez-Rabaza, Jacqueline; Rocha-Martinez, Jose Maria; Clempner, Julio B.

doi:10.3390/math11194143

Open AccessArticle

Characterizing Manipulation via Machiavellianism

by

Jacqueline Sanchez-Rabaza

,

Jose Maria Rocha-Martinez

^* and

Julio B. Clempner

Escuela Superior de Física y Matemáticas, Instituto Politécnico Nacional, School of Physics and Mathematics, National Polytechnic Institute, Edificio 9 U.P. Adolfo Lopez Mateos, Col. San Pedro Zacatenco, Mexico City 07730, Mexico

^*

Author to whom correspondence should be addressed.

Mathematics 2023, 11(19), 4143; https://doi.org/10.3390/math11194143

Submission received: 25 April 2023 / Revised: 4 June 2023 / Accepted: 13 June 2023 / Published: 30 September 2023

(This article belongs to the Topic Game Theory and Applications)

Download

Browse Figures

Versions Notes

Abstract

:

Machiavellianism refers to the propensity of taking advantage of people within a society. Machiavellians have reputations for being cunning and competitive. They are also skilled long-term strategists and planners. Other than their “victories,” there are no other successful conclusions for them. The belief component of Machiavellianism includes cynical views of human nature (e.g., manipulated and manipulating individuals), interpersonal exploitation as a technique (e.g., strategic thinking), and a lack of traditional morality that would forbid their behaviors (e.g., immoral behaviors). This paper focuses on a game that involves manipulation. The game was conceptualized using the best and worst Nash equilibrium points as part of our contribution. We constrained the problem to homogeneous, finite, ergodic, and controllable Bayesian–Markov games. Machiavellian players pretended to be in one state when they were actually in another. Moreover, they pretended to perform one action while actually playing another. All Machiavellian individuals engaged in some form of interpersonal manipulation. Manipulating players exhibited a higher preference compared to manipulated participants. The Pareto frontier is defined as the line where manipulating players play the best Nash equilibrium and manipulated players play the worst Nash equilibrium. It is also considered a sequential Bayesian–Markov manipulation game involving multiple manipulating players and manipulated players. Finally, a tractable characterization of the manipulation equilibrium results is provided. To guarantee that the game’s solution converged into a singular solution, we used Tikhonov’s penalty regularization method. A numerical example describes the results of our model.

Keywords:

Machiavellianism; manipulation; Markov chain; game theory

MSC:

91A10; 91A40; 91A80; 91E30; 62C10

1. Introduction

1.1. Brief Review

The advantages and disadvantages of adding more believable behavioral assumptions to a traditional economic model have been extensively discussed in the literature. The focus on manipulation is one of the most obvious issues. Game-theoretic studies have discovered considerable individual variances in situations where the game allows for unstable behavior, such as manipulation. In this paper, we concentrate on the literature on decision-making and personality linked to Machiavellianism, a personality trait connected to manipulativeness, callousness, and indifference to morality in the study of personality psychology [1].

Christie and Geis [2] employed modified and abridged remarks derived from Niccolo Machiavelli’s works to explore variances in human behavior. Niccolo Machiavelli, in his conception of the world, classifies individuals as manipulated and manipulating. Machiavellian individuals often exploit others, hold pessimistic views on human nature, and lack the moral conscience that would make them accountable for their deeds. Machiavellianism captures the lack of coordination in games where individuals are selfish and may have conflicted interests.

The Mach IV exam that Christie and Geis created is now the accepted self-evaluation instrument and scale used for the Machiavellianism concept [3,4]. High-Mach individuals are more likely to exhibit dishonest behavior and possess a cynical, unfeeling disposition compared to low-Mach individuals. Christie established that a Machiavellian individual would have the following qualities: (a) Their worldview is determined by manipulators and manipulated individuals (i.e., views component), where manipulators do not feel sympathy for their prey, and manipulated individuals also manipulate to some degree. (b) Machiavellians have a lack of regard for conventional morality (e.g., morality component) and do not care about actions, such as lying and cheating. (c) Machiavellians frequently place more emphasis on pragmatic problem-solving than on ideological commitments and they are more likely to employ power strategies (e.g., tactical component) to further their personal goals rather than ideological ones.

The Nash equilibrium is the most classical method used to characterize the solution of a non-cooperative game involving two or more players in game theory [5]. Each player in a Nash equilibrium is aware of the equilibrium tactics of the other players, and changing one’s own strategy would not benefit anyone. The concept of the Nash equilibrium dates back to Cournot, who used it to explain how rival enterprises determined their output levels in 1838 [6]. The current strategy set represents an action plan based on the events that have transpired thus far in the game, and no player can improve their personal expected payoff by altering their strategy while the other players keep theirs unchanged. In optimization problems (with one player), the scenario is the same: if the cost function is strongly convex and we are at the minimal point, then any movement away from this point will result in the greatest number of payout values. From this viewpoint, the Nash equilibrium generalizes the conventional optimal approach for situations involving a single participant. We roughly conceptualize Machiavellianism as a welfare system of the worst-case Nash Equilibrium (manipulated individuals) and the non-cooperative optimal welfare system (manipulating individuals).

1.2. Related Work

Allen and Gorton [7] addressed the idea of a manipulation equilibrium while considering stock price trends and welfare concerns. Stock price manipulation decreases the pre-bid stock price, as demonstrated by Bagnoli and Lipman [8]. They showed that manipulation drives takeover offers and blocks certain effective takeovers when there is limited takeover activity. Clempner [9] provided a game theory method for simulating manipulation that makes use of a reinforcement learning method to incorporate the idea of immorality. Cumming et al. [10] found evidence of the harmful consequences of market manipulation on innovation by using a sample of suspected stock price manipulation incidents based on intra-day data for equities from nine nations over eight years. They demonstrated that these detrimental consequences are more detrimental to innovation in economies with weak intellectual property protections and strong shareholder protections. Clempner and Trejo [11] presented a method based on Nash’s bargaining model for modeling manipulation games. Clempner [12] applied the previous approach for repeated Stackelberg security games. Clempner [13] suggested a manipulation game based on a class of ergodic Bayesian–Markov models.

One sender and one receiver are used to illustrate the traditional Bayesian persuasion framework. The sender, who has access to certain private information, creates a signaling strategy in an attempt to influence the receiver, to make a good decision. For models with many senders, see the works by Milgrom [14] and Krishna [15]. Private signal-based persuasion has been examined in a variety of contexts, including two-agent, two-action games [16], and unanimity elections [17]. For a single sender and receiver model, Kamenica and Gentzkow [18] introduced a Bayesian persuasion framework. They demonstrated the necessary and sufficient criteria for the existence of a signal that helps the sender and described the optimum signals. According to Bergemann and Morris [19], the designer of a fixed mechanism in Bayesian persuasion chooses an information structure for the participants in order to advance their objectives. Gentzkow and Kamenica [20] extended their earlier work [18] by incorporating multiple senders in scenarios where the set of potential signals was extensive, taking into account a lattice structure that enabled intuitive signal comparison and combining. In order to influence a decision-maker’s choice, Brocas et al. [21] proposed a framework where two adversaries invest resources in gathering information from the general public regarding an unknown condition of the world. They described the opponents’ sampling tactics in the game’s equilibrium and demonstrated that they change when one adversary’s cost of information acquisition rises. Gul and Pesendorfer [22] presented a model that describes political campaigns involving asymmetric information and signaling about a binary state of the world. The model includes two senders with conflicting objectives. Several authors, such as [23,24,25,26], cited frameworks in which the information revealed by the sender was only viewed by the recipients.

1.3. Main Contribution

This paper describes a manipulation game. Our contribution consists of conceptualizing the game by employing the best and the worst Nash equilibrium points. We constrained the problem to an assortment of homogeneous, finite, ergodic, and controllable Bayesian–Markov games, where:

Machiavellian players pretended to be in one state when they were actually in another.
Machiavellian players pretended to perform one action while actually playing another.
All Machiavellian individuals engaged in some form of interpersonal manipulation.
Manipulating players exhibited a higher preference compared to manipulated participants.
The Pareto frontier is characterized as the boundary where the manipulating players play the best Nash equilibrium and the manipulated players play the worst Nash equilibrium.
It is considered a sequential Bayesian–Markov manipulation game involving multiple manipulating players and manipulated players.
A tractable characterization of the manipulation equilibrium results is provided.
To guarantee that the game’s solution converged into a singular solution, we used Tikhonov’s penalty regularization method.

1.4. Organization of the Paper

The structure of the paper is as follows. Section 2 describes the manipulation game while Section 3 considers the manipulation equilibrium. In Section 4, considering the Tanaka function, the manipulation equilibrium is computed. Section 5 considers a Bayesian–Markov approach of the model, where the moral hazard is developed. A numerical example is presented in Section 6. Section 7 concludes with some remarks.

2. Manipulation Game

Let

G

be a non-cooperative game denoted as

G = (I, S, u^{l}, u^{f})

.

I = {1, \dots, n + m}

is the set of total Machiavellian players, where

L = {1, \dots, n}

is the set of manipulating players indexed by

l = \bar{1, n}

and

F = {1, \dots, m}

is the set of manipulated players indexed by

f = \bar{1, m}

.

S = S^{l} \times S^{f}

is a strategy set, such that

S^{l}

is the strategy set for the manipulating players (

l \in L

) and

S^{f}

is a strategy set for the manipulated players (

f \in F

). Element

s^{l} \in S^{l}

is a strategy for player

l \in L

, and element

s^{f} \in S^{f}

is a strategy for player

f \in F

. Let us consider that, in general, the index set is

I

, and

s^{- i} = {(s^{j})}_{j \in I / i}

denotes the joint strategy profile of all players, except player i, i.e.,

(s^{1}, \dots, s^{i - 1}, s^{i + 1}, \dots, s^{n + m})

.

A strategy profile

s = (s^{1}, \dots, s^{n}, s^{n + 1}, \dots, s^{n + m})

is a vector of strategies, one for each player, and the set of all possible strategy profiles is denoted by

S = S^{1} \times \dots \times S^{n} \times \dots S^{m}

, i.e.,

S = \times_{k = 1}^{n + m} S^{k}

. For any admissible strategy set

s^{l} \in S_{a d m}^{l}

for the manipulating players:

S_{a d m}^{l} : = \{s^{l} : \sum_{l \in L} g^{l} (s^{l}) = 0, \sum_{l \in L} h^{l} (s^{l}) \leq 0\},

and any admissible strategy space

s^{f} \in S_{a d m}^{f}

for the manipulating players:

S_{a d m}^{f} : = \{s^{f} : \sum_{f \in F} g^{f} (s^{f}) = 0, \sum_{f \in F} h^{f} (s^{f}) \leq 0\},

such that

S_{a d m} = S_{a d m}^{l} \times S_{a d m}^{f}

. The reward of a manipulating player

l \in L

for a strategy

s = (s^{1}, \dots, s^{n}, s^{n + 1}, \dots, s^{n + m})

is determined by a utility function

u^{l} : S_{a d m} \to R

, and the reward of a manipulated player

f \in F

for a strategy

s = (s^{1}, \dots, s^{n}, s^{n + 1}, \dots, s^{n + m})

is determined by a utility function

u^{f} : S_{a d m} \to R

, with the admissible strategy space of the manipulating players being

S_{a d m}^{l}

and the admissible strategy space of the manipulating players being

S_{a d m}^{f}

.

The designations of the manipulating players and the manipulated players indicate the sequential order of play between two types of Machiavellian individuals, meaning that the manipulating players play first and the manipulated players play next. The manipulating and manipulated players are involved in a non-cooperative game. which we will refer to as a manipulation or Machiavellian game.

3. Manipulation Equilibrium

Each Machiavellian player must solve an optimization problem, where the admissible set is constrained by convex constraints based on the variables of the other Machiavellian players, as well as integrity constraints with the objective function. The objective function of the optimization problem depends on the variables of the other players.

We assume that the manipulating players obtain a reward

u^{l}

, and the manipulated players earn a reward

u^{f}

; moreover, we assume that these rewards are continuous and convex in all their arguments. The manipulating players attempt to find a solution to the optimization problem

max_{s^{l} \in S^{l}} \{u^{l} (s^{l}, s^{f}) | s^{f} \in arg max_{s^{' f} \in S^{f}} u^{f} (s^{' f}, s^{l})\},

where the manipulated players attempt to find a solution to the optimization problem

max_{s^{f} \in S^{f}} u^{f} (s^{f}, s^{l}) .

The Machiavellian players obey myopic strategies that move in the direction of improvement with regard to the two optimization problems mentioned above: the former for the manipulating players and the latter for the manipulated players.

Let us first describe the equilibrium concept investigated for simultaneous play games and contrast it with that investigated in the sequential counterpart. A strategy for the Machiavellian players

s^{*} (λ) \in S_{a d m}

, satisfying the following inequality, is known as the Nash equilibrium for the Machiavellian players

\begin{matrix} F (s, λ) : = \sum_{i \in I} λ^{i} [u^{i} (s^{i}, s^{- i}) - u^{i} (s^{*} (λ))] \leq 0, \\ λ \in Λ = \{λ \in R^{n} : λ^{i} \geq 0, \sum_{i \in I} λ^{i} = 1\} . \end{matrix}\}

(1)

If each Machiavellian player has chosen a strategy s, and no player can benefit from changing strategies while others keep their strategies the same, the present set of strategical choices and payoffs is known as the Nash equilibrium. In other words, for a joint strategy,

s^{*} \in S_{a d m}

, satisfying for any admissible

s^{i} \in S_{a d m}^{i}

, we have

u^{i} (s^{i}, s^{- i *}) \leq u^{i} (s^{i *}, s^{- i *})

as a Nash equilibrium point. It is important to note that considering the uniform distribution

λ^{i} = n^{- 1}

, we obtain the original definition of the Nash equilibrium [5]. We assume that the functions

u^{i} (s)

(i \in I)

are meant to be multilinear in all of their arguments [27,28].

The condition in Equation (1) implies the Nash property:

\begin{matrix} λ^{i} [u^{i} (s^{i}, s^{- i}) - u^{i} (s^{*} (λ))] \leq 0 \\ for any s \in S_{a d m} and i \in I . \end{matrix}

Any equilibrium point

s^{*} (λ) \in S_{a d m}

belongs to the Pareto set [29,30]

P : = \{u^{i} (s^{*} (λ)) : s^{*} (λ) \in \underset{s \in S_{a d m}}{A r g max} \sum_{i \in I} λ^{i} u^{i} (s)\} .

(2)

Following Tanaka [31,32], a Nash equilibrium

s^{*} (λ)

that satisfies the conditions in Equation (1) is

\begin{matrix} s^{*} (λ) \in \underset{s \in S_{a d m}}{A r g min} F (s, λ) = \underset{s \in S_{a d m}}{A r g min} \sum_{i \in I} λ^{i} [u^{i} (s^{*} (λ)) - u^{i} (s)] = \\ \underset{s \in S_{a d m}}{A r g min} \sum_{i \in I} λ^{i} [- u^{i} (s)] = \underset{s \in S_{a d m}}{A r g max} \sum_{i \in I} λ^{i} u^{i} (s), \end{matrix}

which determines the Pareto set given in Equation (2).

Remark 1.

It is important to note that by choosing different

λ^{i} \in Λ

, we obtain all feasible Nash equilibria in the proposed game.

Let us introduce the individual reward for the manipulating players

φ^{l} (s^{l}, s^{f}, λ^{l}) = {\hat{λ}}^{l} u^{l} ({\hat{s}}^{l}, {\hat{s}}^{- l}, s^{f})

for any admissible

s^{l} = ({\hat{s}}^{l}, {\hat{s}}^{- l}) \in S_{a d m}^{l}

, such that

φ (s^{l}, s^{f}, λ^{l}) = \sum_{l \in L} φ^{l} (s^{l}, s^{f}, λ^{l}) = \sum_{l \in L} {\hat{λ}}^{l} u^{l} ({\hat{s}}^{l}, {\hat{s}}^{- l}, s^{f}) \to max_{s^{l} \in S_{a d m}^{l}}

and the individual reward for the manipulated players

ψ^{f} (s^{f}, s^{l}, λ^{f}) = {\hat{λ}}^{f} u^{f} ({\hat{s}}^{f}, {\hat{s}}^{- f}, s^{l})

for any admissible

s^{f} = ({\hat{s}}^{f}, {\hat{s}}^{- f}) \in S_{a d m}^{f}

where

ψ (s^{f}, s^{l}, λ^{f}) = \sum_{f \in F} ψ^{f} (s^{f}, s^{l}, λ^{f}) = \sum_{f \in F} {\hat{λ}}^{f} u^{f} ({\hat{s}}^{f}, {\hat{s}}^{- f}, s^{l}) \to max_{s^{f} \in S_{a d m}^{f}} .

The welfare of a strategy profile s for the manipulated players is defined as the minimum reward of a Machiavellian player

min_{λ^{f} \in Λ} ψ (s^{f}, s^{l}, λ^{f})

, or the worst-case among all the Nash equilibria. On the other hand, the optimal welfare for the manipulating players is the maximum reward of a Machiavellian player

max_{λ^{l} \in Λ} φ (s^{l}, s^{f}, λ^{l})

, or the best-case among all the Nash equilibria.

In a manipulation game, one strategy

s^{l *} \in S_{a d m}^{l}

of the manipulating players is called a manipulation (Machiavellian) equilibrium, if

max_{λ^{l} \in Λ} sup_{s^{f} \in N (s^{l})} φ (s^{l}, s^{f}, λ^{l}) \leq max_{λ^{l} \in Λ} sup_{s^{f} \in N (s^{l *})} φ (s^{l *}, s^{f}, λ^{l}),

where

N (s^{l}) = \{s^{f} \in S_{a d m}^{f} : ψ (s^{^{'} f}, s^{l}, λ^{f}) \leq ψ (s^{f}, s^{l}, λ^{f}), s^{^{'} f} \in S_{a d m}^{f}\},

represents the set of Nash equilibria

N (s^{l})

, given that the manipulating players are playing strategy

s^{l}

and the manipulated players’ best reply is the set of Nash equilibria (with

s^{l}

fixed).

4. Computing the Manipulation Equilibrium

4.1. The Tanaka Function

We suggest the following representation for the manipulation game

\begin{matrix} max_{λ^{l} \in Λ} sup_{s^{f} \in N (s^{l})} φ (s^{l}, s^{f}, λ^{l}) \\ s . t . \\ g^{l} (s^{l}) = 0, h^{l} (s^{l}) \leq 0 \\ min_{λ^{f} \in Λ} sup_{s^{l} \in N (s^{f})} ψ (s^{f}, s^{l}, λ^{f}) \\ s . t . \\ g^{f} (s^{f}) = 0, h^{f} (s^{f}) \leq 0 \end{matrix}

The complement variables are fixed, for instance, for computing

max_{λ^{l} \in Λ} sup_{s^{f} \in N (s^{l})} φ (s^{l}, s^{f}, λ^{l})

, variable

s^{f}

is fixed.

Let us define the regularized welfare function as follows:

φ_{δ} (s^{l}, s^{f}, λ^{l}) = φ (s^{l}, s^{f}, λ^{l}) - \frac{δ}{2} {∥s^{l}∥}^{2} + \frac{δ}{2} {∥λ^{l}∥}^{2}, δ > 0,

(3)

and

ψ_{δ} (s^{f}, s^{l}, λ^{l}) = ψ (s^{f}, s^{l}, λ^{l}) - \frac{δ}{2} {∥s^{f}∥}^{2} - \frac{δ}{2} {∥λ^{f}∥}^{2}, δ > 0 .

(4)

In order to prevent changing the form of the original function, the regularization method additionally focuses on determining the parameter

δ

for making the original objective functions,

φ (s^{l}, s^{f}, λ)

and

ψ (s^{f}, s^{l}, λ^{f})

, and the term,

\frac{δ}{2}

, maximal, which has a unique solution, given that functions (3) and (4) are strongly convex if

δ > 0

.

Applying the penalty approach [27,28] in Equations (3) and (4), we have

\begin{matrix} Φ_{k, δ} (s^{l}, s^{f}, λ^{l}) : = φ_{δ} (s^{l}, s^{f}, λ^{l}) + k [- \frac{1}{2} {∥g^{l} (s^{l})∥}^{2} - \frac{1}{2} {∥h^{l} (s^{l})∥}^{2} - \frac{δ}{2} ({∥s^{l}∥}^{2} - {∥λ^{l}∥}^{2})] . \end{matrix}

(5)

Let

α = k^{- 1}

, then we have

\begin{matrix} Φ_{α, δ} (s^{l}, s^{f}, λ^{l}) : = α φ_{δ} (s^{l}, s^{f}, λ^{l}) - \frac{1}{2} {∥g^{l} (s^{l})∥}^{2} - \frac{1}{2} {∥h^{l} (s^{l})∥}^{2} - \frac{δ}{2} ({∥s^{l}∥}^{2} - {∥λ^{l}∥}^{2}), \end{matrix}

(6)

where

\begin{matrix} s_{Φ_{α, δ}}^{l * *} = arg max_{λ^{l} \in Λ} max_{s^{l} \in S_{a d m}^{l}} Φ_{α, δ} (s^{l}, s^{f}, λ^{l}), \end{matrix}

and

\begin{matrix} Ψ_{k, δ} (s^{f}, s^{l}, λ^{f}) : = ψ_{δ} (s^{f}, s^{l}, λ^{f}) + k [- \frac{1}{2} {∥g^{f} (s^{f})∥}^{2} - \frac{1}{2} {∥h^{f} (s^{f})∥}^{2} - \frac{δ}{2} ({∥s^{f}∥}^{2} + {∥λ^{f}∥}^{2})] . \end{matrix}

(7)

Let

α = k^{- 1}

, then we have

\begin{matrix} Ψ_{α, δ} (s^{f}, s^{l}, λ^{f}) : = α ψ_{δ} (s^{f}, s^{l}, λ^{f}) - \frac{1}{2} {∥g^{f} (s^{f})∥}^{2} - \frac{1}{2} {∥h^{f} (s^{f})∥}^{2} - \frac{δ}{2} ({∥s^{f}∥}^{2} + {∥λ^{f}∥}^{2}), \end{matrix}

(8)

where

\begin{matrix} s_{Ψ_{α, δ}}^{f * *} = arg min_{λ^{f} \in Λ} max_{s^{f} \in S_{a d m}} Ψ_{α, δ} (s^{f}, s^{l}, λ^{f}), \end{matrix}

such that

s^{* *} = (s_{Φ_{α, δ}}^{l * *}, s_{Ψ_{α, δ}}^{f * *})

.

The players are attempting to reach one of the

ε

-Nash equilibria [31,32], which involves finding a joint strategy

s \in S_{a d m}

, satisfying for any admissible

s^{l} \in S_{a d m}^{l}

and

s^{f} \in S_{a d m}^{f}

the system of inequalities (the

ε

-Nash equilibrium condition)

\begin{matrix} G_{α, δ} (s^{l}, {\tilde{s}}^{l}, s^{f}, λ^{l}) \leq ε for any s^{l} \in S_{a d m}^{l} and l = \bar{1, n}, ε \geq 0, \\ G_{α, δ} (s^{l}, {\tilde{s}}^{l}, s^{f}, λ^{l}) : = \sum_{l \in L} [Φ_{α, δ}^{l} ({\hat{s}}^{l}, {\hat{s}}^{- l}, s^{f}, {\hat{λ}}^{l}) - Φ_{α, δ}^{l} ({\overset{˚}{s}}^{l}, {\hat{s}}^{- l}, s^{f}, {\hat{λ}}^{l})] \leq ε, \end{matrix}

(9)

where

{\tilde{s}}^{l} \in S_{a d m}^{l}

is the complement of

s^{l}

and

s^{f}

is fixed and

{\overset{˚}{s}}^{l} : = \underset{{\hat{s}}^{l} \in S_{a d m}^{l}}{arg max} Φ_{α, δ}^{l} ({\hat{s}}^{l}, {\hat{s}}^{- l}, s^{f}, λ^{l}) .

Having

\begin{matrix} F_{α, δ} (s^{f}, {\tilde{s}}^{f}, s^{l}, λ^{f}) \leq ε for any s^{f} \in S_{a d m}^{f} and f = \bar{1, m}, ε \geq 0, \\ F_{α, δ} (s^{f}, {\tilde{s}}^{f}, s^{l}, λ^{f}) : = \sum_{f \in F} [Ψ_{α, δ}^{f} ({\hat{s}}^{f}, {\hat{s}}^{- f}, s^{l}, {\hat{λ}}^{f}) - Ψ_{α, δ}^{f} ({\overset{˚}{s}}^{f}, {\hat{s}}^{- f}, s^{l}, {\hat{λ}}^{f})] \leq ε \end{matrix}

(10)

where

{\tilde{s}}^{f} \in S_{a d m}^{f}

is the complement of

s^{f}

and

s^{l}

is fixed and

{\overset{˚}{s}}^{f} : = \underset{{\hat{s}}^{f} \in S_{a d m}^{f}}{arg max} Ψ_{α, δ} ({\hat{s}}^{f}, {\hat{s}}^{- f}, s^{l}, λ^{f}) .

Note that the condition

G_{α, δ} (s^{l}, {\tilde{s}}^{l}, s^{f}, λ^{l}) \leq ε

and

F_{α, δ} (s^{f}, {\tilde{s}}^{f}, s^{l}, λ^{f}) \leq ε

are equivalent to

max_{λ^{l} \in Λ} max_{s^{l} \in S_{a d m}^{l}} G_{α, δ} (s^{l}, {\tilde{s}}^{l}, s^{f}, λ^{l}),

(11)

and

min_{λ^{f} \in Λ} max_{s^{f} \in S_{a d m}} F_{α, δ} (s^{f}, {\tilde{s}}^{f}, s^{l}, λ^{f}) .

(12)

Then

s_{G_{α, δ}}^{l * *} \in A r g max_{λ^{l} \in Λ} max_{s^{l} \in S_{a d m}^{l}} G_{α, δ} (s^{l}, {\tilde{s}}^{l}, s^{f}, λ^{l}),

and

s_{F_{α, δ}}^{f * *} \in A r g min_{λ^{f} \in Λ} max_{s^{f} \in S_{a d m}} F_{α, δ} (s^{f}, {\tilde{s}}^{f}, s^{l}, λ^{f}) .

Definition 1.

A strategy

s_{G_{α, δ}}^{l * *}

of the manipulating player, together with the strategy

s_{F_{α, δ}}^{f * *}

, is said to be a manipulating equilibrium (for

ε = 0

) if they fulfill

\begin{matrix} (s_{G_{α, δ}}^{l * *}, s_{F_{α, δ}}^{f * *}) \in arg max_{λ^{l} \in Λ} min_{λ^{f} \in Λ} max_{s^{l} \in S_{a d m}^{l}} max_{s^{f} \in S_{a d m}^{f}} \\ \{Φ_{α, δ} (s^{l}, s^{f}, λ^{l})| G_{α, δ} (s^{l}, {\tilde{s}}^{l}, s^{f}, λ^{l}) \leq 0, F_{α, δ} (s^{f}, {\tilde{s}}^{f}, s^{l}, λ^{f}) \leq 0\} . \end{matrix} .

(13)

Let us introduce the penalty approach to represent (13) as follows:

P (s^{l}, {\tilde{s}}^{l}, s^{f}, {\tilde{s}}^{f}, λ^{l}, λ^{f}) \to max_{λ^{l} \in Λ} min_{λ^{f} \in Λ} max_{s^{l} \in S_{a d m}^{l}} max_{s^{f} \in S_{a d m}^{f}}

where

\begin{matrix} P_{α, δ} (s^{l}, {\tilde{s}}^{l}, s^{f}, {\tilde{s}}^{f}, λ^{l}, λ^{f}) = Φ_{α, δ} (s^{l}, s^{f}, λ^{l}) - \frac{1}{2} {∥G_{α, δ} (s^{l}, {\tilde{s}}^{l}, s^{f}, λ^{l})∥}^{2} - \frac{1}{2} {∥F_{α, δ} (s^{f}, {\tilde{s}}^{f}, s^{l}, λ^{f})∥}^{2} . \end{matrix}

(14)

4.2. Proximal Format

The problem (14) can be represented in the following proximal format

λ_{δ}^{l *} = arg max_{λ^{l} \in Λ} \{- \frac{1}{2} {∥λ^{l} - λ_{δ}^{l *}∥}^{2} + γ P_{α, δ} (s_{δ}^{l *}, {\tilde{s}}_{δ}^{l *}, s_{δ}^{f *}, {\tilde{s}}_{δ}^{f *}, λ^{l}, λ_{δ}^{f *})\}

s_{δ}^{l *} = arg max_{s^{l} \in S_{a d m}^{l}} \{- \frac{1}{2} {∥s^{l} - s_{δ}^{l *}∥}^{2} + γ P_{α, δ} (s^{l}, {\tilde{s}}_{δ}^{l *}, s_{δ}^{f *}, {\tilde{s}}_{δ}^{f *}, λ_{δ}^{l *}, λ_{δ}^{f *})\}

{\tilde{s}}_{δ}^{l *} = arg max_{{\tilde{s}}^{l} \in S_{a d m}^{l}} \{- \frac{1}{2} {∥{\tilde{s}}^{l} - {\tilde{s}}_{δ}^{l *}∥}^{2} + γ P_{α, δ} (s_{δ}^{l *}, {\tilde{s}}^{l}, s_{δ}^{f *}, {\tilde{s}}_{δ}^{f *}, λ_{δ}^{l *}, λ_{δ}^{f *})\}

λ_{δ}^{f *} = arg min_{λ^{f} \in Λ} \{\frac{1}{2} {∥λ^{f} - λ_{δ}^{f *}∥}^{2} + γ P_{α, δ} (s_{δ}^{l *}, {\tilde{s}}_{δ}^{l *}, s_{δ}^{f *}, {\tilde{s}}_{δ}^{f *}, λ_{δ}^{l *}, λ^{f})\}

s_{δ}^{f *} = arg max_{s^{f} \in S_{a d m}^{f}} \{- \frac{1}{2} {∥s^{f} - s_{δ}^{f *}∥}^{2} + γ P_{α, δ} (s_{δ}^{l *}, {\tilde{s}}_{δ}^{l *}, s^{f}, {\tilde{s}}_{δ}^{f *}, λ_{δ}^{l *}, λ_{δ}^{f *})\}

{\tilde{s}}_{δ}^{f *} = arg max_{{\tilde{s}}^{f} \in S_{a d m}^{f}} \{- \frac{1}{2} {∥{\tilde{s}}^{f} - {\tilde{s}}_{δ}^{f *}∥}^{2} + γ P_{α, δ} (s_{δ}^{l *}, {\tilde{s}}_{δ}^{l *}, s_{δ}^{f *}, {\tilde{s}}^{f}, λ_{δ}^{l *}, λ_{δ}^{f *})\}

where the solutions depend of small parameters

γ > 0,

α > 0

and

δ > 0

.

The proximal method for calculating the manipulation equilibrium for initial variables

λ_{0}^{l}, s_{0}^{l}, {\tilde{s}}_{0}^{l}, λ_{0}^{f}, s_{0}^{f}, {\tilde{s}}_{0}^{f}

is given by

\begin{matrix} λ_{n + 1}^{l} = arg max_{λ^{l} \in Λ} \{- \frac{1}{2} {∥λ^{l} - λ_{n}^{l}∥}^{2} + γ P_{α, δ} (s_{n}^{l}, {\tilde{s}}_{n}^{l}, s_{n}^{f}, {\tilde{s}}_{n}^{f}, λ^{l}, λ_{n}^{f})\} \\ s_{n + 1}^{l} = arg max_{s^{l} \in S_{a d m}^{l}} \{- \frac{1}{2} {∥s^{l} - s_{n}^{l}∥}^{2} + γ P_{α, δ} (s^{l}, {\tilde{s}}_{n}^{l}, s_{n}^{f}, {\tilde{s}}_{n}^{f}, λ_{n}^{l}, λ_{n}^{f})\} \\ {\tilde{s}}_{n + 1}^{l} = arg max_{{\tilde{s}}^{l} \in S_{a d m}^{l}} \{- \frac{1}{2} {∥{\tilde{s}}^{l} - {\tilde{s}}_{n}^{l}∥}^{2} + γ P_{α, δ} (s_{n}^{l}, {\tilde{s}}^{l}, s_{n}^{f}, {\tilde{s}}_{n}^{f}, λ_{n}^{l}, λ_{n}^{f})\} \\ λ_{n + 1}^{f} = arg min_{λ^{f} \in Λ} \{\frac{1}{2} {∥λ^{f} - λ_{n}^{f}∥}^{2} + γ P_{α, δ} (s_{n}^{l}, {\tilde{s}}_{n}^{l}, s_{n}^{f}, {\tilde{s}}_{n}^{f}, λ_{n}^{l}, λ^{f})\} \\ s_{n + 1}^{f} = arg max_{s^{f} \in S_{a d m}^{f}} \{- \frac{1}{2} {∥s^{f} - s_{n}^{f}∥}^{2} + γ P_{α, δ} (s_{n}^{l}, {\tilde{s}}_{n}^{l}, s^{f}, {\tilde{s}}_{n}^{f}, λ_{n}^{l}, λ_{n}^{f})\} \\ {\tilde{s}}_{n + 1}^{f} = arg max_{{\tilde{s}}^{f} \in S_{a d m}^{f}} \{- \frac{1}{2} {∥{\tilde{s}}^{f} - {\tilde{s}}_{n}^{f}∥}^{2} + γ P_{α, δ} (s_{n}^{l}, {\tilde{s}}_{n}^{l}, s_{n}^{f}, {\tilde{s}}^{f}, λ_{n}^{l}, λ_{n}^{f})\} \end{matrix}\}

(15)

To guarantee the convergence of the suggested procedure, let us select the parameters of the algorithm as follows:

\begin{matrix} δ_{n} = \{\begin{matrix} δ_{0} & if & n \leq n_{0} \\ δ_{0} \frac{[1 + ln (n - n_{0})]}{{(1 + n - n_{0})}^{δ}} & if & n > n_{0} \end{matrix}, μ_{n} = \{\begin{matrix} μ_{0} & if & n < n_{0} \\ \frac{μ_{0}}{{(1 + n - n_{0})}^{α}} & if & n \geq n_{0} \end{matrix} \\ γ_{n} = \{\begin{matrix} γ_{0} & if & n < n_{0} \\ \frac{γ_{0}}{{(1 + n - n_{0})}^{γ}} & if & n \geq n_{0} \end{matrix}, δ, μ, γ > 0, δ_{0}, μ_{0}, γ_{0} > 0, \end{matrix}

(16)

which satisfies

\frac{μ_{n}}{δ_{n}} \underset{n \to \infty}{\to} 0

and

δ \leq μ, γ \geq δ, γ + δ \leq 1 .

5. Markov Games and Moral Hazard

5.1. Markov Model

We consider a discrete-time Markov game played by a set

L = {1, \dots, n}

of manipulating players indexed by

l = \bar{1, n}

and a set

F = {1, \dots, m}

of manipulated players indexed by

f = \bar{1, m}

, where

I = {1, \dots, n + m}

is the total set of Machiavellian players indexed by

i = \bar{1, n + m}

.

The main results of the paper are as follows:

The time is discrete, taking values from $T = {0, 1, \dots}$ and the horizon is finite. The Machiavellian players $i \in I$ are in a sequential game.
At each time $t \geq 0$ , $t \in T$ , the type $θ_{t}^{l}$ of the manipulating players l is selected from the finite set $θ_{t}^{l} \in Θ^{l}$ and revealed to the manipulating players l. In the same manner, the manipulated players f are privately informed about their type $θ_{t}^{f} \in Θ^{f}$ .
For each Machiavellian player $i \in I$ , we use $Θ = \times_{i \in I} Θ^{i}$ to denote the set of type profiles and $Θ^{- i} = \times_{j ∖ i \in I} Θ^{j}$ to denote the set of type tuples of all players, except i.
Manipulating players to take an action $a_{t}^{l}$ (make a decision) from a finite set $a_{t}^{l} \in A^{l}$ , $A^{l} = \times_{l \in L} A^{l}$ , and $A^{- i} = \times_{h ∖ l \in L} A^{h}$ , and manipulated players also take an action $a_{t}^{f} \in A^{f}$ , $A^{f} = \times_{f \in F} A^{f}$ , and $A^{- f} = \times_{h ∖ f \in F} A^{h}$ .
We use $Δ (A^{l})$ ( $Δ (A^{f})$ ) for the set of all probability distributions over $A^{l}$ $(A^{f})$ for all l (f).
For each Machiavellian player $i \in I$ , the game begins with a belief of an invariant measure $P^{i} (θ_{0}^{i}) \in Δ (Θ^{i})$ , such that $P^{i} (θ_{t}^{i}) \in Δ (Θ^{i})$ , where $Δ (Θ^{i})$ denotes the set of all probability distributions over $Θ^{i}$ . The beliefs may differ across types, and we do not assume that the Machiavellian players’ beliefs are derived from a common prior. However, we assume that all types of players agree on which types have positive or zero probabilities. From now on, we assume that each belief system satisfies $P^{i} (θ_{t}^{i}) > 0$ . We say that type $θ_{t}^{i}$ has $P^{i} (θ_{t}^{i})$ -positive probability if $θ_{t}^{i} \in Θ^{i}$ and $P^{i} (θ_{t}^{i})$ -zero probability otherwise.
The transition functions are denoted by

$p^{l} (θ_{t + 1}^{l} | θ_{t}^{l}, a_{t}^{l}, θ_{t - 1}^{l}, a_{t - 1}^{l}, \dots, θ_{0}^{l}, a_{0}^{l}) = p^{l} (θ_{t + 1}^{l} | θ_{t}^{l}, a_{t}^{l}),$

and

$p^{f} (θ_{t + 1}^{f} | θ_{t}^{f}, a_{t}^{f}, θ_{t - 1}^{f}, a_{t - 1}^{f}, \dots, θ_{0}^{f}, a_{0}^{f}) = p^{f} (θ_{t + 1}^{f} | θ_{t}^{f}, a_{t}^{f}),$

for the manipulating and manipulated players, respectively.
Each chain $(P^{l}, p^{l} (θ_{t + 1}^{l} | θ_{t}^{l}, a_{t}^{l}))$ and $(P^{f}, p^{f} (θ_{t + 1}^{f} | θ_{t}^{f}, a_{t}^{f}))$ follows controllable, time homogeneous, irreducible, and aperiodic Markov chains. These chains take values in ( $Θ^{l}, A^{l}$ ) and ( $Θ^{f}, A^{f}$ ), respectively.
Manipulating players have a known utility function $u^{l} : A^{l} \times Θ^{l} \to R^{+}$ , which depends on the action $a_{t}^{l} \in A^{l}$ and the privately known type $θ_{t}^{l} \in Θ^{l}$ . In the same manner, we define the utility function $u^{f} : A^{f} \times Θ^{f} \to R^{+}$ for the manipulated players.

Let us relate the message

m_{t}^{l}

for each manipulating player

l \in L

with its type

m_{t}^{l} \in M^{l} \subseteq Θ^{l}

, and the message

m_{t}^{f}

for each manipulated player

f \in F

with its type

m_{t}^{f} \in M^{f} \subseteq Θ^{f}

. Here,

M^{i} \subseteq Θ^{i}

is the set of messages, such that at time t, a Machiavellian player i sends a message

m_{t}^{i}

to all players that respond with

a_{t} = (a_{t}^{1}, a_{t}^{2}, \dots, a_{t}^{n + m})

. We assume that

M^{i} = Θ^{i}

.

5.2. Strategies and Moral Hazards

The relationship

z^{l} (α_{t}^{l} | a_{t}^{l})

for the manipulating players is given by

z^{l} : A^{l} \to Δ (A^{l})

and for the manipulated players

z^{f} (α_{t}^{f} | a_{t}^{f})

is

z^{f} : A^{f} \to Δ (A^{f})

. The set of admissible actions is defined as follows:

Z_{a d m}^{l} = \{z^{l} (α_{t}^{l} | a_{t}^{l}) | \sum_{α_{t}^{l} \in A^{l}} z^{l} (α_{t}^{l} | a_{t}^{l}) = 1, a_{t}^{l} \in A^{l}\},

and

Z_{a d m}^{f} = \{z^{f} (α_{t}^{f} | a_{t}^{f}) | \sum_{α_{t}^{f} \in A^{f}} z^{f} (α_{t}^{f} | a_{t}^{f}) = 1, a_{t}^{f} \in A^{f}\} .

The relationship

z^{i} (α_{t}^{i} | a_{t}^{i})

represents the likelihood in which a Machiavellian player i believes that

α_{t}^{i}

is an action

a_{t}^{i}

.

A strategy

σ^{l} (m_{t}^{l} | θ_{t}^{l})

(behavioral strategy) for manipulating players is

σ^{l} : Θ^{l} \to Δ (M^{l})

, which represents the likelihood in which a manipulating player l believes that a message

m_{t}^{l}

is of type

θ_{t}^{l}

. For the manipulated players, a strategy

σ^{f} (m_{t}^{f} | θ_{t}^{f})

is represented by

σ^{f} : Θ^{f} \to Δ (M^{f})

. The admissible strategy set is given by

S_{a d m}^{l} = \{σ^{l} (m_{t}^{l} | θ_{t}^{l}) | \sum_{m_{t}^{l} \in M^{l}} σ^{l} (m_{t}^{l} | θ_{t}^{l}) = 1, θ_{t}^{l} \in Θ^{l}\},

and

S_{a d m}^{f} = \{σ^{f} (m_{t}^{f} | θ_{t}^{f}) | \sum_{m_{t}^{f} \in M^{f}} σ^{f} (m_{t}^{f} | θ_{t}^{f}) = 1, θ_{t}^{f} \in Θ^{f}\} .

A policy for the manipulating players is defined as a sequence

\{π^{l} (α_{t}^{l} | m_{t}^{l})\}

, such that, for each time, t,

π^{l} (α_{t}^{l} | m_{t}^{l})

is a stochastic kernel, and for the manipulated players, we have

π^{f} (α_{t}^{f} | m_{t}^{f})

. The set of all admissible policies is denoted as follows:

Π_{a d m}^{l} = \{π^{l} (α_{t}^{l} | m_{t}^{l}) | \sum_{α_{t}^{l} \in A^{l}} π^{l} (α_{t}^{l} | m_{t}^{l}) = 1\},

and

Π_{a d m}^{f} = \{π^{f} (α_{t}^{f} | m_{t}^{f}) | \sum_{α_{t}^{f} \in A^{f}} π^{f} (α_{t}^{f} | m_{t}^{f}) = 1\} .

We are interested in the average criterion, where the realized average reward function for the Manipulating players is given by

\begin{matrix} U^{l} (π, σ, z) = \sum_{θ_{t}^{l} \in Θ^{l}} \sum_{m_{t}^{l} \in M^{l}} \sum_{a_{t}^{l} \in A^{l}} \sum_{α_{t}^{l} \in A^{l}} (\sum_{θ_{t + 1}^{l} \in Θ^{l}} u^{l} (a_{t}^{l}, θ_{t}^{l}) p^{l} (θ_{t + 1}^{l} | θ_{t}^{l}, a_{t}^{l})) \cdot \\ \prod_{l \in L} π^{l} (α_{t}^{l} | m_{t}^{l}) σ^{l} (m_{t}^{l} | θ_{t}^{l}) z^{l} (α_{t}^{l} | a_{t}^{l}) P^{l} (θ_{t}^{l}) \prod_{f \in F} π^{f} (α_{t}^{f} | m_{t}^{f}) σ^{f} (m_{t}^{f} | θ_{t}^{f}) z^{f} (α_{t}^{f} | a_{t}^{f}) P^{f} (θ_{t}^{f}) = \\ \sum_{θ_{t}^{l} \in Θ^{l}} \sum_{m_{t}^{l} \in M^{l}} \sum_{a_{t}^{l} \in A^{l}} \sum_{α_{t}^{l} \in A^{l}} (\sum_{θ_{t + 1}^{l} \in Θ^{l}} u^{l} (a_{t}^{l}, θ_{t}^{l}) p^{l} (θ_{t + 1}^{l} | θ_{t}^{l}, a_{t}^{l})) \prod_{i \in I} π^{i} (α_{t}^{i} | m_{t}^{i}) σ^{i} (m_{t}^{i} | θ_{t}^{i}) z^{i} (α_{t}^{i} | a_{t}^{i}) P^{i} (θ_{t}^{i}), \end{matrix}

(17)

and for the manipulated players

\begin{matrix} U^{f} (π, σ, z) = \sum_{θ_{t}^{f} \in Θ^{f}} \sum_{m_{t}^{f} \in M^{f}} \sum_{a_{t}^{f} \in A^{f}} \sum_{α_{t}^{f} \in A^{f}} (\sum_{θ_{t + 1}^{f} \in Θ^{f}} u^{f} (a_{t}^{f}, θ_{t}^{f}) p^{f} (θ_{t + 1}^{f} | θ_{t}^{f}, a_{t}^{f})) \cdot \\ \prod_{i \in I} π^{i} (α_{t}^{i} | m_{t}^{i}) σ^{i} (m_{t}^{i} | θ_{t}^{i}) z^{i} (α_{t}^{i} | a_{t}^{i}) P^{i} (θ_{t}^{i}) . \end{matrix}

(18)

Definition 2.

The interaction between the Machiavellian players induces a manipulation game, given by

G = {(I, Θ^{i}, A^{i}, P^{i}, U^{i})}_{i \in I} .

5.3. Auxiliary Variable

Let us define for the manipulating players:

c^{l} (θ_{t}^{l}, m_{t}^{l}, a_{t}^{l}, α_{t}^{l}) = π^{l} (α_{t}^{l} | m_{t}^{l}) σ^{l} (m_{t}^{l} | θ_{t}^{l}) z^{l} (α_{t}^{l} | a_{t}^{l}) P^{l} (θ_{t}^{l})

and for the manipulated players:

c^{f} (θ_{t}^{f}, m_{t}^{f}, a_{t}^{f}, α_{t}^{f}) = π^{f} (α_{t}^{f} | m_{t}^{f}) σ^{f} (m_{t}^{f} | θ_{t}^{f}) z^{f} (α_{t}^{f} | a_{t}^{f}) P^{f} (θ_{t}^{f}),

with

\begin{matrix} C_{a d m}^{l} : = \{c^{l} (θ_{t}^{l}, m_{t}^{l}, a_{t}^{l}, α_{t}^{l})| \sum_{θ_{t}^{l} \in Θ^{l}} \sum_{m_{t}^{l} \in M^{l}} \sum_{a_{t}^{l} \in A^{l}} \sum_{α_{t}^{l} \in A^{l}} c^{l} (θ_{t}^{l}, m_{t}^{l}, a_{t}^{l}, α_{t}^{l}) = 1 \\ \sum_{m_{t}^{l} \in M^{l}} \sum_{a_{t}^{l} \in A^{l}} \sum_{α_{t}^{l} \in A^{l}} c^{l} (θ_{t}^{l}, m_{t}^{l}, a_{t}^{l}, α_{t}^{l}) = P^{l} (θ_{t}^{l}) > 0, \\ \sum_{θ_{t}^{l} \in Θ^{l}} \sum_{m_{t}^{l} \in M^{l}} \sum_{a_{t}^{l} \in A^{l}} \sum_{α_{t}^{l} \in A^{l}} \sum_{θ_{t + 1}^{l} \in Θ^{l}} [δ_{θ_{t}^{l} θ_{t + 1}^{l}} - p^{l} (θ_{t + 1}^{l} | θ_{t}^{l}, a_{t}^{l})] c^{l} (θ_{t}^{l}, m_{t}^{l}, a_{t}^{l}, α_{t}^{l}) = 0\}, \end{matrix}

where

δ_{θ_{t}^{l} θ_{t + 1}^{l}}

is the Kronecker symbol, such that

\begin{matrix} Δ^{l} : = \{c^{l} (θ_{t}^{l}, m_{t}^{l}, a_{t}^{l}, α_{t}^{l})| \sum_{θ_{t}^{l} \in Θ^{l}} \sum_{m_{t}^{l} \in M^{l}} \sum_{a_{t}^{l} \in A^{l}} \sum_{α_{t}^{l} \in A^{l}} c^{l} (θ_{t}^{l}, m_{t}^{l}, a_{t}^{l}, α_{t}^{l}) = 1 \\ \sum_{m_{t}^{l} \in M^{l}} \sum_{a_{t}^{l} \in A^{l}} \sum_{α_{t}^{l} \in A^{l}} c^{l} (θ_{t}^{l}, m_{t}^{l}, a_{t}^{l}, α_{t}^{l}) = P^{l} (θ_{t}^{l}) > 0\}, \end{matrix}

and

\begin{matrix} C_{a d m}^{f} : = \{c^{f} (θ_{t}^{f}, m_{t}^{f}, a_{t}^{f}, α_{t}^{f})| \sum_{θ_{t}^{f} \in Θ^{f}} \sum_{m_{t}^{f} \in M^{f}} \sum_{a_{t}^{f} \in A^{f}} \sum_{α_{t}^{f} \in A^{f}} c^{f} (θ_{t}^{f}, m_{t}^{f}, a_{t}^{f}, α_{t}^{f}) = 1 \\ \sum_{m_{t}^{f} \in M^{f}} \sum_{a_{t}^{f} \in A^{f}} \sum_{α_{t}^{f} \in A^{f}} c^{f} (θ_{t}^{f}, m_{t}^{f}, a_{t}^{f}, α_{t}^{f}) = P^{f} (θ_{t}^{f}) > 0, \\ \sum_{θ_{t}^{f} \in Θ^{f}} \sum_{m_{t}^{f} \in M^{f}} \sum_{a_{t}^{f} \in A^{f}} \sum_{α_{t}^{f} \in A^{f}} \sum_{θ_{t + 1}^{f} \in Θ^{f}} [δ_{θ_{t}^{f} θ_{t + 1}^{f}} - p^{f} (θ_{t + 1}^{f} | θ_{t}^{f}, a_{t}^{f})] c^{f} (θ_{t}^{f}, m_{t}^{f}, a_{t}^{f}, α_{t}^{f}) = 0\}, \end{matrix}

where

δ_{θ_{t}^{f} θ_{t + 1}^{f}}

is the Kronecker symbol, such that

\begin{matrix} Δ^{f} : = \{c^{f} (θ_{t}^{f}, m_{t}^{f}, a_{t}^{f}, α_{t}^{f})| \sum_{θ_{t}^{f} \in Θ^{f}} \sum_{m_{t}^{f} \in M^{f}} \sum_{a_{t}^{f} \in A^{f}} \sum_{α_{t}^{f} \in A^{f}} c^{f} (θ_{t}^{f}, m_{t}^{f}, a_{t}^{f}, α_{t}^{f}) = 1 \\ \sum_{m_{t}^{f} \in M^{f}} \sum_{a_{t}^{f} \in A^{f}} \sum_{α_{t}^{f} \in A^{f}} c^{f} (θ_{t}^{f}, m_{t}^{f}, a_{t}^{f}, α_{t}^{f}) = P^{f} (θ_{t}^{f}) > 0\} . \end{matrix}

The individual aim in the c-variables can be expressed as follows:

U^{l} (π, σ, z) = \sum_{l \in L} {\tilde{U}}^{l} (c) \to max_{c^{l} \in C_{a d m}^{l}}

(19)

where

{\tilde{U}}^{l} (c) = \sum_{θ_{t}^{l} \in Θ^{l}} \sum_{m_{t}^{l} \in M^{l}} \sum_{a_{t}^{l} \in A^{l}} \sum_{α_{t}^{l} \in A^{l}} W (θ_{t}^{l}, a_{t}^{l}) \prod_{l \in L} c^{l} (θ_{t}^{l}, m_{t}^{l}, a_{t}^{l}, α_{t}^{l}) \underset{f i x e d}{\underset{︸}{\prod_{f \in F} c^{f} (θ_{t}^{f}, m_{t}^{f}, a_{t}^{f}, α_{t}^{f})}}

W (θ_{t}^{l}, a_{t}^{l}) = \sum_{θ_{t + 1}^{l} \in Θ^{l}} u^{l} (a_{t}^{l}, θ_{t}^{l}) p^{l} (θ_{t + 1}^{l} | θ_{t}^{l}, a_{t}^{l}),

and

U^{f} (π, σ, z) = \sum_{f \in F} {\tilde{U}}^{f} (c) \to max_{c^{l} \in C_{a d m}^{l}}

(20)

where

{\tilde{U}}^{f} (c) = \sum_{θ_{t}^{f} \in Θ^{f}} \sum_{m_{t}^{f} \in M^{f}} \sum_{a_{t}^{f} \in A^{f}} \sum_{α_{t}^{f} \in A^{f}} W (θ_{t}^{f}, a_{t}^{f}) \underset{f i x e d}{\underset{︸}{\prod_{l \in L} c^{l} (θ_{t}^{l}, m_{t}^{l}, a_{t}^{l}, α_{t}^{l})}} \prod_{f \in F} c^{f} (θ_{t}^{f}, m_{t}^{f}, a_{t}^{f}, α_{t}^{f}),

W (θ_{t}^{f}, a_{t}^{f}) = \sum_{θ_{t + 1}^{f} \in Θ^{f}} u^{f} (a_{t}^{f}, θ_{t}^{f}) p^{f} (θ_{t + 1}^{f} | θ_{t}^{f}, a_{t}^{f}) .

Definition 3.

We note that

{\tilde{U}}^{i} (c)

is considered individual rational for the Machiavellian player i if it is the case that

I R = \{{\tilde{U}}^{i} (c) | λ^{i} {\tilde{U}}^{i} (c) \geq {\tilde{U}}^{i} (c), λ \in Λ, i \in I\} .

Remark 2.

The foundation of rational decision theory, an economic theory that asserts that Machiavellian players always choose options that maximize their own benefit, is rational conduct. Given the available options, these choices offer the Machiavellian players the greatest benefit or satisfaction.

5.4. Machiavellian Equilibrium

The policies

π^{*}

, strategies

σ^{*}

, and action kernel

z^{*}

that solve the nonlinear programming problem are given by

(π^{*}, σ^{*}, z^{*}) = arg max_{π^{*} \in Π_{a d m}^{i}} \sum_{i \in I} U^{i} (π, σ^{*}, z^{*}),

where

π^{*}, σ^{*}

and

z^{*}

fulfill the Machiavellian equilibrium, satisfying

U^{i} (π^{*}, σ^{*}, z^{*}) \geq U^{i} (π^{i}, π^{- i *}, σ^{i}, σ^{- i *}, z^{i}, z^{- i *}),

such that

π^{- i *} = (π^{1 *}, \dots, π^{i - 1 *}, π^{i + 1 *}, \dots, π^{n + m *})

,

σ^{- i *} = (σ^{1 *}, \dots, σ^{i - 1 *}, σ^{i + 1 *}, \dots,

σ^{n + m *})

, and

z^{- i *} = (z^{1 *}, \dots, z^{i - 1 *}, z^{i + 1 *}, \dots, z^{n + m *})

.

A strategy profile is a Machiavellian equilibrium in the manipulation game

G = {(I, Θ^{i}, A^{i}, P^{i}, U^{i})}_{i \in I}

if the tuple

(π^{*}, σ^{*}, z^{*})

represents the best reply to

(π^{i}, π^{- i *}, σ^{i}, σ^{- i *}, z^{i}, z^{- i *})

. Any reward in a Machiavellian equilibrium must belong to the set

I R

.

A strategy profile is a sequential Machiavellian equilibrium in the game

G = {(I, Θ^{i}, A^{i}, P^{i}, U^{i})}_{i \in I}

if there exists a tuple

(π^{*}, σ^{*}, z^{*})

and history

h_{t}^{i}

, such that, for each player i, the tuple

(π^{*}, σ^{*}, z^{*})

is the best reply to

(π^{i}, π^{- i *}, σ^{i}, σ^{- i *}, z^{i}, z^{- i *})

.

Remark 3.

The manipulation game

G = {(I, Θ^{i}, A^{i}, P^{i}, U^{i})}_{i \in I}

is a sequential game, where the manipulating players play first and the manipulated players play second.

5.5. Variable Association

Associating these variables with the notations above, let us introduce the following vector:

\begin{matrix} s = s (c) : = {(c^{1} (θ_{t}^{1}, m_{t}^{1}, a_{t}^{1}, α_{t}^{1}), \dots, c^{n + m} (θ_{t}^{n + m}, m_{t}^{n + m}, a_{t}^{n + m}, α_{t}^{n + m}))}^{⊺} \in S_{a d m}, c^{i} (θ_{t}^{i}, m_{t}^{i}, a_{t}^{i}, α_{t}^{i}) \in C_{a d m}^{i} \end{matrix}

and the functions

u^{i} (s (c)) = {\tilde{U}}^{i} (c), i \in I .

Then we have

\begin{matrix} s^{*} (λ) \in \underset{s \in S_{a d m}}{A r g min} \sum_{i \in I} λ^{i} u^{i} (s) \Leftrightarrow c^{*} (λ) \in \underset{c \in C_{a d m}^{i}}{A r g min} \sum_{i \in I} λ^{i} {\tilde{U}}^{i} (c) \\ s^{*} (λ) = s (c^{*} (λ)) . \end{matrix}\}

Hence,

\begin{matrix} max_{λ^{l} \in Λ} max_{s^{l} \in S_{a d m}^{l}} φ (s^{l}, s^{f}, λ^{l}) = max_{λ^{l} \in Λ} max_{s^{l} \in C_{a d m}^{l}} ϕ (c^{l}, c^{f}, λ^{l}), \\ min_{λ^{f} \in Λ} max_{s^{f} \in S_{a d m}^{f}} ψ (s^{f}, s^{l}, λ^{f}) = min_{λ^{f} \in Λ} max_{c^{f} \in C_{a d m}^{f}} ϱ (c^{f}, c^{l}, λ^{f}), \end{matrix}

where

\begin{matrix} ϕ (c^{l}, c^{f}, λ^{l}) = \sum_{l \in L} λ^{l} {\tilde{U}}^{l} (c) - \frac{δ}{2} ({∥c^{l}∥}^{2} + {∥λ^{l}∥}^{2}), δ > 0, \end{matrix}

\begin{matrix} ϱ (c^{f}, c^{l}, λ^{f}) = \sum_{f \in F} λ^{f} {\tilde{U}}^{f} (c) - \frac{δ}{2} ({∥c^{f}∥}^{2} - {∥λ^{f}∥}^{2}), δ > 0 . \end{matrix}

5.6. Recover the Relationship

Lemma 1.

Let us assume that Equations (19) and (20) are solved, then the variables

π^{i *} (α_{t}^{i} | m_{t}^{i})

,

i \in I

, can be recovered from

c^{i *} (θ_{t}^{i} m_{t}^{i} a_{t}^{i} α_{t}^{i})

as follows:

\begin{matrix} π^{i *} (α_{t}^{i} | m_{t}^{i}) = \frac{1}{|Θ^{i}|} \sum_{θ_{t}^{i} \in Θ^{i}} \frac{1}{|A^{i}|} \frac{c^{i} (θ_{t}^{i} m_{t}^{i} a_{t}^{i} α_{t}^{i})}{\sum_{ζ_{t}^{i} \in A^{i}} c^{i} (θ_{t}^{i} m_{t}^{i} a_{t}^{i} ζ_{t}^{i})} . \end{matrix}

(21)

Proof.

Let

π^{i *} (α_{t}^{i} | θ_{t}^{i} m_{t}^{i}) = \{\begin{matrix} \frac{1}{|A^{i}|} \frac{c^{i} (θ_{t}^{i} m_{t}^{i} a_{t}^{i} α_{t}^{i})}{\sum_{ζ_{t}^{i} \in A^{i}} c^{i} (θ_{t}^{i} m_{t}^{i} a_{t}^{i} ζ_{t}^{i})} & i f & \sum_{α_{t}^{i} \in A^{i}} c^{i} (θ_{t}^{i} m_{t}^{i} a_{t}^{i} α_{t}^{i}) > 0, \\ 0 & \sum_{α_{t}^{i} \in A^{i}} c^{i} (θ_{t}^{i} m_{t}^{i} a_{t}^{i} α_{t}^{i}) = 0 . \end{matrix}

It then follows from

\begin{matrix} π^{i *} (α_{t}^{i} | m_{t}^{i}) = \frac{1}{|Θ^{i}|} \sum_{θ_{t}^{i} \in Θ^{i}} \frac{1}{|A^{i}|} \frac{c^{i} (θ_{t}^{i} m_{t}^{i} a_{t}^{i} α_{t}^{i})}{\sum_{ζ_{t}^{i} \in A^{i}} c^{i} (θ_{t}^{i} m_{t}^{i} a_{t}^{i} ζ_{t}^{i})} . \end{matrix}

that

\sum_{α_{t}^{i} \in A^{i}} π^{i *} (α_{t}^{i} | m_{t}^{i}) = 1

. □

To recover

σ^{i *} (m_{t}^{i} | θ_{t}^{i})

, the formula is given by

\begin{matrix} σ^{i *} (m_{t}^{i} | θ_{t}^{i}) = \frac{\sum_{a_{t}^{i} \in A^{i}} \sum_{α_{t}^{i} \in A^{i}} c^{i} (θ_{t}^{i} m_{t}^{i} a_{t}^{i} α_{t}^{i})}{\sum_{ω_{t}^{i} \in M^{i}} \sum_{a_{t}^{i} \in A^{i}} \sum_{α_{t}^{i} \in A^{i}} c^{i} (θ_{t}^{i} ω_{t}^{i} a_{t}^{i} α_{t}^{i})} . \end{matrix}

(22)

The formula to recover the distribution

P^{i *} (θ_{t}^{i})

is

\begin{matrix} P^{i *} (θ_{t}^{i}) = \sum_{m_{t}^{i} \in M^{i}} \sum_{a_{t}^{i} \in A^{i}} \sum_{α_{t}^{i} \in A^{i}} c^{i} (θ_{t}^{i} m_{t}^{i} a_{t}^{i} α_{t}^{i}) > 0 . \end{matrix}

(23)

Finally,

z^{i *} (α_{t}^{i} | a_{t}^{i}) = \frac{\sum_{θ_{t}^{i} \in Θ^{i}} \sum_{m_{t}^{i} \in M^{i}} c^{i *} (θ_{t}^{i} m_{t}^{i} a_{t}^{i} α_{t}^{i})}{\sum_{θ_{t}^{i} \in Θ^{i}} \sum_{m_{t}^{l} \in M^{i}} \sum_{ζ_{t}^{i} \in A_{t}^{i}} c^{i *} (θ_{t}^{i} m_{t}^{i} a_{t}^{i} ζ_{t}^{i})} .

(24)

6. Numerical Example

We use the notion of manipulation to study non-cooperative games with uncertain information. The forward and futures markets provide a clear illustration of several key problems associated with the economics of uncertainty and information asymmetry. These markets play a significant role in facilitating trade among firms to efficiently distribute risks, taking into account each agent’s risk preferences. This topic has long been a subject of study in the literature due to its practical significance. For example, the examination of squeezes or corners, as referred to in institutional terms, brings up the issue of market manipulation. In both scenarios, it is implicitly assumed that certain corporations have access to strategic information and employ this advantage to influence the market in their favor, thereby impacting pricing. In this section, we present an example of how oligopolistic firms behave using manipulation. The game consists of four firms, and the main characteristic of this market is mutual interdependence, i.e., one firm’s action influences the profits of its rivals. In the dynamics of the game, firm 1 and firm 2 play first, followed by firms 3 and firm 4. They are tempted to collude to fix a price (produce identical products) or production level (produce different products) in a market in order to maximize their profits. Colluding allows them to act as a monopoly. However, firms pursue their own self-interests, producing greater quantities than the other firms, leading to lower prices (i.e., petroleum). As collusive agreements are illegal, there is a threat that firms may defect by cheating on their associates and manipulating the market. Oligopolists reason that if they are the only ones cheating, it can increase their profits. Then each firm has an incentive to cheat.

Let

G

be a non-cooperative game determined

G = (I, S, u^{l}, u^{f})

.

I = {1, \dots, 4}

is the set of total Machiavellian players where

L = {1, 2}

is the set of manipulating players indexed by

l = \bar{1, 2}

, and

F = {3, 4}

is the set of manipulated players indexed by

f = \bar{3, 4}

. In the dynamics of the game, the players take alternate turns. Manipulators commit first to a strategy and then the manipulated players’ strategy is played. Let the number of states for each player be

θ = 4

and let

a = 2

be the number of actions for each player.

We intend to reflect the ideas that firms are committed to their manipulation actions for a finite length of time, and that they react to the current manipulation actions of other firms. A firm’s instantaneous price competition is given by Equations (17) and (18). The proximal method presented in Equation (15) was applied to calculate the manipulation equilibrium for the initial variables,

δ_{0} = 5.0 \times 10^{- 4}

,

μ_{0} = \times 10^{- 6}

and

γ_{0} = 4.85 \times 10^{- 1}

. The resulting values of interest are as follows:

\begin{matrix} \begin{matrix} π^{1 *} (a | m) = [\begin{matrix} 0.5293 & 0.4707 \\ 0.4979 & 0.5021 \\ 0.4798 & 0.5202 \\ 0.5353 & 0.4647 \end{matrix}], \end{matrix} \begin{matrix} z^{1 *} (α | a) = [\begin{matrix} 0.5038 & 0.4962 \\ 0.5634 & 0.4366 \end{matrix}], \end{matrix} \\ \begin{matrix} σ^{1 *} (m | θ) = [\begin{matrix} 0.3005 & 0.2316 & 0.2345 & 0.2334 \\ 0.2504 & 0.2458 & 0.2718 & 0.2320 \\ 0.2636 & 0.2637 & 0.2775 & 0.1951 \\ 0.2267 & 0.2242 & 0.2433 & 0.3058 \end{matrix}], \end{matrix} \begin{matrix} P^{1 *} (θ) = [\begin{matrix} 0.2860 \\ 0.2070 \\ 0.2740 \\ 0.2331 \end{matrix}] . \end{matrix} \end{matrix}

\begin{matrix} \begin{matrix} π^{2 *} (a | m) = [\begin{matrix} 0.5276 & 0.4724 \\ 0.4974 & 0.5026 \\ 0.4820 & 0.5180 \\ 0.6041 & 0.3959 \end{matrix}] . \end{matrix} \begin{matrix} z^{2 *} (α | a) = [\begin{matrix} 0.5097 & 0.5388 \\ 0.4903 & 0.4612 \end{matrix}], \end{matrix} \\ \begin{matrix} σ^{2 *} (m | θ) = [\begin{matrix} 0.3238 & 0.2240 & 0.2281 & 0.2241 \\ 0.2447 & 0.2451 & 0.2691 & 0.2411 \\ 0.2596 & 0.2668 & 0.2786 & 0.1950 \\ 0.2497 & 0.2491 & 0.2613 & 0.2399 \end{matrix}], \end{matrix} \begin{matrix} P^{2 *} (θ) = [\begin{matrix} 0.2260 \\ 0.1840 \\ 0.3122 \\ 0.2778 \end{matrix}] . \end{matrix} \end{matrix}

\begin{matrix} λ^{l *} = [\begin{matrix} 0.1802 \\ 0.8198 \end{matrix}] . \end{matrix}

\begin{matrix} \begin{matrix} π^{3 *} (a | m) = [\begin{matrix} 0.4375 & 0.5625 \\ 0.5000 & 0.5000 \\ 0.5003 & 0.4997 \\ 0.3487 & 0.6513 \end{matrix}], \end{matrix} \begin{matrix} z^{3 *} (α | a) = [\begin{matrix} 0.0124 & 0.9876 \\ 0.2182 & 0.7818 \end{matrix}], \end{matrix} \\ \begin{matrix} σ^{3 *} (m | θ) = [\begin{matrix} 0.9953 & 0.0016 & 0.0016 & 0.0016 \\ 0.1897 & 0.1891 & 0.1879 & 0.4333 \\ 0.0018 & 0.0018 & 0.0018 & 0.9946 \\ 0.0090 & 0.0090 & 0.0089 & 0.9731 \end{matrix}], \end{matrix} \begin{matrix} P^{3 *} (θ) = [\begin{matrix} 0.2534 \\ 0.3275 \\ 0.2239 \\ 0.1953 \end{matrix}] . \end{matrix} \end{matrix}

\begin{matrix} \begin{matrix} π^{4 *} (a | m) = [\begin{matrix} 0.4375 & 0.5625 \\ 0.5000 & 0.5000 \\ 0.5004 & 0.4996 \\ 0.3285 & 0.6715 \end{matrix}], \end{matrix} \begin{matrix} z^{4 *} (α | a) = [\begin{matrix} 0.0251 & 0.9749 \\ 0.1939 & 0.8061 \end{matrix}], \end{matrix} \\ \begin{matrix} σ^{4 *} (m | θ) = [\begin{matrix} 0.9960 & 0.0013 & 0.0013 & 0.0013 \\ 0.1495 & 0.1479 & 0.1472 & 0.5554 \\ 0.0013 & 0.0013 & 0.0013 & 0.9961 \\ 0.0022 & 0.0022 & 0.0022 & 0.9934 \end{matrix}], \end{matrix} \begin{matrix} P^{4 *} (θ) = [\begin{matrix} 0.2981 \\ 0.2092 \\ 0.3112 \\ 0.1815 \end{matrix}] . \end{matrix} \end{matrix}

\begin{matrix} λ^{f *} = [\begin{matrix} 0.6100 \\ 0.3900 \end{matrix}] . \end{matrix}

Depending on the interpretation of the oligopoly, action a can represent the choice of a price, quantity, location, etc. It can also represent a vector of choices. Firms act in discrete time, and the horizon is finite. The convergence of strategies for the manipulating players is illustrated in Figure 1 and Figure 2; the convergence of strategies for the manipulated players is shown in Figure 3 and Figure 4.

Based on the understanding of the strategies, the example consists of two key components. According to the manipulation model, firms 1 and 2 accumulate a significant number of positions without causing an increase in the price of these contracts. The manipulated firms 3 and 4 are compelled to maintain prices. A deeper examination of firms 1 and 2 reveals that they possess the necessary market sway to keep futures prices from rising while compelling firms 3 and 4 to maintain lower pricing and significantly strengthen their own positions.

7. Conclusions

This paper proposes a manipulative game. As part of our contribution, the game was modeled using the worst and best Nash equilibrium points. To address the issue, we constrained the problem to homogeneous, finite, ergodic, and controlled Bayesian–Markov games. Players who adhered to Machiavellian principles claimed to be in one state while they were truly in another and pretended to act one way while genuinely acting another. Each participant in the Machiavellian group engaged in some sort of social manipulation. Players who engaged in manipulation had a stronger preference than manipulated participants. The Pareto frontier is defined as the line where manipulating players play the best Nash equilibrium and manipulated players play the worst Nash equilibrium. In the case of multiple manipulators and manipulated players, it is also considered a sequential Bayesian–Markov manipulation game. Lastly, we provided a tractable characterization of the manipulation equilibrium findings. We employed Tikhonov’s penalty regularization approach to ensure the convergence of the game’s solution to a unique outcome. The outcomes of our concept are illustrated through a numerical example. Without a doubt, there are other issues that should be taken into account in future studies. We are thinking about expanding our method to handle Bayesian–Markov games and a hierarchical Stackelberg model. The paradigm might be expanded to support leaders and followers in a three-level model. Finding a practical economic use for the concept is an intriguing challenge for manipulation games.

Author Contributions

Conceptualization, J.S.-R. and J.B.C.; Investigation, J.S.-R. and J.B.C.; Writing—original draft, J.S.-R., J.M.R.-M. and J.B.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Jones, D.N.; Paulhus, D.L. Handbook of Individual Differences in Social Behavior; Chapter Machiavellianism; The Guilford Press: New York, NY, USA, 2009; pp. 93–108. [Google Scholar]
Christie, R.; Geis, F. Studies in Machiavellianism; Academic Press: Cambridge, MA, USA, 2013. [Google Scholar]
Spielberger, C.D.; Butcher, J. Advances in Personality Assessment; Routledge: New York, NY, USA, 2013. [Google Scholar]
Geis, F. Dimensions of Personality; London, H., Exner, J.E., Jr., Eds.; Chapter Machiavellianism; Wiley: New York, NY, USA, 1978; pp. 305–363. [Google Scholar]
Nash, J.F. Non-cooperative games. Ann. Math. 1951, 54, 286–295. [Google Scholar] [CrossRef]
Cournot, A. Recherches sur les Principes Mathématiques de la Théorie des Richesses; The Macmillan Company: New York, NY, USA, 1838. [Google Scholar]
Allen, F.; Gorton, G. Stock price manipulation, market microstructure and asymmetric information. Eur. Econ. Rev. 1992, 36, 624–630. [Google Scholar] [CrossRef]
Bagnoli, M.; Lipman, B.L. Stock Price Manipulationthrough Takeover Bids. Rand J. Econ. 1996, 27, 124–147. [Google Scholar] [CrossRef]
Clempner, J.B. A game theory model for manipulation based on Machiavellianism: Moral and ethical behavior. J. Artif. Soc. Soc. Simul. 2017, 20, 12. [Google Scholar] [CrossRef]
Cumming, D.; Ji, S.; Peter, R.; Tarsalewska, M. Market manipulation and innovation. J. Bank. Financ. 2020, 120, 105957. [Google Scholar] [CrossRef]
Clempner, J.B.; Trejo, K. Manipulation Power in Bargaining Games Using Machiavellianism. Econ. Comput. Econ. Cybern. Stud. Res. 2021, 55, 299–313. [Google Scholar]
Clempner, J.B. Learning machiavellian strategies for manipulation in Stackelberg security games. Ann. Math. Artif. Intell. 2022, 90, 373–395. [Google Scholar] [CrossRef]
Clempner, J.B. A Manipulation Game Based on Machiavellian Strategies. Int. Game Theory Rev. 2022, 24, 2150015. [Google Scholar] [CrossRef]
Milgrom, P.; Roberts, J. Relying on the information of interested parties. RAND J. Econ. 1986, 17, 18–32. [Google Scholar] [CrossRef]
Krishna, V.; Morgan, J. A model of expertise. Q. J. Econ. 2001, 116, 747–775. [Google Scholar] [CrossRef]
Taneva, I. Information Design. Am. Econ. J. Microecon. 2019, 11, 151–185. [Google Scholar] [CrossRef]
Bardhi, A.; Guo, Y. Modes of persuasion toward unanimous consent. Theor. Econ. 2018, 13, 1111–1149. [Google Scholar] [CrossRef]
Kamenica, E.; Gentzkow, M. Bayesian Persuasion. Am. Econ. Rev. 2011, 101, 2590–2615. [Google Scholar] [CrossRef]
Bergemann, D.; Morris, S. Information design, Bayesian persuasion, and Bayes correlated equilibrium. Am. Econ. Rev. 2016, 106, 586–591. [Google Scholar] [CrossRef]
Gentzkow, M.; Kamenica, E. Bayesian persuasion with multiple senders and rich signal spaces. Games Econ. Behav. 2017, 104, 411–429. [Google Scholar] [CrossRef]
Brocas, I.; Carrillo, J.D.; Palfrey, T.R. Information gatekeepers: Theoryand experimental evidence. Econ. Theory 2012, 51, 649–676. [Google Scholar] [CrossRef]
Gul, F.; Pesendorfer, W. The war of information. Rev. Econ. Stud. 2012, 79, 707–734. [Google Scholar] [CrossRef]
Eső, P.; Szentes, B. Optimal Information Disclosure in Auctionsand the Handicap Auction. Rev. Econ. Stud. 2007, 74, 705–731. [Google Scholar] [CrossRef]
Bergemann, D.; Pesendorfer, M. Information structuresin optimal auctions. J. Econ. Theory 2007, 137, 580–609. [Google Scholar] [CrossRef]
Rayo, L.; Segal, I. Optimal information disclosure. J. Political Econ. 2010, 118, 949–987. [Google Scholar] [CrossRef]
Li, H.; Shi, X. Discriminatory Information Disclosure. Discrim. Inf. Discl. 2017, 107, 3363–3385. [Google Scholar] [CrossRef]
Clempner, J.B.; Poznyak, A.S. A Tikhonov regularized penalty function approach for solving polylinear programming problems. J. Comput. Appl. Math. 2018, 328, 267–286. [Google Scholar] [CrossRef]
Clempner, J.B.; Poznyak, A.S. A Tikhonov regularization parameter approach for solving Lagrange constrained optimization problems. Eng. Optim. 2018, 50, 1996–2012. [Google Scholar] [CrossRef]
Clempner, J.B.; Poznyak, A.S. Multiobjective Markov chains optimization problem with strong Pareto frontier: Principles of decision making. Expert Syst. Appl. 2017, 68, 123–135. [Google Scholar] [CrossRef]
Clempner, J.B.; Poznyak, A.S. Constructing the Pareto front for multi-objective Markov chains handling a strong Pareto policy approach. Comput. Appl. Math. 2018, 3, 567–591. [Google Scholar] [CrossRef]
Tanaka, K. The Closest Solution To The Shadow Minimum of a Cooperative Dynamic Game. Comput. Math. Appl. 1989, 18, 181–188. [Google Scholar] [CrossRef]
Tanaka, K.; Yokoyama, K. On ϵ-equilibrium point in a noncooperative n-person game. J. Math. Anal. Appl. 1991, 160, 413–423. [Google Scholar] [CrossRef]

Figure 1. Convergence of strategies c for firm 1.

Figure 2. Convergence of strategies c for firm 2.

Figure 3. Convergence of strategies c for firm 3.

Figure 4. Convergence of strategies c for firm 4.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sanchez-Rabaza, J.; Rocha-Martinez, J.M.; Clempner, J.B. Characterizing Manipulation via Machiavellianism. Mathematics 2023, 11, 4143. https://doi.org/10.3390/math11194143

AMA Style

Sanchez-Rabaza J, Rocha-Martinez JM, Clempner JB. Characterizing Manipulation via Machiavellianism. Mathematics. 2023; 11(19):4143. https://doi.org/10.3390/math11194143

Chicago/Turabian Style

Sanchez-Rabaza, Jacqueline, Jose Maria Rocha-Martinez, and Julio B. Clempner. 2023. "Characterizing Manipulation via Machiavellianism" Mathematics 11, no. 19: 4143. https://doi.org/10.3390/math11194143

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Characterizing Manipulation via Machiavellianism

Abstract

1. Introduction

1.1. Brief Review

1.2. Related Work

1.3. Main Contribution

1.4. Organization of the Paper

2. Manipulation Game

3. Manipulation Equilibrium

4. Computing the Manipulation Equilibrium

4.1. The Tanaka Function

4.2. Proximal Format

5. Markov Games and Moral Hazard

5.1. Markov Model

5.2. Strategies and Moral Hazards

5.3. Auxiliary Variable

5.4. Machiavellian Equilibrium

5.5. Variable Association

5.6. Recover the Relationship

6. Numerical Example

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI