Previous Article in Journal
Stationary Bayesian–Markov Equilibria in Bayesian Stochastic Games with Periodic Revelation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Vulnerability and Defence: A Case for Stackelberg Game Dynamics

1
School of Computer and Mathematical Sciences, University of Adelaide, Adelaide, SA 5005, Australia
2
Defence Science and Technology Group, P.O. Box 1500, Edinburgh, SA 5111, Australia
*
Author to whom correspondence should be addressed.
Games 2024, 15(5), 32; https://doi.org/10.3390/g15050032
Submission received: 27 July 2024 / Revised: 13 September 2024 / Accepted: 14 September 2024 / Published: 18 September 2024

Abstract

:
This paper examines the tactical interaction between drones and tanks in modern warfare through game theory, particularly focusing on Stackelberg equilibrium and backward induction. It describes a high-stakes conflict between two teams: one using advanced drones for attack, and the other defending using tanks. The paper conceptualizes this as a sequential game, illustrating the complex strategic dynamics similar to Stackelberg competition, where moves and countermoves are carefully analyzed and predicted.

1. Introduction

More than a century before John Nash formalized the concept of equilibrium in game theory [1,2,3], Antoine Cournot [4] had already introduced a similar idea through his duopoly model, which became a cornerstone in the study of industrial organization [5]. In economics, an oligopoly refers to a market structure in which a small number of firms ( n 2 ) supply a particular product. A duopoly, a specific case where n = 2 , is the scenario to which Cournot’s model applies. In this model, two firms simultaneously produce and sell a homogeneous product. Cournot identified an equilibrium quantity for each firm, where the optimal strategy for each participant is to follow a specific rule if the other firm adheres to it. This idea of equilibrium in a duopoly anticipated Nash’s more general concept of equilibrium points in non-cooperative games.
In 1934, Heinrich von Stackelberg [6,7] introduced a dynamic extension to Cournot’s model by allowing for sequential moves rather than simultaneous ones. In the Stackelberg model, one firm, the leader, moves first, while the second, the follower, reacts accordingly. A well-known example of such strategic behavior is General Motors’ leadership in the early U.S. automobile industry, with Ford and Chrysler often acting as followers.
The Stackelberg equilibrium, derived through backward induction, represents the optimal outcome in these sequential-move games. This equilibrium is often considered more robust than Nash equilibrium (NE) in such settings, as sequential games can feature multiple NEs, but only one corresponds to the backward-induction outcome [1,2,3].

2. Related Work

Stackelberg games have been significantly influential in security and military research applications [8,9,10,11,12,13,14,15,16,17,18,19]. These games, based on the Stackelberg competition model, have been successfully applied in a wide range of real-world scenarios. They are particularly notable for their deployment in contexts where security decisions are critical, such as in protecting infrastructure and managing military operations.
The sequential setup of Stackelberg games is particularly relevant in military contexts where strategic decisions often involve anticipating and responding to an adversary’s actions. The applications of such games in military settings are diverse, ranging from optimizing resource allocation for defence to strategizing offensive maneuvers.
This paper considers the strategic interplay between drones and tanks through the lens of the Stackelberg equilibrium and the principles of backwards induction. In a military operation setting, we consider two types of agents, namely, the attacker (Red team) and the defender (Blue team). The attacker might utilize mobile threats to attack or reduce the number of static, unmovable entities belonging to the defender. In response, the defender will employ countermeasures to reduce the number of enemy attackers. This complex pattern of strategic moves and countermoves is explored as a sequential game, drawing on the concept of Stackelberg competition to illuminate the dynamics at play.
While focusing on developing a game-theoretical analysis, we have presented a hypothetical strategic scenario involving tanks and drones to illustrate our point. Naturally, this scenario may only loosely reflect the realities of such encounters, which evolve rapidly and are subject to constant change.
This paper’s contribution consists of obtaining an analytical solution to a Stackelberg competition in a military setting. To obtain such a solution, we limit the number of available strategic moves to small numbers but enough still to demonstrate the dynamics of a sequential strategic military operation.

3. The Game Definition

We consider a scenario in which two teams, Blue ( B ) and Red ( R ), are engaged in a military operation. The B team comprises ground units, specifically tanks, while the R team operates aerial units, namely drones. The strategy involves the R team’s drones targeting the B team’s tanks. Meanwhile, the B team not only has the capability to shoot down these drones but also provides defensive cover for their tanks, creating a complex interplay of offensive and defensive maneuvers in this combat scenario.
We assume that the B team consists of n tanks where n N and represent the set of tanks as T = { T 1 , T 2 , T n } . Let S = { S 1 , S 2 , S m } be a set of resources that are at the disposal of B team to protect the tanks. It is assumed that the R team’s pure strategy is to attack one of the tanks from the set T. The R team’s mixed strategy is then defined as a vector A T where A T is the probability of attacking the tank T : A attack and T = 1 n A T = 1 . The B team’s mixed strategy is also a vector D T where D T is the marginal probability of protecting the tank T . Note that a marginal probability is obtained by summing (or integrating) over the distribution of the variables that are being disregarded, and these disregarded variables are said to have been marginalized out.

3.1. Marginal Probability of Protection for Tanks

We consider the case when there are five resources from the set { S 1 , S 2 , S 5 } available to protect the four tanks from T = { T 1 , T 2 , T 4 } , whereas one or more of the resources can be used to protect a tank T i from the set T. For this case, the marginal probabilities D T to protect the tank T are determined as follows:
Tanks S 1 S 2 S 3 S 4 S 5 D T T 1 Pr ( T 1 , S 1 ) Pr ( T 1 , S 2 ) Pr ( T 1 , S 3 ) Pr ( T 1 , S 4 ) Pr ( T 1 , S 5 ) D 1 = 1 5 i = 1 5 Pr ( T 1 , S i ) T 2 Pr ( T 2 , S 1 ) Pr ( T 2 , S 2 ) Pr ( T 2 , S 3 ) Pr ( T 2 , S 4 ) Pr ( T 2 , S 5 ) D 2 = 1 5 i = 1 5 Pr ( T 2 , S i ) T 3 Pr ( T 3 , S 1 ) Pr ( T 3 , S 2 ) Pr ( T 3 , S 3 ) Pr ( T 3 , S 4 ) Pr ( T 3 , S 5 ) D 3 = 1 5 i = 1 5 Pr ( T 3 , S i ) T 4 Pr ( T 4 , S 1 ) Pr ( T 4 , S 2 ) Pr ( T 4 , S 3 ) Pr ( T 4 , S 4 ) Pr ( T 4 , S 5 ) D 4 = 1 5 i = 1 5 Pr ( T 4 , S i ) , Resources
where
i = 1 4 Pr ( T i , S 1 ) = i = 1 4 Pr ( T i , S 2 ) = i = 1 4 Pr ( T i , S 3 ) = i = 1 4 Pr ( T i , S 4 ) = i = 1 4 Pr ( T i , S 5 ) = 1 .
For instance, Pr ( T 3 , S 2 ) is the probability that the resource S 2 is used to give protection to the tank T 3 . This means D i 1 and i = 1 4 D i = 1 . In case the set of resources consists of m element i.e., S = { S 1 , S 2 , S m } and the number of tanks to be protected are n, we then have the the marginal probabilities D j to protect the j-th tank obtained as D j = 1 m i = 1 m Pr ( T j , S i ) where 1 j n .

3.2. Defining the Reward Functions

Let R B ( T ) be the reward to team B if the attacked tank T is protected using resources from the set S = { S 1 , S 2 , S m } ,   C B ( T ) be the cost to team B if the attacked tank T is unprotected, R R ( T ) be the reward to team R if the attacked tank T is unprotected, C R ( T ) be the cost to team R if attacked tank T is protected. Note that D T is the marginal probability of protecting the tank T using the resources from the set S .
The quantity D T R B ( T ) ( 1 D T ) C B ( T ) then describes the payoff to the B team when tank T is attacked. Similarly, the quantity ( 1 D T ) R R ( T ) D T C R ( T ) describes the payoff to the R team when the tank T is attacked. However, the probability that the tank T is attacked is A T and we can take this into consideration to define the quantities A T { D T R B ( T ) ( 1 D T ) C B ( T ) } and A T { ( 1 D T ) R R ( T ) D T C R ( T ) } . These are the contributions to the payoffs to the B and T teams, respectively, when the tank T is attacked with the probability A T .
As the vector A T describes the R team’s (mixed) attacking strategy whereas the vector D T describes the B team’s (mixed) protection strategy, the players’ strategy profiles are given as { D T , A T } . For a set of tanks T, the expected payoffs [16,17] to the B and R teams, respectively, can then be written as
Π B { D T , A T } = T T A T { D T R B ( T ) ( 1 D T ) C B ( T ) } , Π R { D T , A T } = T T A T { ( 1 D T ) R R ( T ) D T C R ( T ) } .
We note from these payoffs that if the attack probability for a tank T is zero, the rewards to both the B and T teams for the tank T T are also zero; the payoff functions for the either team depend only on the attacked tanks; and if the B and R teams move simultaneously, the solution is a Nash equilibrium.
Note that with reference to the reward functions defined in Equation (3), for the imagined strategy profile (not equilibrium), where defender protects the first tank by all elements from the set { S 1 , S 2 , S 5 } i.e., Pr ( T 1 , S i ) = 1 for 1 i 5 , we obtain D 1 = 1 5 i = 1 5 Pr ( T 1 , S i ) = 1 and thus D 2 , 3 , 4 = 0 . If the attacker decides to attack the first tank i.e., A 1 = 1 , we obtain Π B { D T , A T } = R B ( T 1 ) and Π R { D T , A T } = C R ( T 1 ) .

4. Leader-Follower Interaction and Stackelberg Equilibrium

We consider a three step strategic game between the B and R teams, also called the leader-follower interaction. As the leader, the B team chooses an action consisting of a protection strategy D T . The R team observes D T and then chooses an action consisting of its attack strategy given by the vector A T . Knowing the rational response A T of the R team, the B team takes this in account and as the leader optimizes its own action. The payoffs to the two teams are Π B { D T , A T } and Π R { D T , A T } .
This game is an example of the dynamic games of complete and perfect information [2]. Key features of this game are (a) the moves occur in sequence, (b) all previous moves are known before next move is chosen, and (c) the players’ payoffs are common knowledge. This framework allows for strategic decision-making based on the actions and expected reactions of the other players, typical of Stackelberg competition scenarios. In many real-world scenarios—especially in complex environments in a military contexts—the assumption that players’ payoffs are common knowledge does not hold and complete information about the payoffs of other players is rarely available.
Given the action D T is previously chosen by the B team, at the second stage of the game, when the R team gets the move, it faces the problem:
M a x A T Π R { D T , A T } .
Assume that for each D T , R team’s optimization problem (4) has a unique solution S R ( D T ) , which is known as the best response of the R team. Now the B team can also solve the R team’s optimization problem by anticipating the R team’s response to each action D T that the B team might take. So that the B team faces the problem:
M a x D T Π B { D T , S R ( D T ) }
Suppose this optimization problem also has a unique solution for the B team and is denoted by D T * . The solution ( D T * , S R ( D T * ) ) is the backwards-induction outcome of this game.
To address this, we consider the above simplified case i.e., when T = 1 , 2 , 4 . Expanding Equation (3) we obtain:
Π R { D T , A T } = i = 1 4 A i { ( 1 D i ) R R ( T i ) D i C R ( T i ) } .
Now, as T = 1 4 A T = 1 , we take as an arbitrary choice A 3 = 1 A 1 A 2 A 4 in Equation (6) to obtain
Π R { D T , A T } = i = 1 i 3 4 A i { ( 1 D i ) R R ( T i ) D i C R ( T i ) } + ( 1 A 1 A 2 A 4 ) { ( 1 D 3 ) R R ( T 3 ) D 3 C R ( T 3 ) } ,
and this re-presses the R team’s reward function in terms of only three variables A 1 , A 2 , and A 3 —defining its attack strategy A T . When expanded, the above equation becomes
Π R { D T , A T } = A 1 { ( 1 D 1 ) R R ( T 1 ) D 1 C R ( T 1 ) ( 1 D 3 ) R R ( T 3 ) + D 3 C R ( T 3 ) } + A 2 { ( 1 D 2 ) R R ( T 2 ) D 2 C R ( T 2 ) ( 1 D 3 ) R R ( T 3 ) + D 3 C R ( T 3 ) } + A 4 { ( 1 D 4 ) R R ( T 4 ) D 4 C R ( T 4 ) ( 1 D 3 ) R R ( T 3 ) + D 3 C R ( T 3 ) } + ( 1 D 3 ) R R ( T 3 ) D 3 C R ( T 3 ) .
As a rational player, the B team knows that the R team would maximize its reward function with respect to its strategic variables and this is expressed as
Π R { D T , A T } A 1 = Π R { D T , A T } A 2 = Π R { D T , A T } A 4 = 0 ,
where A 1 , A 2 , A 4 [ 0 , 1 ] and T = 1 4 A T = 1 . This results in obtaining
D 3 R R ( T 3 ) + C R ( T 3 ) = D 1 R R ( T 1 ) + C R ( T 1 ) R R ( T 1 ) + R R ( T 3 ) , D 3 R R ( T 3 ) + C R ( T 3 ) = D 2 R R ( T 2 ) + C R ( T 2 ) R R ( T 2 ) + R R ( T 3 ) , D 3 R R ( T 3 ) + C R ( T 3 ) = D 4 R R ( T 4 ) + C R ( T 4 ) R R ( T 4 ) + R R ( T 3 ) ,
and this leads us to denote the sum of the reward and the cost to the B and R teams for protecting or attacking the tank T , respectively, by new symbols
Ω 1 R = R R ( T 1 ) + C R ( T 1 ) , Ω 2 R = R R ( T 2 ) + C R ( T 2 ) , Ω 3 R = R R ( T 3 ) + C R ( T 3 ) , Ω 4 R = R R ( T 4 ) + C R ( T 4 ) , Ω 1 B = R B ( T 1 ) + C B ( T 1 ) , Ω 2 B = R B ( T 2 ) + C B ( T 2 ) , Ω 3 B = R B ( T 3 ) + C B ( T 3 ) , Ω 4 B = R B ( T 4 ) + C B ( T 4 ) .
As D i 5 and i = 1 4 D i = 1 , we substitute D 3 = ( 1 D 1 D 2 D 4 ) in Equation (10) along with the substitutions (11) to obtain
( 1 D 2 D 4 ) Ω 3 R = D 1 Ω 1 R + Ω 3 R R R ( T 1 ) + R R ( T 3 ) ,
( 1 D 1 D 4 ) Ω 3 R = D 2 Ω 2 R + Ω 3 R R R ( T 2 ) + R R ( T 3 ) ,
( 1 D 1 D 2 ) Ω 3 R = D 4 Ω 4 R + Ω 3 R R R ( T 4 ) + R R ( T 3 ) .
Using Equations (12)–(14), we now express D 2 and D 4 in terms of D 1 . For this, we subtract Equation (13) from Equation (12)
( D 1 D 2 ) Ω 3 R = D 1 Ω 1 R + Ω 3 R D 2 Ω 2 R + Ω 3 R R R ( T 1 ) + R R ( T 2 ) ,
which gives
D 2 = D 1 Ω 1 R R R ( T 1 ) + R R ( T 2 ) Ω 2 R .
Similarly, subtracting Equation (14) from (12) results in
D 4 = D 1 Ω 1 R + R R ( T 4 ) R R ( T 1 ) Ω 4 R .
Using Equations (15) and (16), the marginal D 3 can then be expressed in terms of the marginal D 1 as
D 3 = 1 D 1 [ 1 + Ω 1 R ( 1 Ω 2 R + 1 Ω 4 R ) ] R R ( T 2 ) R R ( T 1 ) Ω 2 R R R ( T 4 ) R R ( T 1 ) Ω 4 R .
Equations (15)–(17) represent the rational behaviour of the R team, which the B team can now exploit to optimize its defence strategy D T .
From Equations (3) the payoff function of the B team can be expressed as
Π B { D T , A T } = i = 1 4 D i Ω i B C B ( T i ) A i ,
and with the substitution D 3 = ( 1 D 1 D 2 D 4 ) this result in obtaining
Π B D T , A T = D 1 Ω 1 B A 1 Ω 3 B A 3 C B T 1 A 1 + D 2 Ω 2 B A 2 Ω 3 B A 3 C B T 2 A 2 + D 4 Ω 4 B A 4 Ω 3 B A 3 C B T 4 A 4 + Ω 3 B A 3 C B T 3 A 3 .
Now, substitute from Equations (15) and (16) to Equation (19), along with the substitutions (11), to obtain
Π B { D T , A T } = D 1 { Ω 1 B A 1 Ω 3 B A 3 + Ω 1 R [ Ω 2 B A 2 Ω 3 B A 3 ] Ω 2 R + Ω 1 R [ Ω 4 B A 4 Ω 3 B A 3 ] Ω 4 R } + [ R R ( T 1 ) + R R ( T 2 ) ] [ Ω 2 B A 2 Ω 3 B A 3 ] Ω 2 R + [ R R ( T 4 ) R R ( T 1 ) ] [ Ω 4 B A 4 Ω 3 B A 3 ] Ω 4 R + Ω 3 B A 3 [ C B ( T 1 ) A 1 + C B ( T 2 ) A 2 + C B ( T 3 ) A 3 + C B ( T 4 ) A 4 ] = D 1 Δ 1 + Δ 2
where
Δ 1 = Ω 1 B A 1 Ω 3 B A 3 + Ω 1 R [ Ω 2 B A 2 Ω 3 B A 3 ] Ω 2 R + Ω 1 R [ Ω 4 B A 4 Ω 3 B A 3 ] Ω 4 R
Δ 2 = [ R R ( T 1 ) + R R ( T 2 ) ] [ Ω 2 B A 2 Ω 3 B A 3 ] Ω 2 R + [ R R ( T 4 ) R R ( T 1 ) ] [ Ω 4 B A 4 Ω 3 B A 3 ] Ω 4 R + Ω 3 B A 3 [ C B ( T 1 ) A 1 + C B ( T 2 ) A 2 + C B ( T 3 ) A 3 + C B ( T 4 ) A 4 ] ,
appear as the new parameters of considered sequential strategic interaction. This completes the backwards induction process of obtaining the optimal response of the B team in view of its encounter with the rational behaviour of the R team.

5. Optimal Response of the B Team

From Equations (21) and (22) we note that Δ 1 , 2 depend on the values assigned to the two teams’ rewards and costs variables i.e., R B ( T ) , C B ( T ) , R R ( T ) , C R ( T ) as well as on the R team’s attack probabilities A i ( 1 i 4 ) . Three case, therefore, emerge in view of Equation (20) that are described below.

5.1. Case Δ 1 > 0

After observing the attack probabilities A i ( 1 i 4 ) the B team obtains Δ 1 using Equation (21) along with the rewards and costs variables i.e., R B ( T ) ,   C B ( T ) ,   R R ( T ) ,   C R ( T ) and the Equation (11). If the B team finds that Δ 1 > 0 then its payoff Π B { D T , A T } is maximized at the maximum value of D 1 and it is irrespective of the value of Δ 2 . Note that at this maximum value of D 1 , the corresponding values of D 2 ,   D 3 ,   D 4 —as expressed in terms of D 1 and given by Equations (15)–(17)—must remain non-negative, and that the maximum value obtained for D 1 can still be less than D 2 or D 3 or D 4 .

5.2. Case Δ 1 < 0

As 0 D i 1 and i = 1 4 D i = 1 , therefore, in view of the attack probabilities A i ( 1 i 4 ) , if the B team finds that Δ 1 < 0 then the reward is maximized to the value of Δ 2 with D 1 = 0 and we then have
D 2 = R R ( T 2 ) R R ( T 1 ) Ω 2 R , D 3 = 1 R R ( T 2 ) R R ( T 1 ) Ω 2 R R R ( T 4 ) R R ( T 1 ) Ω 4 R , D 4 = R R ( T 4 ) R R ( T 1 ) Ω 4 R .

5.3. Case Δ 1 = 0

If the B team finds that Δ 1 = 0 then its reward becomes Δ 2 , as defined by Equation (22), and is independent of the value assigned to D 1 and via Equations (15)–(17) also independent of D 2 , D 3 , D 4 .

6. Example Instantiation

As an example, we consider the set of arbitrarily-assigned values to the two teams’ rewards and costs as in the table below.
Games 15 00032 i001
for which we have
Ω 1 B = 16 , Ω 1 R = 6 , Ω 2 B = 15 , Ω 2 R = 11 , Ω 3 B = 13 , Ω 3 R = 10 , Ω 4 B = 6 , Ω 4 R = 6 ,
and also for which using Equations (21) and (22) we obtain
Δ 1 = 16 A 1 + 8.181 A 2 33.090 A 3 + 6 A 4 ,
Δ 2 = 7 A 1 4.272 A 2 + 2.469 A 3 A 4 .

6.1. Case Δ 1 > 0

Now, assume that while knowing the attack probabilities A i ( 1 i 4 ) , the B team uses Equation (26) to find that Δ 1 > 0 . As discussed above, its payoff is maximized at the maximum value of D 1 and it is irrespective of the value of Δ 2 . Using Equations (15) and (16), along with the enteries in the table (24), the B team now determines the maximum value for D 1 at which D 2 , D 3 , D 4 obtained from Equations (15)–(17), respectively, all have non-negative values. Table (24) gives
D 2 = 6 D 1 + 2 11 , D 3 = 1 10.272 D 1 0.3485 , D 4 = 6 D 1 + 1 6 ,
and a table of values is then obtained as
Games 15 00032 i002
and D 1 = 0.0634 emerges as the maximum value at which D 2 , D 3 , D 4 remain non-negative. The plots of D 2 , D 3 , and D 4 vs. D 1 (Range: 0.06 to 0.0636) appear in Figure 1.
The B team’s protection strategy is therefore obtained from Equation (28) as
D T * = D 1 * , D 2 * , D 3 * , D 4 * = 0.0634 , 0.216 , 2.24 · 10 4 , 0.23 ,
and from Equations (20), (26) and (27), along with the table (24), the B team’s payoffs then become
Π B { D T * , A T } = 5.986 A 1 3.753 A 2 + 0.371 A 3 0.62 A 4
which, in view of the fact that i = 1 4 A i = 1 , can also be expressed as
Π B { D T * , A T } = 6.357 A 1 4.124 A 2 0.991 A 4 + 0.371 .
Plot of the payoff Π B { D T * , A T } for the values of A 1 , A 2 , A 4 that satisfy the constraints 0 A 1 , A 2 , A 4 1 and 0 A 1 + A 2 + A 4 1 is in Figure 2.
From Equation (32) the payoff Π B { D T * , A T } is maximized at the value of 0.371 for A 3 = 1 , and therefore A 1 , 2 , 4 = 0 . The payoff to the R team is then obtained from Equation (8) as
Π R { D T * , A T } = ( 1 D 3 * ) R R ( T 3 ) D 3 * C R ( T 3 ) ,
where from Equation (30) we have D 3 * 0 and using the table (24) we obtain Π R { D T * , A T } = R R ( T 3 ) = 7 .
Now we consider the reaction of the R team after the B team has determined its protection strategy D T * while following the backwards induction in the case above. For the case when the attack probabilities are such that Δ 1 > 0 in Equation (26), we re-express the R team’s payoff given by Equation (8) by substituting the B team’s protection strategy described by Equation (30). Using i = 1 4 A i = 1 , the R team’s payoff are then expressed in terms of the attack probabilities A 1 ,   A 2 ,   A 4 as
Π R { D T * , A T } = 3.38 ( A 1 + A 2 + A 4 ) + 7 .
Now, in Figure 3 below a plot is obtained comparing the B and the R team’s payoffs given by Equations (32) and (34), respectively, when these are considered implicit functions of A 1 , A 2 , A 4 and with the constraints that 0 A 1 , A 2 , A 4 1 and 0 A 1 + A 2 + A 4 1 . For most of the allowed values of the attack probabilities, as represented by the blue shade, and for Δ 1 > 0 , R team remains significantly better off than the B team.
In view of the reward table (24), the R team’s payoffs attain the maximum value of 7 when A 1 + A 2 + A 4 = 0 or when A 3 = 1 . However, when this is the case, using Equation (32) the payoff to the B team then becomes 0.371 .

6.2. Case Δ 1 0

Consider the case when by using Equation (26) the B team finds that Δ 1 0 . As 0 A 1 + A 2 + A 4 1 in Equation (26), the condition Δ 1 0 can be realized for some values of the attack probabilities.
Plot of the payoff Π B { D T * , A T } for the values of A 1 , A 2 , A 4 that satisfy the constraints 0 A 1 , A 2 , A 4 1 and 0 A 1 + A 2 + A 4 1 is in Figure 4.
Now in view of Equation (20) the B team’s reward is maximized to the value of Δ 2 when D 1 = 0 . In this case, using Equation (23) and the table (24) the B team’s protection strategy is therefore obtained as
D T * = D 1 * , D 2 * , D 3 * , D 4 * = 0 , 0.181 , 0.651 , 0.167 ,
and, as before, using Equation (20), (26) and (27), along with the table (24), the B team’s payoffs then become Π B { D T * , A T } = Δ 2 i.e.,
Π B { D T * , A T } = 7 A 1 4.272 A 2 + 2.47 A 3 A 4 ,
which in view of the fact that i = 1 4 A i = 1 can also be expressed as
Π B { D T * , A T } = 9.47 A 1 6.742 A 2 3.47 A 4 + 2.47 ,
and similarly for the R team
Π R { D T * , A T } = 3.51 ( A 1 + A 2 + A 4 ) + 0.49 .

7. Discussion

We consider the case that the B team moves first and commits to a protection strategy D T . The R team notices the protection strategy D T and decides its attack strategy given by the vector A T . The B team knows that the R team is a rational decision maker and how it will react to its protection strategy D T . The leader-follower interaction—resulting in the consideration of Stackelberg equilibrium—looks into finding the B team’s best protection strategy D T * while knowing that the R team is going to act rationally in view of a protection strategy D T committed by the B team. The R team’s mixed strategy is given by the vector A T of the the attack probabilities A 1 , A 2 , A 3 , A 4 on the four tanks by the R team.
The vector D T describing the B team’s allocation of its resources depends crucially on the parameter Δ 1 as defined in Equation (14) and is obtained from the assigned values in table (24) for rewards and the costs to the two teams. If Δ 1 > 0 then the reward to the B team is maximized at the maximum value of D 1 for which D 2 , D 3 , D 4 —as expressed in terms of D 1 by Equations (15)–(17)—remain non-negative. That is, the maximum value obtained for D 1 can still be less than D 2 or D 3 or D 4 .
We note that for the case Δ 1 > 0 and for most situations encountered by the two teams—represented by the area covered by the blue shade in Figure 3—the reward for the B team remains between 6 and 0.5 whereas the reward for the R team remains between 0.5 and 7.
However, in the case Δ 1 0 and for most situations encountered by the two teams—represented now by the area covered by the blue shade in Figure 5—the reward for the B team remains between 6.5 and 2.5 whereas the reward for the R team remains between 0.5 and 4.
That is, for most of the allowed values of the attack probabilities, the B team can receive higher reward when Δ 1 0 relative to the case when Δ 1 > 0 . However, for most of the allowed values of the attack probabilities, the R team can receive less reward when Δ 1 0 relative to the case when Δ 1 > 0 . Therefore, the situation Δ 1 0 is more favourable to the B team than it is to the R team. Similarly, the situation Δ 1 > 0 turns out to be more favorable to the R team than it is to the B team. Note that these results are specific to the particular values assigned in the considered example to the parameters R B ( T ) , C B ( T ) , R R ( T ) , C R ( T ) for the four tanks.

8. Conclusions

The Stackelberg equilibrium in this scenario is reached when the drones have optimized their attack patterns, in view of the protections provided to the tanks, and the tanks have subsequently optimized their protections in light of the drone best responses. The dynamic interplay of strategic decision-making, under the principles of Stackelberg equilibrium and backwards induction, highlights the intricate nature of modern warfare involving drones and tanks where brains and brawn are equally pivotal. A natural extension of this work is the case when the set of resources consists of m element i.e., S = { S 1 , S 2 , S m } and set of tanks is given as T = { T 1 , T 2 , T n } .

Author Contributions

Conceptualization, A.I.; Methodology, A.I.; Software, I.H.; Validation, E.T. and R.B.; Formal analysis, A.I.; Investigation, E.T., A.P. and R.B.; Writing—original draft, A.I.; Writing—review & editing, A.I. and C.S.; Visualization, A.P. and G.P.; Supervision, C.S.; Project administration, C.S. All authors have read and agreed to the published version of the manuscript.

Funding

The work in this paper was carried out under a Research Agreement between the Defence Science and Technology Group, Department of Defence, Australia, and the University of Adelaide, Contract No. UA216424-S27.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The author declares no conflicts of interest.

References

  1. Binmore, K. Game Theory: A Very Short Introduction; Oxford University Press: Oxford, UK, 2007. [Google Scholar]
  2. Rasmusen, E. Games and Information: An Introduction to Game Theory, 3rd ed.; Blackwell Publishers Ltd.: Oxford, UK, 2001. [Google Scholar]
  3. Osborne, M.J. An Introduction to Game Theory; Oxford University Press: Oxford, UK, 2003. [Google Scholar]
  4. Cournot, A. Researches into the Mathematical Principles of the Theory of Wealth; Bacon, N., Ed.; Macmillan: New York, NY, USA, 1897. [Google Scholar]
  5. Tirole, J. The Theory of Industrial Organization; MIT: Cambridge, MA, USA, 1988. [Google Scholar]
  6. von Stackelberg, H. Marktform und Gleichgewicht; Julius Springer: Vienna, Austria, 1934. [Google Scholar]
  7. Gibbons, R. Game Theory for Applied Economists; Princeton University Press: Princeton, NJ, USA, 1992. [Google Scholar]
  8. Korzhyk, D.; Yin, Z.; Kiekintveld, C.; Conitzer, V.; Tambe, M. Stackelberg vs. Nash in Security Games: An Extended Investigation of Interchangeability, Equivalence, and Uniqueness. J. AI Res. (JAIR) 2011, 41, 297–327. [Google Scholar] [CrossRef]
  9. Bustamante-Faúndez, P.; Bucarey, L.V.; Labbé, M.; Marianov, V.; Ordoñez, F. Playing Stackelberg Security Games in perfect formulations. Omega 2024, 126, 103068. [Google Scholar] [CrossRef]
  10. Hunt, K.; Zhuang, J. A review of attacker-defender games: Current state and paths forward. Eur. J. Oper. Res. 2024, 313, 401–417. [Google Scholar] [CrossRef]
  11. Chen, X.; Xiao, L.; Feng, W.; Ge, N.; Wang, X. DDoS Defense for IoT: A Stackelberg Game Model-Enabled Collaborative Framework. IEEE Int. Things J. 2022, 9, 9659–9674. [Google Scholar] [CrossRef]
  12. Bansal, G.; Sikdar, B. Security Service Pricing Model for UAV Swarms: A Stackelberg Game Approach. In Proceedings of the IEEE INFOCOM 2021—IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), Vancouver, BC, Canada, 10–13 May 2021; pp. 1–6. [Google Scholar] [CrossRef]
  13. Li, H.; Zheng, Z. Optimal Timing of Moving Target Defense: A Stackelberg Game Model. In Proceedings of the MILCOM 2019—2019 IEEE Military Communications Conference (MILCOM), Norfolk, VA, USA, 12–14 November 2019; pp. 1–6. [Google Scholar] [CrossRef]
  14. Feng, Z.; Ren, G.; Chen, J.; Zhang, X.; Luo, Y.; Wang, M.; Xu, Y. Power Control in Relay-Assisted Anti-Jamming Systems: A Bayesian Three-Layer Stackelberg Game Approach. IEEE Access 2019, 7, 14623–14636. [Google Scholar] [CrossRef]
  15. Kar, D.; Nguyen, T.H.; Fang, F.; Brown, M.; Sinha, A.; Tambe, M.; Jiang, A.X. Trends and Applications in Stackelberg Security Games. In Handbook of Dynamic Game Theory; Basar, T., Zaccour, G., Eds.; Springer: Cham, Switzerland, 2016. [Google Scholar] [CrossRef]
  16. Tambe, M. Security and Game Theory: Algorithms, Deployed Systems, Lessons Learned; Cambridge University Press: Cambridge, MA, USA, 2011. [Google Scholar]
  17. Paruchuri, P.; Pearce, J.; Marecki, J.; Tambe, M.; Ordonez, F.; Kraus, S. Playing Games for Security: An Efficient Exact Algorithm for Solving Bayesian Stackelberg Games. In Proceedings of the International Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), Estoril, Portugal, 12–16 May 2008; pp. 895–902. [Google Scholar]
  18. Hohzaki, R.; Nagashima, S. A Stackelberg equilibrium for a missile procurement problem. Eur. J. Oper. Res. 2009, 193, 238–249. [Google Scholar] [CrossRef]
  19. Sinha, A.; Fang, F.; An, B.; Kiekintveld, C.; Tambe, M. Stackelberg Security Games: Looking Beyond a Decade of Success. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18), Stockholm, Sweden, 13–19 July 2018; pp. 5494–5501. [Google Scholar] [CrossRef]
Figure 1. Plots of D 2 , D 3 , and D 4 vs. D 1 (Range: 0.06 to 0.0636).
Figure 1. Plots of D 2 , D 3 , and D 4 vs. D 1 (Range: 0.06 to 0.0636).
Games 15 00032 g001
Figure 2. The B team’s payoff as given by Equation (32) when Δ 1 > 0 and for the values of A 1 , A 2 , A 4 that satisfy the constraints 0 A 1 , A 2 , A 4 1 and 0 A 1 + A 2 + A 4 1 . The payoff is maximized at the value of 0.371 for A 1 , 2 , 4 = 0 .
Figure 2. The B team’s payoff as given by Equation (32) when Δ 1 > 0 and for the values of A 1 , A 2 , A 4 that satisfy the constraints 0 A 1 , A 2 , A 4 1 and 0 A 1 + A 2 + A 4 1 . The payoff is maximized at the value of 0.371 for A 1 , 2 , 4 = 0 .
Games 15 00032 g002
Figure 3. The plot between Π B and Π R when Δ 1 > 0 and for values of A 1 , A 2 , A 4 that satisfy the constraints 0 A 1 , A 2 , A 4 1 and 0 A 1 + A 2 + A 4 1 .
Figure 3. The plot between Π B and Π R when Δ 1 > 0 and for values of A 1 , A 2 , A 4 that satisfy the constraints 0 A 1 , A 2 , A 4 1 and 0 A 1 + A 2 + A 4 1 .
Games 15 00032 g003
Figure 4. The B team’s payoff as given by Equation (37) when Δ 1 0 and for the values of A 1 , A 2 , A 4 that satisfy the constraints 0 A 1 , A 2 , A 4 1 and 0 A 1 + A 2 + A 4 1 . The payoff is maximized at the value of 2.47 for A 1 , 2 , 4 = 0 .
Figure 4. The B team’s payoff as given by Equation (37) when Δ 1 0 and for the values of A 1 , A 2 , A 4 that satisfy the constraints 0 A 1 , A 2 , A 4 1 and 0 A 1 + A 2 + A 4 1 . The payoff is maximized at the value of 2.47 for A 1 , 2 , 4 = 0 .
Games 15 00032 g004
Figure 5. The plot between Π B and Π R when Δ 1 0 and for values of A 1 , A 2 , A 4 that satisfy the constraints 0 A 1 , A 2 , A 4 1 and 0 A 1 + A 2 + A 4 1 .
Figure 5. The plot between Π B and Π R when Δ 1 0 and for values of A 1 , A 2 , A 4 that satisfy the constraints 0 A 1 , A 2 , A 4 1 and 0 A 1 + A 2 + A 4 1 .
Games 15 00032 g005
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Iqbal, A.; Honhaga, I.; Teffera, E.; Perry, A.; Baker, R.; Pearce, G.; Szabo, C. Vulnerability and Defence: A Case for Stackelberg Game Dynamics. Games 2024, 15, 32. https://doi.org/10.3390/g15050032

AMA Style

Iqbal A, Honhaga I, Teffera E, Perry A, Baker R, Pearce G, Szabo C. Vulnerability and Defence: A Case for Stackelberg Game Dynamics. Games. 2024; 15(5):32. https://doi.org/10.3390/g15050032

Chicago/Turabian Style

Iqbal, Azhar, Ishan Honhaga, Eyoel Teffera, Anthony Perry, Robin Baker, Glen Pearce, and Claudia Szabo. 2024. "Vulnerability and Defence: A Case for Stackelberg Game Dynamics" Games 15, no. 5: 32. https://doi.org/10.3390/g15050032

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop