A Strength Allocation Bayesian Game Method for Swarming Unmanned Systems

Li, Lingwei; Ren, Bangbang

doi:10.3390/drones9090626

Open AccessArticle

A Strength Allocation Bayesian Game Method for Swarming Unmanned Systems

by

Lingwei Li

^1,* and

Bangbang Ren

²

¹

School of Automation, Northwestern Polytechnical University, Xi’an 710072, China

²

Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha 410073, China

^*

Author to whom correspondence should be addressed.

Drones 2025, 9(9), 626; https://doi.org/10.3390/drones9090626

Submission received: 18 July 2025 / Revised: 7 August 2025 / Accepted: 11 August 2025 / Published: 5 September 2025

(This article belongs to the Collection Drones for Security and Defense Applications)

Download

Browse Figures

Versions Notes

Abstract

Highlights

What are the main findings?

A swarming strength allocation Bayesian game model under incomplete information is established, addressing the limitations of prior complete information game models and enabling optimal strength allocation for protecting high-value targets.
An improved Lanchester equation-based benefit quantification method is proposed to predict swarming strength attrtition without being restriced by the number of agents, and a Bayesian Nash equilibrium solving algorithm with defense effectiveness is designed to improve the efficiency and operability of selecting strategies.

What is the implication of the main finding?

Provides a practical solution for high-value targets protection by swarming unmanned systems under incomplete information, optimizing resource utilization, and reducing attrition.
Proposes an optimal strategy for strength allocation using Bayesian game theory, enabling decision makers to select specific and executable strategies, and provides theoretical support.

Abstract

This paper investigates a swarming strength allocation Bayesian game approach under incomplete information to address the high-value targets protection problem of swarming unmanned systems. The swarming strength allocation Bayesian game model is established by analyzing the non-zero sum incomplete information game mechanism during the protection process, considering high-tech and low-tech interception players. The model incorporates a game benefit quantification method based on an improved Lanchester equation. The method regards massive swarm individuals as a collective unit for overall cost calculation, thus avoiding the curse of dimensionality from increasing numbers of individuals. Based on it, a Bayesian Nash equilibrium solving approach is presented to determine the optimal swarming strength allocation for the protection player. Finally, compared with random allocation, greedy heuristic, rule-based assignment, and Colonel Blotto game, the simulations demonstrate the proposed method’s robustness in large-scale strength allocation.

Keywords:

Bayesian game; Lanchester equation; strategy selection; defense effectiveness; swarming unmanned systems

1. Introduction

The high-value targets protection process is diversified because of dynamic environments and multiple resources involved [1,2,3]. With their strong autonomy, low cost, high flexibility, and fast upgrading, swarming unmanned systems are becoming a new operational mode [4,5]. For example, systems with unmanned aerial vehicles (UAVs) were seen in [6]. The Perdix and Low-Cost UAV Swarming Technology projects reported in [7] were to be carried out via a swarm consisting of 30 and 103 UAVs, respectively. Note that strength attrition prediction and strength allocation are crucial for mission accomplishment. However, since the number of swarming unmanned systems is quite large, conventional agent-based technologies cannot satisfy the requirements of swarming operations [8]. Hence, the design of swarming strength attrition prediction and strength allocation systems has received more and more attention from the theoretical and engineering areas.

Regarding the strength attrition prediction model of swarming unmanned systems, agent-based approaches were widely used. These methods predicted the strength attrition of swarming unmanned systems from a microscopic perspective, such as offense–defense confrontation decision making (ODCDM) [9] and multi-agent deep deterministic policy gradient (MADDPG) [10]. The ODCDM method models the dynamic dogfighting process of UAVs. It assigns values like speed, position, weapon count, and survival probability to each UAV to build a UAV attrition model. Then, the optimization technique was used to assign attack targets for each UAV. The MADDPG method established a Markov decision process by offering a reward–punishment mechanism for each agent. Then, multi-agents were trained to adapt to more complex and dynamic environments. Other methods include the probability analysis [11], Monte Carlo [12], and grey system theory [13]. As the number of agents increased, the state-action dimensions grew exponentially. This caused a sharp rise in data and computational requirements for solving the strength allocation problem, making it difficult to obtain an effective strategy [14]. To address this problem, the Lanchester equation was presented. From a macroscopic perspective, it treats swarming groups with the same function as a whole unit. It has been applied in melee operation [15], rifle operation [16], and artillery operation [17].

Transitioning from the model developed to the strength allocation approach, many researchers have investigated how to allocate strength reasonably. In [18], a novel abstraction method was proposed to solve large weapon-target assignment (WTA) problems. In [19], the proposed target-based static WTA algorithm incorporates battle probabilities to address scenarios ranging from one-to-one to many-to-many operation situations. A new bi-objective algorithm was developed in [20] by considering both the maximization of the no-leaker probability and stability in engagement orders. Although the above methods have achieved the desired effect on the defense side, they missed the interaction between the offensive and defensive sides. The essence of swarming protection can be counted as a game process in which the bilateral strategies influence each other [21,22,23]. Game theory provides a strategic framework for addressing the strength allocation problem in swarming protection. It enables optimal resource decisions despite incomplete information by simulating interactions between the interception and protection players, helping decision makers select more rational and effective approaches [24,25].

Many scholars have researched the strength allocation game. In [26], a multi-UAV prevention and control under a complete information game was addressed, which proposed compound interception strategies via genetic fuzzy trees and reinforcement learning for target assignment. In [27], a target assignment problem of UAV swarm attack–defense in a complete information game, adopting a multi-agent deep deterministic policy gradient, was developed. In [28], a swarm confrontation method under complete information was developed. It integrated the Lanchester equation and the Nash equilibrium to select optimal strength allocation. However, the above literature discussed the problem of UAV swarm strength allocation under complete information. In practice, imperfect information can occur [29,30]. Both the interception player and the protection player might have incomplete information about each other, making the solution more complex and uncertain [31]. Therefore, it is essential to establish the incomplete information game model mathematically. In [32], a target assignment problem of unmanned underwater vehicles (UUVs) under incomplete information is studied. It established a multi-objective model solved by a swarming optimization algorithm to allocate targets for its UUVs.

On the basis of summarizing the previous research achievements and inspired by [32], this study proposes a swarming strength allocation Bayesian game method based on incomplete information theory. This method effectively deals with the problem of strength attrition prediction for swarming unmanned systems (SUS) in swarming protection. The swarming strength allocation Bayesian game (SSABG) model is established based on it. Under incomplete information, the swarming strength allocation method is presented in this study to protect the high-value targets of the protection player. The main contributions are as follows.

This study explores the mechanism of swarming strength allocation under incomplete information by introducing the Harsanyi transformation. Based on this, the SSABG model is presented to allocate strength to the swarming protection process. In contrast with prior research in [33], which assumes complete information. This study solved the problem of the opposing players being unable to acquire complete information.
The benefit quantification method is proposed using the Lanchester equation. It can predict swarming strength attrition by regarding swarming groups with the same function as a whole unit. Compared with the previous work in [32] based on the agent perspective, this method can predict swarming strength without limits on the number of individuals. Additionally, the fuel budget and time cost are taken into account in the benefit quantification.
A swarming strength allocation algorithm is designed by introducing the mixed-strategy Bayesian Nash equilibrium (BNE). This algorithm comprises three components: dominant-strategy pre-judgment, pure-strategy Nash equilibrium (NE) calculation, and mixed-strategy BNE calculation. By pre-judging the dominant strategy, this approach reduces the algorithm’s computational complexity and enhances computational efficiency. Additionally, a strategy’s effectiveness calculation method is proposed, which can convert the probability distribution of the mixed-strategy BNE into executable specific strategies, thus avoiding the problem of low operability where strategies are merely expressed in probabilistic terms and practically not selectable.

The following content arrangement starts as follows. Section 2 describes the mechanism of SSABG and the scenario of high-value targets protection. In Section 3, the swarming strength allocation Bayesian game model is established, incorporating the benefit quantification method. In the following, this study introduces a BNE solution approach to obtain the mixed/pure strategies for D. Section 4 gives three case simulations to validate the effectiveness of the proposed approach. Finally, Section 5 concludes this study.

2. SSABG Mechanism Analysis and Scenario Illustration

Let

R

denote the set of real numbers.

R^{1 \times n}

represents the set of

1 \times n

real row vectors, and

R^{n \times n}

represents the set of

n \times n

real square matrices. The notation

{()}^{*}

indicates the optimal value of the corresponding expression. For any vector

a \in R^{1 \times n}

, the i-th element is denoted by

a [i]

. For any matrix

C \in R^{m \times n}

, the element in the i-th row and j-th column is represented by

C [i, j]

, where

i = 1, 2, \dots, m

and

j = 1, 2, \dots, n

.

ε (t - τ)

represents the shifted unit step function, when

t < τ

,

ε = 0

, which

ε = 0

when

t < τ

and

ε = 1

when

t \geq τ

. The symbol ∘ represents the Hadamard product [15], and ⊤ represents matrix transposition.

2.1. SSABG Mechanism Analysis

The SSABG model in this study is a game with incomplete information and simultaneous actions. In the game, the players are categorized as the interception player (denoted by A) and the protection player (denoted by D). “Incomplete information” means that both A and D do not fully know each other’s information (e.g., player’s types). “Simultaneous actions” means both A and D choose strategies at the same time without any sequence. Note that A and D only know each other’s strategy spaces but not which specific strategies the opponent has chosen. Thus, BNE is a strategy profile where each player’s strategy is optimal given their types and the probability distribution of others’ types, maximizing benefits. The objectives of this game are, for D, to eliminate A’s strength through swarming strength allocation and protect its high value targets, and for A, to neutralize D’s high-value targets and eliminate its strength.

Handling static Bayesian games hinges on the Harsanyi transformation [34]. This approach treats “Nature” as a virtual player who first determines players’ types. Then, players select strategies based on their type of knowledge. In this subsection, the swarming unmanned systems’ strength allocation strategies selection Bayesian game mechanism is analyzed, which is shown in Figure 1. Table 1 describes the proposed model’s corresponding game stages in detail.

Applying the Harsanyi transformation, this study transforms the uncertainty about A and D types into a virtual selection by “Nature,” creating a completely informed but imperfect information game model to analyze optimal strategy selection. Note that the game stages are set for four stages, but selecting strategies for both A and D is simultaneous.

2.2. High-Value Targets Protection Scenario Illustration

To verify the feasibility and effectiveness of SSABG and the strength allocation strategy selection method, numerical simulations are performed on a typical high-value targets protection scenario [35], whose structure is depicted in Figure 2. Note that this scenario is an example to illustrate our approach. It can be extended to other SUS security scenarios (such as energy facility protection, transportation hub protection, or no-fly zone protection). A intends to neutralize the high-value targets, while D protects them by allocating their swarming strength. Before A neutralizes the high-value targets, it must eliminate D’s SUS. Otherwise, D’s behavior will lead to the failure of A’s operation.

In this scenario, both D and A are equipped with hard-intercept and soft-intercept systems. The hard-intercept systems consist of direct-aimed equipment SUS (e.g., equipment using laser and microwave technologies, hereinafter referred to as DA-SUS) and kinetic-energy equipment SUS (e.g., kinetic interceptors, hereinafter referred to as KE-SUS). The soft-intercept systems consist of reconnaissance information support SUS, electronic jamming SUS, and strength supplement SUS. As shown in Figure 2, when the first SUS of A enters the reconnaissance radius of D (point a), D’s reconnaissance SUS will start to detect A’s information and provide information support for D’s DA-SUS and KE-SUS so that it can grasp the protection situation. D’s electronic jamming SUS breaks away from the previous position to point b, quickly approaching A’s SUS, keeping a certain distance from it for electronic jamming. Upon detecting A’s actions, D allocates its strength to reconfiguring its setup, and uses protection situational awareness to detect the type of A. D’s DA-SUS and KE-SUS move to point c and coordinates with D’s supplement SUS to defend A’s offense. A also chooses a strength allocation strategy to neutralize the high-value target in response to D’s moves.

Under the circumstances of incomplete information for D, the decision maker may not be able to master the information of A accurately, only relying on experience to know that A is divided into two types: (1) high-tech and (2) low-tech. It means their operational effectiveness is different. High-tech interception players have more advanced intercept technologies, causing a stronger threat to D. In contrast, low-tech interception players, lacking such intercept technologies, are less destructive to D. For the two types of A, the optimal pure/mixed strategy for D is obtained by establishing a Bayesian game model.

3. Main Results

3.1. Swarming Strength Allocation Bayesian Game Model

A and D engage in a game to neutralize or protect high-value targets. Specifically, A aims to breach defenses to neutralize the targets, while D seeks to protect them. This game is a non-cooperative, zero-sum under incomplete information, as D cannot determine if A is high-tech or low-tech. The game’s validity relies on three key assumptions and two constraints in Table 2 and Table 3, respectively.

Definition 1.

The swarming strength allocation problem in unmanned systems can be analyzed using the SSABG model. In practical scenarios, A and D have incomplete information about each other. Based on their equipment levels, they can be divided into different types. The SSABG model is defined by the five-tuple

G_{SSABG} = {N, T, S, P, U}

. More specifically, the model can be expressed in detail as follows:

$N = {N_{A}, N_{D}}$ represents the set of players, with $N_{A}$ as A and $N_{D}$ as D.
$T = {T_{A}, T_{D}}$ represents the set of players’ types, including $T_{A}$ = ${T_{1}^{A}$ , $T_{2}^{A}$ , …, $T_{n}^{A}}$ for A and $T_{D}$ = ${T_{1}^{D}$ , $T_{2}^{D}$ , …, $T_{m}^{D}}$ for D.
$S = {S_{A}, S_{D}}$ represents the set of strategies, with $S_{A}$ = ${S_{1}^{A}$ , $S_{2}^{A}$ , …, $S_{n}^{A}}$ as A’s strength allocation strategies and $S_{D}$ = ${S_{1}^{D}$ , $S_{2}^{D}$ , …, $S_{m}^{D}}$ as D’s strength allocation strategies.
$P = {P_{A}, P_{D}}$ represents the set of prior beliefs. For all $T_{j}^{A} \in T_{A}$ and $T_{i}^{D} \in T_{D}$ , $P_{A}$ = ${P_{A} (T_{i}^{D} | T_{j}^{A})$ $| 1 \leq i \leq m$ , $1 \leq j$ $\leq n}$ with $\sum_{i = 1}^{n}$ $P_{A} (T_{i}^{D} | T_{j}^{A})$ (A’s prior beliefs), and $P_{D} = {P_{D} (T_{j}^{A} | T_{i}^{D}) | 1 \leq j \leq n, 1 \leq i \leq m}$ with $\sum_{j = 1}^{m}$ $P_{D} (T_{j}^{A} | T_{i}^{D})$ (D’s prior beliefs).
$U = {U_{A}, U_{D}}$ represents the set of benefit function. For all $S_{k}^{A} \in S_{A}$ , $S_{g}^{D} \in S_{D}$ , $T_{j}^{A} \in T_{A}$ , and $T_{i}^{D} \in T_{D}$ , $U_{A} (T_{j}^{A}, S_{k}^{A}, S_{g}^{D})$ is A’s benefit when j-th A $T_{j}^{A}$ uses strategies $S_{k}^{A}$ against D’s strategies $S_{g}^{D}$ , and $U_{D} (T_{i}^{D}, S_{k}^{A}, S_{g}^{D})$ is D’s benefit when the i-th $T_{i}^{D}$ uses strategies $S_{g}^{D}$ against A’s strategies $S_{k}^{A}$ .

3.2. Swarming Strength Allocation Benefit Quantification Model

A’s benefit comprises remaining strength attrition

y_{s a}

and interception cost

u_{a c}

, whereas D’s benefit includes remaining strength attrition

x_{s a}

and protection cost

u_{d c}

. The benefits are formulated as

U_{A} (T_{j}^{A}, S_{k}^{A}, S_{g}^{D}) = y_{s a} (S_{k}^{A}) - u_{a c} (S_{k}^{A})

(1)

and

U_{D} (T_{i}^{D}, S_{k}^{A}, S_{g}^{D}) = x_{s a} (S_{g}^{D}) - u_{d c} (S_{g}^{D})

(2)

Inspired by the classic Lanchester equation in [15] and the modern operation mode in [16], an improved Lanchester equation is presented to predict the attrition

y_{s a}

and

x_{s a}

of swarming unmanned systems. It also aggregates individuals’ response capability within a swarm into overall strength, which is not influenced by the number of individuals in the swarm, which is given by

\{\begin{matrix} {\dot{y}}_{s a} & = - ε (t - t_{r}) {(μ_{r} (A_{d} \circ Φ_{d}) x_{d_{s a}} + (1 - μ_{r}) (A_{d}^{'} \circ Φ_{d}) x_{d_{s a}} y_{s a})}^{λ_{r}} \\ - ε (t - t_{r}) {(μ_{r} (A_{k} \circ Φ_{k}) + (1 - μ_{r}) (A_{k}^{'} \circ Φ_{k}) x_{k_{s a}} y_{s a})}^{λ_{r}} + s_{b} \\ {\dot{x}}_{s a} & = - ε (t - t_{b}) {(μ_{b} (B_{d} \circ Ψ_{d}) y_{d_{s a}} + (1 - μ_{b}) (B_{d}^{'} \circ Ψ_{d}) y_{d_{s a}} x_{s a})}^{λ_{b}} \\ - ε (t - t_{b}) {(μ_{b} (B_{k} \circ Ψ_{k}) + (1 - μ_{b}) (B_{k}^{'} \circ Ψ_{k}) y_{k_{s a}} x_{s a})}^{λ_{b}} + s_{r} \end{matrix}

(3)

where

x_{s a} = {[x_{d_{s a}} x_{k_{s a}}]}^{⊤}

and

y_{s a} = {[y_{d_{s a}} y_{k_{s a}}]}^{⊤}

. Specifically,

x_{d_{s a}} \in R^{m \times 1}

and

y_{d_{s a}} \in R^{n \times 1}

represent D’s and A’s DA-SUS strength, respectively, while

x_{k_{s a}} \in R^{m \times 1}

and

y_{k_{s a}} \in R^{n \times 1}

represent D’s and A’s KE-SUS strength, respectively.

s_{r} = {[s_{r_{d}} s_{r_{k}}]}^{⊤}

and

s_{b} = {[s_{b_{d}} s_{b_{k}}]}^{⊤}

(

s_{r} \in R^{2 m \times 1}

and

s_{b} \in R^{2 n \times 1}

) represent D’s and A’s supplement strength, respectively.

A_{d} \in R^{n \times m}

,

A_{k} \in R^{n \times m}

,

B_{d} \in R^{m \times n}

, and

B_{k} \in R^{m \times n}

are the damage coefficient matrices with situation awareness of D and A, respectively.

A_{d}^{'} \in R^{n \times m}

,

A_{k}^{'} \in R^{n \times m}

,

B_{d}^{'} \in R^{m \times n}

and

B_{k}^{'} \in R^{m \times n}

are damage coefficient matrices without situation awareness of D and A, respectively.

Φ_{d} \in R^{n \times m}

and

Φ_{k} \in R^{n \times m}

are strength allocation strategy matrices of D’s DA-SUS and KE-SUS, respectively.

Ψ_{d} \in R^{m \times n}

and

Ψ_{k} \in R^{m \times n}

are strength allocation strategy matrices of A’s DA-SUS and KE-SUS, respectively. The matrices

Φ_{d}

,

Φ_{k}

,

Ψ_{d}

, and

Ψ_{k}

are non-negative, with each column sum not exceeding 1.

t_{r} \in R^{2 m \times 1}

and

t_{b} \in R^{2 n \times 1}

represent the decision-making delay time of D and A, respectively.

μ_{r} \in R^{m \times 1}

and

μ_{b} \in R^{n \times 1}

denote the proportions of D’s and A’s strength that can receive reconnaissance information, respectively.

λ_{r} \in R^{m \times 1}

and

λ_{b} \in R^{n \times 1}

denote the operational effectiveness of D’s and A’s electronic jamming systems, respectively.

The interception cost

u_{a c}

consists of fuel budget and time cost, which is given by

u_{a c} = Δ v_{a} + u_{t c_{a}}

(4)

where

Δ v_{a}

represents A’s fuel budget, and

u_{t c_{a}}

represents A’s time cost. Protection cost

u_{d c}

also consists of fuel budget and time cost, which is given by

u_{d c} = Δ v_{d} + u_{t c_{d}}

(5)

where

Δ v_{d}

represents D’s fuel budget, and

u_{t c_{d}}

represents D’s time cost.

The attrition strength rate

δ_{s a}

serves as a metric to evaluate the performance of the strength allocation strategy. A smaller

δ_{s a}

indicates superior operation effectiveness, where

δ_{s a}

is defined as

δ_{s a} = \frac{s_{a_{0}} - s_{a_{T}}}{s_{a_{0}}}

(6)

where

s_{a_{T}}

represents the terminal-time strength, and

s_{a_{0}}

denotes the initial strength.

Remark 1.

In unmanned swarming system protections, fuel budget (

Δ v_{a}

and

Δ v_{d}

) and time cost (

u_{t c_{a}}

and

u_{t c_{d}}

) are of vital importance (see [38]). These protections are subject to orbital dynamics, unlike their ground-based counterparts. The mass of the SUS has a decisive effect on its maneuverability, which requires propulsion systems to have optimal weight ratios and limited fuel. Furthermore, the SUS’s equipment needs significant time to reach its operational positions, highlighting the importance of preparation time in profit calculations.

Remark 2.

The decision-making delay time, denoted by

t_{r}

for D and

t_{b}

for A, represents an interval during which one side cannot mount effective actions to inflict more attrition on the opponent (i.e., the interruption of communication systems leads to the failure of command transmission). As another part of the benefit quantification model, fuel budget and time cost (or flight time) do not act directly on the Lanchester equation but affect the level of game benefits. The smaller the fuel budget and time cost, the lower the cost incurred by A or D, which increases the game benefit.

3.3. The Swarming Strength Allocation Algorithm Based on Nash Equilibrium

In the high-value protection process, A and D have different information processing capabilities. When D implements a defensive strategy, A can implement a set of offensive strategies. Both aim to maximize benefits based on the given players’ type and responding prior belief sets. This leads to an equilibrium where unilateral strategy changes by either party reduce their benefits. The definitions of pure-strategy NE and mixed-strategy BNE are given below.

Definition 2.

Pure-strategy NE. For the j-th type of A, if their strategy

S_{k}^{*} (T_{j}^{A})

is the best response to D’s strategy

S_{g}^{*} (T_{i}^{D})

, and for the i-th type of D, their defense strategy

S_{g}^{*} (T_{i}^{D})

is the best response to A’s strategy

S_{k}^{*} (T_{j}^{A})

, then

(S_{k}^{*} (T_{j}^{A}), S_{g}^{*} (T_{i}^{D}))

is pure-strategy NE of the SSABG model when

\begin{matrix} \sum_{i = 1}^{m} P_{A} (T_{i}^{D} | T_{j}^{A}) U_{A} (T_{j}^{A}, S_{k}^{*} (T_{j}^{A}), S_{g}^{*} (T_{i}^{D})) \geq \sum_{i = 1}^{m} P_{A} (T_{i}^{D} | T_{j}^{A}) U_{A} (T_{j}^{A}, S_{k} (T_{j}^{A}), S_{g}^{*} (T_{i}^{D})), \end{matrix}

(7)

and

\begin{matrix} \sum_{j = 1}^{n} P_{D} (T_{j}^{A} | T_{i}^{D}) U_{D} (T_{i}^{D}, S_{k}^{*} (T_{j}^{A}), S_{g}^{*} (T_{i}^{D})) \geq \sum_{j = 1}^{n} P_{D} (T_{j}^{A} | T_{i}^{D}) U_{D} (T_{i}^{D}, S_{k}^{*} (T_{j}^{A}), S_{g} (T_{i}^{D})) . \end{matrix}

(8)

Definition 3.

Mixed-strategy BNE. For the j-th type of A, if it selects strategy

S_{k}^{A} (T_{j}^{A})

with probability

f_{k}^{A} (T_{j}^{A})

, then

F_{A} (T_{j}^{A})

=

{f_{1}^{A} (T_{j}^{A})

,

f_{2}^{A} (T_{j}^{A})

, …,

f_{K}^{A} (T_{j}^{A})}

is the mixed strategy for the j-th type of A

T_{j}^{A}

. Similarly, for the i-th type of D, if it selects strategy

S_{g}^{D} (T_{i}^{D})

with probability

f_{g}^{D} (T_{i}^{D})

, then

F_{D} (T_{i}^{D})

=

{f_{1}^{D} (T_{i}^{D})

,

f_{2}^{D} (T_{i}^{D})

, …,

f_{G}^{D} (T_{i}^{D})}

is the mixed strategy for the i-th type of D

T_{i}^{D}

. When the mixed strategy

F_{A}^{*} (T_{j}^{A})

of the j-th type of A is the best response to D’s mixed strategy

F_{D}^{*} (T_{i}^{D})

, and the mixed strategy

F_{D}^{*} (T_{i}^{D})

of the i-th type of D is the best response to A’s mixed strategy

F_{A}^{*} (T_{j}^{A})

, then

(F_{A}^{*} (T_{j}^{A}), F_{D}^{*} (T_{i}^{D}))

is a mixed-strategy BNE of the SSABG model if it satisfies

\begin{matrix} \sum_{i = 1}^{m} P_{A} (T_{i}^{D} | T_{j}^{A}) U_{A} (T_{j}^{A}, F_{A}^{*} (T_{j}^{A}), F_{D}^{*} (T_{i}^{D})) \geq \sum_{i = 1}^{m} P_{A} (T_{i}^{D} | T_{j}^{A}) U_{A} (T_{j}^{A}, F_{A} (T_{j}^{A}), F_{D}^{*} (T_{i}^{D})), \end{matrix}

(9)

and

\begin{matrix} \sum_{j = 1}^{n} P_{D} (T_{j}^{A} | T_{i}^{D}) U_{D} (T_{i}^{D}, F_{A}^{*} (T_{j}^{A}), F_{D}^{*} (T_{i}^{D})) \geq \sum_{j = 1}^{n} P_{D} (T_{j}^{A} | T_{i}^{D}) U_{D} (T_{i}^{D}, F_{A}^{*} (T_{j}^{A}), F_{D} (T_{i}^{D})) . \end{matrix}

(10)

Theorem 1.

For the static Bayesian game model

G_{SSABG} = {N, T, S, P, U}

, if the strategy sets

S_{k} (T_{j}^{A})

of A and

S_{g} (T_{i}^{D})

of D are finite, then this game model must have a pure-strategy NE or mixed-strategy BNE. SSABG may yield multiple equilibria rather than a unique one.

Proof.

Given the static Bayesian game

G_{SSABG} = {N, T, S, P, U}

, it can be found that the players are A and D. The player type sets

(T_{A}, T_{D})

and strategy sets

(S_{A}, S_{D})

are finite. The corresponding prior belief sets

(P_{A}, P_{D})

and benefit sets

U (U_{A}, U_{D})

are also finite. Thus, the given static Bayesian game model is a finite game. By the fixed-point theorem [16], every finite-strategy game has at least one pure-strategy NE or mixed-strategy BNE. By the definitions of pure-strategy NE and mixed-strategy BNE, at this equilibrium, there is a set of pure-strategies or mixed-strategies that maximize the benefits for both A and D compared with other strategies. Without knowing the opponent’s strategy, both sides choose this set to maximize their benefits. Solving the pure-strategy NE or mixed-strategy BNE yields the optimal pure-strategy or mixed-strategy defense strategy.

Different parameter values mean different strategy probabilities. These strategy probability sets that meet the constraints are all NE. Therefore, the Nash equilibrium solution is not unique [39]. There are many different solutions in the area of the parameter space where equilibrium exists. □

On the basis of the above Definitions 2 and 3, the swarming strength allocation algorithm based on Bayesian game is proposed. This algorithm consists of three parts: dominant strategy pre-judging, pure-strategy NE calculation, and mixed-strategy BNE calculation. First, pre-judging is conducted to determine whether there is a dominant strategy. If a dominant strategy exists, the algorithm proceeds to the pure strategy NE calculation sub-algorithm; otherwise, it moves to the BNE calculation sub-algorithm. The details of the algorithms can be seen in Algorithms 1–3.

Algorithm 1 Swarming strength allocation algorithm based on Bayesian games

Input:

G_{SSABG}

model

Output: Swarming strength allocation strategies.

1: Initialize

M_{SSABG} = {(N_{A}, N_{D}), (T_{A}, T_{D}), (S_{A}, S_{D}), (P_{A}, P_{D}), (U_{A}, U_{D})}

.

2: for Each player type in [A, D] do

3: Establish the type set

T_{type}

4: Establish the strategy set

S_{type}

5: Determine the type of the player

T_{type, k}

←

H (N_{type}, S_{type, k})

via Harsanyi transformation

6: end for

7: for each strategy pair

(s_{A, j} (T_{A}, j) \in S_{A} (T_{A}), s_{D, i} (T_{D}, i) \in S_{D} (T_{D}))

do

8: for each player type in [A, D] do

9: Calculate the player’s cost and benefit for the current strategy pair

10: end for

11: end for

12: Generate game benefit value matrix U.

13: Identify dominant strategies from U.

14: if U has dominant strategies then

15: Call the dominant strategy solving sub-algorithm

{Solve}^{1} (S S A B G)

.

16: else

17: Call the mixed-strategy solving sub-algorithm

{Solve}^{2} (S S A B G)

.

18: end if

19: for Each player type in [A, D] do

20: Evaluate and rank the player’s strategies

21: end for

22: return Swarming strength allocation strategies results.

The defense effectiveness is quantified by the defensive strategy’s impact on offensive behavior at the equilibrium point of strategies chosen by A and D, combining the SSABG model and mixed-strategy BNE. A traditional static Bayesian game provides D’s strategy selection probability distribution after solving the mixed-strategy BNE, which is hard to guide actual defense actions since a specific strategy must be chosen during protection. The concept of defense effectiveness addresses this by selecting the optimal strategy after calculating each defense strategy’s effectiveness post-equilibrium solution. After determining the equilibrium point between A and D, D can predict A’s strategy selection probability, denoted as

F_{A}^{*} (T_{i}^{A})

=

{f_{1}^{A *} (T_{j}^{A})

,

f_{2}^{D *} (T_{j}^{A})

, …,

f_{K}^{A *} (T_{j}^{A})}

. Combined with the prior probabilities of each D type

P_{D} (T_{j}^{A} ∣ T_{i}^{D})

and the benefit function

U_{D} (T_{i}^{D}, S_{k}^{A} (T_{j}^{A}), S_{g}^{D})

, the defense effectiveness for strategy

S_{g}^{D}

is calculated as follows [40]:

E (S_{g}^{D}) = \sum_{j = 1}^{n} P_{D} (T_{j}^{A} ∣ T_{i}^{D}) \sum_{k = 1}^{k_{2}} U_{D} (T_{i}^{D}, S_{k}^{A} (T_{j}^{A}), S_{g}^{D}) F_{A}^{*} (T_{j}^{A}) .

(11)

Once all defense strategies’ effectiveness is determined, they are ranked in descending order. Given D’s limited resources, the strategy with the highest defense effectiveness from the BNE or NE is selected as the optimal strategy. Notably, if the most effective strategy is not in the BNE or NE, the one with the highest effectiveness within these equilibria is selected.

Algorithm 2 Pure-strategy NE solving sub-algorithm

{Solve}^{1} (S S A B G)

Input:

G_{SSABG}

model

Output: Swarming strength allocation strategies.

1:: for $i \in S_{A}$ do
2:: Assume the i-th strategy is dominant.
3:: for $j \in S_{A}$ do
4:: if $i \neq j$ then
5:: for $k \in S_{D}$ do
6:: if $U (i, k) \geq U (j, k)$ then
7:: The j-th strategy is not dominant; break the inner loop;
8:: end if
9:: end for
10:: if The i-th strategy is not dominant then
11:: Break the middle loop.
12:: end if
13:: end if
14:: end for
15:: Record dominant strategy index.
16:: end for
17:: for $i \in S_{D}$ do
18:: Assume the i-th strategy is dominant.
19:: for $j \in S_{D}$ do
20:: if $i \neq j$ then
21:: for $k \in S_{A}$ do
22:: if $U (i, k) \geq U (j, k)$ then
23:: The j-th strategy is not dominant; break the inner loop;
24:: end if
25:: end for
26:: if The i-th strategy is not dominant then
27:: Break the middle loop.
28:: end if
29:: end if
30:: end for
31:: Record dominant strategy index.
32:: end for
33:: return Pure-strategy NE $(S_{A}^{*}, S_{D}^{*})$ .

Algorithm 3 Mixed-strategy BNE solving sub-algorithm

{Solve}^{2} (S S A B G)

Input: $G_{SSABG}$ model
Output: Swarming strength allocation strategies.
1:
Obtain $P_{A}$ and $P_{D}$ from historical data.
2:
while $F_{A} - F_{A}^{'} > tol$ and $F_{D} - F_{D}^{'} > tol$ , and the number of iterations is less than the maximum allowed. do
3:
    for $\forall s_{A, j} (T_{A}, j) \in S_{A} (T_{A})$ do
4:
        for $\forall s_{D, i} (T_{D}, i) \in S_{D} (T_{D})$ do
5:
             $\sum_{i = 1}^{m} P_{A} (T_{i}^{D} ∣ T_{j}^{A}) U_{A}^{'} (T_{j}^{A}, F_{A}^{*} (T_{j}^{A}), F_{D}^{*} (T_{i}^{D})) \geq \sum_{i = 1}^{m} P_{A} (T_{i}^{D} ∣ T_{j}^{A}) U_{A}^{'} (T_{j}^{A}, F_{A} (T_{j}^{A}),$
             $F_{D}^{*} (T_{i}^{D}))$ ;
6:
             $\sum_{j = 1}^{n} P_{D} (T_{j}^{A} ∣ T_{i}^{D}) U_{D} (T_{i}^{D}, F_{A}^{*} (T_{j}^{A}), F_{D}^{*} (T_{i}^{D})) \geq$ $\sum_{j = 1}^{n} P_{D} (T_{j}^{A} ∣ T_{i}^{D}) U_{D} (T_{i}^{D}, F_{A}^{*} (T_{j}^{A}),$
             $F_{D} (T_{i}^{D}))$ ;
7:
        end for
8:
    end for
9:
end while
10:
return Mixed-strategy BNE $(F_{A}^{*}, F_{D}^{*})$ .

3.4. Methods for Comparison

To validate the robustness and novelty of the proposed SSABG model, four alternative allocation methods are selected to compare with SSABG. The description of the four methods are presented as follows:

Random allocation (RA) [41]: a random strength allocation method, making A and D randomly select an allocation strategy for interception or protection.
Greedy heuristic (GH) [41]: a greedy heuristics strength allocation method, making A and D always select an allocation strategy with the highest benefit.
Rule-based assignment (RBA) [42]: a rule-based strength allocation method, making A and D select an allocation strategy with certain rules.
Colonel Blotto game (CBG) [28]: a game theory-based allocation method, making A and D allocate strength at NE or BNE across targets.

In the RA method, pure strategies for A and D are numerous and randomly sampled. The expected benefit of each sampled strategy is computed, and the strategy yielding the highest benefit is selected as optimal. The number of samples is fixed at 1000 to balance solution efficiency and accuracy. Since this method does not optimize the game benefit, it is a lower-bound benchmark for an allocation approach’s performance. In the GH method, a classic greedy heuristic algorithm is employed at each step; the strategy with the highest local benefit is chosen. In the RBA method, the predefined rule is that A and D select the strategy with the highest expected benefit. In the CBG method, the protection process is decomposed into multiple battlefields. Victory on a battlefield is awarded to the side that allocates more resources, irrespective of cost. The criterion for strategy selection is whether the opponent’s strength is eliminated. The model assumes perfect information and thus serves as an upper-bound benchmark. To apply this method experimentally, the Harsanyi transformation is required to allocate strength.

The above methods provide approaches for strength allocation without a method of game benefit quantification and assumptions for modeling game players. Thus, this study fails to address the problem of high-value targets protection. In the comparative simulations, this study will uniformly handle the strength allocation strategy selection of the aforementioned methods using the proposed game framework, and compare the advantages and disadvantages of various methods under the same experimental conditions.

4. Numerical Simulations and Result Analysis

In this section, three experiments are conducted to verify the effectiveness of SSABG compared with other methods. The performances of different allocation methods, including RA, GH, RBA, CBG and SSABG, are evaluated by the average defense effectiveness and execution time. RA is denoted as a baseline method for other methods, and CBG is denoted as an upper-bound benchmark method. Then, the simulations of different prior probabilities of A’s types are conducted to explore the effect on strategy selection. Finally, the comparison between RA, GH, RBA, CBG, and SSABG is analyzed.

4.1. Simulation Parameter Settings

A’s type is denoted as

T_{A} = {T_{1}^{A}, T_{2}^{A}}

, while D’s types are denoted as

T_{D} = T_{1}^{D}

. Specifically,

T_{1}^{A}

denotes the high-tech interception player, while

T_{2}^{A}

signifies the low-tech interception player. D’s prior probabilities of A’s type

(P_{1}^{A}, P_{2}^{A}) = (0.45, 0.55)

. Some strength allocation strategies encompass game mechanisms, achieved by selecting diverse units to engage with the opponent’s corresponding unmanned system strength. A’s (both the high-tech one and the low-tech one) initial strength allocation strategies probability distribution in

(f_{1}^{A}, f_{2}^{A}, f_{3}^{A}, f_{4}^{A})

is

(0.25, 0.25, 0.25, 0.25)

, while D’s initial strength allocation strategies probabilities distribution in

(f_{1}^{D}, f_{2}^{D}, f_{3}^{D}, f_{4}^{D})

is

(0.25, 0.25, 0.25, 0.25)

too.

The key variables and parameter settings are given by

\{\begin{matrix} s_{r} = [10, 4], s_{b} = [10, 4], t_{r} = [0.5, 0.7], t_{b} = [0.5, 1.1], μ_{r} = 0.9, λ_{r} = 0.9, \\ A_{d} = {[2.4, 1.2]}^{⊤}, A_{k} = {[2, 2]}^{⊤}, A_{d}^{'} = A_{d} \times 10^{- 5}, A_{k}^{'} = A_{k} \times 10^{- 5} . \end{matrix}

(12)

When A is a high-tech one, the damage coefficient matrices are given by

\{\begin{matrix} μ_{b} = 0.85, λ_{b} = 0.85, B_{d} = {[2.6, 1.8]}^{⊤}, B_{k} = {[0.5, 0.6]}^{⊤}, \\ B_{d}^{'} = B_{d} \times 10^{- 5}, B_{k}^{'} = B_{k} \times 10^{- 5}, \end{matrix}

(13)

while when A is a low-tech one, the parameters are given by

\{\begin{matrix} μ_{b} = 0.8, λ_{b} = 0.8, B_{d} = {[1.73, 1.2]}^{⊤}, B_{k} = {[0.33, 0.4]}^{⊤}, \\ B_{d}^{'} = B_{d} \times 10^{- 5}, B_{k}^{'} = B_{k} \times 10^{- 5} . \end{matrix}

(14)

According to [43], the optimal operational principle of allocating strength is to concentrate strength to neutralize the opponent. In this study, the strength allocation strategies are designed to centralize all swarming strength to engage the opponent’s single SUS, rather than dispersing the strength to engage multiple SUS concurrently. Once one side is eliminated, the remaining SUS’s strength is reallocated to intercept another SUS of the opponent. Figure 3 and Figure 4 provide a comprehensive summary of the strength allocation strategies for A and D, respectively. Taking

S_{1}^{A}

in Figure 3 as an example, this strategy represents A, allocating all DA-SUS and KW-SUS to intercept D’s DA-SUS. After D’s DA-SUS are neutralized, all DA-SUS and KW-SUS are allocated to intercept D’s KE-SUS. Other strategies in Figure 3 and Figure 4 are defined according to the arrow directions specified in the figures.

A’s and D’s fuel budget

Δ v

and the time cost

u_{t c_{d}}

are detailed in Table 4, respectively. Considering A, the values

[71, 45, 78, 101]

correspond to

Δ_{v}

when A selects actions

S_{1}^{A}

,

S_{2}^{A}

,

S_{3}^{A}

, and

S_{4}^{A}

, respectively. Similarly, the values

[110, 159, 121, 195]

represent

u_{t c}

for the same sequence of actions. The remaining elements in Table 4 are analyzed analogously. Note that the initial strength 330 for D and 380 for A, their fuel budgets, and time costs are referred to [28,34,38].

4.2. Simulation Results

To ensure robustness, all numerical simulations adopted 200 independent Monte Carlo runs for each method, with all simulations conducted under identical computational settings.

4.2.1. Comparison of Increasing Strength Scale

Four cases with increasing initial strength are simulated in this subsection. The illustration of Cases #1–#4 is given below.

Case #1: Symmetric and small-scale protection scenario. A’s and D’s initial strength is equal, that is, $x = {[50, 50]}^{⊤}$ and $y = {[50, 50]}^{⊤}$ .
Case #2: Defender-disadvantage scenario. A’s and D’s initial strength is equal, that is, $x = {[150, 180]}^{⊤}$ and $y = {[200, 180]}^{⊤}$ .
Case #3: Defender-advantage scenario. A’s and D’s initial strength is equal, that is, $x = {[290, 260]}^{⊤}$ and $y = {[280, 250]}^{⊤}$ .
Case #4: Large-scale protection scenario. A’s and D’s initial strength is equal, that is, $x = {[560, 520]}^{⊤}$ and $y = {[530, 520]}^{⊤}$ .

Figure 5 shows the defense effectiveness of five methods under four cases. When the strength scale is small (Case #1), the defense effectiveness of SSABG does not markedly exceed that of other methods. In Cases #2–#4, SSABG obtains higher defense effectiveness than the other methods. RA always yields the lowest effectiveness because it selects strategies uniformly at random and cannot adapt to scenario changes. GH and RBA rank next but remain limited: GH maximizes only local best benefit, while RBA relies on fixed rules, so that neither method can cope with dynamic settings. CBG is grounded in game theory, yet ignores all costs and assumes complete information. In practice, it must be reformulated via a Harsanyi transformation, so its effectiveness remains below that of SSABG. These results confirm that SSABG protects high-value targets better than the alternatives under all cases by managing incomplete information and dynamic interactions.

Table 5 and Table 6 show the strategy selection probability of each method. RA selects a fixed pure strategy regardless of the strength scale. GH and RBA select a pure strategy because if no dominant strategy exists, GH and RBA select the strategy that maximizes their benefit without considering the opponent’s strategy. CBG selects the strategy with uniform distribution (except in Case #2) due to its cost omission and complete information assumption. Selecting either

S_{1}^{D} (T^{D})

or

S_{2}^{D} (T^{D})

allows D to eliminate A’s strength at a high attrition rate, so these two strategies are dominated strategies. Based on it, A selects its allocation strategies randomly (attrition rate data are provided in Appendix A, Table A9, Table A10, Table A11, Table A12, Table A13, Table A14, Table A15 and Table A16). SSABG tends to obtain a pure-strategy NE when the strength scale is small. As the strength scale grows, SSABG obtains a mixed-strategy BNE to balance interception threat and protection costs.

4.2.2. Comparison of Different Prior Probabilities

Figure 6 shows the defense effectiveness of the five methods under the circumstances that the prior probability

P_{1}^{A}

changes. The initial strengths of A and D in this simulation are

x = {[560, 520]}^{⊤}

and

y = {[530, 520]}^{⊤}

, respectively. In Figure 6, SSABG obtains the highest defense effectiveness across all

P_{1}^{A}

. It means that the strategies selected by SSABG are superior. RA remains the least effective because it chooses strategies uniformly at random, ignores prior information, and cannot handle A’s type uncertainty. GH selects the strategy with the highest immediate benefit but neglects the uncertainty introduced by the prior, so its effectiveness is only slightly above that of RA and below that of SSABG. RBA follows fixed rules; although stable, the rules cannot adapt to changes in the past, so their effectiveness stays at a medium level. CBG assumes complete information and therefore disregards incomplete information. It cannot update beliefs as the prior varies. Hence, its effectiveness is still lower than that of SSABG.

SSABG’s defense effectiveness fluctuates slightly with

P_{1}^{A}

but always stays at its highest. The method converts incomplete information into a complete yet imperfect information game via the Harsanyi transformation, obtains a pure-strategy NE or a mixed-strategy BNE, and updates its belief about A’s type according to the prior. It can be indicated that SSABG can obtain its pure-strategy NE or mixed-strategy BNE as the prior probability changes with high robustness.

4.2.3. Comparison of Execution Time

Figure 7 shows the average execution time for Cases #1–#4 with different initial strengths. RBA is the fastest, followed by GH. SSABG and CBG show a similar execution time, and RA is the slowest. GH and RBA need no iteration. They select the strategy with the highest benefit. RA enumerates all strategies of A and D randomly with N samples and keeps the best, so it is the slowest. SSABG and CBG use linear programming to obtain the pure-strategy NE or mixed-strategies BNE, so they take longer than GH and RBA.

Furthermore, Figure 8 presents the execution time of five methods under different strength scales. The result in Figure 8 shows that the execution times of RA, GH, RBA, CBG, and SSABG are all stable as the strength scale expands, a phenomenon attributed to all five methods utilizing the improved Lanchester equation for benefit quantification. This method treats SUS with the same function as an integrated unit for analysis, avoiding the curse of dimensionality caused by the increasing number of agents, thus ensuring stable execution efficiency. Specifically, RA, GH, RBA, and SSABG rely on this method to calculate game benefits for selecting allocation strategies. CBG uses it to determine whether strength is fully eliminated for strategy selection. This shared framework ensures that the execution times of all five methods do not fluctuate significantly across scenarios of increasing scales.

4.3. Results Discussion

This subsection benchmarks the proposed SSABG against the RA, GH, RBA, and CBG methods based on the results presented in Section 4.2, focusing on model complexity, scalability, computational requirements, and general applicability. The structured comparison is presented in Table 7. In comparing model complexity, scalability, and general applicability, low represents the worst performance, while high represents the best performance. In contrast, in comparing computational requirements, high represents the worst performance, while low represents the best performance.

Model complexity. RA, GH, and RBA show low model complexity. RA requires random sampling to obtain a strategy. GH determines its strategy through greedy heuristic search. RBA depends on predefined fixed rules. CBG presents medium model complexity because it solves pure-strategy NE or mixed-strategy BNE under the assumption of complete information. Moreover, it ignores the cost in benefit quantification. SSABG presents the highest model complexity. To address two issues (large-scale strength and incomplete information assumption), SSABG applies the improved Lanchester equation at the macro level. It abstracts many agents as overall strength if they have the same function. Meanwhile, SSABG uses Bayesian game theory at the micro level to describe the uncertainty and interactions between A and D. It enhances the expressive capability of the model while increasing the model complexity.

Scalability. SSABG shows the highest scalability. Five methods benefit from the game benefit analysis framework based on the improved Lanchester equation and the Harsanyi transformation, and their execution times remain stable as the strength scale expands. It quantifies game benefits through this framework and combines Bayesian equilibrium solving, which can address the strength allocation problem under incomplete information. Furthermore, if the information is fully known, this method can also be used to obtain allocation strategies. CBG comes next; although it relies on this framework to measure the state of strength elimination, the assumption of complete information limits its adaptability to situations with information asymmetry. GH and RBA show medium performance. They are based on local optimality and predefined rules, respectively, and although their execution times are stable, they lack the ability for flexible adjustment. RA is the worst due to its benefit calculation, which relies on this framework, and its random sampling strategy leads to the failure to optimize game benefits.

Computational requirements. RA imposes high computational requirements because it evaluates strategies’ benefits with a large number of random samples, resulting in the longest execution time. GH and RBA incur low computational costs as they select strategies based on local game benefits or fixed rules with the shortest execution times. CBG presents medium requirements. It applies linear programming to obtain pure-strategy NE or mixed-strategy BNE. SSABG also shows medium computational requirements, since it must solve a pure-strategy NE or a mixed-strategy BNE while accounting for operational costs. Its execution time is slightly lower than that of CBG, yet markedly lower than RA’s, and it does not increase noticeably with the strength scale.

General applicability. RA has low applicability. It serves as a baseline, accepting any allocation strategy but delivering consistently weak performance. GH also has low applicability because it always selects the locally optimal game benefit only when a dominant strategy exists. However, if no dominant strategy exists, its performance is relatively low. RBA has medium applicability, is restricted to static scenarios covered by predefined rules, and is unable to accommodate dynamic adjustments, resulting in migration costs. CBG has medium applicability within the complete information assumption. The Harsanyi transformation is required if the information is incomplete, yet it still cannot handle an incomplete information game. SSABG is highly applicable. It addresses cases with or without dominant strategies and accommodates pure-strategy NE and mixed-strategy BNE. SSABG is well suited to large-scale heterogeneous SUS (e.g., UAVs, UUVs, and satellites) whose attrition dynamics are predicted by the Lanchester equation. Game benefits are derived directly from this equation, and equilibrium strategies are computed accordingly.

Additionally, a comparison of model assumptions, constraints, and limitations between RA, GH, RBA, CBG, and SSABG is given in Table 8.

The assumptions and constraints of SSABG make it more in line with scenarios with incomplete information in high-value targets protection. Its limitation is that the advantages are not significant in small-scale scenarios, because the impact of stochastic effects dominates for small-scale SUS, causing attrition prediction to be vulnerable to random factors [15]. RA assumes that strategy selection is random, making it unable to adapt to scenario changes. GH assumes that the strategy with the maximum benefit is selected only. It ignores the interaction of the opponent’s response strategies. RBA depends on a predefined rule, resulting in insufficient flexibility if the protection task changes. CGB assumes complete information and cannot handle uncertainty. By integrating the incomplete information game mechanism, SSABG makes up for the defects of other methods in information processing. However, its efficiency is slightly inferior in small-scale scenarios due to higher model complexity.

From the above discussion, SSABG has advantages in handling large-scale strength allocation scenarios, but does not show significant advantages in small-scale strength allocation scenarios. Specifically, for the small-scale strength allocation problem, fast methods such as GH or RBA are sufficient. For large-scale strength allocation problems, SSABG should be used to balance optimality of effectiveness and execution time.

5. Conclusions

A Bayesian game model was applied for SUS to accomplish the strength allocation with incomplete information. The Harsanyi transformation converts the game with incomplete information into a game of complete but imperfect information. It simplifies the analysis and makes the model more analyzable. In addition, a quantification method of the game benefits was proposed by introducing an improved Lanchester equation to predict the strength attrition. It can effectively solve the curse of dimensionality. Then, the swarming strength allocation algorithm based on the Bayesian game was developed to attain a solution. Simulation results verify the effectiveness of the proposed method. This study offers a solution to protecting high-value targets by large-scale SUS under incomplete information. The proposed framework optimizes resource utilization by allocating SUS’s strength to protect these assets. It reduces individual-agent attrition, providing decision makers with a theoretically grounded guide when information is incomplete.

Author Contributions

Methodology, software, writing—original draft preparation: L.L.; writing—review and editing: B.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be found in GitHub with Octave 10.2 at https://github.com/leilie1221/Drones-manuscript-code (accessed on 7 August 2025).

DURC Statement

The current research is limited to the field of aerospace engineering, which is beneficial for protecting high-value targets and does not pose a threat to public health or national security. The authors acknowledge the dual-use potential of the research involving strength allocation decision making in unmanned swarming systems protection and confirm that all necessary precautions have been taken to prevent its misuse. In line with their ethical responsibilities, the authors strictly adhere to all relevant national and international regulations regarding DURC. They also advocate for responsible implementation, ethical conduct, regulatory compliance, and transparent reporting to mitigate misuse risks and promote beneficial outcomes.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Game Benefits and Attrition Rate in Cases #1–#4

Taking

[- 41.74, - 18.26]

in Table A1 as an example, the former one represents D’s game benefit, while the latter one represents A’s game benefit. The other game benefits in Table A1, Table A2, Table A3, Table A4, Table A5, Table A6, Table A7 and Table A8 are defined similarly. Furthermore, taking

[0.17, 1]

in Table A9 as an example, the former one represents D’s attrition rate, while the latter one represents A’s attrition rate. The other attrition rates in Table A9, Table A10, Table A11, Table A12, Table A13, Table A14, Table A15 and Table A16 are defined similarly.

Table A1. Game benefit in Case #1 (high-tech interception player).

	$S_{1}^{A} (T_{1}^{A})$	$S_{2}^{A} (T_{1}^{A})$	$S_{3}^{A} (T_{1}^{A})$	$S_{4}^{A} (T_{1}^{A})$
$S_{1}^{D} (T^{D})$	[−41.74, −18.26]	[−31.74, 60.74]	[−46.52, −14.48]	[−25.55, 61.55]
$S_{2}^{D} (T^{D})$	[−45.15, 38.15]	[−50.16, 34.16]	[−41.83, −36.17]	[−40.85, 158.85]
$S_{3}^{D} (T^{D})$	[−307.02, 231.02]	[−292.98, 243.98]	[−134.78, 97.78]	[−128.17, 135.17]
$S_{4}^{D} (T^{D})$	[−381.16, 207.16]	[−380.11, 216.11]	[−199.55, 116.55]	[−99.95, 203.95]

Table A2. Game benefit in Case #1 (low-tech interception player).

	$S_{1}^{A} (T_{2}^{A})$	$S_{2}^{A} (T_{2}^{A})$	$S_{3}^{A} (T_{2}^{A})$	$S_{4}^{A} (T_{2}^{A})$
$S_{1}^{D} (T^{D})$	[−60.12, 32.12]	[−47.14, 42.14]	[−41.11, 11.11]	[−63.14, 91.14]
$S_{2}^{D} (T^{D})$	[−55.42, −16.58]	[−66.45, 68.45]	[−58.29, 60.29]	[−70.31, 106.31]
$S_{3}^{D} (T^{D})$	[−332.31, 264.31]	[−326.01, 221.01]	[−115.12, 63.12]	[−114.1, 164.1]
$S_{4}^{D} (T^{D})$	[−406.95, 271.95]	[−416.67, 227.67]	[−203.16, 73.16]	[−94.13, 196.13]

Table A3. Game benefit in Case #2 (high-tech interception player).

	$S_{1}^{A} (T_{1}^{A})$	$S_{2}^{A} (T_{1}^{A})$	$S_{3}^{A} (T_{1}^{A})$	$S_{4}^{A} (T_{1}^{A})$
$S_{1}^{D} (T^{D})$	[−477.98, 417.98]	[−467.78, 496.78]	[132.38, −193.38]	[152.2, −116.2]
$S_{2}^{D} (T^{D})$	[−482.52, 475.52]	[−487.3, 471.3]	[135.72, −213.72]	[136.68, −18.68]
$S_{3}^{D} (T^{D})$	[−561.58, 485.58]	[−547.57, 498.57]	[−311.48, 274.48]	[−304.54, 311.54]
$S_{4}^{D} (T^{D})$	[−635.66, 461.66]	[−634.65, 470.65]	[−377.38, 294.38]	[−277.46, 381.46]

Table A4. Game benefit in Case #2 (low-tech interception player).

	$S_{1}^{A} (T_{2}^{A})$	$S_{2}^{A} (T_{2}^{A})$	$S_{3}^{A} (T_{2}^{A})$	$S_{4}^{A} (T_{2}^{A})$
$S_{1}^{D} (T^{D})$	[136.3, −164.3]	[149.26, −154.26]	[166.25, −196.25]	[144.22, −116.22]
$S_{2}^{D} (T^{D})$	[140.46, −212.46]	[129.44, −127.44]	[148.98, −146.98]	[136.96, −100.96]
$S_{3}^{D} (T^{D})$	[−564.88, 496.88]	[−558.75, 453.75]	[−15.65, −36.35]	[−15.62, 65.62]
$S_{4}^{D} (T^{D})$	[−638.31, 503.31]	[−648.17, 459.17]	[−105.21, −24.79]	[3.94, 98.06]

Table A5. Game benefit in Case #3 (high-tech interception player).

	$S_{1}^{A} (T_{1}^{A})$	$S_{2}^{A} (T_{1}^{A})$	$S_{3}^{A} (T_{1}^{A})$	$S_{4}^{A} (T_{1}^{A})$
$S_{1}^{D} (T^{D})$	[304.64, −364.64]	[314.59, −285.59]	[342.4, −403.4]	[363.37, −327.37]
$S_{2}^{D} (T^{D})$	[299.65, −306.65]	[294.6, −310.6]	[347.1, −425.1]	[348.08, −230.08]
$S_{3}^{D} (T^{D})$	[−645.52, 569.52]	[−631.42, 582.42]	[53.19, −90.19]	[59.88, −52.88]
$S_{4}^{D} (T^{D})$	[−719.78, 545.78]	[−718.68, 554.68]	[−11.27, −71.73]	[88.35, 15.65]

Table A6. Game benefit in Case #3 (low-tech interception player).

	$S_{1}^{A} (T_{2}^{A})$	$S_{2}^{A} (T_{2}^{A})$	$S_{3}^{A} (T_{2}^{A})$	$S_{4}^{A} (T_{2}^{A})$
$S_{1}^{D} (T^{D})$	[344.95, −372.95]	[357.93, −362.93]	[378.93, −408.93]	[356.92, −328.92]
$S_{2}^{D} (T^{D})$	[349.65, −421.65]	[338.62, −336.62]	[361.78, −359.78]	[349.76, −313.76]
$S_{3}^{D} (T^{D})$	[−577.88, 509.88]	[−571.5, 466.5]	[193.87, −245.87]	[194.88, −144.88]
$S_{4}^{D} (T^{D})$	[−652.21, 517.21]	[−661.84, 472.84]	[105.58, −235.58]	[214.6, −112.6]

Table A7. Game benefit in Case #4 (high-tech interception player).

	$S_{1}^{A} (T_{1}^{A})$	$S_{2}^{A} (T_{1}^{A})$	$S_{3}^{A} (T_{1}^{A})$	$S_{4}^{A} (T_{1}^{A})$
$S_{1}^{D} (T^{D})$	[731.08, −791.08]	[740.97, −711.97]	[813.41, −874.41]	[834.41, −798.41]
$S_{2}^{D} (T^{D})$	[726.32, −733.32]	[721.21, −737.21]	[818.1, −896.1]	[819.09, −701.09]
$S_{3}^{D} (T^{D})$	[−1052.69, 976.69]	[−1038.58, 989.58]	[272.33, −309.33]	[279.03, −272.03]
$S_{4}^{D} (T^{D})$	[−1127.33, 953.33]	[−1126.21, 962.21]	[208.44, −291.44]	[307.36, −203.36]

Table A8. Game benefit in Case #4 (low-tech interception player).

	$S_{1}^{A} (T_{2}^{A})$	$S_{2}^{A} (T_{2}^{A})$	$S_{3}^{A} (T_{2}^{A})$	$S_{4}^{A} (T_{2}^{A})$
$S_{1}^{D} (T^{D})$	[834.83, −862.83]	[847.79, −852.79]	[882.7, −912.7]	[860.69, −832.69]
$S_{2}^{D} (T^{D})$	[839.53, −911.53]	[828.48, −826.48]	[865.54, −863.54]	[853.53, −817.53]
$S_{3}^{D} (T^{D})$	[−852.98, 784.98]	[−846.39, 741.39]	[568.74, −620.74]	[569.76, −519.76]
$S_{4}^{D} (T^{D})$	[−928.63, 793.63]	[−938.05, 749.05]	[480.18, −610.18]	[589.2, −487.2]

Table A9. Attrition rate in Case #1 (high-tech interception player).

	$S_{1}^{A} (T_{1}^{A})$	$S_{2}^{A} (T_{1}^{A})$	$S_{3}^{A} (T_{1}^{A})$	$S_{4}^{A} (T_{1}^{A})$
$S_{1}^{D} (T^{D})$	[0.17, 1]	[0.17, 1]	[0.17, 1]	[0.17, 1]
$S_{2}^{D} (T^{D})$	[0.18, 1]	[0.18, 1]	[0.17, 1]	[0.17, 1]
$S_{3}^{D} (T^{D})$	[1, 0.14]	[1, 0.14]	[0.42, 1]	[0.41, 1]
$S_{4}^{D} (T^{D})$	[1, 0.14]	[1, 0.14]	[0.44, 1]	[0.43, 1]

Table A10. Attrition rate in Case #1 (low-tech interception player).

	$S_{1}^{A} (T_{2}^{A})$	$S_{2}^{A} (T_{2}^{A})$	$S_{3}^{A} (T_{2}^{A})$	$S_{4}^{A} (T_{2}^{A})$
$S_{1}^{D} (T^{D})$	[0.27, 1]	[0.27, 1]	[0.26, 1]	[0.26, 1]
$S_{2}^{D} (T^{D})$	[0.27, 1]	[0.27, 1]	[0.26, 1]	[0.26, 1]
$S_{3}^{D} (T^{D})$	[1, 0.14]	[1, 0.14]	[0.12, 1]	[0.12, 1]
$S_{4}^{D} (T^{D})$	[1, 0.15]	[1, 0.15]	[0.13, 1]	[0.13, 1]

Table A11. Attrition rate in Case #2 (high-tech interception player).

	$S_{1}^{A} (T_{1}^{A})$	$S_{2}^{A} (T_{1}^{A})$	$S_{3}^{A} (T_{1}^{A})$	$S_{4}^{A} (T_{1}^{A})$
$S_{1}^{D} (T^{D})$	[1, 0.16]	[1, 0.16]	[0.1, 1]	[0.11, 1]
$S_{2}^{D} (T^{D})$	[1, 0.16]	[1, 0.16]	[0.11, 1]	[0.11, 1]
$S_{3}^{D} (T^{D})$	[1, 0.03]	[1, 0.03]	[0.83, 0.54]	[0.83, 0.54]
$S_{4}^{D} (T^{D})$	[1, 0.03]	[1, 0.03]	[0.83, 0.53]	[0.83, 0.54]

Table A12. Attrition rate in Case #2 (low-tech interception player).

	$S_{1}^{A} (T_{2}^{A})$	$S_{2}^{A} (T_{2}^{A})$	$S_{3}^{A} (T_{2}^{A})$	$S_{4}^{A} (T_{2}^{A})$
$S_{1}^{D} (T^{D})$	[0.02, 1]	[0.02, 1]	[0.01, 1]	[0.01, 1]
$S_{2}^{D} (T^{D})$	[0.02, 1]	[0.02, 1]	[0.01, 1]	[0.01, 1]
$S_{3}^{D} (T^{D})$	[1, 0.09]	[1, 0.09]	[0.43, 1]	[0.44, 1]
$S_{4}^{D} (T^{D})$	[1, 0.09]	[1, 0.09]	[0.44, 1]	[0.44, 1]

Table A13. Attrition rate in Case #3 (high-tech interception player).

	$S_{1}^{A} (T_{1}^{A})$	$S_{2}^{A} (T_{1}^{A})$	$S_{3}^{A} (T_{1}^{A})$	$S_{4}^{A} (T_{1}^{A})$
$S_{1}^{D} (T^{D})$	[0.16, 1]	[0.16, 1]	[0.08, 1]	[0.08, 1]
$S_{2}^{D} (T^{D})$	[0.16, 1]	[0.16, 1]	[0.08, 1]	[0.08, 1]
$S_{3}^{D} (T^{D})$	[1, 0.15]	[1, 0.15]	[0.55, 1]	[0.55, 1]
$S_{4}^{D} (T^{D})$	[1, 0.15]	[1, 0.15]	[0.56, 1]	[0.55, 1]

Table A14. Attrition rate in Case #3 (low-tech interception player).

	$S_{1}^{A} (T_{2}^{A})$	$S_{2}^{A} (T_{2}^{A})$	$S_{3}^{A} (T_{2}^{A})$	$S_{4}^{A} (T_{2}^{A})$
$S_{1}^{D} (T^{D})$	[0.03, 1]	[0.03, 1]	[0.01, 1]	[0.01, 1]
$S_{2}^{D} (T^{D})$	[0.03, 1]	[0.03, 1]	[0.01, 1]	[0.01, 1]
$S_{3}^{D} (T^{D})$	[1, 0.32]	[1, 0.32]	[0.28, 1]	[0.28, 1]
$S_{4}^{D} (T^{D})$	[1, 0.32]	[1, 0.32]	[0.28, 1]	[0.28, 1]

Table A15. Attrition rate in Case #4 (high-tech interception player).

	$S_{1}^{A} (T_{1}^{A})$	$S_{2}^{A} (T_{1}^{A})$	$S_{3}^{A} (T_{1}^{A})$	$S_{4}^{A} (T_{1}^{A})$
$S_{1}^{D} (T^{D})$	[0.18, 1]	[0.18, 1]	[0.09, 1]	[0.09, 1]
$S_{2}^{D} (T^{D})$	[0.18, 1]	[0.18, 1]	[0.1, 1]	[0.1, 1]
$S_{3}^{D} (T^{D})$	[1, 0.18]	[1, 0.18]	[0.57, 1]	[0.57, 1]
$S_{4}^{D} (T^{D})$	[1, 0.18]	[1, 0.18]	[0.57, 1]	[0.57, 1]

Table A16. Attrition rate in Case #4 (low-tech interception player).

	$S_{1}^{A} (T_{2}^{A})$	$S_{2}^{A} (T_{2}^{A})$	$S_{3}^{A} (T_{2}^{A})$	$S_{4}^{A} (T_{2}^{A})$
$S_{1}^{D} (T^{D})$	[0.05, 1]	[0.05, 1]	[0.03, 1]	[0.03, 1]
$S_{2}^{D} (T^{D})$	[0.05, 1]	[0.05, 1]	[0.03, 1]	[0.03, 1]
$S_{3}^{D} (T^{D})$	[1, 0.4]	[1, 0.4]	[0.29, 1]	[0.29, 1]
$S_{4}^{D} (T^{D})$	[1, 0.39]	[1, 0.39]	[0.29, 1]	[0.29, 1]

References

Zheng, Z.; Wei, C.; Duan, H. UAV swarm air combat maneuver decision-making method based on multi-agent reinforcement learning and transferring. Sci. China Inf. Sci. 2024, 67, 180204. [Google Scholar] [CrossRef]
Zhao, P.; Wang, J.; Kong, L. Decentralized Algorithms for Weapon-Target Assignment in Swarming Combat System. Math. Probl. Eng. 2019, 2019, 8425403. [Google Scholar] [CrossRef]
Zhang, J.; Han, K.; Zhang, P.; Hou, Z.; Ye, L. A survey on joint-operation application for unmanned swarm formations under a complex confrontation environment. J. Syst. Eng. Electron. 2023, 34, 1432–1446. [Google Scholar] [CrossRef]
Hayat, S.; Yanmaz, E.; Muzaffar, R. Survey on unmanned aerial vehicle networks for civil applications: A communications viewpoint. IEEE Commun. Surv. Tutorials 2016, 18, 2624–2661. [Google Scholar] [CrossRef]
Rosalie, M.; Danoy, G.; Chaumette, S.; Bouvry, P. Chaos-enhanced mobility models for multilevel swarms of UAVs. Swarm Evol. Comput. 2018, 41, 36–48. [Google Scholar] [CrossRef]
Zhang, B.; Liao, J.; Kuang, Y.; Zhang, M.; Zhou, S.; Kang, Y. Research status and development of the United States UAV swarm battlefield. Aero Weapon. 2020, 27, 7–12. [Google Scholar]
Wang, K.; Wang, D.; Meng, Q. Overview on the development of United States unmanned aerial vehicle cluster combat system. In Proceedings of the 2024 International Conference on Unmanned Aircraft Systems, Chania, Greece, 4–7 June 2024; pp. 92–95. [Google Scholar]
Zhang, C.; Liu, T.; Bai, G.; Tao, J.; Zhu, W. A dynamic resilience evaluation method for cross-domain swarms in confrontation. Reliab. Eng. Syst. Saf. 2024, 244, 109904. [Google Scholar] [CrossRef]
Xing, D.; Zhen, Z.; Gong, H. Offense-defense confrontation decision making for dynamic UAV swarm versus UAV swarm. Proc. Inst. Mech. Eng. Part G J. Aerosp. Eng. 2019, 233, 5689–5702. [Google Scholar] [CrossRef]
Li, Y.; Liu, Z.; Hong, Y.; Wang, J.; Wang, J.; Li, Y.; Tang, Y. Multi-agent reinforcement learning based game: A survey. Acta Autom. Sin. 2025, 51, 540–558. [Google Scholar]
Lima Filho, G.M.d.; Medeiros, F.L.L.; Passaro, A. Decision support system for unmanned combat air vehicle in beyond visual range air combat based on artificial neural networks. J. Aerosp. Technol. Manag. 2021, 13, e3721. [Google Scholar] [CrossRef]
Jian, J.; Chen, Y.; Li, Q.; Li, H.; Zheng, X.; Han, C. Decision-Making Method of Multi-UAV Cooperate Air Combat Under Uncertain Environment. IEEE J. Miniaturization Air Space Syst. 2024, 5, 138–148. [Google Scholar] [CrossRef]
Hao, X.; Fang, Z.; Zhang, J.; Deng, F.; Jiang, A.; Xiao, S. Reinforcement Model for Unmanned Combat System of Systems Based on Multi-Layer Grey Target. J. Grey Syst. 2024, 36, 54–66. [Google Scholar]
Luo, B.; Hu, T.; Zhou, Y.; Huang, T.; Yang, C.; Gui, W. Survey on multi-agent reinforcement learning for control and decision-making. Acta Autom. Sin. 2025, 51, 510–539. [Google Scholar]
Sha, J. Mathematic Tactics; Science Press: Beijing, China, 2003. [Google Scholar]
Chen, X.; Cao, J.; Zhao, F.; Jiang, X. Nash equilibrium analysis of hybrid dynamic games system based on event-triggered control. Control Theory Appl. 2021, 38, 1801–1808. [Google Scholar]
Chen, X.; Wang, D.; Zhao, F.; Guo, M.; Qiu, J. A Viewpoint on Construction of Networked Model of Event-triggered Hybrid Dynamic Games. In Proceedings of the 2022 IEEE Conference on Games (CoG), Beijing, China, 21–24 August 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 473–477. [Google Scholar]
Elliott, D.S.; Vatsan, M. Efficient Computation of Weapon-Target Assignments Using Abstraction. IEEE Control Syst. Lett. 2023, 7, 3717–3722. [Google Scholar] [CrossRef]
Nam Eung, H.; Hyung Jun, K. Static Weapon-Target Assignment Based on Battle Probabilities and Time-Discounted Reward. Def. Sci. J. 2024, 74, 662–670. [Google Scholar] [CrossRef]
Silav, A.; Karasakal, E.; Karasakal, O. Bi-objective dynamic weapon-target assignment problem with stability measure. Ann. Oper. Res. 2022, 311, 1229–1247. [Google Scholar] [CrossRef]
Wang, Z.; Lu, Y.; Li, X.; Li, Z. Optimal defense strategy selection based on the static Bayesian game. J. Xidian Univ. 2019, 46, 55–61. [Google Scholar]
Liu, X.; Zhang, H.; Ma, J.; Tan, J. Research review of network defense decision-making methods based on attack and defense game. Chin. J. Netw. Inf. Secur. 2022, 8, 1–14. [Google Scholar]
Liu, X.; Zhang, H.; Zhang, Y.; Hu, H.; Cheng, J. Modeling of network attack and defense behavior and analysis of situation evolution based on game theory. J. Electron. Inf. Technol. 2021, 43, 3629–3638. [Google Scholar]
Kline, A.; Ahner, D.; Hill, R. The weapon-target assignment problem. Comput. Oper. Res. 2019, 105, 226–236. [Google Scholar] [CrossRef]
He, W.; Tan, J.; Guo, Y.; Shang, K.; Zhang, H. A Deep Reinforcement Learning-Based Deception Asset Selection Algorithm in Differential Games. IEEE Trans. Inf. Forensics Secur. 2024, 19, 8353–8368. [Google Scholar] [CrossRef]
Deng, L.; Wu, J.; Shi, J.; Xia, J.; Liu, Y.; Yu, X. Research on intelligent decision technology for multi-UAVs prevention and control. In Proceedings of the 2020 Chinese Automation Congress (CAC), Shanghai, China, 6–8 November 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 5362–5367. [Google Scholar]
Xuan, S.; Ke, L. UAV swarm attack-defense confrontation based on multi-agent reinforcement learning. In Advances in Guidance, Navigation and Control: Proceedings of the 2020 International Conference on Guidance, Navigation and Control, ICGNC 2020, Tianjin, China, 23–25 October 2020; Springer: Berlin/Heidelberg, Germany, 2021; pp. 5599–5608. [Google Scholar]
Ji, X.; Zhang, W.; Xiang, F.; Yuan, W.; Chen, J. A swarm confrontation method based on Lanchester law and Nash equilibrium. Electronics 2022, 11, 896. [Google Scholar] [CrossRef]
Chi, S.; Li, S.; Wang, C.; Xie, G. A review of research on pursuit-evasion games. Acta Electron. Sin. 2025, 51, 705–726. [Google Scholar]
Majumder, R.; Ghose, D. A strategic decision support system using multiplayer non-cooperative games for resource allocation after natural disasters. IEEE Trans. Autom. Sci. Eng. 2022, 20, 2227–2240. [Google Scholar] [CrossRef]
Yi, N.; Wang, Q.; Yan, L.; Tang, Y.; Xu, J. A multi-stage game model for the false data injection attack from attacker’s perspective. Sustain. Energy Grids Netw. 2021, 28, 100541. [Google Scholar] [CrossRef]
Wei, N.; Liu, M.; Cheng, W. Decision-making of underwater cooperative confrontation based on MODPSO. Sensors 2019, 19, 2211. [Google Scholar] [CrossRef]
Li, L.; Xiao, B.; Wu, X. Optimal control of strength allocation strategies generation with complex constraints. Trans. Inst. Meas. Control 2025, 47, 634–646. [Google Scholar] [CrossRef]
Jiang, L.; Zhang, H.; Wang, J. Optimal strategy selection method for moving target defense based on signaling game. J. Commun. 2019, 40, 128–137. [Google Scholar]
Liu, L.; Zhang, Q.; Zhao, Z. Operational conception and key technologies of unmanned aerial vehicle swarm interception system. Command Control Simul. 2021, 43, 48–54. [Google Scholar]
Lei, C.; Zhang, H.; Wan, L.; Liu, L.; Ma, D. Incomplete information Markov game theoretic approach to strategy generation for moving target defense. Comput. Commun. 2018, 116, 184–199. [Google Scholar] [CrossRef]
Kanakia, A.; Touri, B.; Correll, N. Modeling multi-robot task allocation with limited information as global game. Swarm Intell. 2016, 10, 147–160. [Google Scholar] [CrossRef]
Reesman, R.; Wilson, J.R. The Physics of Space War: How Orbital Dynamics Constrain Space-to-Space Engagements; Aerospace Corporation: Singapore, 2020. [Google Scholar]
Klinkova, G.; Grabinski, M. A Statistical Analysis of Games with No Certain Nash Equilibrium Make Many Results Doubtful. Appl. Math. 2022, 13, 120–130. [Google Scholar] [CrossRef]
Wang, J.; Yu, D.; Zhang, H.; Wang, N. Active defense strategy selection based on the static Bayesian game. J. Xidian Univ. 2016, 43, 144–150. [Google Scholar]
Yang, Z.; Chen, Y.p.; Gu, F.; Wang, J.; Zhao, L. A Two-Way Update Resource Allocation Strategy in Mobile Edge Computing. Wirel. Commun. Mob. Comput. 2022, 2022, 1117597. [Google Scholar] [CrossRef]
Yu, X.; Khani, A.; Chen, J.; Xu, H.; Mao, H. Real-time holding control for transfer synchronization via robust multiagent reinforcement learning. IEEE Trans. Intell. Transp. Syst. 2022, 23, 23993–24007. [Google Scholar] [CrossRef]
Clausewitz. On War; Oxford University Press Inc.: Oxford, UK, 2007; Chapter 2; pp. 147–148. [Google Scholar]

Figure 1. Illustration of the SSABG model.

Figure 2. High-value targets protection scenario illustration.

Figure 3. Strength allocation strategies of A.

Figure 4. Strength allocation strategies of D.

Figure 5. Comparison of defense effectiveness with increasing strength scale.

Figure 6. Comparison of defense effectiveness with different prior probabilities.

Figure 7. Comparison of average execution time with increasing strength scales.

Figure 8. Comparison of execution time with increasing strength scales (the lowest data belong to RBA).

Table 1. Four stages of the SSABG model.

Stage	Description	Content
Stage #1	Introduce “Nature” as a virtual player	Introduce “Nature” as a virtual player responsible for initializing type in the game.
Stage #2	Determine players’ types	Through Harsanyi transformation, convert the type uncertainty of A and D into a virtual selection by “Nature”, constructing a game framework with complete but imperfect information. A and D only know their types and the possible strategies the opponent may adopt, but not the opponent’s type.
Stage #3	Select strategies simultaneously	A and D simultaneously choose deterministic allocation strategies from their respective strategy spaces, without knowledge of the strategy selected by the other.
Stage #4	Solve game equilibrium	Solve the NE for A and D, yielding either a pure-strategy NE or a mixed-strategy BNE, and compute the defense effectiveness of each strategy. Select the allocation strategy with the highest defense effectiveness.

Table 2. Assumptions of the SSABG model.

Assumption	Description
Rational decision making [36]	Both A and D are fully rational, with no cognitive limitations, and make decisions to maximize their benefits.
Information asymmetry [36]	Neither side can precisely know the other’s strategy benefits, but can infer the opponent’s type through probability distributions, converting strategy uncertainty into players’ type uncertainty.
Communication perfect [37]	Within the swarming unmanned system, inter-agent communication is perfect. Every unit has complete knowledge of its state. Each unit’s type remains invariant throughout the interaction, while its strength evolves. Moreover, no random factors are involved in SUS.

Table 3. Constraints of the SSABG model.

Constraints	Description
Strength constraints	During the protection process, the strength of A and D is limited by finite fuel and time, and is not unlimited.
Time constraints	A and D must complete their tasks within a certain time limit.

Table 4.

Δ v

and

u_{t c}

.

Table 4.

Δ v

and

u_{t c}

.

Player	$Δ v$	$u_{tc}$
A	[71, 45, 78, 101]	[110, 159, 121, 195]
D	[65, 15, 60, 86]	[75, 142, 114, 169]

Table 5. Selection probability of strategy in Cases #1 and #2.

Methods	Case #1	Case #2
RA	$F^{A *} (T_{1}^{A})$ : (1,0,0,0)	$F^{A *} (T_{1}^{A})$ : (1,0,0,0)
	$F^{A *} (T_{2}^{A})$ : (0,1,0,0)	$F^{A *} (T_{2}^{A})$ : (0,1,0,0)
	$F^{D *} (T_{D})$ : (1,0,0,0)	$F^{D *} (T_{D})$ : (1,0,0,0)
GH	$F^{A *} (T_{1}^{A})$ : (0,0,0,1)	$F^{A *} (T_{1}^{A})$ : (0,0,0,1)
	$F^{A *} (T_{2}^{A})$ : (0,0,0,1)	$F^{A *} (T_{2}^{A})$ : (0,1,0,0)
	$F^{D *} (T_{D})$ : (1,0,0,0)	$F^{D *} (T_{D})$ : (1,0,0,0)
RBA	$F^{A *} (T_{1}^{A})$ : (0,1,0,0)	$F^{A *} (T_{1}^{A})$ : (0,1,0,0)
	$F^{A *} (T_{2}^{A})$ : (0,0,0,1)	$F^{A *} (T_{2}^{A})$ : (0,1,0,0)
	$F^{D *} (T_{D})$ : (1,0,0,0)	$F^{D *} (T_{D})$ : (1,0,0,0)
CBG	$F^{A *} (T_{1}^{A})$ : (0.25,0.25,0.25,0.25)	$F^{A *} (T_{1}^{A})$ : (0.45,0.45,0.1,0)
	$F^{A *} (T_{2}^{A})$ : (0.25,0.25,0.25,0.25)	$F^{A *} (T_{2}^{A})$ : (0.25,0.25,0.25,0.25)
	$F^{D *} (T_{D})$ : (0,1,0,0)	$F^{D *} (T_{D})$ : (1,0,0,0)
SSABG	$F^{A *} (T_{1}^{A})$ : (0,0,1,0)	$F^{A *} (T_{1}^{A})$ : (0,0,1,0)
	$F^{A *} (T_{2}^{A})$ : (0,0,1,0)	$F^{A *} (T_{2}^{A})$ : (0,0,0.75,0.25)
	$F^{D *} (T_{D})$ : (1,0,0,0)	$F^{D *} (T_{D})$ : (1,0,0,0)

* represents the Nash equilibrium.

Table 6. Selection probability of strategy in Cases #3 and #4.

* represents the Nash equilibrium.

Table 7. Comparison of five methods on model complexity, scalability, computational requirements, and general applicability.

Methods	Model	Scalability	Computational	General
Methods	Complexity	Scalability	Requirements	Applicability
RA	Low	Low	High	Low
GH	Low	Medium	Low	Low
RBA	Medium	Medium	Low	Medium
CBG	Medium	Medium	Medium	Medium
SSABG	High	High	Medium	High

Table 8. Comparison of five methods on assumptions, constraints, and limitations.

Methods	Assumptions	Constraints	Limitations
RA	Random strategy selection	None	Unable to adapt to scenario change
GH	Strategy selection with the highest benefit	None	Disregards opponent’s strategies
RBA	Strategy selection based on predefined fixed rules	None	Inflexible for dynamic scenarios due to high migration cost
CBG	Complete information elimination-based selection	None	Unable to handle incomplete information scenarios size
SSABG	Incomplete information, rational decision making, communication perfect	Total strength, engagement time	Performance is not obvious in small-scale scenarios

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, L.; Ren, B. A Strength Allocation Bayesian Game Method for Swarming Unmanned Systems. Drones 2025, 9, 626. https://doi.org/10.3390/drones9090626

AMA Style

Li L, Ren B. A Strength Allocation Bayesian Game Method for Swarming Unmanned Systems. Drones. 2025; 9(9):626. https://doi.org/10.3390/drones9090626

Chicago/Turabian Style

Li, Lingwei, and Bangbang Ren. 2025. "A Strength Allocation Bayesian Game Method for Swarming Unmanned Systems" Drones 9, no. 9: 626. https://doi.org/10.3390/drones9090626

APA Style

Li, L., & Ren, B. (2025). A Strength Allocation Bayesian Game Method for Swarming Unmanned Systems. Drones, 9(9), 626. https://doi.org/10.3390/drones9090626

Article Menu

A Strength Allocation Bayesian Game Method for Swarming Unmanned Systems

Abstract

Highlights

Abstract

1. Introduction

2. SSABG Mechanism Analysis and Scenario Illustration

2.1. SSABG Mechanism Analysis

2.2. High-Value Targets Protection Scenario Illustration

3. Main Results

3.1. Swarming Strength Allocation Bayesian Game Model

3.2. Swarming Strength Allocation Benefit Quantification Model

3.3. The Swarming Strength Allocation Algorithm Based on Nash Equilibrium

3.4. Methods for Comparison

4. Numerical Simulations and Result Analysis

4.1. Simulation Parameter Settings

4.2. Simulation Results

4.2.1. Comparison of Increasing Strength Scale

4.2.2. Comparison of Different Prior Probabilities

4.2.3. Comparison of Execution Time

4.3. Results Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

DURC Statement

Conflicts of Interest

Appendix A. Game Benefits and Attrition Rate in Cases #1–#4

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI