Cumulative Prospect Theory-Driven Pigeon-Inspired Optimization for UAV Swarm Dynamic Decision-Making

Peng, Yalan; Huo, Mengzhen

doi:10.3390/drones9070478

Open AccessArticle

Cumulative Prospect Theory-Driven Pigeon-Inspired Optimization for UAV Swarm Dynamic Decision-Making

by

Yalan Peng

and

Mengzhen Huo

^*

National Key Laboratory of Aircraft Integrated Flight Control, School of Automation Science and Electrical Engineering, Beihang University, Beijing 100083, China

^*

Author to whom correspondence should be addressed.

Drones 2025, 9(7), 478; https://doi.org/10.3390/drones9070478 (registering DOI)

Submission received: 31 May 2025 / Revised: 25 June 2025 / Accepted: 2 July 2025 / Published: 6 July 2025

(This article belongs to the Special Issue Biological UAV Swarm Control)

Download

Browse Figures

Versions Notes

Abstract

To address the dynamic decision-making and control problem in unmanned aerial vehicle (UAV) swarms, this paper proposes a cumulative prospect theory-driven pigeon-inspired optimization (CPT-PIO) algorithm. Gray relational analysis and information entropy theory are integrated into cumulative prospect theory (CPT), constructing a prospect value model for Pareto solutions by setting reference points, defining value functions, and determining attribute weights. This prospect value is used to evaluate the quality of each Pareto solution and serves as the fitness function in the pigeon-inspired optimization (PIO) algorithm to guide its evolutionary process. Furthermore, incorporating individual and swarm situation assessment methods, the situation assessment model is constructed and the information entropy theory is employed to ascertain the weight of each assessment index. Finally, the reverse search mechanism and competitive learning mechanism are introduced into the standard PIO to prevent premature convergence and enhance the population’s exploration capability. Simulation results demonstrate that the proposed CPT-PIO algorithm significantly outperforms two novel multi-objective optimization algorithms in terms of search performance and solution quality, yielding higher-quality Pareto solutions for dynamic UAV swarm decision-making.

Keywords:

unmanned aerial vehicle; situation assessment; dynamic decision-making; cumulative prospect theory; pigeon-inspired optimization; information entropy

1. Introduction

With the rapid development of science and technology, unmanned aerial vehicles (UAVs) are now being used in various fields, including commerce, industry, agriculture, and scientific research. This has promoted the rapid growth of related industries and is now essential for supporting the development of the low-altitude economy. In response to market demand, there are many researchers exploring the potential applications of UAVs in areas such as logistics and distribution, urban governance, low-altitude access, fire rescue, address identification, agricultural plant protection, environmental protection, and cultural and tourism development [1,2,3]. The top priority of current UAV technology research is to optimize the applicability of UAV clusters in different fields, take into account their performance, improve their scientific and technological innovation, and provide a strong impetus to assist the high-quality development of various industries. In dynamic and open environments, UAV swarms must respond intelligently and cooperatively to a wide range of real-time tasks [4]. Achieving effective decision-making and control in UAV swarms under such conditions poses a significant challenge, especially when multiple conflicting tasks must be balanced simultaneously [5].

In real-world application scenarios, UAV swarm decision-making often involves the optimization of multiple task objectives—such as minimizing mission completion time, communication delays, participation costs, and attrition risks. As the number of tasks increases, traditional multi-objective optimization (MOO) algorithms struggle to maintain performance. Specifically, Pareto-based dominance relationships become less discriminative in high-dimensional objective spaces, resulting in reduced search efficiency and weaker convergence. This makes it difficult to identify meaningful trade-offs among solutions and poses a considerable challenge for high-dimensional dynamic decision-making in UAV swarms [6,7].

To address these issues, various swarm intelligence algorithms have been explored, including particle swarm optimization (PSO), Artificial Bee Colony (ABC), genetic algorithms (GAs), etc. [8,9,10,11]. These methods have shown success in solving complex optimization problems in UAV control, such as task allocation and path planning.

A novel co-evolutionary multi-group particle swarm optimization algorithm that enhances multi-mission UAV path planning by enabling cooperative evolution among subgroups was proposed in [12], improving solution diversity, convergence speed, and task adaptability in complex environments. An efficient grid-based path planning method for UAVs using an improved Artificial Bee Colony algorithm was presented in [13] and enhances convergence speed and solution quality by incorporating adaptive strategies and local search mechanisms to better navigate complex environments. A novel vibrational genetic algorithm enhanced with Voronoi diagrams for autonomous UAV path planning was introduced in [14], effectively improving obstacle avoidance, path smoothness, and convergence efficiency in complex environments. A dynamic parameter genetic algorithm for collaborative strike task allocation in UAV swarms was transformed in [15]. It adaptively adjusts genetic parameters to effectively handle heterogeneous targets, enhancing task efficiency, adaptability, and coordination in dynamic combat scenarios.

The pigeon-inspired optimization (PIO) algorithm has gained attention for its inspiration from the homing behavior of pigeons, offering fast convergence and strong global search capabilities.

A hierarchical control strategy for multi-UAV obstacle avoidance based on an enhanced pigeon-inspired optimization algorithm was developed in [16], introducing layered decision-making to improve real-time responsiveness, path safety, and coordination in complex environments.

However, most of these algorithms operate under deterministic assumptions and do not consider the behavioral characteristics or risk preferences of decision-makers in uncertain environments.

Meanwhile, cumulative prospect theory (CPT) provides a psychologically grounded framework for modeling decision-making under risk and uncertainty. Unlike traditional utility-based models, CPT accounts for how people perceive gains and losses differently and how they subjectively distort probabilities. These characteristics have made CPT widely applicable in areas such as behavioral economics, finance, and risk analysis [17,18].

Integrating the cumulative prospect theory into a value of information framework to account for human-like risk perception was achieved in [19], providing a more realistic and behaviorally-informed approach to decision-making under uncertainty. A multi-criteria fuzzy portfolio selection method that combined three-way decision theory with cumulative prospect theory was presented in [20], offering a more nuanced and psychologically consistent framework for investment decision-making under uncertainty and ambiguity. A method based on the removal effects of criteria-multi-attributive border approximation area comparison decision-making was introduced in [21]. It integrates cumulative prospect theory with picture fuzzy sets, enhancing the evaluation of wearable health technology devices by capturing decision-makers’ psychological behavior and handling uncertainty more effectively.

Despite its effectiveness in modeling human behavior, CPT has not yet been systematically applied to UAV swarm optimization tasks, particularly those involving dynamics and uncertain outcomes.

To fill this gap, this paper proposes a novel algorithm named cumulative prospect theory-driven pigeon-inspired optimization (CPT-PIO) for solving the multi-objective dynamic decision-making problem in UAV swarms. The core idea is to incorporate decision-makers’ psychological preferences into the optimization process. First, gray relational analysis and information entropy are used to normalize and weigh the objectives, reducing the influence of dimensional heterogeneity. Then, based on CPT, a comprehensive prospect value model is constructed by defining reference points, applying value functions, and assigning probability weights. This value serves as a fitness function to guide the search process of the PIO algorithm. To further improve global exploration and avoid premature convergence, an inverse search mechanism is introduced into the standard PIO.

The primary contributions and novelties are discussed as follows:

A novel decision-making optimization framework is proposed by integrating cumulative prospect theory into the evaluation of UAV swarm Pareto solutions, allowing the algorithm to reflect psychological risk preferences through the construction of a prospect value model;
An entropy-weighted grey relational analysis method is introduced for swarm situation assessment, enabling objective and adaptive determination of assessment index weights in dynamic and uncertain environments;
The traditional pigeon-inspired optimization algorithm is enhanced by incorporating a reverse search mechanism and competitive learning strategy, which effectively avoids premature convergence and improves solution diversity and convergence speed.

These innovations jointly contribute to a robust and psychologically-informed optimization strategy, as demonstrated by superior performance in simulation experiments against state-of-the-art methods.

The rest of this paper is arranged as follows: Section 2 provides the problem formulation of the UAV swarm mission process. The methods for UAV swarm control are presented in Section 3. Section 4 discusses the situation assessment model and Section 5 provides the algorithm design and process. Section 6 displays the numerical simulation results and Section 7 presents the summarization of the whole work.

2. Problem Formulation

The dynamic task allocation of unmanned aerial vehicles in a swarm is a complex and nonlinear stochastic process. Each individual UAV serves as both the executor and emitter of the decision-making results. Through continuous local interaction with their peers, UAVs are able to exchange uninterrupted information with their neighbors, including position and velocity data, which drives coordinate behaviors such as cohesion and alignment. When considering UAVs in a swarm as a mass point model, the effect of their specific attitude changes on the movement of UAVs is not taken into account. The motion state variables of drones are based on position, speed, and acceleration [22]. This motion model can be described as

\{\begin{matrix} {\dot{p}}_{i}^{c} = v_{i}^{c} \\ {\dot{v}}_{i}^{c} = q_{i}^{c} \end{matrix}

(1)

where c denotes the type of the individual, c = [uav, target], each with the same number of individuals denoted by

i = 1, 2, \dots, Num

. The location of individual i of c type is represented by

p_{i}^{c}

, while the speed and acceleration are represented by

v_{i}^{c}

and

q_{i}^{c}

, respectively.

To set up the UAV swarm mission scenario as shown in Figure 1, the UAV swarm and the target swarm should be established. It is assumed that the number of UAVs and targets is the same, and there is an airfield as a destination for the UAV swarm after all target missions are allocated. Both the UAVs and the targets are fixed-wing UAVs of the same type, with the same size, specifications, and flight capabilities. When one target has been continuously locked on to by a UAV for a certain period of time, the mission of this target is considered to be accomplished.

In this framework, each UAV autonomously executes the decision-making process in a fully decentralized manner. Each UAV executes the decision-making algorithm locally via onboard embedded systems, enabling autonomous operation without relying on a central server. Information, such as position, velocity, and posture indicators, is exchanged through short-range periodic broadcasts within each UAV’s communication radius, enabling resilience and scalability across the swarm.

3. Method for UAV Swarm Control

In the initial phase, the UAV swarm cannot perceive the presence of targets. The objective of the UAVs is to form an orderly whole within the swarm, moving at high speed towards the airfield.

In 2008, Ballerini et al. studied the flight data of starling flocks and found that the individuals in the flock were only speed-correlated with the 6–7 nearest neighbors, and the number of interacting neighbors remained stable, unaffected by the distance between individuals [23]. This phenomenon, referred to as “topological interactions,” challenges the traditional principle of defining neighbors based on relative distance [24,25]. When topological interactions are considered, the number of interacting neighbors is nearly constant and much smaller than the number of neighbors that can be sensed. Drawing on the strong spatial consistency shown by the starlings in nature, the unique information interaction mechanism is applied to the control of UAV swarm. The seven individuals closest are selected to interact with each other [26]. UAVs perceive neighboring positions and velocities using onboard non-visual sensors such as GPSs, inertial measurement units (IMUs), or ultra-wideband (UWB) modules, enabling reliable relative positioning.

For example, in the c type, the distance between UAV i and other UAVs is calculated. The distance between i and j is named

{dis}_{i, j}^{c}

.

{dis}_{i, j}^{c} = |p_{i}^{c} - p_{j}^{c}|, (j = 1, 2, \dots, Num, j \neq i)

(2)

Then,

{dis}_{i, j}^{c}

is sorted from small to large; using

{SN}_{i, j}^{c}

to indicate the sort of UAV j, the neighbor set of UAVs i can be expressed as

{nei}_{i}^{c} = {j | {SN}_{i, j}^{c} \leq 7}

(3)

where

{nei}_{i}^{c}

represents the neighbors’ collection of

c

type UAV i.

In order to more efficiently complete the search for targets, the UAV swarm should achieve spatial and speed consistency and avoid collisions between individuals within the swarm.

q_{v_{i}^{c}} = - k_{v} (v_{i}^{c} - \frac{1}{| {nei}_{i}^{c} |} \sum_{j \in {nei}_{i}^{c}} v_{j}^{c})

(4)

where

q_{v_{i}^{c}}

represents the speed consistency acceleration of c camp UAV i.

k_{v}

represents correlation coefficients,

k_{v} > 0

.

| {nei}_{i}^{c} |

stands for the number of neighbors and its value is 7.

In order for the entire UAV swarm to form a compact whole, within the neighbor collection, the neighbor j will have a gathering force on the UAV i when the distance between the UAV and its neighbor j exceeds the set threshold.

The collection of individuals,

{nei_ag}_{i}^{c}

, that generate aggregation forces on UAV i in the neighbor collection of UAVs i is determined:

{nei_ag}_{i}^{c} = {j | r_{re} \leq ‖p_{i}^{c} - p_{j}^{c}‖ \leq r_{ag}, j \in {nei}_{i}^{c}}

(5)

where

p_{j}^{c}

indicates the position of UAV j of c type.

r_{re}

and

r_{ag}

respectively denote the radius of rejection and aggregation.

The spatial aggregation acceleration can be expressed as

q_{{ag}_{i}^{c}} = k_{ag} (\frac{1}{r_{ag} - r_{re}} \cdot \frac{p_{i}^{c} - p_{j}^{c}}{{‖p_{i}^{c} - p_{j}^{c}‖}_{2}})

(6)

where

k_{ag}

represents the correlation coefficient,

k_{ag} > 0

,

j \in {nei_ag}_{i}^{c}

.

Collision avoidance between UAVs is considered; that is, if the distance between the UAV i and its neighbors is less than the pre-set security threshold, the neighbor will repel them.

The collection of individuals,

{nei_re}_{i}^{c}

, that generate repulsive forces on UAV i in the neighbor collection is also determined:

{nei_re}_{i}^{c} = {j | ‖p_{i}^{c} - p_{j}^{c}‖ \leq r_{re}, j \in {nei}_{i}^{c}}

(7)

q_{{re}_{i}^{c}} = k_{re} (\frac{1}{r_{re}} \cdot \frac{p_{i}^{c} - p_{j}^{c}}{{‖p_{i}^{c} - p_{j}^{c}‖}_{2}})

(8)

where

q_{{re}_{i}^{c}}

represents the repulsive acceleration of UAV i of the c type.

k_{re}

represents the correlation coefficient,

k_{re} > 0

,

j \in {nei_re}_{i}^{c}

.

The UAV swarm receives the gravity of the airfield. The acceleration generated by the gravity can be expressed as

q_{airfield} = k_{airfield} \cdot \frac{p_{airfield} - p_{UAV}}{‖p_{airfield} - p_{UAV}‖}

(9)

p_{UAV} = \frac{1}{Num} \cdot \sum_{i = 1}^{Num} p_{i}

(10)

where

q_{airfield}

represents the gravity acceleration of the airfield.

k_{airfield}

indicates the correlation coefficient,

k_{airfield} > 0

.

p_{airfield}

determines the position of airfield,

p_{UAV}

represents the position of the UAV swarm.

The combined acceleration of the c type UAV i can be expressed as follows:

q_{i}^{c} = \{\begin{matrix} q_{v_{i}^{c}} + q_{{ag}_{i}^{c}} + q_{{re}_{i}^{c}} + q_{airfield} \\ q_{\max} \end{matrix}, \begin{matrix} q_{i}^{c} \leq q_{\max} \\ q_{i}^{c} \geq q_{\max} \end{matrix}

(11)

where

q_{\max}

represents the acceleration limit of the UAV to avoid excessive motion fluctuations caused by the simultaneous action of multiple interaction forces. This constraint ensures the resulting control input remains within feasible dynamic bounds and helps maintain stable formation.

The distance between the UAV swarm centers can be expressed as

dis = | | p_{UAV}^{c} - p_{UAV}^{c^{'}} | |

(12)

where

p_{UAV}^{c^{'}}

stands for the position of the target swarm. When dis is greater than

r_{s}

, the detection perception radius of the UAV, both sides’ swarms cannot perceive the presence of each other and the UAV swarm’s target is the airfield. When

dis \leq r_{s}

, the objectives of the UAV swarm will change, with some individuals aiming at the target swarm.

In order to improve the efficiency of task allocation, by grouping the UAV swarm, the identification of targets of each UAV is translated into the determination of objectives for each group.

For the c type UAV swarm, firstly, it is divided into

{Num}_{group}

groups and

{Num}_{group}

individuals are selected as centers in the swarm. Then, the distance of the remaining UAVs to the

{Num}_{group}

swarm center is calculated. They then choose to become a group with their nearest center, group all UAVs, and recalculate the central location coordinates of each group, which are new swarm centers. The above steps are repeated until the end of the cycle.

4. Situation Assessment Model

In the context of information technology applications, the accurate assessment of a task situation is an important part of data fusion and correct decision-making. The situation of a UAV swarm is not simply a state of integration between the environment and the targets in the mission area at the given time. By constructing a view of the targets and the UAV swarm on multiple elements of activity, time, location, and status, after the target distribution is linked to the activity and environment, intentions and maneuverability, and analysis, the result is a comprehensive posture.

The situation assessment indicator is based on the state change of all individuals in the confrontation environment. The situation will change over time, but it has relative stability and can be seen as constant for some time, and only if the time changes to a certain limit will the situation change.

The situation assessment mainly considers the overall motion state of the targets, the difficulty of targeting under different target selection strategies, and the difficulty of executing different behavior strategies [27].

In this paper, two main types of indicators are considered: the relative advantage indicator and swarm synergy indicator. The relative advantage indicator includes the angle advantage indicator, position advantage indicator, speed advantage indicator, distance advantage indicator, and height advantage indicator.

The angular advantage indicator is a function of the difference in angle between the mean motion vector of the n group and the vector pointing from the geometric center of the n group to the geometric center of the m group. The difficulty with mobility of implementing the mandate by the n group against the m group is characterized by the angular advantage indicator.

ζ_{nm}^{a} = {(k_{α} \cdot \exp (α_{nm}) + 1)}^{- 1}

(13)

where

ζ_{nm}^{α}

indicates the angular advantage indicator of the n group against the m group.

k_{α}

represents the correlation coefficient.

The position advantage indicator is a function of

β_{nm}

, the difference in angle between the mean motion vector of the m group and the vector pointing from the geometric center of the n group to the geometric center of the m group. The position advantage of implementing the mandate by the n group against the m group is characterized by the position advantage indicator.

ζ_{nm}^{β} = \sin (\frac{π}{2} \cdot \frac{β_{nm}}{k_{β}})

(14)

where

ζ_{nm}^{β}

indicates the position advantage indicator of the n group against the m group.

k_{β}

represents the correlation coefficient.

The speed advantage indicator is a function of speeds of the n group and the m group. When the speed difference between both sides is large, the speed advantage indicator is fixed.

ζ_{nm}^{v} = \{\begin{matrix} 0.1 ‖v_{n}‖ < 0.6 ‖v_{m}‖ \\ - 0.5 \frac{‖v_{n}‖}{‖v_{m}‖} 0.6 ‖v_{m}‖ \leq ‖v_{n}‖ \leq 1.5 ‖v_{m}‖ \\ 1.0 1.5 ‖v_{m}‖ < ‖v_{n}‖ \end{matrix}

(15)

where

ζ_{nm}^{v}

indicates the speed advantage indicator of the n group against the m group.

v_{n}

and

v_{m}

represent the mean speed of the n group and the m group, respectively.

The distance advantage indicator is a function of the distance between the geometric center of two groups. The closer the two groups, the shorter the time it takes to be done and the greater the distance advantage indicator.

ζ_{nm}^{d} = \{\begin{matrix} 1 d < r_{a} \\ 1 - \frac{d - r_{a}}{r_{s} - r_{a}} r_{a} \leq d \leq r_{s} \\ 0 r_{s} < d \end{matrix}

(16)

where

ζ_{nm}^{d}

indicates the distance advantage indicator of the n group against the m group. d represents the distance between the geometric center of the n group and the geometric center of the m group.

r_{s}

and

r_{a}

indicate the perceived radius and the implementing mandate radius of the UAV, respectively.

The height advantage indicator is a function of the difference in mean height between the m group and the n group. The height advantage of the implementing mandate by the n group against the m group is characterized by the height advantage indicator.

ζ_{nm}^{h} = \frac{k_{h} \cdot \exp (∆ h (p_{n}, p_{m}))}{k_{h} \cdot \exp (∆ h (p_{n}, p_{m})) + 1}

(17)

where

ζ_{nm}^{h}

indicates the height advantage indicator of the n group against the m group.

k_{h}

represents the correlation coefficient.

∆ h

expresses the difference in mean height between the m group and the n group.

The swarm synergy indicator is a function that characterizes the difficulty of coordination among individuals in a swarm in the execution of tasks. The swarm synergy indicator mainly includes the synergy indicator

ζ_{nm}^{c 1}

for clusters measured by sequential parameters in the horizontal plane and the system indicator

ζ_{nm}^{c 2}

for clusters measured by aggregation in the vertical plane.

ζ_{nm}^{c 1} = \frac{1}{Num} \cdot ‖\sum_{i = 1}^{Num} \frac{v_{i} - \bar{v}}{‖v_{i}‖}‖

(18)

ζ_{nm}^{c 2} = \sqrt{{(\frac{1}{Num} \cdot \sum_{j = 1}^{Num} {∆ h (p_{n}, p_{m})}^{2} + ε)}^{- 1}}

(19)

where

ε

is a small constant

ε \neq 0

.

To ensure that each indicator contributes fairly and consistently, all situation assessment terms are normalized to the [0,1] interval. This guarantees that the final situation assessment score also lies within [0,1]. In summary, the formula for calculating the indicators for the comprehensive situation assessment of the UAV swarm is as follows:

ζ_{nm} = ω_{1} \cdot ζ_{nm}^{c 1} + ω_{2} \cdot ζ_{nm}^{a} + ω_{3} \cdot ζ_{nm}^{β} + ω_{4} \cdot ζ_{nm}^{c 2} + ω_{5} \cdot ζ_{nm}^{v} + ω_{6} \cdot ζ_{nm}^{d} + ω_{7} \cdot ζ_{nm}^{h}

(20)

where

ω_{j} (j = 1, 2, \dots, 7)

represents the coefficient,

\sum_{j}^{7} ω_{j} = 1

,

ω_{j} > 0

.

To ensure reliable and robust situation assessment, a group-level information fusion strategy is adopted, where UAV posture reports (e.g., position, velocity, and orientation) are aggregated using weights derived from entropy-based credibility scores. These scores quantify the consistency and discriminability of each UAV’s observation within its group, thereby reducing the influence of contradictory or unreliable data caused by sensor noise or communication delays. At the same time, information entropy theory is introduced to objectively determine the weights of each assessment indicator, effectively avoiding bias from subjective preferences [28]. The detailed calculation procedure for entropy-based weight assignment is described as follows.

Step 1: For one type side, each UAV group has

{Num}_{group} + 1

targets to chooses from. That is, each UAV group has

{Num}_{group} + 1

operational options to be evaluated, each of which contains the seven pending indicators mentioned above. For the UAV group n, build the evaluation matrix

X^{n} = {(x_{ij}^{n})}_{({Num}_{group} + 1) \times 7}, n = 1, 2, \dots, {Num}_{group}; i = 1, 2, \dots, {Num}_{group} + 1; j = 1, 2, \dots, 7

(21)

Step 2: Standardize data on indicators according to (22) to eliminate non-conformity between indicators.

p_{ij}^{n} = \frac{x_{ij}^{n}}{\sum_{i = 1}^{{Num}_{group} + 1} x_{ij}^{n}}

(22)

Step 3: Calculate entropy

E_{j}

for each assessment indicator according to (23). When

p_{ij}^{n} = 0

,

p_{ij}^{n} {lnp}_{ij}^{n} = 0

.

E_{j}^{n} = \frac{\sum_{i = 1}^{{Num}_{group} + 1} p_{ij}^{n} {lnp}_{ij}^{n}}{\ln ({Num}_{group} + 1)}

(23)

Step 4: Calculate the deviation degree

d_{j}

of indicator j by using the entropy

E_{j}

according to (24).

d_{j}^{n} = 1 - E_{j}^{n}

(24)

Step 5: Determine the weight coefficient

ω_{j}^{n}

by using the deviation degree of the assessment indicators

d_{j}

according to (25).

ω_{j}^{n} = d_{j}^{n} / \sum_{j = 1}^{7} d_{j}^{n}

(25)

Step 6: Each UAV group has its own evaluation index weight matrix

W^{n} = {[ω_{1}^{n}, ω_{2}^{n}, \dots, ω_{7}^{n}]}^{T}, n = 1, 2, \dots, {Num}_{group}

. Calculate the average weight matrix for the entire UAV swarm

ϑ = {[{\bar{ϑ}}_{j}]}_{1 \times 7}

. The average solution of the weight of each evaluation indicator can be calculated by

{\bar{ϑ}}_{j} = \frac{\sum_{n = 1}^{{Num}_{group}} ω_{j}^{n}}{{Num}_{group}}

(26)

Step 7: Calculate the positive distance

{PD}^{n}

and the negative distance

{ND}^{n}

between

ω_{j}^{n}

the evaluation index weights of the UAV group n and the mean weight resolution

{\bar{ϑ}}_{j}

.

When the evaluation indicator j is benefit-based,

\{\begin{matrix} {PD}_{j}^{n} = \frac{\max (0, (ω_{j}^{n} - {\bar{ϑ}}_{j}))}{{\bar{ϑ}}_{j}} \\ {ND}_{j}^{n} = \frac{\max (0, ({\bar{ϑ}}_{j} - ω_{j}^{n}))}{{\bar{ϑ}}_{j}} \end{matrix}, n = 1, 2, \dots, {Num}_{group}, j = 1, 2, \dots, 7

(27)

When the evaluation indicator j is cost-based,

\{\begin{matrix} {PD}_{j}^{n} = \frac{\max (0, ({\bar{ϑ}}_{j} - ω_{j}^{n}))}{{\bar{ϑ}}_{j}} \\ {ND}_{j}^{n} = \frac{\max (0, (ω_{j}^{n} - {\bar{ϑ}}_{j}))}{{\bar{ϑ}}_{j}} \end{matrix}, n = 1, 2, \dots, {Num}_{group}, j = 1, 2, \dots, 7

(28)

Step 8: Based on the evaluation index weight matrix

W^{n} = {[ω_{1}^{n}, ω_{2}^{n}, \dots, ω_{7}^{n}]}^{T}

, calculate separately the weight positive distance

{SP}^{n}

and the weighted negative distance

{SN}^{n}

for each UAV group.

\{\begin{matrix} {SP}^{n} = \sum_{j = 1}^{7} ω_{j}^{n} \cdot {PD}_{j}^{n} \\ {SN}^{n} = \sum_{j = 1}^{7} ω_{j}^{n} \cdot {ND}_{j}^{n} \end{matrix}, n = 1, 2, \dots, {Num}_{group}, j = 1, 2, \dots, 7

(29)

Step 9: By standardizing the

{S P}^{n}

and

{S N}^{n}

obtained in Step 8, the standard weighted positive distance

{NSP}^{n}

and the standard weighted negative distance

{NSN}^{n}

are obtained.

\{\begin{matrix} {NSP}^{n} = \frac{{SP}^{n}}{\max_{n} ({SP}^{n})} \\ {NSN}^{n} = \frac{{SN}^{n}}{\max_{n} ({SN}^{n})} \end{matrix}, n = 1, 2, \dots, {Num}_{group}

(30)

Step 10: The overall evaluation score of each UAV group

S^{n}

can be obtained separately by (31).

S^{n} = \frac{1}{2} ({NSP}^{n} + {NSN}^{n}), n = 1, 2, \dots, {Num}_{group}

(31)

Step 11: Sort

S^{n}

in descending order, select

W = {[ω_{1}, ω_{2}, \dots, ω_{7}]}^{T}

, corresponding to the maximum

S^{n}

. Thus, the optimal weight of the situation assessment for the UAV swarm is obtained.

The weights w_j (j = 1, …, 7) are not fixed constants but are dynamically computed via the entropy-based multi-step procedure detailed in (21) to (31). This adaptive mechanism ensures context-sensitive assessment while avoiding subjective bias.

Based on this situation evaluation function, the cost function defined by the UAV swarm mission benefits is designed.

The loss in the UAV swarm is caused by the target swarm, so the loss value is determined by the target swarm’s strategy, which is not related to the UAV swarm’s task allocation strategy. Under the targets’ policy, the valuation of the losses of the UAV swarm can be calculated as

L_{n}^{c} = A_{nm}^{{cc}^{'}} \cdot B_{nm}^{{cc}^{'}} \cdot {value}_{n}^{c}

(32)

where

L_{n}^{c}

expresses the valuation of the losses of the n group of the c type,

n = 1, 2, \dots, {Num}_{group}, m = 1, 2, \dots, {Num}_{group} .

The

A_{nm}^{c c^{'}}

stands for the probability of the

m

group against the n group. The

B_{nm}^{{cc}^{'}}

stands for the probability of the n group loss under the strategy of m group. The

{value}_{n}^{c}

stands for the value of the n group.

A_{nm}^{{cc}^{'}} = μ_{nm}^{{cc}^{'}} \cdot (1 - \exp (- \frac{{value_res}_{m}^{c^{'}}}{{value_res}_{n}^{c}}))

(33)

μ_{nm}^{{cc}^{'}} = \{\begin{matrix} 1 ζ_{mn} > ζ^{*} \\ \exp (ζ_{mn} - ζ^{*}) ζ_{mn} \leq ζ^{*} \end{matrix}

(34)

where

μ_{nm}^{{cc}^{'}}

indicates the effect of the posture of groups on operate probability.

ζ_{mn}

denotes the m group’s posture for the n group.

ζ^{*}

indicates the operate posture threshold.

{value_res}_{n}^{c}

and

{value_res}_{m}^{c^{'}}

indicate the value of resources in the n group and the m group, respectively.

(1 - \exp (- \frac{{value_res}_{m}^{c^{'}}}{{value_res}_{n}^{c}}))

indicates the effect of the gap in the value of resources on the probability of operating.

B_{nm}^{{cc}^{'}} = 1 - {(1 - τ \cdot p_{res})}^{{value}_{res}_{nm}^{c c^{'}}}

(35)

{value}_{res}_{nm}^{c c^{'}} = {value_res}_{m}^{c^{'}} \cdot (\exp (- ζ_{mn}))

(36)

where

τ

expresses the influential factors of environment, such as geography, weather, or electromagnetic interference. The

p_{res}

expresses the resource’s ability. The

{value}_{res}_{nm}^{c c^{'}}

expresses the resource value for the m group to launch against the n group.

X = {x_{1}, x_{2}, \dots, x_{{Num}_{group} + 1}}

is a vector of

{Num}_{group} + 1

decision variables, which denotes the feasible task strategies.

F (X) = {f_{1} (X), f_{2} (X), \dots, f_{M} (X)}

is the Pareto frontier and

f_{n} (X)

is the function value of the n target. The high-dimensional multi-objective model can be expressed as follows:

F (X) = [f_{1} (X), f_{2} (X), f_{3} (X)]

(37)

f_{1} (X) = \min (\sum_{i = 1}^{{Num}_{group} + 1} L_{i}^{c} + L_{bass}^{c})

(38)

f_{2} (X) = \max (\sum_{i = 1}^{{Num}_{group} + 1} L_{i}^{c^{'}} + L_{bass}^{c^{'}})

(39)

f_{3} (X) = \min (\sum_{i = 1}^{{Num}_{group} + 1} t_{i})

(40)

where

f_{1} (X)

and

f_{2} (X)

denote the task loss value of the UAV swarm and the target swarm under the X strategy, respectively.

f_{3} (X)

denotes the mission completion time of the UAV swarm under the X strategy,

t_{i}

is the mission completion time of the i group.

5. Pigeon-Inspired Optimization Based on Cumulative Prospect Theory

Pigeon-inspired optimization based on cumulative prospect theory integrates cumulative prospect theory and the grey correlation analysis method, establishes the comprehensive prospect value model of Pareto solutions, adopts prospect value as the fitness of the pigeon-inspired optimization algorithm to guide the evolution, and evaluates the advantages and disadvantages of the Pareto solution in terms of the magnitude of the value.

5.1. Improved Pigeon-Inspired Optimization

The pigeon-inspired optimization algorithm emulates the unique navigation behavior exhibited by pigeon swarms during homing. This process involves two key operators, map–compass operators and landmark operators, which guide pigeons at different stages of navigation. The iterative update of individual positions within the population facilitates the exploration of the target space [29,30]. In the initial navigation phase, the traditional PIO algorithm updates solely based on the global optimum position, leading to rapid convergence and a tendency to become trapped in local optima, resulting in poor population diversity. To address these limitations, a reverse search mechanism has been introduced to enhance the traditional PIO algorithm, thereby improving search accuracy and preventing premature convergence.

The basic idea is to retrieve the population P after each iteration, generate its inverse population OP, and, based on the size of the fitness value from two populations

{P, OP}

, select N optimal individuals to form the new population NP, which will be used for the next iterative search. The mathematical expression of the reverse search mechanism is as follows:

X_{op} = L_{a} + U_{a} - X

(41)

where

L_{a}

and

U_{a}

are upper and lower boundaries of X and

X_{op}

is the reverse individual of X.

The pigeon-inspired optimization algorithm based on the reverse search mechanism is as follows.

Step 1: Initialization optimization algorithm parameters.

Set

N_{p}

pigeons in

D i m

dimension search space. The position of the i pigeon is indicated as

{loc}_{i} = [{loc}_{i 1}, {loc}_{i 2}, \dots, {loc}_{iDim}]

and the velocity is indicated as

{vel}_{i} = [{vel}_{i 1}, {vel}_{i 2}, \dots, {vel}_{iDim}], i = 1, 2, \dots, N_{p}

. The improved pigeon-inspired optimization algorithm has two independent operations: the maximum number of iterations in the first is ncmax₁ and the second is ncmax₂.

Step 2: Map–compass operators.

At this stage, pigeons rely primarily on the sun and Earth’s magnetic field for navigation. Their position and speed are updated in each iteration as follows:

\{\begin{matrix} {vel}_{i}^{t} = {vel}_{i}^{t - 1} \cdot e^{- R_{1} t} + r a n d ({loc}_{g} - {loc}_{i}^{t - 1}) \\ {loc}_{i}^{t} = {loc}_{i}^{t - 1} + {vel}_{i}^{t} \end{matrix}

(42)

{loc}_{oi}^{t} = L_{a} + U_{a} - {loc}_{i}^{t}

(43)

where R₁ is the map and compass factor. Sort all of the

{loc}_{i}^{t}

and

{loc}_{oi}^{t}

by their fit values. Select the top

N_{p}

pigeons with the highest fitness values as the new population. Proceed with the next iteration until the number of iterations reaches ncmax₁.

Step 3: Landmark operators.

At this stage, pigeons rely on landmarks near the nest to navigate, and pigeons in the group that are far from their destination are abandoned in turn.

{loc}_{center}^{t - 1} = \frac{\sum_{i = 1}^{N_{p}^{t - 1}} {loc}_{i}^{t - 1} FIT ({loc}_{i}^{t - 1})}{N_{p}^{t - 1} \cdot \sum_{i = 1}^{N_{p}^{t - 1}} FIT ({loc}_{i}^{t - 1})}

(44)

where

{loc}_{center}^{t - 1}

indicates the central position of the pigeon swarm at the moment of

t - 1

. On the matter,

FIT ({loc}_{i}^{t - 1}) = \frac{1}{f ({loc}_{i}^{t - 1}) + ε}

(45)

where

ε

represents a small number to ensure that the denominator is not zero.

N_{p}^{t - 1}

indicates the number of pigeons in the t

-

1 moment.

{\bar{F}}_{global}^{t - 1} = \frac{\sum_{i = 1}^{N_{p}^{t - 1}} FIT ({loc}_{i}^{t - 1})}{N_{p}^{t - 1}}

(46)

where

{\bar{F}}_{global}^{t - 1}

indicates the average fitness of the entire pigeon swarm at the t

-

1 moment. The pigeons in the swarm are randomly arranged into ring-shaped topologies, with each pigeon and the pigeons on its left and right sides forming a small group.

{\bar{F}}_{local}^{t - 1} = \frac{FIT ({loc}_{i}^{t - 1}) + FIT ({loc}_{left}^{t - 1}) + FIT ({loc}_{right}^{t - 1})}{3}

(47)

where

{\bar{F}}_{local}^{t - 1}

indicates the fitness value of the small group at the t

-

1 moment. FIT

({loc}_{left}^{t - 1})

and FIT

({loc}_{right}^{t - 1})

represent respectively the adaptability values for the left and right individuals of the

i

pigeon in the topology.

{loc}_{local_center}^{t - 1} = \frac{{loc}_{i}^{t - 1} + {loc}_{left}^{t - 1} + {loc}_{right}^{t - 1}}{3}

(48)

where

{loc}_{local_center}^{t - 1}

indicates the central position of the small group at the t

-

1 moment.

{loc}_{left}^{t - 1}

ad

{loc}_{right}^{t - 1}

represent, respectively, the coordination for the left and right individuals of the i pigeon in the topology.

\{\begin{matrix} {loc}_{i}^{t} = {loc}_{i}^{t - 1} + cauchy \cdot ({loc}_{center}^{t - 1} - {loc}_{i}^{t - 1}), {\bar{F}}_{local}^{t - 1} < {\bar{F}}_{global}^{t - 1} \\ {loc}_{i}^{t} = {loc}_{i}^{t - 1} + gussion \cdot ({loc}_{{local}_{center}}^{t - 1} - {loc}_{i}^{t - 1}), {\bar{F}}_{local}^{t - 1} \geq {\bar{F}}_{global}^{t - 1} \end{matrix}

(49)

where

c a u c h y

represents the random number satisfying the Cauchy distribution and

g u s s i o n

represents the random number satisfying the Gaussian distribution.

N_{p}^{t} = \frac{N_{p}^{t - 1}}{2}

(50)

The pigeons are sorted by size to size by the cost function, leaving behind half the pigeons far from the nest.

Step 4: When

t > {ncmax}_{1} + {n c m a x}_{2}

, end the loop interaction and obtain the optimal solution

{loc}_{gbest}

.

The improved pigeon-inspired optimization algorithm is proposed, incorporating two key improvements into the standard framework. First, a reverse search mechanism is introduced to generate inverse solutions and select optimal candidates, thereby enhancing global exploration capability and solution diversity. Second, a competitive learning strategy based on ring-topology neighborhoods and combined Cauchy–Gaussian perturbations is employed to improve adaptability and help the algorithm escape local optima. These modifications collectively improve the robustness and accuracy of the algorithm in dynamic UAV swarm decision-making scenarios.

5.2. Cumulative Prospect Theory

Cumulative prospect theory takes fully into account the risk attitude of decision-makers in the face of gains and losses. A comprehensive foreground value model is established by setting reference points, determining value functions and attribute weights to evaluate risk decision issues.

The mapping relationship between the event set Q and the result set R based on the uncertain prospect

f : Q \to R

is established. The occurrence of any event

q_{m} \in Q

results in a result

r_{m} \in R

. The results of all events are sorted in small to large order,

r_{1} \leq r_{2} \leq \dots {\leq r}_{c} \leq \dots {\leq r}_{n - 1} {\leq r}_{n}

. The midpoint

r_{c}

of the sequence is set as a reference point. If

r_{m} > r_{c}

, the result

r_{m}

is profitable; at this point, the decision-maker’s gains and losses are a positive foreground value of

V_{a} (f^{+})

. If

r_{m} < r_{c}

, the result

r_{m}

is a cost and the decision-maker’s gains and losses are a negative foreground value of

V_{a} (f^{-})

. f’s comprehensive prospect value is

V_{a} (f) = V_{a} (f^{+}) + V_{a} (f^{-})

[31].

\{\begin{matrix} V_{a} (f^{+}) = \sum_{m = c}^{n} π_{m}^{+} v_{a} (∆ r_{m}) \\ V_{a} (f^{-}) = \sum_{m = 1}^{c} π_{m}^{-} v_{a} (∆ r_{m}) \end{matrix}

(51)

where

π_{m}^{+}

and

π_{m}^{-}

represent the decision weight function of the profitable and lossy results, respectively.

∆ r_{m} = r_{m} - r_{c}

represents the gains and losses of decision-makers relative to the reference results and v_a is its corresponding value function.

The value function can be described by the following piecemeal function,

v_{a} (∆ r_{m}) = \{\begin{matrix} {(r_{m} - r_{c})}^{α}, r_{m} \geq r_{c} \\ - ρ {(r_{c} - r_{m})}^{β}, r_{m} < r_{c} \end{matrix}

(52)

where

0 < α

and

β < 1

determine the coefficient of aversion and coefficient of preference for decision-makers when faced with risk, respectively. The general value is

α = β = 0.85

[32].

ρ

denotes the loss evasion factor. In general, decision-makers are more sensitive to loss, but in real life, there are also scenarios where decision-makers are more sensitive to revenue. Therefore, income sensitivity coefficients are introduced to express decision-makers’ sensitivity to returns. The improved value function is

v_{a} (∆ r_{m}) = \{\begin{matrix} {γ (r_{m} - r_{c})}^{α}, r_{m} \geq r_{c} \\ - ρ {(r_{c} - r_{m})}^{β}, r_{m} < r_{c} \end{matrix}

(53)

where, if

ρ = 1, γ > 1

, decision-makers are more sensitive to gains than losses. Conversely, if

ρ > 1, γ = 1

, decision-makers are more sensitive to losses than gains.

When calculating a cumulative weight function, the single probability weight function should be calculated. Since people usually prefer small probability events, the probability weight function is

\{\begin{matrix} w^{+} (p_{h}) = \frac{p_{h}^{φ}}{{[p_{h}^{φ} + {(1 - p_{h})}^{φ}]}^{1 / φ}} \\ w^{-} (p_{h}) = \frac{p_{h}^{δ}}{{[p_{h}^{δ} + {(1 - p_{h})}^{δ}]}^{1 / δ}} \end{matrix}

(54)

where

w^{+} (p_{h}) =

and

w^{-} (p_{h})

are the probability weight function for decision-makers facing gains and losses, respectively.

p_{h}

is the probability of

q_{h}

.

φ

and

δ

are the risk return attitude coefficients and risk loss attitude coefficients, respectively, with a classic value of

φ = 0.61, δ = 0.69

[33]. For each parameter, a range of candidate values is tested, and the combination yielding the most stable performance across scenarios is adopted.

According to the formulas proposed by scholars such as Tversky, probability decision weight functions

π_{m}^{+}

and

π_{m}^{-}

under different results can be obtained by

\{\begin{matrix} π_{i}^{+} = w^{+} (p_{i} {+ \dots + p}_{n}) - w^{+} (p_{i + 1} {+ \dots + p}_{n}), 0 \leq i \leq n - 1 \\ π_{i}^{-} = w^{-} (p_{- m} {+ \dots + p}_{i}) - w^{-} (p_{- m} {+ \dots + p}_{i - 1}), 1 - m \leq i \leq 0 \end{matrix}

(55)

where

p_{i}

expresses the ideal probability for arbitrary events

q_{i}

to occur. In the scenarios set in this article,

q_{i}

can specifically indicate a target selected by one of the UAVs and

p_{i}

is the ideal probability value for that event to occur. The general value range is [0,1].

5.3. Multi-Objective Grey Relation Evaluation Strategy Based on Cumulative Prospect Theory

When applying the cumulative prospect theory to the multi-objective optimization problem, grey correlation analysis and information entropy theory are introduced to build upon the grey correlation evaluation strategy. The model effectively establishes the connection between the Pareto solution and the prospect value and explores the effective information between the Pareto solution and each objective.

F_{i} = \{f_{i 1}, f_{i 2}, \dots, f_{ij}, \dots, f_{iM}\}

is the Pareto frontier of

X_{i}

.

f_{iM}

denotes the Mth objective function value of

X_{i}

. The following is the comprehensive prospect based on the gray correlation evaluation strategy value modeling process.

Step 1: Standardized processing.

In order to effectively eliminate the effects of different target scales and orders of magnitude, each target value is normalized. The processing method is as follows:

u_{ij} = \frac{f_{\max (j)} - f_{ij}}{f_{\max (j)} - f_{\min (j)}}

(56)

where

u_{ij}

is the normalized processing value of the sub-target value

f_{ij}

,

f_{\max (j)}

, and

f_{\min (j)}

are the maximum and minimum values of the

j

target respectively.

Step 2: Determination of reference points.

When making decisions, people usually measure the degree of gain or loss of the decision based on certain reference points. Borrowing the idea of the distance between superior and inferior solutions (a technique for ordering preferences by similarity to an ideal solution), the positive and negative ideal solutions are selected as the reference points of the evaluation indexes, which are used to measure the advantages and disadvantages of the Pareto solution, namely

u_{j}^{+} = \max {u_{ij} | i = 1, \dots, N}

(57)

u_{j}^{-} = \max {u_{ij} | i = 1, \dots, N}

(58)

where

j = 1, 2, \dots, M, U^{+} = {u_{1}^{+}, u_{2}^{+}, \dots, u_{M}^{+}}

and

U^{-} = {u_{1}^{-}, u_{2}^{-}, \dots, u_{M}^{-}}

are positive and negative ideal programs, respectively.

Step 3: Determination of positive and negative correlation coefficients.

Using gray correlation analysis to select positive and negative ideal schemes as the reference series, the positive and negative correlation coefficients between

X_{i}

and the positive and negative ideal schemes

U^{+}

and

U^{-}

with respect to the

j

goal can be determined as

r_{ij}^{+} = \frac{\min_{1 \leq i \leq N} \min_{1 \leq j \leq M} |u_{ij} - u_{j}^{+}| + θ \max_{1 \leq i \leq N} \max_{1 \leq j \leq M} |u_{ij} - u_{j}^{+}|}{|u_{ij} - u_{j}^{+}| + θ \max_{1 \leq i \leq N} \max_{1 \leq j \leq M} |u_{ij} - u_{j}^{+}|}

(59)

r_{ij}^{-} = \frac{\min_{1 \leq i \leq N} \min_{1 \leq j \leq M} |u_{ij} - u_{j}^{-}| + θ \max_{1 \leq i \leq N} \max_{1 \leq j \leq M} |u_{ij} - u_{j}^{-}|}{|u_{ij} - u_{j}^{-}| + θ \max_{1 \leq i \leq N} \max_{1 \leq j \leq M} |u_{ij} - u_{j}^{-}|}

(60)

where

θ

is the resolution factor, which generally takes the value of 0.5.

r_{ij}^{+}

and

r_{ij}^{-}

are the positive and negative correlation coefficients, respectively.

Step 4: Construction of positive and negative prospective value functions.

On the basis of the value function of the cumulative prospect theory, the prospect value function of the Pareto solution can be constructed as follows:

\{\begin{matrix} {v_{ij}^{+} = γ (1 - r_{ij}^{-})}^{α}, u_{ij} > u_{j}^{+} \\ v_{ij}^{-} = - ρ {(r_{ij}^{+} - 1)}^{β}, u_{ij} < u_{j}^{-} \end{matrix}

(61)

where

v_{ij}^{+}

and

v_{ij}^{-}

are the positive and negative prospect value, respectively. If the negative ideal solution is used as the reference point, the Pareto solution

F_{i}

is superior to the negative ideal solution and has positive prospect utility value; if the positive ideal solution is used as the reference point, the Pareto solution

F_{i}

is inferior to the positive ideal solution and has negative prospect utility value.

Step 5: Modeling the value of integrated prospects.

Let the attribute weight functions of the Pareto solution for positive and negative prospects be

π^{+} (ω_{j})

and

π^{-} (ω_{j})

, respectively; then, the composite prospect value

V_{i}

of individual

X_{i}

can be expressed as

V_{i} = \sum_{j = 1}^{M} v_{ij}^{+} π^{+} (ω_{j}) + \sum_{j = 1}^{M} v_{ij}^{-} π^{-} (ω_{j})

(62)

where

ω_{j}

is the evaluation weight of the

j

objection,

π^{+} (ω_{j})

and

π^{-} (ω_{j})

can be calculated according to Equation (55). The integrated prospect value

V_{i}

is the sum of positive and negative prospect values, and the larger the prospect value, the better the quality of the Pareto solution.

In order to determine the evaluation weight

ω_{j}

, the information entropy theory is introduced to effectively avoid the basic drawbacks of assigning weights due to subjective factors. The specific calculation process is as follows.

First, the weighting of each objective is calculated using the normalized values of the objective function values.

P_{ij} = \frac{u_{ij}}{\sum_{i = 1}^{N} u_{ij}}

(63)

Then, the information entropy of each target is calculated.

e_{j} = - \frac{1}{\ln (N)} \sum_{i = 1}^{N} P_{ij} \ln (P_{ij})

(64)

Finally, evaluation weights are calculated for individual objectives.

ω_{j} = \frac{1 - e_{j}}{\sum_{j = 1}^{M} (1 - e_{j})}

(65)

where

i = 1, 2, \dots, N, j = 1, 2, \dots, M

.

As can be seen from the construction process of the comprehensive prospect value model, the comprehensive prospect value model is based on the cumulative prospect theory. The grey correlation analysis method is used to gradually establish the link between the Pareto solution and the prospect value, using the information entropy to determine the evaluation weights of the individual objectives and the size of the comprehensive prospect value to assess the degree of the advantages and disadvantages of the Pareto solution. The greater the value, the better the quality of the solution. Therefore, the prospect value can be used as the fitness of the evolutionary algorithm to guide the algorithm to solve the UAV swarm dynamic task allocation problem. Figure 2 shows a flow chart of the CPT-PIO method.

The situation assessment mechanism provides a strategic evaluation of UAV posture and task suitability. The CPT-PIO component, in contrast, performs dynamic optimization to refine assignments and resolve conflicts. This layered structure ensures that assessment and optimization work in a complementary fashion.

6. Discussion

In order to verify the feasibility and effectiveness of the proposed method, a series of comparative simulation experiments are conducted in this paper. When setting up one airfield and the UAV swarm and the target swarm of 40 UAVs each, the initial position of the UAV swarm is randomly generated near (0,0,0), obeying a uniform distribution on [0,1000,500] m. The airfield is located at (10000,10000,0) m. The target swarm’s initial position is randomly generated near the airfield, obeying a uniform distribution on [9000,10000,500] m. Each UAV is assigned five units of abstract operational resources, representing its limited capacity in energy, payload, or computational bandwidth. These are used to constrain the cumulative cost of assigned tasks. The speed of the UAV swarm is randomly generated, obeying a uniform distribution over [50, 100] m/s, the target swarm is randomly generated, obeying a uniform distribution on [−100, −50] m/s, and the simulation time interval Δt = 0.1 s.

The parameter settings of each component of the CPT-PIO algorithm are shown in Table 1.

To validate the hybrid strategy, we added comparative experiments with CPT only, PIO only, and CPT-PIO. As shown in Table 2, the proposed CPT-PIO hybrid method outperforms both the CPT-only and PIO-only baselines across multiple evaluation metrics. Specifically, CPT-PIO achieves the highest assessment score of 0.88, indicating superior situational awareness and strategic positioning among UAV groups. It also obtains the highest task completion rate at 93.1%, demonstrating better allocation decisions under uncertainty. Furthermore, CPT-PIO converges in only 96 steps, faster than the 120 steps required by the PIO-only approach. These results validate the effectiveness of combining risk preference modeling (via CPT) with global optimization (via PIO), providing a more adaptive and efficient solution for UAV swarm coordination.

The CPT-PIO algorithm is compared with the algorithms in [34], [35], and [36], respectively. A classic example of a decentralized auction mechanism (DAM) distributed task allocation method was presented in [34]. A classic response threshold allocation (RTA) algorithm was displayed in [35]. A distributed dynamic task allocation method for UAV swarms based on a networked evolutionary game-theoretic (NEG) framework proposed in [36], ensuring convergence to optimal strategies through a payoff-based learning algorithm under dynamic task and agent conditions.

The grouping of the UAV swarm is plotted in Figure 3, translating the identification of individual UAV targets into group-level objectives, which lays the foundation for subsequent task allocation. In this distribution, UAVs are spatially organized into distinct teams based on proximity and communication range, as illustrated by shaded circular regions. UAVs of the same color are represented in the same group. The communication links shown between UAVs represent the information-sharing paths established within each group, ensuring coordinated actions and enhancing mission effectiveness. This grouping mechanism ensures efficient resource utilization and stable swarm operations by minimizing intra-group interference and optimizing local interactions.

The dynamic evolution of the situation assessment indicator weights (w₁–w₇) over ten discrete time steps is illustrated in Figure 4. It is evident that the weights of different indicators fluctuate within the range [0,1], highlighting their temporal adaptivity. For instance, the weight of the speed advantage indicator (w₅) remains consistently high, indicating its persistent importance in swarm dynamics. In contrast, weights such as w₂ (angular advantage) and w₆ (distance advantage) exhibit more moderate variations, reflecting changes in mission context or UAV state. This dynamic weighting strategy, determined by entropy-based analysis, ensures that the evaluation process remains responsive to the evolving confrontation scenario and avoids the subjective bias inherent in fixed-weight methods.

The time-varying velocity profiles of all UAVs in the swarm along the x, y, and z spatial directions are illustrated in Figure 5. Each curve corresponds to the velocity evolution of a single UAV, enabling the observation of group-level motion coordination and dynamic behavior in three-dimensional space, which shows that the velocity of the UAV swarm converges after a period of time.

The Gantt chart results are plotted in Figure 6, and the labels in the figure represent the assigned target numbers. In the case of group 5 in the UAV swarm, its target mission sequence is UAVs #35, #9, #14, #1, #33, #19, #8, and #29 (i.e., in the target swarm, UAVs #35, #9, #14, #1, #33, #19, #8, and #29 are in one group).

As demonstrated in Figure 6, the Gantt chart clearly illustrates the task execution timelines across different UAVs. Notably, several intervals show minimal idle states for the UAVs, indicating a high scheduling density and effective time utilization. Such compact and continuous task scheduling highlights the temporal efficiency of the proposed CPT-PIO method. Additionally, the distribution of tasks across UAVs ensures complete coverage without evident redundancy or mission neglect, further reinforcing the robustness and comprehensive effectiveness of the task allocation strategy.

It follows that the CPT-PIO uses a formation group as a unit for comprehensive consideration, rather than one UAV as a unit for allocation, which provides more options and a relatively better chance of effective allocation and performance. This shows the usability of CPT-PIO in the dynamic task allocation problem.

In terms of swarm situation assessment, Figure 7 gives the comparison curves of CPT-PIO with the remaining three compared methods in terms of situation assessment, where all the situation assessment calculations are done using the methods in this paper. Figure 6 reveals that the CPT-PIO algorithm consistently achieves higher situation assessment scores compared to RTA, DAM, and NEG. This improvement is primarily attributed to the integration of cumulative prospect theory, which enables the algorithm to incorporate psychological preferences and risk tendencies, combined with the application of information entropy and grey relational analysis to ensure objective weighting of assessment indicators, as well as the enhancement of the PIO search mechanism that promotes broader solution space exploration and more effective task distribution—all of which together contribute to improved decision quality under dynamic and uncertain conditions.

7. Conclusions

This paper investigates the dynamic decision-making challenges faced by UAV swarms, with the CPT-PIO algorithm proposed as a solution. Key findings emerge from this study:

(1): An innovative combination of grey relational analysis and information entropy into cumulative prospect theory is presented. Established is a multi-objective grey relational evaluation strategy-based comprehensive prospect value model, designed specifically for UAV swarm dynamic decision-making problems. Examples demonstrate the extension of CPT’s applicability to multi-objective optimization scenarios.
(2): Developed in this work is an enhanced pigeon-inspired optimization (PIO) algorithm incorporating a reverse search and competitive learning mechanism. It can be seen that the local optima escape capability significantly improves through this modification.
(3): Simulation experiments on the UAV swarm dynamic decision-making scenarios are conducted, where the CPT-PIO algorithm exhibits clear advantages over existing methods. Superior Pareto solutions are achieved, particularly in large-scale problem-solving contexts.

Our future work will focus on prioritizing the refinement of situation assessment metrics for heterogeneous, cross-domain unmanned swarms. We will also further explore the algorithm’s implementation in dynamic decision-making scenarios spanning maritime, terrestrial, and aerial domains.

Author Contributions

Conceptualization, Y.P. and M.H.; methodology, Y.P. and M.H.; software, Y.P.; validation, Y.P.; formal analysis, Y.P. and M.H.; investigation, Y.P. and M.H.; resources, Y.P.; data curation, Y.P.; writing—original draft preparation, Y.P.; writing—review and editing, M.H.; visualization, Y.P.; supervision, M.H.; project administration, M.H.; funding acquisition, M.H. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by the National Natural Science Foundation of China grant numbers #62403025.

Data Availability Statement

The data used to support the findings of this study are available from the corresponding author upon request.

DURC Statement

Current research is limited to the field of multi-agent swarm intelligence and UAV dynamic decision-making optimization, which is beneficial for enhancing autonomous coordination, adaptive task allocation, and real-time decision-making efficiency in civilian UAV applications such as environmental monitoring, disaster response, and infrastructure inspection, and does not pose a threat to public health or national security. The authors acknowledge the dual-use potential of the research involving cumulative prospect theory-driven pigeon-inspired optimization algorithms for UAV swarm systems and confirm that all necessary precautions have been taken to prevent potential misuse. As an ethical responsibility, the authors strictly adhere to relevant national and international laws about DURC. The authors advocate for responsible deployment, ethical considerations, regulatory compliance, and transparent reporting to mitigate misuse risks and foster beneficial outcomes.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Jiao, L.; Liu, H. An Anchor-Free Refining Feature Pyramid Network for Dense and Multioriented Wheat Spikes Detection Under UAV. IEEE Trans. Instrum. Meas. 2025, 74, 1–14. [Google Scholar] [CrossRef]
Ma, Y.; Luo, R. Path Planning for Searching Submarine with Cooperative Coverage of Fixed-Wing UAVs Cluster in Complex Boundary Sea Area. IEEE Sens. J. 2023, 23, 30070–30083. [Google Scholar] [CrossRef]
Real, F.; Castano, A. Autonomous Fire-Fighting with Heterogeneous Team of Unmanned Aerial Vehicles. Field Robot. 2021, 1, 1–28. [Google Scholar] [CrossRef]
Nesmachnow, S. Multi-Objective Evolutionary Approach for Flight Planning of an Autonomous Fleet of Unmanned Aerial Vehicles in Exploration and Surveillance Missions; Springer: Singapore, 2025. [Google Scholar]
Zuo, Z.; Liu, C. Unmanned Aerial Vehicles: Control Methods and Future Challenges. IEEE/CAA J. Autom. Sin. 2022, 9, 601–614. [Google Scholar] [CrossRef]
Horyna, J.; Baca, T. Decentralized swarms of unmanned aerial vehicles for search and rescue operations without explicit communication. Auton. Robot. 2022, 47, 77–93. [Google Scholar] [CrossRef]
Zhou, W.; Liu, Z. Multi-target tracking for unmanned aerial vehicle swarms using deep reinforcement learning. Neurocomputing 2021, 466, 285–297. [Google Scholar] [CrossRef]
Chen, L.; Liu, W. An efficient multi-objective ant colony optimization for task allocation of heterogeneous unmanned aerial vehicles. J. Comput. Sci. 2022, 58, 101545. [Google Scholar] [CrossRef]
Shi, J.; Tan, L. A multi- unmanned aerial vehicle dynamic task assignment method based on bionic algorithms. Comput. Electr. Eng. 2022, 99, 107820. [Google Scholar] [CrossRef]
Peng, Q.; Wu, H. A Dynamic Task Allocation Method for Unmanned Aerial Vehicle Swarm Based on Wolf Pack Labor Division Model. IEEE Trans. Emerg. Top. Comput. Intell. 2024, 8, 4075–4089. [Google Scholar] [CrossRef]
Ni, X.; Li, Z.; Zhang, Y. A Path Planning Algorithm for Unmanned Aerial Vehicles Based on Artificial Bee Colony. In Proceedings of 4th 2024 International Conference on Autonomous Unmanned Systems (4th ICAUS 2024); Springer: Singapore, 2025. [Google Scholar]
Hu, G.; Mao, C. CMPSO: A novel co-evolutionary multigroup particle swarm optimization for multi-mission UAVs path planning. Adv. Eng. Inform. 2025, 63, 102923. [Google Scholar] [CrossRef]
Yildirim, M.Y.; Akay, R. An efficient grid-based path planning approach using improved artificial bee colony algorithm. Knowl.-Based Syst. 2025, 318, 113528. [Google Scholar] [CrossRef]
Pehlivanoglu, Y.V. A new vibrational genetic algorithm enhanced with a Voronoi diagram for path planning of autonomous UAV. Aerosp. Sci. Technol. 2012, 16, 47–55. [Google Scholar] [CrossRef]
Zhang, C.; Guo, J. A dynamic parameters genetic algorithm for collaborative strike task allocation of unmanned aerial vehicle clusters towards heterogeneous targets. Appl. Soft Comput. 2025, 175, 113075. [Google Scholar] [CrossRef]
Wu, H.; Duan, H. Hierarchical pigeon inspired optimization based multi-UAV obstacle avoidance control. Aerosp. Sci. Technol. 2025, 159, 109963. [Google Scholar] [CrossRef]
Curtis, E.T.; Curtis, J.L. Shallow Value Weighting Predicts Problem Gambling: A Parameter Estimation Analysis Using Cumulative Prospect Theory. J. Gambl. Stud. 2023, 40, 333–348. [Google Scholar] [CrossRef]
Dalhaus, T.; Barnett, B.J. Behavioral weather insurance: Applying cumulative prospect theory to agricultural insurance design under narrow framing. PLoS ONE 2020, 15, e0232267. [Google Scholar] [CrossRef]
Rangrez, Z.Y.M.; Ghosh, J. Integrating risk perceptions in a value of information framework using cumulative prospect theory. Struct. Saf. 2025, 115, 102573. [Google Scholar] [CrossRef]
Wang, X.; Wang, B. Multi-criteria fuzzy portfolio selection based on three-way decisions and cumulative prospect theory. Appl. Soft Comput. 2023, 134, 110033. [Google Scholar] [CrossRef]
Fan, J.; Lei, T. MEREC-MABAC method based on cumulative prospect theory for picture fuzzy sets: Applications to wearable health technology devices. Expert Syst. Appl. 2024, 255, 124749. [Google Scholar] [CrossRef]
Zhang, Z.; Yuan, Y.; Duan, H. Finite-Time Formation Control for Clustered UAVs with Obstacle Avoidance Inspired by Pigeon Hierarchical Behavior. Drones 2025, 9, 276. [Google Scholar] [CrossRef]
Ballerini, M.; Cabibbo, N. Interaction ruling animal collective behavior depends on topological rather than metric distance: Evidence from a field study. Proc. Natl. Acad. Sci. USA 2008, 105, 1232–1237. [Google Scholar] [CrossRef] [PubMed]
Romanczuk, P.; Erdmann, U. Collective motion of active Brownian particles in one dimension. Eur. Phys. J. Spec. Top. 2010, 187, 127–134. [Google Scholar] [CrossRef]
Couzin, I.; Krause, J. Collective Memory and Spatial Sorting in Animal Groups. J. Theor. Biol. 2002, 218, 1–11. [Google Scholar] [CrossRef]
Wu, J.; Luo, C. Distributed UAV swarm Formation and Collision Avoidance Strategies Over Fixed and Switching Topologies. IEEE Trans. Cybern. 2022, 52, 10969–10979. [Google Scholar] [CrossRef]
Lei, X.; Shangqin, T.; Zhenglei, W.; Yongbo, X.; Xiaofei, W. UCAV situation assessment method based on C-LSHADE-Means and SAE-LVQ. J. Syst. Eng. Electron. 2023, 34, 1235–1251. [Google Scholar] [CrossRef]
Xu, W.; Pan, Y.; Chen, X.; Ding, W.; Qian, Y. A Novel Dynamic Fusion Approach Using Information Entropy for Interval-Valued Ordered Datasets. IEEE Trans. Big Data 2023, 9, 845–859. [Google Scholar] [CrossRef]
Duan, H.; Huo, M. Limit-Cycle-Based Mutant Multiobjective Pigeon-Inspired Optimization. IEEE Trans. Evol. Comput. 2020, 24, 948–959. [Google Scholar] [CrossRef]
Yuan, G.; Duan, H. Robust Control for UAV Close Formation Using LADRC via Sine-Powered Pigeon-Inspired Optimization. Drones 2023, 7, 238. [Google Scholar] [CrossRef]
Zhang, P.; Qiao, F. Novel Multi-Criteria Sustainable Evaluation for Production Scheduling Based on Fuzzy Analytic Network Process and Cumulative Prospect Theory-Enhanced VIKOR. IEEE Robot. Autom. Lett. 2022, 7, 9969–9976. [Google Scholar] [CrossRef]
Tversky, A.; Kahneman, D. Advances in prospect theory: Cumulative representation of uncertainty. J. Risk Uncertain. 1992, 5, 297–323. [Google Scholar] [CrossRef]
Xu, H.; Zhou, J. A decision-making rule for modeling travelers’ route choice behavior based on cumulative prospect theory. Transp. Res. Part C Emerg. Technol. 2011, 19, 218–228. [Google Scholar] [CrossRef]
Kim, M.; Baik, H. Response threshold model based UAV search planning and task allocation. J. Intell. Robot. Syst. 2014, 75, 625–640. [Google Scholar] [CrossRef]
Choi, H.L.; Brunet, L. Consensus-based decentralized auctions for robust task allocation. IEEE Trans. Robot. 2009, 25, 912–926. [Google Scholar] [CrossRef]
Zhang, Z.; Jiang, J. Distributed dynamic task allocation for unmanned aerial vehicle swarm systems: A networked evolutionary game-theoretic approach. Chin. J. Aeronaut. 2024, 37, 182–204. [Google Scholar] [CrossRef]

Figure 1. Illustration of the UAV swarm mission area.

Figure 2. Flow chart of the UAV swarm dynamic decision-making using the CPT-PIO algorithm.

Figure 3. Distribution of UAV teams.

Figure 4. Dynamic evolution of the assessment indicator weights over time steps.

Figure 5. UAV swarm velocity evolution in x, y and z directions.

Figure 6. Task allocation Gantt chart of the UAV swarm.

Figure 7. Change curves for situation assessment indicator.

Table 1. Parameter values of the CPT-PIO.

Components	Parameters	Values
UAV swarm control model	k_v	1
	k_ag	5
	k_re	5
	k_airfield	5
	r_re	200 m
	r_ag	500 m
	q_max	20 m/s²
Situation assessment model	$k_{α}$	0.4
	$k_{β}$	0.4
	$k_{h}$	0.2
	$ε$	0.0001
	r_s	5000 m
	r_a	1000 m
PIO algorithm	$N_{p}$	200
	ncmax₁	40
	ncmax₂	20
	R₁	0.02
Cumulative prospect theory	$α$	0.85
	$β$	0.85
	$φ$	0.61
	$δ$	0.69

Table 2. Performance comparison of the CPT-only, PIO-only, and CPT-PIO methods.

Method	Assessment Score	Task Completion Rate (%)	Convergence Time (Steps)
CPT-only	0.68	84.2	—
PIO-only	—	88.9	120
CPT-PIO	0.88	93.1	96

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Peng, Y.; Huo, M. Cumulative Prospect Theory-Driven Pigeon-Inspired Optimization for UAV Swarm Dynamic Decision-Making. Drones 2025, 9, 478. https://doi.org/10.3390/drones9070478

AMA Style

Peng Y, Huo M. Cumulative Prospect Theory-Driven Pigeon-Inspired Optimization for UAV Swarm Dynamic Decision-Making. Drones. 2025; 9(7):478. https://doi.org/10.3390/drones9070478

Chicago/Turabian Style

Peng, Yalan, and Mengzhen Huo. 2025. "Cumulative Prospect Theory-Driven Pigeon-Inspired Optimization for UAV Swarm Dynamic Decision-Making" Drones 9, no. 7: 478. https://doi.org/10.3390/drones9070478

APA Style

Peng, Y., & Huo, M. (2025). Cumulative Prospect Theory-Driven Pigeon-Inspired Optimization for UAV Swarm Dynamic Decision-Making. Drones, 9(7), 478. https://doi.org/10.3390/drones9070478

Article Menu

Cumulative Prospect Theory-Driven Pigeon-Inspired Optimization for UAV Swarm Dynamic Decision-Making

Abstract

1. Introduction

2. Problem Formulation

3. Method for UAV Swarm Control

4. Situation Assessment Model

5. Pigeon-Inspired Optimization Based on Cumulative Prospect Theory

5.1. Improved Pigeon-Inspired Optimization

5.2. Cumulative Prospect Theory

5.3. Multi-Objective Grey Relation Evaluation Strategy Based on Cumulative Prospect Theory

6. Discussion

7. Conclusions

Author Contributions

Funding

Data Availability Statement

DURC Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI