1. Introduction
With the rapid development of science and technology, unmanned aerial vehicles (UAVs) are now being used in various fields, including commerce, industry, agriculture, and scientific research. This has promoted the rapid growth of related industries and is now essential for supporting the development of the low-altitude economy. In response to market demand, there are many researchers exploring the potential applications of UAVs in areas such as logistics and distribution, urban governance, low-altitude access, fire rescue, address identification, agricultural plant protection, environmental protection, and cultural and tourism development [
1,
2,
3]. The top priority of current UAV technology research is to optimize the applicability of UAV clusters in different fields, take into account their performance, improve their scientific and technological innovation, and provide a strong impetus to assist the high-quality development of various industries. In dynamic and open environments, UAV swarms must respond intelligently and cooperatively to a wide range of real-time tasks [
4]. Achieving effective decision-making and control in UAV swarms under such conditions poses a significant challenge, especially when multiple conflicting tasks must be balanced simultaneously [
5].
In real-world application scenarios, UAV swarm decision-making often involves the optimization of multiple task objectives—such as minimizing mission completion time, communication delays, participation costs, and attrition risks. As the number of tasks increases, traditional multi-objective optimization (MOO) algorithms struggle to maintain performance. Specifically, Pareto-based dominance relationships become less discriminative in high-dimensional objective spaces, resulting in reduced search efficiency and weaker convergence. This makes it difficult to identify meaningful trade-offs among solutions and poses a considerable challenge for high-dimensional dynamic decision-making in UAV swarms [
6,
7].
To address these issues, various swarm intelligence algorithms have been explored, including particle swarm optimization (PSO), Artificial Bee Colony (ABC), genetic algorithms (GAs), etc. [
8,
9,
10,
11]. These methods have shown success in solving complex optimization problems in UAV control, such as task allocation and path planning.
A novel co-evolutionary multi-group particle swarm optimization algorithm that enhances multi-mission UAV path planning by enabling cooperative evolution among subgroups was proposed in [
12], improving solution diversity, convergence speed, and task adaptability in complex environments. An efficient grid-based path planning method for UAVs using an improved Artificial Bee Colony algorithm was presented in [
13] and enhances convergence speed and solution quality by incorporating adaptive strategies and local search mechanisms to better navigate complex environments. A novel vibrational genetic algorithm enhanced with Voronoi diagrams for autonomous UAV path planning was introduced in [
14], effectively improving obstacle avoidance, path smoothness, and convergence efficiency in complex environments. A dynamic parameter genetic algorithm for collaborative strike task allocation in UAV swarms was transformed in [
15]. It adaptively adjusts genetic parameters to effectively handle heterogeneous targets, enhancing task efficiency, adaptability, and coordination in dynamic combat scenarios.
The pigeon-inspired optimization (PIO) algorithm has gained attention for its inspiration from the homing behavior of pigeons, offering fast convergence and strong global search capabilities.
A hierarchical control strategy for multi-UAV obstacle avoidance based on an enhanced pigeon-inspired optimization algorithm was developed in [
16], introducing layered decision-making to improve real-time responsiveness, path safety, and coordination in complex environments.
However, most of these algorithms operate under deterministic assumptions and do not consider the behavioral characteristics or risk preferences of decision-makers in uncertain environments.
Meanwhile, cumulative prospect theory (CPT) provides a psychologically grounded framework for modeling decision-making under risk and uncertainty. Unlike traditional utility-based models, CPT accounts for how people perceive gains and losses differently and how they subjectively distort probabilities. These characteristics have made CPT widely applicable in areas such as behavioral economics, finance, and risk analysis [
17,
18].
Integrating the cumulative prospect theory into a value of information framework to account for human-like risk perception was achieved in [
19], providing a more realistic and behaviorally-informed approach to decision-making under uncertainty. A multi-criteria fuzzy portfolio selection method that combined three-way decision theory with cumulative prospect theory was presented in [
20], offering a more nuanced and psychologically consistent framework for investment decision-making under uncertainty and ambiguity. A method based on the removal effects of criteria-multi-attributive border approximation area comparison decision-making was introduced in [
21]. It integrates cumulative prospect theory with picture fuzzy sets, enhancing the evaluation of wearable health technology devices by capturing decision-makers’ psychological behavior and handling uncertainty more effectively.
Despite its effectiveness in modeling human behavior, CPT has not yet been systematically applied to UAV swarm optimization tasks, particularly those involving dynamics and uncertain outcomes.
To fill this gap, this paper proposes a novel algorithm named cumulative prospect theory-driven pigeon-inspired optimization (CPT-PIO) for solving the multi-objective dynamic decision-making problem in UAV swarms. The core idea is to incorporate decision-makers’ psychological preferences into the optimization process. First, gray relational analysis and information entropy are used to normalize and weigh the objectives, reducing the influence of dimensional heterogeneity. Then, based on CPT, a comprehensive prospect value model is constructed by defining reference points, applying value functions, and assigning probability weights. This value serves as a fitness function to guide the search process of the PIO algorithm. To further improve global exploration and avoid premature convergence, an inverse search mechanism is introduced into the standard PIO.
The primary contributions and novelties are discussed as follows:
A novel decision-making optimization framework is proposed by integrating cumulative prospect theory into the evaluation of UAV swarm Pareto solutions, allowing the algorithm to reflect psychological risk preferences through the construction of a prospect value model;
An entropy-weighted grey relational analysis method is introduced for swarm situation assessment, enabling objective and adaptive determination of assessment index weights in dynamic and uncertain environments;
The traditional pigeon-inspired optimization algorithm is enhanced by incorporating a reverse search mechanism and competitive learning strategy, which effectively avoids premature convergence and improves solution diversity and convergence speed.
These innovations jointly contribute to a robust and psychologically-informed optimization strategy, as demonstrated by superior performance in simulation experiments against state-of-the-art methods.
The rest of this paper is arranged as follows:
Section 2 provides the problem formulation of the UAV swarm mission process. The methods for UAV swarm control are presented in
Section 3.
Section 4 discusses the situation assessment model and
Section 5 provides the algorithm design and process.
Section 6 displays the numerical simulation results and
Section 7 presents the summarization of the whole work.
3. Method for UAV Swarm Control
In the initial phase, the UAV swarm cannot perceive the presence of targets. The objective of the UAVs is to form an orderly whole within the swarm, moving at high speed towards the airfield.
In 2008, Ballerini et al. studied the flight data of starling flocks and found that the individuals in the flock were only speed-correlated with the 6–7 nearest neighbors, and the number of interacting neighbors remained stable, unaffected by the distance between individuals [
23]. This phenomenon, referred to as “topological interactions,” challenges the traditional principle of defining neighbors based on relative distance [
24,
25]. When topological interactions are considered, the number of interacting neighbors is nearly constant and much smaller than the number of neighbors that can be sensed. Drawing on the strong spatial consistency shown by the starlings in nature, the unique information interaction mechanism is applied to the control of UAV swarm. The seven individuals closest are selected to interact with each other [
26]. UAVs perceive neighboring positions and velocities using onboard non-visual sensors such as GPSs, inertial measurement units (IMUs), or ultra-wideband (UWB) modules, enabling reliable relative positioning.
For example, in the
c type, the distance between UAV
i and other UAVs is calculated. The distance between
i and
j is named
.
Then,
is sorted from small to large; using
to indicate the sort of UAV
j, the neighbor set of UAVs
i can be expressed as
where
represents the neighbors’ collection of
type UAV
i.
In order to more efficiently complete the search for targets, the UAV swarm should achieve spatial and speed consistency and avoid collisions between individuals within the swarm.
where
represents the speed consistency acceleration of
c camp UAV
i.
represents correlation coefficients,
.
stands for the number of neighbors and its value is 7.
In order for the entire UAV swarm to form a compact whole, within the neighbor collection, the neighbor j will have a gathering force on the UAV i when the distance between the UAV and its neighbor j exceeds the set threshold.
The collection of individuals,
, that generate aggregation forces on UAV
i in the neighbor collection of UAVs
i is determined:
where
indicates the position of UAV
j of
c type.
and
respectively denote the radius of rejection and aggregation.
The spatial aggregation acceleration can be expressed as
where
represents the correlation coefficient,
,
.
Collision avoidance between UAVs is considered; that is, if the distance between the UAV i and its neighbors is less than the pre-set security threshold, the neighbor will repel them.
The collection of individuals,
, that generate repulsive forces on UAV
i in the neighbor collection is also determined:
where
represents the repulsive acceleration of UAV
i of the
c type.
represents the correlation coefficient,
,
.
The UAV swarm receives the gravity of the airfield. The acceleration generated by the gravity can be expressed as
where
represents the gravity acceleration of the airfield.
indicates the correlation coefficient,
.
determines the position of airfield,
represents the position of the UAV swarm.
The combined acceleration of the
c type UAV
i can be expressed as follows:
where
represents the acceleration limit of the UAV to avoid excessive motion fluctuations caused by the simultaneous action of multiple interaction forces. This constraint ensures the resulting control input remains within feasible dynamic bounds and helps maintain stable formation.
The distance between the UAV swarm centers can be expressed as
where
stands for the position of the target swarm. When
dis is greater than
, the detection perception radius of the UAV, both sides’ swarms cannot perceive the presence of each other and the UAV swarm’s target is the airfield. When
, the objectives of the UAV swarm will change, with some individuals aiming at the target swarm.
In order to improve the efficiency of task allocation, by grouping the UAV swarm, the identification of targets of each UAV is translated into the determination of objectives for each group.
For the c type UAV swarm, firstly, it is divided into groups and individuals are selected as centers in the swarm. Then, the distance of the remaining UAVs to the swarm center is calculated. They then choose to become a group with their nearest center, group all UAVs, and recalculate the central location coordinates of each group, which are new swarm centers. The above steps are repeated until the end of the cycle.
4. Situation Assessment Model
In the context of information technology applications, the accurate assessment of a task situation is an important part of data fusion and correct decision-making. The situation of a UAV swarm is not simply a state of integration between the environment and the targets in the mission area at the given time. By constructing a view of the targets and the UAV swarm on multiple elements of activity, time, location, and status, after the target distribution is linked to the activity and environment, intentions and maneuverability, and analysis, the result is a comprehensive posture.
The situation assessment indicator is based on the state change of all individuals in the confrontation environment. The situation will change over time, but it has relative stability and can be seen as constant for some time, and only if the time changes to a certain limit will the situation change.
The situation assessment mainly considers the overall motion state of the targets, the difficulty of targeting under different target selection strategies, and the difficulty of executing different behavior strategies [
27].
In this paper, two main types of indicators are considered: the relative advantage indicator and swarm synergy indicator. The relative advantage indicator includes the angle advantage indicator, position advantage indicator, speed advantage indicator, distance advantage indicator, and height advantage indicator.
The angular advantage indicator is a function of the difference in angle between the mean motion vector of the
n group and the vector pointing from the geometric center of the
n group to the geometric center of the
m group. The difficulty with mobility of implementing the mandate by the
n group against the
m group is characterized by the angular advantage indicator.
where
indicates the angular advantage indicator of the
n group against the
m group.
represents the correlation coefficient.
The position advantage indicator is a function of
, the difference in angle between the mean motion vector of the
m group and the vector pointing from the geometric center of the
n group to the geometric center of the
m group. The position advantage of implementing the mandate by the
n group against the
m group is characterized by the position advantage indicator.
where
indicates the position advantage indicator of the
n group against the
m group.
represents the correlation coefficient.
The speed advantage indicator is a function of speeds of the
n group and the
m group. When the speed difference between both sides is large, the speed advantage indicator is fixed.
where
indicates the speed advantage indicator of the
n group against the
m group.
and
represent the mean speed of the
n group and the
m group, respectively.
The distance advantage indicator is a function of the distance between the geometric center of two groups. The closer the two groups, the shorter the time it takes to be done and the greater the distance advantage indicator.
where
indicates the distance advantage indicator of the
n group against the
m group.
d represents the distance between the geometric center of the
n group and the geometric center of the
m group.
and
indicate the perceived radius and the implementing mandate radius of the UAV, respectively.
The height advantage indicator is a function of the difference in mean height between the
m group and the
n group. The height advantage of the implementing mandate by the
n group against the
m group is characterized by the height advantage indicator.
where
indicates the height advantage indicator of the
n group against the
m group.
represents the correlation coefficient.
expresses the difference in mean height between the
m group and the
n group.
The swarm synergy indicator is a function that characterizes the difficulty of coordination among individuals in a swarm in the execution of tasks. The swarm synergy indicator mainly includes the synergy indicator
for clusters measured by sequential parameters in the horizontal plane and the system indicator
for clusters measured by aggregation in the vertical plane.
where
is a small constant
.
To ensure that each indicator contributes fairly and consistently, all situation assessment terms are normalized to the [0,1] interval. This guarantees that the final situation assessment score also lies within [0,1]. In summary, the formula for calculating the indicators for the comprehensive situation assessment of the UAV swarm is as follows:
where
represents the coefficient,
,
.
To ensure reliable and robust situation assessment, a group-level information fusion strategy is adopted, where UAV posture reports (e.g., position, velocity, and orientation) are aggregated using weights derived from entropy-based credibility scores. These scores quantify the consistency and discriminability of each UAV’s observation within its group, thereby reducing the influence of contradictory or unreliable data caused by sensor noise or communication delays. At the same time, information entropy theory is introduced to objectively determine the weights of each assessment indicator, effectively avoiding bias from subjective preferences [
28]. The detailed calculation procedure for entropy-based weight assignment is described as follows.
Step 1: For one type side, each UAV group has
targets to chooses from. That is, each UAV group has
operational options to be evaluated, each of which contains the seven pending indicators mentioned above. For the UAV group
n, build the evaluation matrix
Step 2: Standardize data on indicators according to (22) to eliminate non-conformity between indicators.
Step 3: Calculate entropy
for each assessment indicator according to (23). When
,
.
Step 4: Calculate the deviation degree
of indicator
j by using the entropy
according to (24).
Step 5: Determine the weight coefficient
by using the deviation degree of the assessment indicators
according to (25).
Step 6: Each UAV group has its own evaluation index weight matrix
. Calculate the average weight matrix for the entire UAV swarm
. The average solution of the weight of each evaluation indicator can be calculated by
Step 7: Calculate the positive distance and the negative distance between the evaluation index weights of the UAV group n and the mean weight resolution .
When the evaluation indicator j is benefit-based,
When the evaluation indicator j is cost-based,
Step 8: Based on the evaluation index weight matrix
, calculate separately the weight positive distance
and the weighted negative distance
for each UAV group.
Step 9: By standardizing the
and
obtained in Step 8, the standard weighted positive distance
and the standard weighted negative distance
are obtained.
Step 10: The overall evaluation score of each UAV group
can be obtained separately by (31).
Step 11: Sort in descending order, select , corresponding to the maximum . Thus, the optimal weight of the situation assessment for the UAV swarm is obtained.
The weights wj (j = 1, …, 7) are not fixed constants but are dynamically computed via the entropy-based multi-step procedure detailed in (21) to (31). This adaptive mechanism ensures context-sensitive assessment while avoiding subjective bias.
Based on this situation evaluation function, the cost function defined by the UAV swarm mission benefits is designed.
The loss in the UAV swarm is caused by the target swarm, so the loss value is determined by the target swarm’s strategy, which is not related to the UAV swarm’s task allocation strategy. Under the targets’ policy, the valuation of the losses of the UAV swarm can be calculated as
where
expresses the valuation of the losses of the
n group of the
c type,
The
stands for the probability of the
group against the
n group. The
stands for the probability of the
n group loss under the strategy of
m group. The
stands for the value of the
n group.
where
indicates the effect of the posture of groups on operate probability.
denotes the
m group’s posture for the
n group.
indicates the operate posture threshold.
and
indicate the value of resources in the
n group and the
m group, respectively.
indicates the effect of the gap in the value of resources on the probability of operating.
where
expresses the influential factors of environment, such as geography, weather, or electromagnetic interference. The
expresses the resource’s ability. The
expresses the resource value for the
m group to launch against the
n group.
is a vector of
decision variables, which denotes the feasible task strategies.
is the Pareto frontier and
is the function value of the
n target. The high-dimensional multi-objective model can be expressed as follows:
where
and
denote the task loss value of the UAV swarm and the target swarm under the
X strategy, respectively.
denotes the mission completion time of the UAV swarm under the
X strategy,
is the mission completion time of the
i group.
5. Pigeon-Inspired Optimization Based on Cumulative Prospect Theory
Pigeon-inspired optimization based on cumulative prospect theory integrates cumulative prospect theory and the grey correlation analysis method, establishes the comprehensive prospect value model of Pareto solutions, adopts prospect value as the fitness of the pigeon-inspired optimization algorithm to guide the evolution, and evaluates the advantages and disadvantages of the Pareto solution in terms of the magnitude of the value.
5.1. Improved Pigeon-Inspired Optimization
The pigeon-inspired optimization algorithm emulates the unique navigation behavior exhibited by pigeon swarms during homing. This process involves two key operators, map–compass operators and landmark operators, which guide pigeons at different stages of navigation. The iterative update of individual positions within the population facilitates the exploration of the target space [
29,
30]. In the initial navigation phase, the traditional PIO algorithm updates solely based on the global optimum position, leading to rapid convergence and a tendency to become trapped in local optima, resulting in poor population diversity. To address these limitations, a reverse search mechanism has been introduced to enhance the traditional PIO algorithm, thereby improving search accuracy and preventing premature convergence.
The basic idea is to retrieve the population
P after each iteration, generate its inverse population
OP, and, based on the size of the fitness value from two populations
, select
N optimal individuals to form the new population
NP, which will be used for the next iterative search. The mathematical expression of the reverse search mechanism is as follows:
where
and
are upper and lower boundaries of
X and
is the reverse individual of
X.
The pigeon-inspired optimization algorithm based on the reverse search mechanism is as follows.
Step 1: Initialization optimization algorithm parameters.
Set pigeons in dimension search space. The position of the i pigeon is indicated as and the velocity is indicated as . The improved pigeon-inspired optimization algorithm has two independent operations: the maximum number of iterations in the first is ncmax1 and the second is ncmax2.
Step 2: Map–compass operators.
At this stage, pigeons rely primarily on the sun and Earth’s magnetic field for navigation. Their position and speed are updated in each iteration as follows:
where
R1 is the map and compass factor. Sort all of the
and
by their fit values. Select the top
pigeons with the highest fitness values as the new population. Proceed with the next iteration until the number of iterations reaches
ncmax1.
Step 3: Landmark operators.
At this stage, pigeons rely on landmarks near the nest to navigate, and pigeons in the group that are far from their destination are abandoned in turn.
where
indicates the central position of the pigeon swarm at the moment of
. On the matter,
where
represents a small number to ensure that the denominator is not zero.
indicates the number of pigeons in the
t1 moment.
where
indicates the average fitness of the entire pigeon swarm at the
t1 moment. The pigeons in the swarm are randomly arranged into ring-shaped topologies, with each pigeon and the pigeons on its left and right sides forming a small group.
where
indicates the fitness value of the small group at the
t1 moment.
FIT and
FIT represent respectively the adaptability values for the left and right individuals of the
pigeon in the topology.
where
indicates the central position of the small group at the
t1 moment.
ad
represent, respectively, the coordination for the left and right individuals of the
i pigeon in the topology.
where
represents the random number satisfying the Cauchy distribution and
represents the random number satisfying the Gaussian distribution.
The pigeons are sorted by size to size by the cost function, leaving behind half the pigeons far from the nest.
Step 4: When , end the loop interaction and obtain the optimal solution .
The improved pigeon-inspired optimization algorithm is proposed, incorporating two key improvements into the standard framework. First, a reverse search mechanism is introduced to generate inverse solutions and select optimal candidates, thereby enhancing global exploration capability and solution diversity. Second, a competitive learning strategy based on ring-topology neighborhoods and combined Cauchy–Gaussian perturbations is employed to improve adaptability and help the algorithm escape local optima. These modifications collectively improve the robustness and accuracy of the algorithm in dynamic UAV swarm decision-making scenarios.
5.2. Cumulative Prospect Theory
Cumulative prospect theory takes fully into account the risk attitude of decision-makers in the face of gains and losses. A comprehensive foreground value model is established by setting reference points, determining value functions and attribute weights to evaluate risk decision issues.
The mapping relationship between the event set
Q and the result set
R based on the uncertain prospect
is established. The occurrence of any event
results in a result
. The results of all events are sorted in small to large order,
. The midpoint
of the sequence is set as a reference point. If
, the result
is profitable; at this point, the decision-maker’s gains and losses are a positive foreground value of
. If
, the result
is a cost and the decision-maker’s gains and losses are a negative foreground value of
.
f’s comprehensive prospect value is
[
31].
where
and
represent the decision weight function of the profitable and lossy results, respectively.
represents the gains and losses of decision-makers relative to the reference results and
va is its corresponding value function.
The value function can be described by the following piecemeal function,
where
and
determine the coefficient of aversion and coefficient of preference for decision-makers when faced with risk, respectively. The general value is
[
32].
denotes the loss evasion factor. In general, decision-makers are more sensitive to loss, but in real life, there are also scenarios where decision-makers are more sensitive to revenue. Therefore, income sensitivity coefficients are introduced to express decision-makers’ sensitivity to returns. The improved value function is
where, if
, decision-makers are more sensitive to gains than losses. Conversely, if
, decision-makers are more sensitive to losses than gains.
When calculating a cumulative weight function, the single probability weight function should be calculated. Since people usually prefer small probability events, the probability weight function is
where
and
are the probability weight function for decision-makers facing gains and losses, respectively.
is the probability of
.
and
are the risk return attitude coefficients and risk loss attitude coefficients, respectively, with a classic value of
[
33]. For each parameter, a range of candidate values is tested, and the combination yielding the most stable performance across scenarios is adopted.
According to the formulas proposed by scholars such as Tversky, probability decision weight functions
and
under different results can be obtained by
where
expresses the ideal probability for arbitrary events
to occur. In the scenarios set in this article,
can specifically indicate a target selected by one of the UAVs and
is the ideal probability value for that event to occur. The general value range is [0,1].
5.3. Multi-Objective Grey Relation Evaluation Strategy Based on Cumulative Prospect Theory
When applying the cumulative prospect theory to the multi-objective optimization problem, grey correlation analysis and information entropy theory are introduced to build upon the grey correlation evaluation strategy. The model effectively establishes the connection between the Pareto solution and the prospect value and explores the effective information between the Pareto solution and each objective.
is the Pareto frontier of . denotes the Mth objective function value of . The following is the comprehensive prospect based on the gray correlation evaluation strategy value modeling process.
Step 1: Standardized processing.
In order to effectively eliminate the effects of different target scales and orders of magnitude, each target value is normalized. The processing method is as follows:
where
is the normalized processing value of the sub-target value
,
, and
are the maximum and minimum values of the
target respectively.
Step 2: Determination of reference points.
When making decisions, people usually measure the degree of gain or loss of the decision based on certain reference points. Borrowing the idea of the distance between superior and inferior solutions (a technique for ordering preferences by similarity to an ideal solution), the positive and negative ideal solutions are selected as the reference points of the evaluation indexes, which are used to measure the advantages and disadvantages of the Pareto solution, namely
where
and
are positive and negative ideal programs, respectively.
Step 3: Determination of positive and negative correlation coefficients.
Using gray correlation analysis to select positive and negative ideal schemes as the reference series, the positive and negative correlation coefficients between
and the positive and negative ideal schemes
and
with respect to the
goal can be determined as
where
is the resolution factor, which generally takes the value of 0.5.
and
are the positive and negative correlation coefficients, respectively.
Step 4: Construction of positive and negative prospective value functions.
On the basis of the value function of the cumulative prospect theory, the prospect value function of the Pareto solution can be constructed as follows:
where
and
are the positive and negative prospect value, respectively. If the negative ideal solution is used as the reference point, the Pareto solution
is superior to the negative ideal solution and has positive prospect utility value; if the positive ideal solution is used as the reference point, the Pareto solution
is inferior to the positive ideal solution and has negative prospect utility value.
Step 5: Modeling the value of integrated prospects.
Let the attribute weight functions of the Pareto solution for positive and negative prospects be
and
, respectively; then, the composite prospect value
of individual
can be expressed as
where
is the evaluation weight of the
objection,
and
can be calculated according to Equation (55). The integrated prospect value
is the sum of positive and negative prospect values, and the larger the prospect value, the better the quality of the Pareto solution.
In order to determine the evaluation weight , the information entropy theory is introduced to effectively avoid the basic drawbacks of assigning weights due to subjective factors. The specific calculation process is as follows.
First, the weighting of each objective is calculated using the normalized values of the objective function values.
Then, the information entropy of each target is calculated.
Finally, evaluation weights are calculated for individual objectives.
where
.
As can be seen from the construction process of the comprehensive prospect value model, the comprehensive prospect value model is based on the cumulative prospect theory. The grey correlation analysis method is used to gradually establish the link between the Pareto solution and the prospect value, using the information entropy to determine the evaluation weights of the individual objectives and the size of the comprehensive prospect value to assess the degree of the advantages and disadvantages of the Pareto solution. The greater the value, the better the quality of the solution. Therefore, the prospect value can be used as the fitness of the evolutionary algorithm to guide the algorithm to solve the UAV swarm dynamic task allocation problem.
Figure 2 shows a flow chart of the CPT-PIO method.
The situation assessment mechanism provides a strategic evaluation of UAV posture and task suitability. The CPT-PIO component, in contrast, performs dynamic optimization to refine assignments and resolve conflicts. This layered structure ensures that assessment and optimization work in a complementary fashion.
6. Discussion
In order to verify the feasibility and effectiveness of the proposed method, a series of comparative simulation experiments are conducted in this paper. When setting up one airfield and the UAV swarm and the target swarm of 40 UAVs each, the initial position of the UAV swarm is randomly generated near (0,0,0), obeying a uniform distribution on [0,1000,500] m. The airfield is located at (10000,10000,0) m. The target swarm’s initial position is randomly generated near the airfield, obeying a uniform distribution on [9000,10000,500] m. Each UAV is assigned five units of abstract operational resources, representing its limited capacity in energy, payload, or computational bandwidth. These are used to constrain the cumulative cost of assigned tasks. The speed of the UAV swarm is randomly generated, obeying a uniform distribution over [50, 100] m/s, the target swarm is randomly generated, obeying a uniform distribution on [−100, −50] m/s, and the simulation time interval Δt = 0.1 s.
The parameter settings of each component of the CPT-PIO algorithm are shown in
Table 1.
To validate the hybrid strategy, we added comparative experiments with CPT only, PIO only, and CPT-PIO. As shown in
Table 2, the proposed CPT-PIO hybrid method outperforms both the CPT-only and PIO-only baselines across multiple evaluation metrics. Specifically, CPT-PIO achieves the highest assessment score of 0.88, indicating superior situational awareness and strategic positioning among UAV groups. It also obtains the highest task completion rate at 93.1%, demonstrating better allocation decisions under uncertainty. Furthermore, CPT-PIO converges in only 96 steps, faster than the 120 steps required by the PIO-only approach. These results validate the effectiveness of combining risk preference modeling (via CPT) with global optimization (via PIO), providing a more adaptive and efficient solution for UAV swarm coordination.
The CPT-PIO algorithm is compared with the algorithms in [
34], [
35], and [
36], respectively. A classic example of a decentralized auction mechanism (DAM) distributed task allocation method was presented in [
34]. A classic response threshold allocation (RTA) algorithm was displayed in [
35]. A distributed dynamic task allocation method for UAV swarms based on a networked evolutionary game-theoretic (NEG) framework proposed in [
36], ensuring convergence to optimal strategies through a payoff-based learning algorithm under dynamic task and agent conditions.
The grouping of the UAV swarm is plotted in
Figure 3, translating the identification of individual UAV targets into group-level objectives, which lays the foundation for subsequent task allocation. In this distribution, UAVs are spatially organized into distinct teams based on proximity and communication range, as illustrated by shaded circular regions. UAVs of the same color are represented in the same group. The communication links shown between UAVs represent the information-sharing paths established within each group, ensuring coordinated actions and enhancing mission effectiveness. This grouping mechanism ensures efficient resource utilization and stable swarm operations by minimizing intra-group interference and optimizing local interactions.
The dynamic evolution of the situation assessment indicator weights (
w1–
w7) over ten discrete time steps is illustrated in
Figure 4. It is evident that the weights of different indicators fluctuate within the range [0,1], highlighting their temporal adaptivity. For instance, the weight of the speed advantage indicator (
w5) remains consistently high, indicating its persistent importance in swarm dynamics. In contrast, weights such as
w2 (angular advantage) and
w6 (distance advantage) exhibit more moderate variations, reflecting changes in mission context or UAV state. This dynamic weighting strategy, determined by entropy-based analysis, ensures that the evaluation process remains responsive to the evolving confrontation scenario and avoids the subjective bias inherent in fixed-weight methods.
The time-varying velocity profiles of all UAVs in the swarm along the
x,
y, and
z spatial directions are illustrated in
Figure 5. Each curve corresponds to the velocity evolution of a single UAV, enabling the observation of group-level motion coordination and dynamic behavior in three-dimensional space, which shows that the velocity of the UAV swarm converges after a period of time.
The Gantt chart results are plotted in
Figure 6, and the labels in the figure represent the assigned target numbers. In the case of group 5 in the UAV swarm, its target mission sequence is UAVs #35, #9, #14, #1, #33, #19, #8, and #29 (i.e., in the target swarm, UAVs #35, #9, #14, #1, #33, #19, #8, and #29 are in one group).
As demonstrated in
Figure 6, the Gantt chart clearly illustrates the task execution timelines across different UAVs. Notably, several intervals show minimal idle states for the UAVs, indicating a high scheduling density and effective time utilization. Such compact and continuous task scheduling highlights the temporal efficiency of the proposed CPT-PIO method. Additionally, the distribution of tasks across UAVs ensures complete coverage without evident redundancy or mission neglect, further reinforcing the robustness and comprehensive effectiveness of the task allocation strategy.
It follows that the CPT-PIO uses a formation group as a unit for comprehensive consideration, rather than one UAV as a unit for allocation, which provides more options and a relatively better chance of effective allocation and performance. This shows the usability of CPT-PIO in the dynamic task allocation problem.
In terms of swarm situation assessment,
Figure 7 gives the comparison curves of CPT-PIO with the remaining three compared methods in terms of situation assessment, where all the situation assessment calculations are done using the methods in this paper.
Figure 6 reveals that the CPT-PIO algorithm consistently achieves higher situation assessment scores compared to RTA, DAM, and NEG. This improvement is primarily attributed to the integration of cumulative prospect theory, which enables the algorithm to incorporate psychological preferences and risk tendencies, combined with the application of information entropy and grey relational analysis to ensure objective weighting of assessment indicators, as well as the enhancement of the PIO search mechanism that promotes broader solution space exploration and more effective task distribution—all of which together contribute to improved decision quality under dynamic and uncertain conditions.