Task Assignment of UAV Swarm Based on Wolf Pack Algorithm

Lu, Yingtong; Ma, Yaofei; Wang, Jiangyun; Han, Liang

doi:10.3390/app10238335

Open AccessArticle

Task Assignment of UAV Swarm Based on Wolf Pack Algorithm

School of Automation Science and Electrical Engineering, Beihang University, Beijing 100191, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2020, 10(23), 8335; https://doi.org/10.3390/app10238335

Submission received: 5 November 2020 / Revised: 20 November 2020 / Accepted: 21 November 2020 / Published: 24 November 2020

(This article belongs to the Section Aerospace Science and Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

To perform air missions with an unmanned aerial vehicle (UAV) swarm is a significant trend in warfare. The task assignment among the UAV swarm is one of the key issues in such missions. This paper proposes PSO-GA-DWPA (discrete wolf pack algorithm with the principles of particle swarm optimization and genetic algorithm) to solve the task assignment of a UAV swarm with fast convergence speed. The PSO-GA-DWPA is confirmed with three different ground-attack scenarios by experiments. The comparative results show that the improved algorithm not only converges faster than the original WPA and PSO, but it also exhibits excellent search quality in high-dimensional space.

Keywords:

UAV swarm; task assignment; Wolf Pack Algorithm (WPA); PSO-GA-DWPA; ground-attack

1. Introduction

In the last two decades, how to use unmanned aerial vehicles (UAVs) for various military missions has received growing attention. Over the years, UAVs have been used in military missions, such as ground attack missions (GAM), wide area search missions (WASM), suppression of enemy air defense (SEAD), and combat intelligence, surveillance, and reconnaissance (ISR).

At present, research on UAVs has changed from focus on a large full-function UAV to a swarm of small single-function stealth UAVs, because the cost of a small single-function UAV is much lower than that of a large full-function UAV. The basic performance of UAV swarm operations is to meet the requirement of fast reliable assignment and execution, which is very complex to implement because of the large amount of computation.

The task assignment of a UAV swarm is usually viewed as an optimization problem, in which the optimal solution is found by maximizing or minimizing an objective function with constraints. The classical optimization models include a mixed integer linear programming (MILP) [1] model, the traveling salesman problem (TSP) [2] model, the generalized assignment problem (GAP) [3] model, the vehicle routing problem (VRP) [4] model and the cooperative multiple task assignment problem (CMTAP) [5] model.

The methods to solve optimization models are divided into two categories. One category is to get an exact solution by mathematical programming, such as the Hungarian algorithm [6], the branch and bound search algorithm [7], dynamic programming [8], the exhaustive method [9], Newton’s method [10] and the gradient method [11]. These methods can guarantee an optimal solution of a convex problem if there is a solution, but it is difficult to solve an NP-hard non-convex problem. Moreover, this kind of method is normally suitable for low-dimensional space.

Another category is evolutionary algorithms, which are inspired by the mechanism of nature’s evolution of a biological population. These methods usually do not have polynomial time complexity, but can approach a global optimal or suboptimal solution with finite calculation cost. In the task assignment of the UAV swarm, it is difficult to establish an accurate mathematical model and guarantee that the model is convex. Therefore, the evolutionary algorithms are the most common methods to solve this kind of problem. The classical evolutionary algorithms include the simulated annealing algorithm (SAA) [12], the tabu search algorithm (TS) [13], the genetic algorithm (GA) [14], the ant colony optimization algorithm (ACO) [15], and particle swarm optimization algorithm (PSO) [16].

Considering the UAV swarm operation in the future, task assignment is challenging due to its computational complexity caused by two reasons. The first is the complexity of the task assignment model, including determining the order of tasks, the constraints and the coupling between task assignment and trajectory optimization. The second is the large amount of computation and long solution time caused by the dimension of the variables. For example, in a scenario where 80 aircraft attack 100 targets, assuming that only a single attack task is performed and each target is attacked only once, the variables describing the UAVs assigned to all targets, reach 100 dimensions, which greatly increases the computation. Jia et al. [17] and Su [18] improved the variable’s coding method where the constraint information was added so as to overcome the model complexity caused by constraints. The co-evolutionary ant colony algorithm was presented in [19] to decompose a high-dimensional variable into several low-dimensional variables. This method can effectively improve the speed of solving the algorithm itself, but it is necessary to fuse all UAVs’ decisions to meet the constraints of task assignment, such as the order of tasks. In [20], by integrating PSO with the beetle antennae search (BAS) algorithm, which can overcome the defect that PSO is prone to fall into local convergence, the convergence speed and solution accuracy of the algorithm are improved. Both in [18,20], the fixed parameters of PSO are improved to be linear or dynamic so as to effectively accelerate the convergence speed. Braun [21] developed the island model, where several populations isolated on an island are optimized by GA until they degenerate. Then the degeneration is removed by refreshing the population on each island through individuals of other islands. This method effectively solves the problem of population degradation. To mitigate the parameter tuning on high-dimensional problems, which is a computationally expensive procedure, Tuani et al. [22] introduced an adaptive approach to a heterogeneous ant colony population that evolves the alpha and beta controlling parameters for ACO.

In addition, distributed task assignment methods were proposed to improve the computation efficiency to overcome the problem of large computation and long solution time. Matin et al. [23] proposed minimum distance greedy search (MDGS) to make the decision for a UAV to select the targets, when the arrivals are dynamic and appear uniformly in a known rectangular region. This approach is currently very effective in the field of dynamic task assignment for individual UAVs. Di et al. [24] used the distributed auction algorithm, in which each UAV decides their own task goals according to their own local information, to reduce the computational load of the system to a certain extent. Similarly, a market-based decentralized algorithm was proposed by Oh et al. [25] to realize online task assignment. Zhang et al. [26] used a contract network to rapidly update the task assignment scheme after pre-assignment based on PSO. Such methods did not necessarily find the optimal solution based on the global information, so the optimal assignment of resources cannot be realized.

Despite the great progress in improving efficiency, dimension explosion is still the primary concern of these methods. High-dimension combinatorial optimization is still a difficult task assignment problem. One possible direction is to establish a hybrid method to overcome the defects of each individual method, such as PSO-SA [27], PSO-GA [28,29], and DPSO-GT [30,31]. Simulation experiments have demonstrated the effectiveness of these methods, when there are under 30 dimensions. The other direction is to design a more efficient algorithm, especially for high- dimension problems. Wu et al. [32] proposed a new heuristic swarm intelligent algorithm named the wolf pack algorithm (WPA), based on a swarm of intelligent wolves. Simulation results showed that the WPA is especially suitable for solving high-dimension and multimodal function optimization problems. Subsequently, the algorithm was applied to some classical optimization problems and achieved good results [33,34]. However, whether the WPA with excellent performance in solving high-dimensional continuous problems is suitable for discrete optimization problems such as task assignment needs further verification.

The main contribution of this paper is the application of WPA in high-dimension task assignment and the improvement on the WPA with the ideas of PSO and a GA to accelerate convergence speed. The remainder of this paper is organized as follows. Section 2 formalizes the UAV swarm task assignment problem with respect to performance requirements. The principles of the WPA and the improvement of four aspects of the WPA are described in Section 3. In Section 4, the PSO-GA-DWPA is compared with the WPA and PSO by simulation experiments in three task scenarios. Finally, Section 5 concludes the paper and briefly explores future work.

2. Task Assignment Model

In this section, a background incorporating a UAV swarm that executes an attack mission in a large field is presented. Then, a mathematical formulation is established.

There are

N_{T}

static targets whose initial positions have been revealed, and

N_{V}

attacking UAVs planning to attack these targets with the minimum cost.

In actual task assignment for ground attack, the task assignment model is very complex; it includes many factors, such as fuel constraints, ammunition quantity, timing constraints between multiple tasks, UAV mobility, threats, and redundancy tolerance. The factors considered in different combat scenarios are different, so the task assignment model varies with the actual situation. This paper considers a more general task assignment model which is not limited to a specific scenario and a specific UAV type. The point is to verify whether the proposed PSO-GA-DWPA is suitable for solving high-dimensional discrete optimization problems.

To avoid the influence of uncertain factors on the analysis of simulation results, the following ideal assumptions are set up to simplify the formalization of the problem.

There are no obstacles and no-fly zones in the task scenario. A flight trajectory can be described as connections of straight lines.
It does not take into account the consumption of time spent preparing and firing the weapon. In other words, only the time the UAV swarm takes to reach the target positions is considered.
The UAV swarm maintains the same constant velocity and hence the flight time can be represented with the flight distance.
Each target can be attacked only once. There are more targets than UAVs in the swarm, which means each UAV will probably be assigned multiple targets.

The targets can be expressed as

T = {T_{1}, T_{2}, \dots, T_{N_{T}}}

, and the attacking UAV swarm can be expressed as

V = {V_{1}, V_{2}, \dots, V_{N_{V}}}

.

The task assignment plan of UAV

V_{i} \in V

is

\begin{array}{l} p l a n_{i} = {s t a g e_{1}^{(i)}, s t a g e_{2}^{(i)}, \dots, s t a g e_{N_{m i}}^{(i)}} \\ s t a g e_{k}^{(i)} = (T_{j}, L_{k}^{(i)}), T_{j} \in T \end{array},

(1)

where

s t a g e_{k}^{(i)}

is the stage

k

of UAV

V_{i} \in V

,

N_{m i}

is the task number assigned to UAV

V_{i} \in V

,

T_{j}

is the target of the UAV

V_{i} \in V

at stage

k

, and

L_{k}^{(i)}

is the distance to

T_{j}

of UAV

V_{i} \in V

at stage

k

.

Without loss of generality, two cost indicators that must be considered in task assignment are adopted: the shortest total range of all UAVs (i.e., the lowest fuel cost) and the shortest time to complete all tasks.

Total range of all UAVs

$J_{1} = \sum_{i = 1}^{N_{V}} \sum_{k = 1}^{N_{m i}} L_{k}^{(i)} .$

(2)

For all UAVs in the swarm, the average range is

$J_{1}^{*} = \frac{\sum_{i = 1}^{N_{V}} \sum_{k = 1}^{N_{m i}} L_{k}^{(i)}}{N_{V}} .$

(3)
Time to complete all tasks

$J_{2} = \max_{i \in V} T i m e_{i},$

(4)

where $T i m e_{i}$ is the time of UAV $V_{i} \in V$ to finish all of its tasks. With consideration of the same constant velocity for all UAVs, the time of UAV $V_{i} \in V$ to finish all of its tasks can be expressed as its total range. Therefore, Equation (4) can be represented as Equation (5).

$J_{2}^{*} = \max_{i \in V} \sum_{k = 1}^{N_{m i}} L_{k}^{(i)} .$

(5)

The optimization goal is to minimize both the total range of all UAVs and the maximum range among all UAVs.

Thus, the cost function of task assignment is

\min J = ω_{1} J_{1}^{*} + ω_{2} J_{2}^{*} = ω_{1} \frac{\sum_{i = 1}^{N_{V}} \sum_{k = 1}^{N_{m i}} L_{k}^{(i)}}{N_{V}} + ω_{2} \max_{i \in V} \sum_{k = 1}^{N_{m i}} L_{k}^{(i)},

(6)

where

ω_{1}

and

ω_{2}

are weighting factors reflecting the importance of each performance criterion, decided by the commander, and

ω_{1} + ω_{2} = 1

3. The Wolf Pack Algorithm

3.1. The Basics of the Wolf Pack Algorithm

The WPA adopts the bottom-up design principle and simulates wolves hunting cooperatively according to their division system of responsibilities as shown in Figure 1. Wolves are divided into a leader wolf, exploring wolves and fierce wolves, and they share information with each other. The leader wolf is closest to the prey because it is most sensitive to smell, so it is the guidance of the wolf pack. Exploring wolves are responsible for detecting the environment to find the position of the prey, due to their sensitivity to smell. Fierce wolves are responsible for getting close to catch the prey quickly.

The problem space is defined as an

N \times D

Euclidean space, where

N

is the number of wolves in the wolf pack, and

D

is the number of each wolf’s information dimension. The position of the wolf

i \in N

is expressed as

X_{i} = {x_{i 1}, \dots x_{i d}, \dots, x_{i D}}

, where

x_{i d}

is the value in

d^{t h} (d \in D)

dimension of wolf

i \in N

, the smell concentration of prey perceived by wolf is described as the cost function value. The distance between the wolf

p

and

q

is defined as the Manhattan distance.

L (p, q) = \sum_{d = 1}^{D} | x_{p d} - x_{q d} | .

(7)

WPA is shown in Algorithm 1. The whole process of wolf pack hunting can be abstracted into 3 features:

the selection of the leader wolf based on the winner-take-all rule;
three cooperative behaviors including walking, calling and sieging;
an update mechanism based on the strongest-survives law.

Alorithm 1: WPA

3.1.1. The Selection of Leader Wolf Based on Winner-Take-All Rule

The wolf with the minimum cost can be the leader wolf at any time. Initially, the wolf with the minimum cost is chosen as the leader wolf. In the subsequent stages, if the cost of some wolf is less than that of the current leader wolf, that wolf will be the new leader. If multiple wolves are the best at the same time, pick one at random to be the leader wolf. The leader wolf goes directly to the next iteration without executing the following cooperative behaviors, until it is replaced by a new leader wolf.

3.1.2. Three Cooperative Behaviors

Walking behavior. $S$ suboptimal wolves are selected as exploring wolves to perform the walking behavior. $S$ is a random integer picked from $[\frac{N}{α + 1}, \frac{N}{α}]$ , where $α$ is the scale factor of exploring wolves. Starting from the current position, exploring wolf $i \in N$ makes one step forward towards $h$ directions. On account of individual differences, $h$ generally takes a random integer within a limited range. The new position to the $b^{t h} (b \in h)$ direction is obtained by

$X_{i}^{b} = X_{i} + \sin (2 π \times \frac{b}{h}) \times s t e p_{a},$

(8)

where $s t e p_{a}$ is the length of walking step. Exploring wolf $i \in N$ will return to its initial position after detecting each direction, then choose the best one to update its position. All exploring wolves will keep walking until the satisfy one of the following conditions.

As long as one exploring wolf’s new position is better than that of the leader wolf, this exploring wolf will be the new leader wolf and the wolf pack will move to the calling behavior.
When walking times reach the maximum $T_{\max}$ , the wolf pack move to the calling behavior.

2.: Calling behavior. The leader wolf calls for $M (M = N - S - 1)$ nearest wolves as fierce wolves to get close to its position rapidly with a large step. The new position of fierce wolf $i \in N$ is obtained by

$X_{i}^{k + 1} = X_{i}^{k} + s t e p_{b} \times \frac{X_{l e a d e r}^{k} - X_{i}^{k}}{| X_{l e a d e r}^{k} - X_{i}^{k} |},$

(9)

where $s t e p_{b}$ is the length of raid step. Fierce wolves will continue to get close to the leader wolf until one of the following conditions is satisfied.

As long as one fierce wolf’s new position is better than that of the leader wolf, this fierce wolf will be the new leader and the wolf pack moves to the sieging behavior.
The Manhattan distance between a fierce wolf and the leader wolf is less than the threshold distance $d_{n e a r}$ . $d_{n e a r}$ is estimated according to

$d_{n e a r} = \frac{1}{D \times ω} \times \sum_{d = 1}^{D} | \max_{d} - \min_{d} |,$

(10)

where $ω$ is the factor representing the threshold distance and $[\min_{d}, \max_{d}]$ is the variable’s domain in the $d^{t h}$ dimension.

3.: Sieging behavior. The leader wolf, whose position is treated as the prey’s position, guides all other wolves to siege the prey with a small step. For iteration $k$ , the position of the prey is $X_{l e a d e r}^{k}$ ; then the new position of wolf $i \in N$ is updated according to

$X_{i}^{k + 1} = X_{i}^{k} + λ \cdot s t e p_{c} \cdot | X_{l e a d e r}^{k} - X_{i}^{k} |,$

(11)

where $λ$ is a random real number in $[- 1, 1]$ , and $s t e p_{c}$ is the length of the siege step. For any wolf, if its new position is better than the current position, its position will be updated, otherwise its current position will be kept. The wolf whose position is the best will be chosen as the leader wolf.

3.1.3. Update Mechanism Based on Strong-Survive Law

Prey is distributed according to the principle of strong to weak, which will cause the weakest wolves to starve without food, that is to say, the wolves too far away from the prey will be eliminated. In the algorithm, the

R

weakest wolves will be eliminated from the population and extra new

R

wolves will be generated randomly. The larger

R

is, the more wolves will be generated, which is conducive to maintaining the diversity of individuals in the population. However, if

R

is too large, the algorithm tends to perform a random search. On the contrary, if

R

is too small, it is not conducive to maintaining the diversity of individuals in the population, and the ability of the algorithm to open up a new solution space is weakened. Since the size and number of prey are different with each capture, the number of wolves starving to death varies.

R

is a random integer in

[\frac{N}{2 β}, \frac{N}{β}]

, where

β

is the scale factor of the wolf population update.

3.2. The Proposed PSO-GA-DWPA

The WPA is suitable for continuous problems where the variable in each dimension changes continuously. However, the task assignment problem is a discrete problem in which the variable in each dimension is an integer belonging to a set. As such, based on the rules of the WPA, it is necessary to improve the WPA to match the integer discrete character of the task assignment. In addition, in order to improve the convergence speed and solution accuracy, PSO and a GA are introduced into WPA. The details of the PSO-GA-DWPA are as follows and the algorithm is shown in Algorithm 2.

3.2.1. Integer Matrix Coding

Integer matrix coding is adopted to express the assignment schema

X_{i} = [\begin{array}{l} x_{i 1} & x_{i 2} & \dots & x_{i j} & \dots & x_{i N_{T}} \\ y_{i 1} & y_{i 2} & \dots & y_{i j} & \dots & y_{i N_{T}} \end{array}],

(12)

where

X_{i}

is the position of the wolf

i \in N

,

N

is the population number of wolves, the dimensions of variable is

N_{T}

because there are more targets than UAVs and each target can only be attacked once. The 1st row of Equation (12) is the index of the UAVs which perform attacking tasks where

x_{i j} \in V

, and elements in this row can be repeated because a UAV can attack multiple targets. The 2ed row of Equation (12) is the index of targets, where

y_{i j} \in T

. For the same UAV, the order in which the target number appears is the attack order. For example, if there are 5 UAVs attacking 8 targets and the coding matrix is shown as

X_{i} = [\begin{array}{l} 1 & 3 & 2 & 2 & 1 & 1 & 3 & 3 \\ 3 & 5 & 1 & 4 & 2 & 8 & 7 & 6 \end{array}]

, then the task assignment scheme is as follows:

p l a n_{1} = {(3, L_{1}^{(1)}), (2, L_{2}^{(1)}), (8, L_{2}^{(1)})}

,

p l a n_{2} = {(1, L_{1}^{(2)}), (4, L_{2}^{(2)})}

, and

p l a n_{3} = {(5, L_{1}^{(3)}), (7, L_{2}^{(3)}), (6, L_{2}^{(3)})}

.

3.2.2. Improvement on Walking Behavior

The walking behavior of wolves is essentially an active exploration of the unknown environment. This process determines the extent of the current best solution approaches to the global optimal solution. In the WPA, each exploring wolf scouts the prey from h directions divided equally by 360 degrees as Equation (8), and adjusts the coverage of the scout by changing the size of

h

. The bigger

h

is, the larger the coverage of the scout will be, but the speed of the scout will be relatively slower. To scout the prey more effectively, PSO is introduced in this paper to make the walking process completed from as few directions as possible, under the guidance of the global extremum and the individual extremum.

Each exploring wolf is represented as one particle in the PSO. The implementation of PSO is divided into three steps [20]: tracking individual extremum, tracking global extremum and individual variation.

Tracking individual extremum

This part of the formula is expressed in the discrete domain as Equation (13). This represents a copy operation with probability

c_{1}

.

X_{i}^{k}

represents the position of wolf

i \in S

in iteration

k

.

P_{i}^{k}

is the optimal solution of wolf

i \in S

according to the current scouted directions, representing the individual extremum of wolf

i \in S

in iteration

k

. Note that

φ_{i}^{k}

is a temporary variable.

φ_{i}^{k} = c_{1} \otimes F_{1} (X_{i}^{k}, P_{i}^{k}) = {\begin{matrix} F_{1} (X_{i}^{k}, P_{i}^{k}), & ε = r a n d () < c_{1} \\ X_{i}^{k}, & o t h e r w i s e \end{matrix} .

(13)

During the operation in Figure 2, a random number

ε \in [0, 1]

is generated firstly. If

ε < c_{1}

, where

c_{1}

is the threshold to determine crossover, two random integers

a, b \in [1, N_{T}]

are generated. The columns between the index range

[a, b]

of

X_{i}^{k}

will be exchanged with the counterpart in the individual extremum

P_{i}^{k}

. The result of this operation is

φ_{i}^{k}

. On the other hand, if

ε \geq c_{1}

,

φ_{i}^{k} = X_{i}^{k}

.

2.: Tracking global extremum

Equation (14) represents a copy operation with probability

c_{2}

.

P_{g}^{k}

is the position of the leader wolf, representing the global extremum in iteration

k

.

λ_{i}^{k}

is a temporary variable.

λ_{i}^{k} = c_{2} \otimes F_{1} (φ_{i}^{k}, P_{g}^{k}) = {\begin{matrix} F_{1} (φ_{i}^{k}, P_{g}^{k}), & ε = r a n d () < c_{2} \\ φ_{i}^{k}, & o t h e r w i s e \end{matrix} .

(14)

During the operation in Figure 3, a random number

ε \in [0, 1]

is generated firstly. If

ε < c_{2}

, where

c_{2}

is the threshold to determine crossover, two random integers

a, b \in [1, N_{T}]

are generated. The columns between the index range

[a, b]

of

φ_{i}^{k}

will be exchanged with the counterpart in the global extremum

P_{g}^{k}

. The result of this operation is

λ_{i}^{k}

. On the other hand, if

ε \geq c_{2}

,

λ_{i}^{k} = φ_{i}^{k}

.

3.: Individual variation

This part of the formula is described in the discrete domain as Equation (15), where

s t e p_{a}

is the walking step, which determines how many columns of

λ_{i}^{k}

will mutate.

W_{i}^{k}

is a temporary variable representing the result of the walking behavior.

W_{i}^{k} = F_{2} (λ_{i}^{k}, Γ (s t e p_{a})) .

(15)

During the operation in Figure 4, function

Γ (s t e p_{a})

randomly selects

s t e p_{a}

columns in the

λ_{i}^{k}

. The values of the

1^{s t}

row in these columns are randomly selected from

[1, N_{V}]

, while the values of the second row are randomly exchanged.

3.2.3. Improvement on Calling Behavior

The calling behavior is an essential process for fierce wolves to converge to the current optimal solution which is the position of the leader wolf. This behavior determines the convergence rate of the algorithm. Inspired by gene segment duplication in the GA, each fierce wolf duplicates a piece of the leader’s position to replace its own to realize the fast approach of the leader wolf.

This part of the formula is expressed in the discrete domain as Equation (16).

P_{g}^{k}

is the position of the leader wolf, representing the global extremum in iteration

k

,

s t e p_{b}

is the raid step, and

C_{i}^{k}

(

i \in M

) is a temporary variable representing the result of the calling behavior.

C_{i}^{k} = F_{1} (X_{i}^{k}, P_{g}^{k}, s t e p_{b}) .

(16)

During the operation in Figure 5, a continuous segment of length

s t e p_{b}

is randomly intercepted from the wolf

i \in M

, then, is replaced by the corresponding segment of the leader wolf.

3.2.4. Improvement on Sieging Behavior

This part of the formula is expressed in the discrete domain as Equation (17), where

s t e p_{c}

is siege step, and

X_{i}^{k + 1}

is the result of the siege behavior.

X_{i}^{k + 1} = F_{2} (X, Γ (s t e p_{c})) = {\begin{matrix} F_{2} (W_{i}^{k}, Γ (s t e p_{c})), & i \in S \\ F_{2} (C_{i}^{k}, Γ (s t e p_{c})), & i \in M \end{matrix} .

(17)

During the operation in Figure 6, Function

Γ (s t e p_{c})

randomly selects

s t e p_{c}

columns in

X

. The values of the first row in these columns are randomly selected from

[1, N_{V}]

, while the values of the second row are randomly exchanged.

3.2.5. Improvement in Wolf Population Update

In WPA,

R

weakest wolves are eliminated and replaced by new wolves generated randomly just as the initialization. Experiments show that new wolves generated by this method do not have competitiveness, and they are finally eliminated and meaningless in the algorithm. In the PSO-GA-DWPA, a new wolf is generated by using the method of small variation of the leader wolf. The process is similar to individual variation in walking behavior.

This is expressed in the discrete domain as Equation (18), where

s t e p_{d}

is variation step, and

X_{i}^{k + 1}

is the position of new wolf.

X_{i}^{k + 1} = F_{2} (P_{g}^{k}, Γ (s t e p_{d})) .

(18)

During the operation in Figure 7, Function

Γ (s t e p_{d})

randomly selects

s t e p_{d}

columns in

P_{g}^{k}

. The values of the first row in these columns are randomly selected from

[1, N_{V}]

, while the values of the second row are randomly exchanged.

In this way, the new wolf can not only be competitive, but can also conduct local optimization at the position of the leader wolf to avoid local convergence.

Alorithm 2: POS-GA-DWPA

4. Experiments of Task Assignment for UAV Swarm Using PSO-GA-DWPA

The performance of the PSO-GA-DWPA is analyzed in this section using simulation according to the assumptions in Section 2. By setting three scenarios with different numbers of UAVs and targets, we verified the applicable scope and efficiency of the algorithm. The parameters used for the simulation and the PSO-GA-DWPA are summarized in Table 1.

4.1. Monte Carlo Simulation in Different Scenarios

A Monte Carlo study, consisting of 50 runs, is used in this section to compare the performance of the PSO-GA-DWPA, WPA and PSO algorithms with respect to the cost function of Equation (6) where

ω_{1} = ω_{2} = 0.5

. Three scenarios are examined: 5 UAVs on 8 targets, 20 UAVs on 30 targets and 100 UAVs on 150 targets. The initial locations of the UAVs and targets were generated randomly as shown in Figure 8.

The algorithms are coded in Python, and all simulations are run on a computer with a 2.50-GHz Intel Core i5 CPU and 4 GB of RAM.

First, a small size scenario of 5 UAVs on 8 targets is analyzed. In Figure 9 and Figure 10, the cost curve and the running time curve (the intersection of the solid line and dotted line indicates the convergence time) are plotted to compare the convergence performances of the three methods. Figure 9 shows that the optimization capabilities of the three algorithms are similar, but the convergence speed of the WPA and the PSO-GA-DWPA is better than that of PSO. Also, PSO is time-consuming and WPA and PSO-GA-DWPA is time-saving. Since the calculation process of the PSO-GA-DWPA is more complicated than that of the WPA, running time for each iteration of the PSO-GA-DWPA is slightly longer than that of the WPA. However, the PSO-GA-DWPA needs less number of iterations to converge, so the convergence time is shorter than that of the WPA.

Compared with the recognized fast and accurate PSO algorithm, the advantage of the WPA and the proposed PSO-GA-WPA in a low-dimensional solution space (such as the 8-dimensional solution space in scenario 1), is not obvious. The experiments in Scenario 1 prove the availability of the WPA and the PSO-GA-WPA for task assignment problem. The higher the dimension of the solution space is, the advantage of the proposed algorithm is more obvious. Compared with the WPA, the PSO-GA-WPA has the advantage of a faster convergence speed. Although their convergence costs are similar, the PSO-GA-WPA can obtain a convergent solution in a shorter time.

Second, we set the size as 20 UAVs on 30 targets, in other words, the dimension of the optimization problem is 30. As shown in Figure 10, it is obvious that the optimization performances of the WPA and the PSO-GA-DWPA are much better than that of the PSO. It can be seen that the WPA and the PSO-GA-DWPA can obtain a better solution in a shorter time. PSO-GA-DWPA has the same optimization performance as WPA, however, its convergence speed is better than that of the WPA.

Finally, the dimension of the optimization problem is set to be 150, that is, the scenario is 100 UAVs on 150 targets. The convergence performance and running time of different measures are shown in Figure 11. With the increase of dimension, PSO has become ineffective, (i) the optimal solution cannot be obtained; and (ii) the calculation speed is too slow, and the time required for 1000 iterations is nearly 2.5 times that of the other two algorithms. Running time for each iteration of the PSO-GA-DWPA is slightly longer than that of the WPA, but its convergence speed in the early stage is obviously faster, so its convergence time is shorter than that of WPA.

The comparison of the three methods in 3 scenarios is shown in Table 2, which indicates visually that the overall performance of the PSO-GA-DWPA is better than that of the other two methods. According to Figure 9, Figure 10 and Figure 11 and Table 2, compared with the PSO, which is recognized as fast and accurate, the advantage of the WPA and the proposed PSO-GA-WPA in the low-dimensional solution space (such as the 8-dimensional solution space in Scenario 1) is not obvious. The experiments in Scenario 1 prove the availability of the WPA and the PSO-GA-WPA for task assignment problem. The higher the dimension of the solution space is, the advantage of the proposed algorithm is more obvious. Compared with the WPA, the PSO-GA-WPA has the advantage of a faster convergence speed. Although their convergence costs are similar, the PSO-GA-WPA can obtain a convergent solution in a shorter time.

4.2. The Real-Time Analysis of the PSO-GA-DWPA

In practical application, the situation of the battlefield is rapidly changing; as a result, the time of task assignment is limited. When the optimization problem scale is small, the global optimal solution or sub-optimal solution can be obtained within the specified time. However, if the dimension of the optimization problem is high, it is impossible to get the optimal solution in the specified time. In this case, an algorithm with a faster convergence speed is required in the early stage to obtain a relatively reasonable solution.

As shown in Figure 9b, to get the convergent solution of the optimization problem with 8 dimensions, the WPA needs 5 s and the PSO-GA-DWPA needs 2 s, which meets the requirement of real time. While as shown in Figure 10b and Figure 11b, the WPA needs 70 s and the PSO-GA-DWPA needs 30 s when the dimension of the problem is 30, even worse, the WPA needs 2500 s and the PSO-GA-DWPA needs 1800 s when the dimension of the problem goes up to 150. Hence in practice, a compromise must be made between the optimal solution and the convergence time.

In the 150-dimensional scenario, it is required to make the decision of task assignment within a specified time. Fifty experiments with three methods were conducted to compare the adaptability of the three methods. The experimental results are shown in Figure 12. As can be seen from the results in the figure, the PSO-GA-DWPA obtains a better solution within the specified time. If we need to obtain a better but not optimal solution in finite time, the PSO-GA-DWPA is a better choice.

5. Conclusions

Aiming at the problem of the UAV swarm task assignment, this paper introduces a novel swarm intelligence algorithm, the wolf pack algorithm (WPA), which is particularly suitable for high-dimensional continuous optimization problems. In order to apply this method to the discrete optimization problem and improve the convergence speed, this paper proposes the PSO-GA-DWPA based on matrix coding and the principles of particle swarm optimization and genetic algorithm. According to a general task assignment model and simulation experiments, this method proved to be applicable for high-dimensional task assignment.

The simulation results show that in the case of a small dimension such as Scenario 1, the PSO-GA-DWPA has almost the same optimization ability as the PSO. However, when the dimension increases to 30 in Scenario 2, the PSO-GA-DWPA is obviously superior to PSO both in terms of convergence speed and solution accuracy. When the dimension increases to 150 in Scenario 3, the PSO fails, while the PSO-GA-DWPA can still obtain an effective solution. Therefore, this method can serve as a useful reference for the UAV swarm task assignment.

Author Contributions

Conceptualization, Y.L.; methodology, Y.L.; software, Y.L.; validation, Y.L., Y.M. and J.W.; formal analysis, Y.L.; investigation, Y.L.; resources, Y.L., Y.M., J.W. and L.H.; data curation, Y.L.; writing—original draft preparation, Y.L.; writing—review and editing, Y.L.; visualization, Y.L.; supervision, Y.M., J.W. and L.H.; project administration, Y.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Acknowledgments

The authors would like to thank the anonymous reviewers for their constructive comments and suggestions, which improved the quality of the paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

Peters, J.R.; Surana, A.; Bullo, F. Robust Scheduling and Routing for Collaborative Human/Unmanned Aerial Vehicle Surveillance Missions. J. Aerosp. Inf. Syst. 2018, 15, 1–19. [Google Scholar] [CrossRef]
Ran, H.; Zhou, R.; Dong, Z. Cooperative guidance of multi aircraft in beyond-visual-range air combat. J. Beijing Univ. Aeronaut. Astronaut. 2014, 40, 1457–1462. [Google Scholar]
Cattrysse, D.; van Wassenhove, L. A survey of algorithms for the generalized assignment problem. Eur. J. Oper. Res. 1992, 60, 260–272. [Google Scholar] [CrossRef] [Green Version]
Drexl, M. Synchronization in vehicle routing-a survey of VRPs with multiple synchronization constraints. Transp. Sci. 2012, 46, 297–316. [Google Scholar] [CrossRef] [Green Version]
Shima, T.; Rasmussen, S.J.; Sparks, A.G.; Passino, K.M. Multiple task assignments for cooperating uninhabited aerial vehicles using genetic algorithms. Comput. Oper. Res. 2006, 33, 3252–3269. [Google Scholar] [CrossRef]
Kuhn, H.W. The Hungarian method for the assignment problem. Nav. Res. Logist. Q. 1955, 2, 83–97. [Google Scholar] [CrossRef] [Green Version]
Shima, T.; Rasmussen, S.J. Tree search algorithm for assigning cooperating UAVs to multiple tasks. Int. J. Robust Nonlinear Control 2008, 18, 135–153. [Google Scholar]
Alighanbari, M.; How, J. Cooperative task assignment of unmanned aerial vehicles in adversarial environments. In Proceedings of the American Control Conference, Portland, OR, USA, 8–10 June 2005; pp. 4661–4666. [Google Scholar]
Ling, X. The approximate optimal solution of the traveling salesman problem is obtained by the optimal exhaustive method. Comput. Appl. Res. 1998, 15, 82–83. [Google Scholar]
Lipson, J.D. Newton’s method: A great algebraic algorithm. In Proceedings of the Third ACM Symposium on Symbolic & Algebraic Computation, Yorktown Heights, NY, USA, 10 August 1976; pp. 260–270. [Google Scholar]
Ji, S.; Ye, J. An accelerated gradient method for trace norm minimization. In Proceedings of the 26th Annual International Conference on Machine Learning, Montreal, QC, Canada, 14–18 June 2009; pp. 457–464. [Google Scholar]
Wang, C.; Mu, D.; Zhao, F.; Sutherland, J.W. A parallel simulated annealing method for the vehicle routing problem with simultaneous pickup-delivery and time windows. Comput. Ind. Eng. 2015, 83, 111–122. [Google Scholar] [CrossRef]
Diaz, J.A.; Fernandez, E. A Tabu search heuristic for the generalized assignment problem. Eur. J. Oper. Res. 2011, 132, 22–38. [Google Scholar] [CrossRef]
Shima, T.; Rasmussen, S.J.; Sparks, A.G. UAV cooperative multiple task assignments using genetic algorithms. In Proceedings of the American Control Conference, Portland, OR, USA, 8–10 June 2005; pp. 2989–2994. [Google Scholar]
Boveiri, H.R. An incremental ant colony optimization based approach to task assignment to processors for multiprocessor scheduling. Front. Inf. Technol. Electron. Eng. 2017, 18, 498–510. [Google Scholar] [CrossRef]
Sujit, P.B.; George, J.M.; Beard, R. Multiple UAV task allocation using particle swarm optimization. In Proceedings of the AIAA Guidance, Navigation and Control Conference and Exhibit, Honolulu, HI, USA, 18–21 August 2008; pp. 1–9. [Google Scholar]
Jia, Z.; Yu, J.; Ai, X.; Xu, X.; Yang, D. Cooperative multiple task assignment problem with stochastic velocities and time windows for heterogeneous unmanned aerial vehicles using a genetic algorithm. Aerosp. Sci. Technol. 2018, 76, 112–125. [Google Scholar] [CrossRef]
Liang, G.; Kang, Y.; Xing, Z.; Yin, G. UAV Cooperative Multi-task Assignment Based on Discrete Particle Swarm Optimization Algorithm. Comput. Simul. 2018, 35, 22–28. [Google Scholar]
Su, F. Research on Distributed Online Cooperative Mission Planning for Multiple Unmanned Combat Aerial Vehicles in Dynamic Environment. PhD Thesis, National University of Defense Technology, Changsha, China, 2013. [Google Scholar]
Wu, X. Coordination Tasks Pre-Allocation and Redistribution Studies in UAVs. Master’s Thesis, Nanchang Hangkong University, Nanchang, China, 2018. [Google Scholar]
Braun, H. On solving travelling salesman problems by genetic algorithms. In International Conference on Parallel Problem Solving from Nature; Springer: Berlin/Heidelberg, Germany, 1990; pp. 129–133. [Google Scholar]
Tuani, A.F.; Keedwell, E.; Collett, M. Heterogenous Adaptive Ant Colony Optimization with 3-opt local search for the Travelling Salesman Problem. Appl. Soft Comput. 2020, 106720, 1–14. [Google Scholar] [CrossRef]
Matin, H.N.; Yekkehkhany, A.; Nagi, R. Probabilistic Analysis of UAV Routing with Dynamically Arriving Targets. In Proceedings of the 22th International Conference on Information Fusion, Ottawa, ON, Canada, 2–5 July 2019; pp. 1–8. [Google Scholar]
Di, B.; Rui, Z.; Jiang, W. Distributed coordinated heterogeneous task allocation for unmanned aerial vehicles. Control Decis. 2013, 28, 274–278. [Google Scholar]
Oh, G.; Kim, Y.; Ahn, J.; Choi, H.-L. Market-Based Task Assignment for Cooperative Timing Missions in Dynamic Environments. J. Intell. Robot. Syst. 2017, 87, 97–123. [Google Scholar] [CrossRef]
Zhang, J.; Wang, Y.; Li, F.; Zhou, T. Dynamic Task Assignment Problem of Multi-agent. Electron. Technol. Softw. Eng. 2018, 18, 255–258. [Google Scholar]
Li, Y.; Dong, Y. Weapon-target assignment based on simulated annealing and discrete particle swarm optimization in cooperative air combat. Acta Aeronaut. Et Astronaut. Sin. 2010, 31, 626–631. [Google Scholar]
Kumar, N.; Vidyarthi, D.P. A novel hybrid PSO-GA metaheuristic for scheduling of DAG with communication on multiprocessor systems. Eng. Comput. 2016, 32, 35–47. [Google Scholar] [CrossRef]
López LF, M.; Blas, N.G.; Albert, A.A. Multidimensional knapsack problem optimization using a binary particle swarm model with genetic operations. Soft Comput. 2018, 22, 2567–2582. [Google Scholar] [CrossRef]
Guo, T.; Michalewicz, Z. Inver-over operator for the TSP. In International Conference on Parallel Problem Solving from Nature; Springer: Heidelberg, Germany, 1998; pp. 803–812. [Google Scholar]
Yan, J.; Li, X.; Liu, B. Cooperative task allocation of multi-UAVs with mixed DPSO-GT algorithm. J. Natl. Univ. Def. Technol. 2015, 37, 165–171. [Google Scholar]
Wu, H.; Zhang, F.; Wu, L. New Swarm Intelligence Algorithm-Wolf Pack Algorithm. Syst. Eng. Electron. 2013, 35, 2430–2437. [Google Scholar]
Liu, Y.; Li, W.; Wu, H.; Song, W. Track Planning for Unmanned Aerial Vehicles Based on Wolf Pack Algorithm. J. Syst. Simul. 2015, 27, 1838–1843. [Google Scholar]
Zhang, L.; Lei, Z.; Liu, S.; Zhou, J. Three-Dimensional Underwater Path Planning Based on Modified Wolf Pack Algorithm. IEEE Access 2017, 5, 22783–22795. [Google Scholar] [CrossRef]

Figure 1. Hunting process of the wolf pack.

Figure 2. Tracking individual extremum.

Figure 3. Tracking global extremum.

Figure 4. Individual variation.

Figure 5. Duplicating the leader wolf’s position segment in the calling behavior.

Figure 6. Individual variation in siege behavior.

Figure 7. Individual variation in the update stage.

Figure 8. Initial positions of the UAV swarm and targets. (a) Scenario 1: 5 UAVs and 8 targets are randomly distributed in a space of 100 × 100 m. (b) Scenario 2: 20 UAVs and 30 targets are randomly distributed in a space of 1000 × 1000 m. (c) Scenario 3: 100 UAVs and 150 targets are randomly distributed in a space of 10,000 × 10,000 m.

Figure 9. Simulation results in scenario 1: (a) the mean cost of the Monte Carlo runs, and (b) the mean convergence time of the Monte Carlo runs.

Figure 10. Simulation results in Scenario 2: (a) mean cost of the Monte Carlo runs, and (b) the mean convergence time of the Monte Carlo runs.

Figure 11. Simulation results in scenario 3: (a) the mean cost of the Monte Carlo runs, and (b) the mean convergence time of the Monte Carlo runs.

Figure 12. Mean of the Monte Carlo runs: 100 vs. 150 scenario, minimum cost within specified time.

Table 1. Simulation parameters.

Parameters	Scenario 1	Scenario 2	Scenario 3
iterations	200	400	1000
population size	100	100	100
$s t e p_{a}$	2	2	2
$s t e p_{b}$	4	14	70
$s t e p_{c}$	1	2	2
$s t e p_{d}$	2	2	2
$T_{\max}$	10	10	10
$d_{n e a r}$	2	14	70
$α$	4	4	4
$β$	5	5	5

Table 2. Comparison of simulation results.

Scenario	Algorithm	Cost Function				Convergence Time
Scenario	Algorithm	Maximum	Minimum	Mean	Standard Deviation	Maximum	Minimum	Mean	Standard Deviation
5 vs. 8	PSO	31.69	28.72	29.49	0.78	46.77	16.23	36.31	7.8
	WPA	29.7	28.72	28.92	0.37	7.47	2.64	4.22	1.43
	PSO-GA-WPA	30.98	28.72	29.23	0.52	4.36	1.45	2.53	0.17
20 vs. 30	PSO	582.86	412.4	511.84	37	-	-	-	-
	WPA	279.15	239.23	261.95	12.65	100.55	64.89	70.29	10.17
	PSO-GA-WPA	269.84	235.28	253.6	10.2	51	23.67	29.56	4.77
100 vs. 150	PSO	8586	7929	8212	225	-	-	-	-
	WPA	1908	1650	1783	80	2508	2160	2255	102
	PSO-GA-WPA	1880	1480	1606	78	1894	1784	1844	27

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lu, Y.; Ma, Y.; Wang, J.; Han, L. Task Assignment of UAV Swarm Based on Wolf Pack Algorithm. Appl. Sci. 2020, 10, 8335. https://doi.org/10.3390/app10238335

AMA Style

Lu Y, Ma Y, Wang J, Han L. Task Assignment of UAV Swarm Based on Wolf Pack Algorithm. Applied Sciences. 2020; 10(23):8335. https://doi.org/10.3390/app10238335

Chicago/Turabian Style

Lu, Yingtong, Yaofei Ma, Jiangyun Wang, and Liang Han. 2020. "Task Assignment of UAV Swarm Based on Wolf Pack Algorithm" Applied Sciences 10, no. 23: 8335. https://doi.org/10.3390/app10238335

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Task Assignment of UAV Swarm Based on Wolf Pack Algorithm

Abstract

1. Introduction

2. Task Assignment Model

3. The Wolf Pack Algorithm

3.1. The Basics of the Wolf Pack Algorithm

3.1.1. The Selection of Leader Wolf Based on Winner-Take-All Rule

3.1.2. Three Cooperative Behaviors

3.1.3. Update Mechanism Based on Strong-Survive Law

3.2. The Proposed PSO-GA-DWPA

3.2.1. Integer Matrix Coding

3.2.2. Improvement on Walking Behavior

3.2.3. Improvement on Calling Behavior

3.2.4. Improvement on Sieging Behavior

3.2.5. Improvement in Wolf Population Update

4. Experiments of Task Assignment for UAV Swarm Using PSO-GA-DWPA

4.1. Monte Carlo Simulation in Different Scenarios

4.2. The Real-Time Analysis of the PSO-GA-DWPA

5. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI