ERRT-GA: Expert Genetic Algorithm with Rapidly Exploring Random Tree Initialization for Multi-UAV Path Planning

Xu, Hong; Niu, Zijing; Jiang, Bo; Zhang, Yuhang; Chen, Siji; Li, Zhiqiang; Gao, Mingke; Zhu, Miankuan

doi:10.3390/drones8080367

Open AccessArticle

ERRT-GA: Expert Genetic Algorithm with Rapidly Exploring Random Tree Initialization for Multi-UAV Path Planning

by

Hong Xu

^1,2,†

,

Zijing Niu

^2,†,

Bo Jiang

^1,2,

Yuhang Zhang

²

,

Siji Chen

^2,3,

Zhiqiang Li

²,

Mingke Gao

² and

Miankuan Zhu

^2,*

¹

School of Computer Science, Hangzhou Dianzi University, Hangzhou 310018, China

²

The 32nd Research Institute of China Electronics Technology Group Corporation, Shanghai 201808, China

³

School of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Drones 2024, 8(8), 367; https://doi.org/10.3390/drones8080367

Submission received: 14 June 2024 / Revised: 12 July 2024 / Accepted: 30 July 2024 / Published: 1 August 2024

Download

Browse Figures

Versions Notes

Abstract

In unmanned aerial vehicle (UAV) path planning, evolutionary algorithms are commonly used due to their ability to handle high-dimensional spaces and wide generality. However, traditional evolutionary algorithms have difficulty with population initialization and may fall into local optima. This paper proposes an improved genetic algorithm (GA) based on expert strategies, including a novel rapidly exploring random tree (RRT) initialization algorithm and a cross-variation process based on expert guidance and the wolf pack search algorithm. Experimental results on baseline functions in different scenarios show that the proposed RRT initialization algorithm improves convergence speed and computing time for most evolutionary algorithms. The expert guidance strategy helps algorithms jump out of local optima and achieve suboptimal solutions that should have converged. The ERRT-GA is tested for task assignment, path planning, and multi-UAV conflict detection, and it shows faster convergence, better scalability to high-dimensional spaces, and a significant reduction in task computing time compared to other evolutionary algorithms. The proposed algorithm outperforms most other methods and shows great potential for UAV path planning problems.

Keywords:

multi-UAV; task assignment; path planning; RRT initialization; expert genetic

1. Introduction

Unmanned aerial vehicles (UAVs) have found widespread applications in various fields due to their high mobility and reliability, including agriculture, forestry, environmental monitoring, urban planning, disaster management, and security monitoring [1,2,3,4,5]. With the emergence of swarm intelligence, UAVs can collaborate in swarm to perform tasks beyond the capabilities of individual UAVs, constrained by functional and resource limitations. This collaboration presents benefits to military, civilian, and commercial domains, as illustrated in Figure 1.

However, the effective deployment of UAV swarm in complex scenarios with multiple missions and obstacles presents a challenge [6]. UAV swarm encounter difficulties in meeting all mission requirements, finding safe and suitable path areas, and avoiding conflicts between multiple-UAV paths. Therefore, path planning has become one of the most active research areas. UAVs are required to adhere to predefined paths and operate under ground control. The most critical aspect of UAV swarm is their cooperative potential and the mechanism to plan and follow the best path to accomplish the mission [7]. UAV path planning in complex environments is constrained by factors such as path length, optimal selection, obstacle boundaries, and UAV kinematics, which limit the possibility of obtaining the best path for the UAV. To overcome these constraints, path planning methods with various constraints have been proposed to enable UAVs to discover the best and collision-free paths while reducing energy consumption [8].

This study aims to devise strategies for search routes, task assignments, and path planning for multiple UAVs in a simulated environment, targeting a pre-determined search location. In the realm of path planning, search-based and sampling-based algorithms have been extensively utilized in the industry, while other algorithms continue to evolve, each presenting distinct merits and limitations. However, the complexity of handling numerous UAVs, coupled with the non-convexity of obstacles in the scenarios presented in this paper, renders many existing algorithms ineffective in achieving satisfactory outcomes. Therefore, the challenges of path planning can be summarized in three points: (1) dimensional catastrophes for high-dimensional spaces or an increased level of intelligence; (2) searching in complex environments, such as dynamic environments or environments with non-convex obstacles; and (3) guaranteeing the solution optimality of the found path.

Considering the effectiveness of evolutionary algorithms in addressing non-convex environments and non-deterministic polynomial-time hardness (NP-hard) problems, this study employs genetic algorithms (GA) toward making advancements in this field, specifically targeting the resolution of path planning challenges presented in the given scenario. Traditional genetic algorithms commence by initializing populations through the selection of feasible points across each row of a grid map, ensuring connectivity among all nodes via a linking function, and ultimately deriving a viable population-based solution. However, this method has three drawbacks: (1) For the extreme case of the map, as shown in Figure 2, initialized populations cannot be obtained using the traditional genetic algorithm; (2) The speed of initialization is not efficient as the number of initial populations increases; and (3) The randomness of initialization affects the convergence results of the algorithm and leads to unstable performance.

To solve these problems and challenges, we propose a multi-UAV path planning algorithm based on an improved genetic algorithm with an expert policy and consisting of three main steps. First, capacitated vehicle routing problem (CVRP) is solved using a particle swarm optimization-based algorithm to plan the task-assignment decision for multiple UAVs. The path cost is calculated by determining the best trajectory to be solved by optimal boundary value problem (OBVP). Second, for each individual UAV obtained from the above steps, an improved genetic algorithm based on an expert strategy and a new population initialization algorithm, rapidly exploring random tree (RRT) initialization, is used to optimize the smoothed trajectories. Third, to solve the path conflict problem of multiple UAVs, hierarchical path planning is used to reprogram the paths of conflicting UAVs using temporal path points as conflict nodes. Finally, the final path trajectories of multiple UAVs in multi-task assignment are obtained. The experimental results show that the proposed algorithm converges faster and provides the required path trajectory for the task faster than the comparison algorithms.

The main contributions of this study are as follows:

(1) Utilizing UAV kinetics trajectory as the cost of task assignment can more accurately reflect actual flight costs and energy consumption, improving back-end path planning and reducing computational complexity in subsequent path planning.

(2) To address the issue of population initialization difficulties in complex environments, a new population initialization strategy is incorporated into the genetic algorithm. Moreover, an expert strategy is introduced to guide the evolution direction, promoting faster and more accurate convergence while ensuring the optimality of the solution and the escape from local optimal solutions.

(3) Hierarchical path planning is employed to avoid timing conflicts among multiple UAVs, thereby preventing collisions and ensuring the feasibility of the task solution.

The remainder of this paper is organized as follows: The related work is shown in Section 2. Some preparatory knowledge is described in Section 3. The improved genetic algorithm based on expert strategies is described in Section 4. Section 5 presents the experiments and analysis of results. Finally, the conclusion is described in Section 6.

2. Related Work

Recently, several investigations have studied UAVs and proposed different path planning algorithms, which can be classified into four main types: search-based, sampling-based, evolutionary algorithm-based, and reinforcement learning-based [9].

Gridding a map is the most common form of structured map data, and algorithms based on graph search have emerged as a result. Traditional algorithms include classical depth-first search, breadth-first search (BFS), and Dijkstra, which is the extension of the BFS in weighted graphs. However, due to the low search efficiency of traditional algorithms, Hart et al. [10] proposed the A* algorithm with heuristic functions. A* balances the completeness of the algorithm and the improvement of the search efficiency and has become the basic and most dominant algorithm for search-based path optimization. Stentz [11] proposed the D* algorithm to handle obstacle changes or path cost changes in an unknown environment, and Koenig [12] proposed D* lite as an improvement to the D* algorithm. These search methods are widely used in unmanned vehicle technology. Fransen et al. [13] proposed an improved A* heuristic algorithm that can find the least-cost path in geometric maps with turning costs. Fan et al. [14] combined the improved A* algorithm with the trajectory point expansion method to improve the real-time performance of the algorithm and reduce the amount of information stored in the algorithm.

Although heuristic functions are incorporated into search methods, the search remains inefficient for high-dimensional spaces. Rapidly exploring random tree based on the probabilistic road map algorithm proposed by Lavalle [15] is the basic algorithm of sampling-based path planning. Subsequent research has been focused on how to efficiently find the optimal path due to the probabilistic completeness of the RRT algorithm. Karaman and Frazzoli [16] proposed RRT*, and Gammell et al. [17] proposed Informed RRT*, both of which have greatly improved the speed of the RRT algorithm in finding the optimal path. Lindqvist et al. [18] proposed Exploration-RRT for 3D exploration tasks in partially or completely unknown and unstructured environments. Ge et al. [19] proposed a method for trajectory planning of fixed-wing UAVs based on the Dynamic RRT* algorithm to quickly obtain smooth trajectories of the aircraft.

Due to the increasing popularity of UAV swarm research, various evolutionary algorithms such as particle swarm optimization, the genetic algorithm, and the wolf pack algorithm have been widely utilized in UAV swarm for path planning, mission assignment, and formation control. In order to solve the problems that traditional evolutionary algorithms have in terms of quickly finding feasible paths and slow algorithm conversion, Meng et al. [20] proposed a novel algorithm named Priority Planning and Hybrid Particle Swarm Optimization (PPSwarm), which shows significant advantages in path quality, convergence speed, and running time. Wang et al. [21] developed an Energy Balance Multi-Objective Trajectory Planning Algorithm (EMTSPA) for multiple-swing UAVs to solve the energy balance path planning problem and achieved a multi-objective optimization solution. Zhang et al. [22] improved the wolf pack algorithm (WPA) by integrating the Dubins path planning method with an improved local fine-grained search capability to better suit the requirements of multiple-UAV path planning. In order to solve the problem of dispersing sensor data, Wang et al. [23] introduced a hybrid path planning method combining improved particle swarm optimization and a genetic algorithm (IPSO-GA) to enhance data collection efficiency of UAVs in forest environment monitoring. In order to address the issue of traditional genetic algorithms in UAV path planning under emergency scenarios, where local searches fail to complete the task, Wang and Meng [24] proposed the use of an enhanced genetic algorithm for real-time path planning.

Path planning for UAVs can be modeled as Markov decision processes, and reinforcement learning can be utilized for path planning. In order to solve the problem of UAVs being prone to collisions in environments with densely distributed and high-speed obstacles, Yang et al. [25] proposed a new method based on high-speed and high-density multi-strategy obstacle avoidance (MSAO2H) which guides the learning of angle parameters through a deep Q-network. Rahim et al. [26] proposed a reinforcement learning-based cooperative UAV scheme for search and rescue tasks in unknown environments. Huang et al. [27] used the proximal policy optimization framework based on representation enhancement for reinforcement learning-based UAV path planning, and the method can effectively improve the convergence speed during reinforcement learning training, increase the success rate of UAV missions, and reduce timeout and collision rates.

Multi-UAV path planning can be simplified as Multi-Agent Path Finding (MAPF) when conducting comprehensive planar planning. One of the most mainstream and effective MAPF algorithms is Conflict-Based Search (CBS) [28]. This method adopts a two-layered approach which involves computing path conflicts and performing single-agent path planning. To expedite CBS’s computation speed, Li et al. [29] proposed a novel algorithm named Explicit Estimation Conflict-Based Search (EECBS), which leverages heuristic search to enhance CBS efficiency. Moreover, Li et al. [30] addressed symmetry and K- robustness concerns and introduced K-CBS with symmetry-breaking constraints. To address real-time dynamic systems where agents receive new tasks immediately after reaching their destinations, Lifelong MAPF was introduced by Li et al. [31]. Chen et al. [32] proposed a scalable and safe multi-agent motion planning method to avoid collisions with obstacles and other agents. This method can identify a piecewise linear path for each agent, ensuring that the actual trajectories following these paths are guaranteed to meet the reach-and-avoid requirement.

Both search-based and sampling-based algorithms have been widely used in various industries, while the other two approaches are still undergoing continuous development. Each method has its own advantages and disadvantages. The search-based approach can guarantee an optimal path search, but the number of search spaces can exponentially increase in large and complex maps or UAV swarm, resulting in an explosion in search time. The sampling-based approach can quickly find solutions in higher dimensions and spaces, but it is probabilistically complete and not guaranteed to find optimal solutions within a limited time frame. Evolution-based algorithms can quickly cross-evolve in high-dimensional spaces and converge quickly to ensure optimal solutions. However, they may fall into local optimal solutions in complex environments, and population initialization may be difficult to complete. Reinforcement learning-based algorithms are still in their infancy, and their action and state space characteristics require a significant number of computational resources when the state space is large. This makes it challenging to deploy them in edge-end devices with limited computational resources, and further development is necessary.

3. Preparatory Knowledge

3.1. Optimal Boundary Value Problem

The optimal boundary value problem is an extended optimization problem of the boundary value problem, i.e., a smooth trajectory is constructed by restricting the initial and target points of the trajectory. The model of OBVP is as follows:

\min J_{\sum} = \sum_{k = 1}^{3} J_{k} J_{k} = \frac{1}{T} \int_{0}^{T} {j_{k} (t)}^{2} d t

(1)

s . t . s_{k} = (p_{k}, v_{k}, a_{k})

(2)

u_{k} = j_{k}

(3)

{\dot{s}}_{k} = f_{s} (s_{k}, u_{k}) = (v_{k}, a_{k}, j_{k})

(4)

The state of the UAV is represented by

s_{k} = (p_{k}, v_{k}, a_{k})

, which is divided into three dimensions, namely

(x, y, z)

;

k

represents the three dimensions of the UAV,

p

denotes the scalar position of the UAV,

v

represents the scalar velocity, and

a

indicates the scalar acceleration. The input control of the UAV is denoted by

u

, and

j

is jerk, which is the derivative of acceleration. Given the initial state

(p_{0}, v_{0}, a_{0})

and the final state

(p_{T}, v_{T}, a_{T})

, there exists a boundary condition constraint as shown in Table 1.

Then, at least a fifth-order polynomial (containing six coefficients) needs to be constructed to make the connection in order to satisfy the constraints on the starting and ending points:

p (t) = c_{5} t^{5} + c_{4} t^{4} + c_{3} t^{3} + c_{2} t^{2} + c_{1} t^{1} + c_{0}

(5)

where

p (t)

represents the position of the UAV at time t, and the polynomial of Equation (5) represents this position using coefficients. To solve for this coefficient, we can input the data from Table 1 for

t = 0

and

t = T

into Equation (5). Then, take the derivative of Equation (5) to obtain the velocity data and take the second derivative of Equation (5) to obtain the acceleration data. Combined with the constraints in Table 1, the coefficients in Equation (5) can be expressed as:

[\begin{matrix} p_{0} \\ p_{T} \\ 0 \\ 0 \\ 0 \\ 0 \end{matrix}] = [\begin{matrix} 0 & 0 & 0 & 0 & 0 & 1 \\ T^{5} & T^{4} & T^{3} & T^{2} & T & 1 \\ 0 & 0 & 0 & 0 & 1 & 0 \\ {5 T}^{4} & 4 T^{3} & 3 T^{2} & 2 T & 1 & 0 \\ 0 & 0 & 0 & 2 & 0 & 0 \\ 20 T^{3} & 12 T^{2} & 6 T & 2 & 0 & 0 \end{matrix}] [\begin{matrix} c_{5} \\ c_{4} \\ c_{3} \\ c_{2} \\ c_{1} \\ c_{0} \end{matrix}]

(6)

We use the Pontryagin minimum principle [33] to solve the problem, and eventually, it can be deduced as:

\begin{matrix} J = T + (\frac{1}{3} {α_{1}}^{2} T^{3} + α_{1} β_{1} T^{2} + {β_{1}}^{2} T) \\ + (\frac{1}{3} {α_{2}}^{2} T^{3} + α_{2} β_{2} T^{2} + {β_{2}}^{2} T) \\ + (\frac{1}{3} {α_{3}}^{2} T^{3} + α_{3} β_{3} T^{2} + {β_{3}}^{2} T) \end{matrix}

(7)

[\begin{matrix} α_{1} \\ α_{2} \\ α_{3} \\ β_{1} \\ β_{2} \\ β_{3} \end{matrix}] = [\begin{matrix} - \frac{12}{T^{3}} & 0 & 0 & \frac{6}{T^{2}} & 0 & 0 \\ 0 & - \frac{12}{T^{3}} & 0 & 0 & \frac{6}{T^{2}} & 0 \\ 0 & 0 & - \frac{12}{T^{3}} & 0 & 0 & \frac{6}{T^{2}} \\ \frac{6}{T^{2}} & 0 & 0 & - \frac{2}{T} & 0 & 0 \\ 0 & \frac{6}{T^{2}} & 0 & 0 & - \frac{2}{T} & 0 \\ 0 & 0 & \frac{6}{T^{2}} & 0 & 0 & - \frac{2}{T} \end{matrix}] [\begin{matrix} Δ p_{x} \\ Δ p_{y} \\ Δ p_{z} \\ Δ v_{x} \\ Δ v_{y} \\ Δ v_{z} \end{matrix}]

(8)

where

(Δ p_{x}, Δ p_{y}, Δ p_{z}, Δ v_{x}, Δ v_{y}, Δ v_{z})

are constants related only to the initial and end states. Then, it can be concluded that the cost function is only related to time

T

. By finding the extreme value of this polynomial, the minimum value can be obtained. Then, bringing back the original polynomial, the equation of the trajectory can be obtained. Figure 3 shows the UAV smooth flight trajectory computed from OBVP when the velocity and acceleration at the starting and ending points of the UAV are both unconstrained and free. The blue lines represent the trajectories that the UAV can fly in various directions, while the green line represents the optimal flight trajectory for the UAV.

3.2. Multi-Segment Trajectory Generation

After obtaining multiple waypoints from the path planning front-end (the details are in Section 4.2), we need to plan a trajectory that can smoothly pass through all the waypoints to reach the final destination. This problem is an extension of the OBVP, referred to as the Multi-Segment Trajectory Generation (MSG). To construct a multi-segment trajectory optimization problem, each segment can be viewed as a BVP, and a polynomial is used to connect them. A continuity constraint is then applied between each segment. In this paper, we utilize the minimum snap algorithm to solve the problem.

The Minimum Snap algorithm minimizes the snap value in the motion state, which is the derivative of jerk and represents differential thrust in UAVs. Minimizing snap can result in optimal energy efficiency and save energy [34,35]. As the state quantity of the aircraft consists of four values,

s = (p, v, a, j)

, each trajectory segment contains eight unknowns, requiring a seventh-order polynomial for connection.

In this algorithm, first, we construct polynomial parametric equations for each segment of the trajectory:

f (t) = \{\begin{matrix} f_{1} (t) \dot{=} \sum_{i = 0}^{N} p_{1, i} t^{i} T_{0} \leq t \leq T_{1} \\ f_{2} (t) \dot{=} \sum_{i = 0}^{N} p_{2, i} t^{i} T_{1} \leq t \leq T_{2} \\ ⋮ \\ f_{M} (t) \dot{=} \sum_{i = 0}^{N} p_{M, i} t^{i} T_{M - 1} \leq t \leq T_{M} \end{matrix}

(9)

The target function can be expressed as follows:

\begin{matrix} J (T) = \int_{T_{j - 1}}^{T_{j}} {(f^{4} (t))}^{2} d t \\ = \sum_{i \geq 4, l \geq 4} \frac{i (i - 1) (i - 2) (i - 3) (l - 1) (l - 2) (l - 3)}{i + l - 7} ({T_{j}}^{i + l - 7} - {T_{j - 1}}^{i + l - 7}) p_{i} p_{l} \\ = {[\begin{matrix} ⋮ \\ p_{i} \\ ⋮ \end{matrix}]}^{T} [\begin{matrix} ⋮ \\ \dots & \frac{i (i - 1) (i - 2) (i - 3) (l - 1) (l - 2) (l - 3)}{i + l - 7} T^{i + l - 7} & \dots \\ ⋮ \end{matrix}] [\begin{matrix} ⋮ \\ p_{l} \\ ⋮ \end{matrix}] \end{matrix}

(10)

State constraints for the first and last endpoints and position constraints for the intermediate nodes:

\begin{matrix} {f_{j}}^{(k)} (T_{j}) = {x_{j}}^{(k)} \\ ⟹ [\begin{matrix} \dots & \frac{i!}{(i - k)!} T_{j - 1}^{i - k} & \dots \\ \dots & \frac{i!}{(i - k)!} T_{j}^{i - k} & \dots \end{matrix}] [\begin{matrix} ⋮ \\ p_{j, i} \\ ⋮ \end{matrix}] = [\begin{matrix} x_{0, j}^{(k)} \\ x_{T, j}^{(k)} \end{matrix}] \\ ⟹ A_{j} P_{j} = d_{j} \end{matrix}

(11)

Continuity constraint between two adjacent polynomials:

\begin{matrix} {f_{j}}^{(k)} (T_{j}) = {f_{j + 1}}^{(k)} (T_{j}) \\ ⟹ [A_{j} - A_{j + 1}] [\begin{matrix} P_{j} \\ P_{j + 1} \end{matrix}] = 0 \end{matrix}

(12)

Finally, the target function is concatenated with the constraints to obtain a Quadratic Programming (QP) problem:

\min {[\begin{matrix} P_{1} \\ ⋮ \\ P_{M} \end{matrix}]}^{T} [\begin{matrix} Q_{1} & 0 & 0 \\ 0 & ⋱ & 0 \\ 0 & 0 & Q_{M} \end{matrix}] [\begin{matrix} P_{1} \\ ⋮ \\ P_{M} \end{matrix}]

(13)

s . t . A_{e q} [\begin{matrix} P_{1} \\ ⋮ \\ P_{M} \end{matrix}] = d_{e q}

(14)

In this paper, we use the OSQP solver [36] to solve the QP problem for obtaining a global optimal solution. Figure 4 is the MSG trajectory of the UAV solved using the OSQP solver. Each point in Figure 4 is a way point assigned by the task sent by the front-end. The UAV starts from the starting point and passes through each way point, and finally reaches the ending point to form a smooth trajectory.

4. Algorithm Model

This paper addresses the problem of completing a specific task assignment with multiple UAVs in a map containing obstacles in a complex environment, and devising a smooth path planning trajectory that avoids temporal conflicts among the paths of multiple UAVs, to ensure the completion of the task in its entirety. To achieve this, the problem is divided into three stages: (1) task assignment for multiple UAVs; (2) path planning for the task points assigned to each UAV; (3) and hierarchical path planning to prevent path conflicts among multiple UAVs, followed by replanning. Finally, a path planning diagram for multiple UAVs is obtained. The flow path is shown in Figure 5.

4.1. Problem Model Paradigm

The task scenario described in this study involves a complex map environment with multiple obstacles, where

N

tasks,

T a s k = [T_{1}, \dots, T_{N}]

, are marked with their respective location coordinates set as

S_T a s k = [S_{1}, S_{2}, \dots, S_{N}]

. The UAV swarm is required to pass through all task locations to provide the required resources, such as photographing disaster sites or delivering supplies. Each mission requires different resources

M = [m_{1}, \dots, m_{N}]

, and multiple homogeneous UAVs form a quadcopter UAV swarm set

U = [U_{1}, U_{2}, \dots]

. All UAVs start from a specified starting point

S_{0}

and can perform any mission, returning to the starting point upon completion of all missions. The objective of this paper is to assign

N

tasks to the multiple-UAV formations with the least use of UAVs, the least energy consumption, no collision of multiple UAVs, and in accordance with the laws of UAV kinematics. To achieve this, this paper proposes a task assignment and path planning model:

\min J = C_{0} K + C_{1} \sum_{i = 0}^{N} \sum_{j = 0}^{N} \sum_{k = 1}^{K} d_{i j} x_{i j k}

(15)

s . t . \sum_{k = 1}^{K} \sum_{j = 1}^{N} x_{0 j k} = \sum_{k = 1}^{K} \sum_{j = 1}^{N} x_{j 0 k} = K

(16)

\sum_{i = 1}^{N} x_{i j k} = \sum_{i = 1}^{N} x_{j i k} (k \in U, \forall j = 1, 2, \dots, N)

(17)

\sum_{i = 0}^{N} \sum_{j = 0}^{N} x_{i j k} m_{j} \leq C A P A C I T Y_M A X (k \in U)

(18)

\sum_{k = 1}^{K} \sum_{i = 0}^{N} x_{i j k} = 1 (j = 1, 2, \dots, N)

(19)

\sum_{i = 0}^{N} \sum_{j = 0}^{N} x_{i j k} d_{i j} \leq D I S T A N C E_M A X (k \in U)

(20)

K \geq ⌈ \frac{\sum_{j = 1}^{N} m_{j}}{M} ⌉ (K \in N^{*})

(21)

d_{i j} = f_{O B V P} (s_{i}, s_{j}) (s_{i}, s_{j} \in S_{T a s k}, i \neq j)

(22)

where

C_{0}

denotes the fixed cost of the starting use of the UAV and

C_{1}

denotes the cost of the traveling per-unit distance of the UAV.

N

denotes the number of mission points, and

n

denotes the node (

n = 0

denotes the starting point).

K

denotes the number of UAVs to be used.

D I S T A N C E_M A X

denotes the upper limit of the distance to be flown due to power limitation.

d_{i j}

denotes the distance from

s_{i}

to

s_{j}

solved using OBVP.

C A P A C I T Y_M A X

indicates that each UAV has a fixed upper limit of resources.

x_{i j k}

indicates the decision variable:

x_{i j k} = \{\begin{array}{l} 1, U A V k f l i e s f r o m i t o j \\ 0, U A V k d o e s n^{'} t f l y f r o m i t o j \end{array}

(23)

The objective of this problem is to optimize the above-mentioned problem and obtain the minimum required resources to assign different tasks to each UAV. Subsequently, these assignments are passed to the back-end path planning module for finding an optimal path planning solution.

4.2. Task Assignment Based on Improved PSO-GA

As the task assignment problem can be formulated as an optimization problem, metaheuristics algorithms such as Particle Swarm Optimization (PSO) [37], GA [38], Artificial Bee Colonies (ABC) [39], and Ant Colony Optimization (ACO) [40] can be applied to solve it. These algorithms use a random search strategy to quickly find excellent candidate solutions in a large-scale search space, avoiding the huge amount of computation required for exhaustive search [41]. However, according to the free lunch theorem, for any optimization problem, the average performance of any two optimization algorithms is equal, and no optimization algorithm can perform well in terms of computational efficiency, versatility, global search capability, etc. [42]. This has led to the mix of algorithms becoming a research hotspot and trend in the field of algorithm optimization. Compared with artificial bee colonies and ant colony optimization, the convergence speed and running speed of the PSO algorithm are relatively fast. However, PSO also has its drawbacks. It is easy to fall into a local optimal solution, which may eventually lead to slow convergence or no convergence. Therefore, this paper adopts an effective improved PSO-GA algorithm, which combines the particle swarm algorithm and the genetic algorithm to take into account the quality and diversity of the solution and has good convergence and search efficiency.

In PSO-GA, each particle represents a list of task assignments. The algorithm is initialized with a set of random particles

p o p u = {X_{1}, X_{2}, \dots, X_{N}}

, and then iteratively searches for the optimal solution. To facilitate fast initialization of particles, we first reduce the problem to a Traveling Salesman Problem (TSP) and then use a greedy algorithm for initialization. In each iteration, the particles update themselves by selecting the genetic information from three values

(X_{i}, p b e s t, g b e s t)

and cross-updating themselves. After finding these values, the particle updates its values using Equation (24):

\{\begin{array}{l} {p a r e n t}_{1} = X_{i}^{(j - 1)} \\ {p a r e n t}_{2} = \tilde{ω X_{i}^{(j - 1)}} |c_{1} {p b e s t}_{i}^{(j - 1)}| c_{2} {g b e s t}^{(j - 1)} \\ X_{i}^{(j)} = α {p a r e n t}_{2} + (1 - α) {p a r e n t}_{1} \end{array}

(24)

Equation (24) represents the population iteration equation in PSO-GA, where each child generated in an iteration has two parents for cross-genetic inheritance. One parent is the result of the previous iteration of the child, while the other is selected through a weighted probability from three particle values, namely the opposite direction path of the previous iteration of the child, the maximum fitness value found in the particle’s history search space, and the maximum fitness value in the population’s history search space. The weighting coefficients

ω

,

c_{1}

, and

c_{2}

determine the balance between the particle’s reliance on its own highest fitness value and the strategic guidance of the population fitness. Finally, cross-variant inheritance is performed using these two parents. The inverse direction path

\tilde{X_{i}^{(j - 1)}}

is taken to ensure that cross-variant inheritance leads to a free space search operation induced by the particle’s inertia.

To balance the global and local search capabilities of the algorithm and avoid falling into a local optimum, this paper adaptively adjusts the three parameters

ω

,

c_{1}

, and

c_{2}

:

\{\begin{array}{l} ω = ω_{m a x} + \frac{(ω_{m i n} - ω_{m a x}) {i t e r}^{2}}{{I T E R_M A X}^{2}} \\ c_{1} = c_{1 m a x} + \frac{(c_{1 m i n} - c_{1 m a x}) i t e r}{I T E R_M A X} \\ c_{2} = c_{2 m a x} + \frac{(c_{2 m i n} - c_{2 m a x}) i t e r}{I T E R_M A X} \end{array}

(25)

The value of

ω

changes dynamically according to the current iteration number

i t e r

and the upper limit of total iterations

I T E R_{M A X}

, allowing the algorithm to have a strong search ability at the early stage and gradually converge to better regions for finer search in the later stage. On the other hand,

c_{1}

represents the particle’s self-learning ability, while

c_{2}

represents its ability to learn from better-performing particles. Therefore, appropriate values of

c_{1}

and

c_{2}

are necessary to facilitate the particle search. In the initial stage, the algorithm should focus on the individual’s self-knowledge, while in the later stage, it should focus on the individual’s ability to acquire social information. The maximum and minimum values of the weights are denoted as

{(ω}_{m a x}, c_{1 m a x}, c_{2 m a x})

and

{(ω}_{m i n}, c_{1 m i n}, c_{2 m i n})

, respectively.

The algorithm’s flow is illustrated in Algorithm 1, utilizing the analytical solution method of OBVP to emphasize the actual dynamical path length required by the aircraft. The final output comprises the solved

N

mission point paths for

N

UAVs,

P O P U_U A V = \{u_{1}, \dots, u_{n}\}

by using find_path, which is a function to obtain the way point by

g b e s t

.

Algorithm 1: PSO-genetic algorithm

Input: iteration time

I T E R_M A X

, population number

N P

Output: List of all waypoints for each drone

P O P U_U A V

1: iter ← 1;
2: POPU ← initial(NP);
3: drone_popu ← greed_assign(popu);
4: fit ← OBVP(drone_popu);
5: pbest,gbest ← min(fit);
6: while iter < $I T E R_M A X$ do
7: iter ← iter + 1;
8: popu(iter) ← crossover(popu(iter-1),pbest,gbest);
9: drone_popu ← greed_assign(popu(iter));
10: fit ← OBVP(drone_popu);
11: pbest ← min(fit);
12: if pbest < gbest then
13: gbest ← pbest;
14: end
15: POPU_UAV ← find_path(gbest);
16: end

4.3. ERRT-GA for Path Planning

After acquiring the mission path points

u : \{s_{1}, s_{2}, \dots, s_{k}\}

for each UAV from the mission assignment module (where

s_{0}

is the starting point), the path planning module can be utilized for planning the path. The path planning module is segregated into the front-end and back-end, wherein the front-end is primarily utilized to resolve the path planning under discrete actions, while the back-end is employed for refining the path trajectory to render it more consistent with the kinematic laws of the UAV. The flow chart is illustrated in Figure 6.

4.3.1. Population Initialization

To address the drawbacks of the traditional genetic algorithm’s population initialization, some researchers have introduced greedy algorithms for population initialization [43]. This type of method can improve the results, but the paths it finds may be similar, which is not conducive to population variation. This paper presents a novel algorithm based on RRT for population initialization in a genetic algorithm. In this algorithm, after map rasterization (a two-dimensional example is used later on), all the points are mapped into a one-dimensional set

M A P = {s_{1}, s_{2}, \dots, s_{M A P S I Z E}}

using Equation (26). Here,

i n d e x

denotes the one-dimensional lead,

c o l

represents the number of columns on the map, and

r o w

represents the number of rows on the map.

\{\begin{array}{l} i n d e x = i + j \times c o l (i \in [1, c o l], i \in [1, r o w]) \\ i = i n d e x m o d c o l \\ j = i n d e x d i v c o l \end{array}

(26)

Each UAV can move to neighboring grids, and the available movements are discretized into a set of actions,

A C T I O N = {a_{1}, a_{2}, \dots}

, where, for example, a 2D map has 8 actions and a 3D map has 26 actions. Nodes and their neighboring nodes are numbered on the map, which allows actions to be viewed as edges between two nodes. All edges of all nodes are mapped into a one-dimensional set

A C T I O N_A L L = {a_{1}, a_{2}, \dots, a_{N \times 8}}

using Equation (27), where

i n d e x

is the one-dimensional index,

s_{i}

is the starting point of the edge, and

s_{j}

is the ending point of the edge.

\begin{array}{l} i n d e x = 8 (i - 1) + j (i \in [1, N] \in N^{+}, j \in [1, 8] \in N^{+}) \\ s_{i} = i n d e x d i v 8 \\ s_{j} = i n d e x m o d 8 \end{array}

(27)

A random weight

β \in [0, 1]

is assigned to all actions, and the weight set

β = {β_{1}, β_{2}, \dots, β_{N \times 8}}

of all edges is generated. Subsequent iterative path findings and fitness calculations only need to be evaluated based on this weight

β

. By constantly changing

β

to generate different populations, the initialization process is completed, and the need for judging path continuity for each path is eliminated.

Additionally, we propose using RRT to optimize the initialized populations. For rasterized maps, a sampling tree Tree is used for RRT. Firstly, a point

s_{r a n d}

is randomly sampled in the map to find the nearest point

s_{n e a r}

in the tree to the sampled point. Then, a new point

s_{n e w}

is generated by concatenating

s_{n e a r}

and

s_{r a n d}

using a fixed number of steps and a parameter

∆

, such that

s_{n e w} = {∆ s}_{n e a r} + (1 - ∆) s_{r a n d}

. Collision detection is performed between

s_{n e a r}

and

s_{n e w}

, and if no obstacle is encountered,

s_{n e w}

is added to the Tree. This process is repeated until the distance between the new point and the end point is less than the present value

γ

. Due to the existence of the sampling tree, the initialized RRT generation for multiple groups can share a common sampling tree, which greatly improves the speed of auxiliary initialization, as shown in Figure 7.

After obtaining the RRT initialization path

R R T_P A T H = {s_{1}, s_{2}, \dots, s_{l}}

, the initialization can be completed by increasing the corresponding point edge weights. The final output is the population’s

P A T H_I N I T I A L = {β_{1}, β_{2}, \dots, β_{N P}}

with multiple path weights. The RRT initialization is shown in Algorithm 2.

Algorithm 2: RRT-initial Algorithm

Input: population number NP
Output: path of population

P A T H_I N I T I A L

1: for i ← 1 to mapsize do
2: $A C T I O N_A L L$ ← (si, $A C T I O N$ );
3: end
4: $β$ ← zero(size( $A C T I O N_A L L$ ));
5: $P A T H_I N I T I A L$ ← $\{β\}$ × NP
6: for i ← 1 to NP do
7: $β (i) =$ rand(size( $β (i)$ ));
8: path ← RRT;
9: for j ← 1 to size(path) do
10: $β (i, j) =$ $β (i, j) +$ bouns;
11: end
12: end

4.3.2. Expert Genetic Iteration

After obtaining the initialized population, the fit can be performed. To obtain the paths from the population, the selection probabilities

{t e m p}_{i j}

for all succeeding action nodes of the node must be defined, starting from the starting point

s_{0}

.

{t e m p}_{i j} = h (j) + β_{i n d e x (i, j)} + μ \times d r i v e (j) i, j \in M A P

(28)

h (j) = c o s t (i, j) + O B V P (j, e n d) i, j \in M A P

(29)

where

c o s t (i, j)

represents the actual path cost from node

i

to node

j

;

O B V P (j, e n d)

represents the expected smooth trajectory cost from node

j

to the end point;

h (j)

is the heuristic function of a successor node;

β_{i n d e x (i, j)}

denotes the selection probability weight of the edge from node

i

to node

j

;

d r i v e (j)

represents the aircraft steering drive loss if node

j

is chosen as the direction of aircraft movement (zero if the succeeding node is in the same direction as the present motion); and

μ

is the weight parameter. Therefore, we can derive an optimal path

p a t h_p o p u

in this individual case by

{t e m p}_{i j}

. The adaptation fit can be defined as follows:

f i t = c o s t (i, j) + d r i v e (j) i, j \in p a t h_p o p u

(30)

f i t

determines the degree of dominance of each individual in the population, and the resulting set

F I T

is generated. The genetic parent set

P A R E N T

is then selected using the bid-race method, which involves evaluating each individual’s fitness and selecting the fittest individuals as parents. This method ensures that the next generation of individuals inherits the most desirable traits from their parents.

P A R E N T = \{{a r g m i n}_{i} (f i t (i))\} i \in r a n d (N P) i n F I T

(31)

The parents are then subjected to cross-genetic and mutation to produce the new population,

P O P U_M E D

. An expert strategy is employed to guide the new population, which consists of three steps: firstly, the best-adapted individuals are rewarded; secondly, the RRT algorithm is used to expertly guide the worst-adapted individuals; and thirdly, the wolf pack algorithm is used to expertly guide the remaining worst-adapted individuals by utilizing the top three individuals ranked by adaptation.

\{\begin{matrix} β_{i} = β_{i} + b o u n s i \in p a t h (i n d e x (f i t_f i r s t)) \\ β_{i} = β_{i} + g u i d a n c e i \in p a t h (R R T) \\ β_{i} = {\frac{1}{6} β}_{f i r s t} + {\frac{1}{3} β}_{s e c o n d} + {\frac{1}{2} β}_{t h i r d} i \in p a t h (i n d e x (f i t_w o r s t)) \end{matrix}

(32)

Finally, a new iterative population

P O P U_F I N

is obtained. After the completion of the iteration, individuals with the best fitness are selected, and the best path is obtained to be passed into the back-end. The genetic iteration algorithm is shown in Algorithm 3.

Algorithm 3: Expert Genetic Algorithm

Input: population number NP, path of population

P A T H_I N I T I A L

, iteration time

I T E R_M A X

Output: best path of the population

p a t h

1: $for i \leftarrow 1$ to ITER_MAX do
2: $for j \leftarrow 1$ to size ( $P A T H_I N I T I A L$ ) do
3: $p \leftarrow j$ ;
4: while p != end do
5: for l ← 1 to size(action) do
6: if iscollosion(p,l) then
7: continue;
8: end
9: h(l) ← cost(p,l) + OBVP(l,end);
10: $temp (p, l) \leftarrow h (l) + β (p, l)$ + drive(l);
11: end
12: index ← argmin(temp)
13: fit(j) ← fit(j) + cost(p,index) + drive(p,index);
14: p = index;
15: end
16: end
17: end
18: $parent \leftarrow Tournament (P A T H_I N I T I A L$ ,fit);
19: $P O P U_M E D$ ← crossover(parent);
20: $P O P U_F I N$ $\leftarrow bestbouns (P O P U_M E D$ $) + RRT (P O P U_M E D$ $) + wolf (P O P U_M E D$ );
21: Path ← find_path(argmin(fit(POPU_FIN)));

4.3.3. Multi-UAV Path Conflict Detection

In the path planning module, a kinetics-compliant path plan is generated for each UAV. However, during actual flight, their paths may overlap, which can result in collisions. To prevent collisions, the paths of multiple UAVs must be planned hierarchically. The lower layer involves the basic path planning module, which generates a set PATH of paths, i.e.,

{{p a t h}_{1}, {p a t h}_{2}, \dots, {p a t h}_{K}}

, as discussed in the previous section. The upper layer is the path conflict detection module. Since each path is obtained by rasterizing the map and the action is obtained by the adjacent grid, it can be decomposed by temporalizing the path,

p a t h_t i m e = {0 : s_{0}, 1 : s_{1}, \dots, T : s_{e n d}}

. Since all UAVs start to take off simultaneously, comparing the temporalized paths for different drones can identify the conflict point. If a conflicting path is identified, it is returned to the lower-level path planning module for re-planning. Finally, the non-conflicting paths of all UAVs are obtained, and the trajectory is smoothed using MSG to obtain the final trajectory.

The proposed algorithm’s general flow is presented in Algorithm 4. Firstly, the mission and map are initialized, followed by mission assignment. Then, by passing the assigned mission points to the path planning module, the path for each UAV is obtained. Finally, the smooth path trajectory is obtained through multi-UAV path conflict detection.

Algorithm 4: Multi-UAV evolutionary path planning Algorithm

Input: iteration time

I T E R_M A X

, population number NP
Output: Smooth trajectory for all drone Trajectory

1: Initial_map();
2: Conflict←true;
3: While conflict do
4: $P O P U_D R O N E$ ← PSO—Genetic Algorithm(ITER_MAX.NP);
5: $for i \leftarrow 1$ to size (POPU_DRONE) do
6: $P A T H_I N I T I A L$ ← RRT_INITIAL(POPU_DRONE(i),NP);
7: Path(i) ← Expert Genetic Algorithm(PATH_INITIAL);
8: end
9: Conflict ← isconflict(PATH);
10: end
11: Trajectory ← smooth(PATH);

5. Experiment and Result Analysis

5.1. Experiment Settings

The experiments presented in this paper were conducted in a Windows 11 environment with a R7 5800 CPU, and the algorithm was numerically simulated using MATLAB 2021a simulation software. In this section, we present a set of experiments with the map settings, number of tasks, and other parameters specified in Table 2. The map includes randomly generated obstacles with varying numbers, locations, and shapes, as well as randomly generated task points and UAV starting points. To fairly evaluate the feasibility and effectiveness of the proposed algorithm, we conducted comparison experiments with different baseline algorithms and performed ablation experiments on the algorithm. Figure 8 shows the experimental map generated once.

5.2. Experiment and Analysis

5.2.1. Task Assignment

We first use traditional Euclidean distance to verify. The solution to the simplified TSP with unlimited aircraft resources is shown in Figure 9a, while Figure 9b,c show the paths of different starting points for multi-task assignment. The paths of different colors in the figures represent the paths of different UAVs. Then, when the aircraft dynamics model is added to the task assignment algorithm, as seen in Figure 9d, the actual flight trajectory still differs significantly from the original straight path.

In order to verify the necessity of aircraft dynamics model in task allocation, we conducted more in-depth experiments. We assigned tasks to the five UAVs and then performed preliminary path planning to obtain the trajectories. The results are shown in Figure 10a,b. Figure 10a shows the task allocation result without using OBVP to solve and without considering the aircraft dynamics prediction, and shows the trajectory obtained by preliminary path planning. It can be observed that the trajectory of UAV-4 is affected by the actual smooth power trajectory of the aircraft, resulting in a much farther actual flight distance than the expected flight distance at the time of its own mission assignment. This may cause the aircraft to run out of energy during the actual flight, leading to the mission being unable to be completed, which is unacceptable. Figure 10b displays the predicted trajectory after the assignment with OBVP as a cost function and with the inclusion of aircraft dynamics, which shows that one of the UAV-4 mission points was assigned to UAV-5. Eventually, the actual planning of UAV-4 did not consume much more energy than originally expected, and the mission was completed successfully. It can be seen that if the OBVP is not used to solve, the final resulting path will fly a long extra distance in the actual flight.

5.2.2. Path Planning

After obtaining relevant path points from task assignment, the path planning experiments were conducted. In order to verify the effect of population initialization, we used the algorithms without RRT-initial as the comparison algorithms, such as GA, IGA [44], and Expert-GA. Then, the proposed RRT-initial population initialization modules were added to these algorithms, and the results are shown in Figure 11. From the results in Figure 11, we can see that in the same task, adding RRT-initial population initialization to the baseline algorithm can effectively reduce the fitness value of the algorithm. The ERRT-GA proposed in this paper has the smallest fitness value, proving its superiority over several other algorithms.

Based on the ERRT-GA, this paper adds path trajectory smoothing processing, and the two trajectories in Figure 12a show the original and smoothed trajectories. It was observed that during the smoothing process, the aircraft was susceptible to collisions with obstacles. Therefore, obstacle expansion volume was added to avoid aircraft collisions with obstacles during real movement, as shown in Figure 12b. Finally, we conducted a complete path planning experiment. Figure 13a shows the trajectory of a single UAV for the complete mission, and Figure 13b shows the trajectory of a complete mission for five UAVs.

Since the population initialization time is also an important factor affecting the algorithm, we conducted a comparative experiment on the population initialization time of three different population-size types: small, medium, and large. The comparative experimental results are shown in Table 3. As can be seen from the data in the table, our proposed new initialization algorithm without RRT optimization is faster than the initialization of the traditional genetic algorithm. One interesting outcome of the experiments is that the traditional genetic algorithm fails to identify an adequate quantity of populations in extreme maps, leading to errors in the calculations. Consequently, the initialization time for the traditional genetic algorithm is infinite for medium and large population sizes. On the other hand, there is a small increase in the initialization time of RRT-initial, but this is not exponential as the number of populations increases. This is because the RRT is shareable, as shown in Figure 14. Thus, the more population paths in the shared tree, the faster the search will be.

In order to verify the convergence, ability to escape from the optimal solution, and ability to find the optimal solution of the algorithm proposed in this paper, we set up three groups of experiments with different numbers of iterations and population sizes in the same simulation scenario and compared them with several other algorithms. Table 4, Table 5 and Table 6 show the results of the comparison experiments for the upper limits of population size and number of iterations for (10, 50), (10, 200), and (50, 200), respectively. The fit change graphs are shown in Figure 15, Figure 16 and Figure 17.

The RRT-initial population initialization algorithm proposed in this paper is evaluated by comparing the runtimes of GA and GA-RRT and of IGA and IGA-RRT in Table 4, Table 5 and Table 6. Although RRT-initial in the experiment of Table 3 increases the population initialization time, the overall task runtime is found to be reduced. This is because the convergence speed of the evolutionary algorithm is closely related to the quality of the initialized population. The RRT-initial population initialization algorithm provides the evolutionary algorithm with a high-quality initialized population in a short time, resulting in faster convergence in subsequent iterations. As shown in Table 4, Table 5 and Table 6, most algorithms using RRT-initial greatly reduce the number of iterations required to reach convergence, leading to a reduction in the overall program running time. These results fully demonstrate the shared nature of the search tree and the probabilistic completeness of the optimal solution of the sampling-based algorithm.

By comparing the baseline algorithm GA and IGA with Expert-GA, we were able to demonstrate the superiority of the improved strategy in the evolutionary algorithm iteration step. To achieve path finding, we adopted a discrete action strategy along with heuristic functions for path searching. The Markovian nature of the path planning allowed us to extend the path search to high-dimensional planes, leveraging discrete UAV actions. The inclusion of a heuristic search guaranteed the completeness of the optimal solution in the path search. Additionally, the inclusion of OBVP prediction helped to align the final path with the real aircraft trajectory. These combined efforts resulted in a more effective and efficient path-planning approach.

Secondly, the expert strategy optimizes the algorithm in two ways: fitting to the optimal path and avoiding local optima. Individuals with high fitness are rewarded, while those with low fitness are guided by the wolf pack algorithm. This drives the population toward the best-explored individuals. These improvements enable the population to converge faster, and the mean fit quickly approaches the best fit. However, this may lead to a population that can only improve with a certain probability of mutation in subsequent iterations, which can easily result in a local optimal solution. To address this, we added the expert guidance strategy of RRT, which has two advantages. Firstly, our RRT tree is always shared, so each RRT expert guide results in a better path than the previous one. The probabilistic completeness of RRT guarantees that the optimal solution can be obtained with enough iterations. Furthermore, the sharing of RRTs ensures that each RRT expert guide does not require a new sampling search, which minimizes the search time and does not affect the algorithm’s iteration speed. Secondly, the randomness of the RRT expert guidance produces paths that are different from those known to the population, providing an exploratory role. Since the obtained paths are excellent, they are also available with high probability, making it possible to jump out of the local optimal solution.

Finally, a comparison between the baseline algorithm and the final ERRT-GA proposed in this paper shows that the overall running time of the task has been significantly reduced. The ERRT-GA can solve the global optimal solution for different numbers of populations and iterations. Furthermore, as the ERRT-GA only requires fewer iterations to obtain the global optimal solution, we have added a truncation mechanism to ERRT-GA. By analyzing multiple experimental data, when the difference between the best fit and the mean fit of the population,

ε = |b e s t f i t - m e a n f i t|

, is less than 0.01, and the difference between

ε_{t}

at iter times

t

and

ε_{t - 1}

at iter times

t - 1

,

δ = |ε_{t} - ε_{t - 1}|

, is less than 0.01, the iteration is stopped. Figure 18 shows the actual time taken by various algorithms to obtain the optimal solution in different scenarios. The results show that the proposed ERRT-GA has a good running speed compared with other baseline algorithms. Especially when the number of iterations is large, the ERRT-GA with truncation mechanism can greatly improve the calculation speed of the program and save the calculation cost.

In order to demonstrate the feasibility and superiority of our proposed algorithm, we conducted comparative analyses with several contemporary path planning algorithms under identical experimental conditions and parameters. We compared our proposed method with Reinforcement Learning (RL) [45], Non-dominated Sorting Genetic Algorithm II (NSGA-II) [46], and A*-GA [47], and each algorithm was tested multiple times. While ensuring that the experimental results of each algorithm did not fluctuate too drastically, we selected the best results of each algorithm for comparison. The result is shown in Figure 19. Notably, RL aligns with the characteristics previously mentioned, where its initial training phase is relatively slow, yet it approaches an optimal solution upon reaching a certain level of training. The other two GA-based algorithms, however, tend to exhibit issues with larger population bases. NSGA-II algorithms can converge to the near-optimal value as the number of iterations increases, but it takes more iterations and time. The convergence speed of the A*-GA algorithm is relatively fast, but it can never converge to the near-optimal value. In contrast, our ERRT-GA demonstrates not only fewer iterations and a reduced iteration time but also the capacity to achieve near-optimal solutions. These findings underscore the practicality and superiority of our algorithm.

6. Conclusions

This paper presents an analysis of various path planning algorithms, highlighting their strengths and weaknesses, as well as the challenges that arise in path planning. To address issues such as difficult population initialization, local optimum constraints, and slow convergence of evolutionary algorithms, we propose an improved genetic algorithm utilizing expert strategies. The task is segmented into three modules, namely task assignment, path planning, and multi-UAV conflict detection. The task assignment algorithm incorporates the prediction of UAV kinetics trajectories, while the path planning algorithm employs a new population initialization approach based on RRT and integrates a heuristic search function based on the Markov property in the iterative process of the evolutionary algorithm. An expert guidance strategy is also included for population crossover variation. In the multi-UAV path module, a hierarchical path planning framework is adopted for temporal path discretization, followed by a path smoothing operation. This paper not only describes these algorithms theoretically but also includes numerous comparison experiments and ablation experiments with baseline algorithms. The results from these experiments demonstrate that the algorithms proposed in this paper outperform other methods and significantly improve the optimality and convergence of paths, allowing for a shorter time to obtain the desired path trajectory. The algorithm’s discrete and Markovian nature also makes it applicable to high-dimensional space and more complex maps. In future work, we will focus on extending the algorithm to higher-dimensional spaces and further enhancing the path planning capabilities, conduct in-depth research on the feasibility of parallel computing of PSO and GA in task allocation, and use a semi-physical simulation method to verify the algorithm, that is, build a simulation scene under a semi-physical simulation platform, connect some hardware of the drone to the simulation environment, and then implant the algorithm into the intelligent computing board for experimental verification.

Author Contributions

Conceptualization, H.X. and Z.N.; methodology, H.X., S.C., Z.L. and M.G.; software, Z.N. and Y.Z.; validation, H.X., Z.N. and M.Z.; formal analysis, Z.N., S.C. and M.Z.; investigation, H.X. and Z.L.; resources, H.X. and Z.N.; data curation, Z.N. and Z.L.; writing—original draft preparation, H.X. and Z.N.; writing—review and editing, M.Z. and B.J.; visualization, M.Z.; supervision, M.Z. and B.J.; project administration, M.Z. and B.J.; funding acquisition, B.J. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by National Key Research and Development Program of China (No. 2022YFB4500900), the National Natural Science Foundation of China (No. U21A20484) and the Zhejiang Provincial Natural Science Foundation of China (No. LGG22F020020).

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Tanaka, T.S.T.; Wang, S.; Jørgensen, J.R.; Gentili, M.; Vidal, A.Z.; Mortensen, A.K.; Acharya, B.S.; Beck, B.D.; Gislum, R. Review of Crop Phenotyping in Field Plot Experiments Using UAV-Mounted Sensors and Algorithms. Drones 2024, 8, 212. [Google Scholar] [CrossRef]
Asadzadeh, S.; de Oliveira, W.J.; de Souza Filho, C.R. UAV-based remote sensing for the petroleum industry and environmental monitoring: State-of-the-art and perspectives. J. Pet. Sci. Eng. 2022, 208, 109633. [Google Scholar] [CrossRef]
Mohd Noor, N.; Abdullah, A.; Hashim, M. Remote sensing UAV/drones and its applications for urban areas: A review. IOP Conf. Ser. Earth Environ. Sci. 2018, 169, 012003. [Google Scholar]
Erdelj, M.; Natalizio, E. UAV-assisted disaster management: Applications and open issues. In Proceedings of the 2016 International Conference on Computing, Networking and Communications (ICNC), Kauai, HI, USA, 15–18 February 2016; pp. 1–5. [Google Scholar]
Kakiuchi, R.; Tran, D.T.; Lee, J.H. Evaluation of human behaviour detection and interaction with information projection for drone-based night-time security. Drones 2023, 7, 307. [Google Scholar] [CrossRef]
Wu, Y.; Wu, S.; Hu, X. Multi-constrained cooperative path planning of multiple drones for persistent surveillance in urban environments. Complex Intell. Syst. 2021, 7, 1633–1647. [Google Scholar] [CrossRef]
Saeed, R.A.; Omri, M.; Abdel-Khalek, S.; Ali, E.S.; Alotaibi, M.F. Optimal path planning for drones based on swarm intelligence algorithm. Neural Comput. Appl. 2022, 34, 10133–10155. [Google Scholar] [CrossRef]
Yosuf, R.H.; Mokhtar, R.A.; Saeed, R.A.; Alhumyani, H.; Abdel-Khalek, S. Scheduling Algorithm for Grid Computing Using Shortest Job First with Time Quantum. Intell. Autom. Soft Comput. 2022, 31, 581–590. [Google Scholar] [CrossRef]
Aggarwal, S.; Kumar, N. Path planning techniques for unmanned aerial vehicles: A review, solutions, and challenges. Comput. Commun. 2020, 149, 270–299. [Google Scholar] [CrossRef]
Hart, P.E.; Nilsson, N.J.; Raphael, B. A formal basis for the heuristic determination of minimum cost paths. IEEE Trans. Syst. Sci. Cybern. 1968, 4, 100–107. [Google Scholar] [CrossRef]
Stentz, A. Optimal and efficient path planning for partially-known environments. In Proceedings of the 1994 IEEE International Conference on Robotics and Automation, San Diego, CA, USA, 8–13 May 1994; pp. 3310–3317. [Google Scholar]
Koenig, S.; Likhachev, M. D*lite. In Proceedings of the Eighteenth National Conference on Artificial Intelligence and Fourteenth Conference on Innovative Applications of Artificial Intelligence, Edmonton, AB, Canada, 28 July–1 August 2002; pp. 476–483. [Google Scholar]
Fransen, K.; van Eekelen, J. Efficient path planning for automated guided vehicles using A*(Astar) algorithm incorporating turning costs in search heuristic. Int. J. Prod. Res. 2023, 61, 707–725. [Google Scholar] [CrossRef]
Fan, G.; Xing, X.; Han, Y.; Chen, M.; Gui, H. Path planning for ground target reconnaissance based on improved Astar algorithm. In Proceedings of the 2021 China Automation Congress (CAC), Beijing, China, 22–24 October 2021; pp. 2092–2097. [Google Scholar]
LaValle, S. Rapidly-Exploring Random Trees: A New Tool for Path Planning; Research Report 9811; Department of Computer Science, Iowa State University: Ames, IA, USA, 1998. [Google Scholar]
Karaman, S.; Frazzoli, E. Sampling-based algorithms for optimal motion planning. Int. J. Robot. Res. 2011, 30, 846–894. [Google Scholar] [CrossRef]
Gammell, J.D.; Srinivasa, S.S.; Barfoot, T.D. Informed RRT*: Optimal sampling-based path planning focused via direct sampling of an admissible ellipsoidal heuristic. In Proceedings of the 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, Chicago, IL, USA, 14–18 September 2014; pp. 2997–3004. [Google Scholar]
Lindqvist, B.; Agha-Mohammadi, A.A.; Nikolakopoulos, G. Exploration-RRT: A multi-objective path planning and exploration framework for unknown and unstructured environments. In Proceedings of the 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Prague, Czech Republic, 27 September–1 October 2021; pp. 3429–3435. [Google Scholar]
Ge, J.; Liu, L.; Dong, X.; Tian, W. Trajectory planning of fixed-wing UAV using kinodynamic RRT algorithm. In Proceedings of the 2020 10th International Conference on Information Science and Technology (ICIST), Bath, London, Plymouth, UK, 9–15 September 2020; pp. 44–49. [Google Scholar]
Meng, Q.; Chen, K.; Qu, Q. PPSwarm: Multi-UAV Path Planning Based on Hybrid PSO in Complex Scenarios. Drones 2024, 8, 192. [Google Scholar] [CrossRef]
Wang, L.; Zhang, X.; Deng, P.; Kang, J.; Gao, Z.; Liu, L. An Energy-Balanced Path Planning Algorithm for Multiple Ferrying UAVs Based on GA. Int. J. Aerosp. Eng. 2020, 2020, 3516149. [Google Scholar] [CrossRef]
Zhang, L.; Zhang, L.; Liu, S.; Zhou, J.; Papavassiliou, C. Three-dimensional underwater path planning based on modified wolf pack algorithm. IEEE Access 2017, 5, 22783–22795. [Google Scholar] [CrossRef]
Wang, Y.; Li, W.; Jiang, R. A novel hybrid algorithm based on improved particle swarm optimization algorithm and genetic algorithm for multi-UAV path planning with time windows. In Proceedings of the 2022 IEEE 5th Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC), Chongqing, China, 16–18 December 2022; Volume 5, pp. 1005–1009. [Google Scholar]
Wang, X.; Meng, X. UAV online path planning based on improved genetic algorithm. In Proceedings of the 2019 Chinese Control Conference (CCC), Guangzhou, China, 27–30 July 2019; pp. 4101–4106. [Google Scholar]
Yang, T.; Yang, F.; Li, D. A New Autonomous Method of Drone Path Planning Based on Multiple Strategies for Avoiding Obstacles with High Speed and High Density. Drones 2024, 8, 205. [Google Scholar] [CrossRef]
Rahim, S.; Razaq, M.M.; Chang, S.Y.; Peng, L. A reinforcement learning-based path planning for collaborative UAVs. In Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing, New York, NY, USA, 25–29 April 2022; pp. 1938–1943. [Google Scholar]
Huang, X.; Wang, W.; Ji, Z.; Cheng, B. Representation Enhancement-Based Proximal Policy Optimization for UAV Path Planning and Obstacle Avoidance. Int. J. Aerosp. Eng. 2023, 2023, 6654130. [Google Scholar] [CrossRef]
Sharon, G.; Stern, R.; Felner, A.; Sturtevant, N.R. Conflict-based search for optimal multi-agent pathfinding. Artif. Intell. 2015, 219, 40–66. [Google Scholar] [CrossRef]
Li, J.; Ruml, W.; Koenig, S. Eecbs: A bounded-suboptimal search for multi-agent path finding. Proc. AAAI Conf. Artif. Intell. 2021, 35, 12353–12362. [Google Scholar] [CrossRef]
Li, J.; Harabor, D.; Stuckey, P.J.; Ma, H.; Gange, G.; Koenig, S. Pairwise symmetry reasoning for multi-agent path finding search. Artif. Intell. 2021, 301, 103574. [Google Scholar] [CrossRef]
Li, J.; Tinka, A.; Kiesel, S.; Durham, J.W.; Kumar, T.S.; Koenig, S. Lifelong multi-agent path finding in large-scale warehouses. Proc. AAAI Conf. Artif. Intell. 2021, 35, 11272–11281. [Google Scholar] [CrossRef]
Chen, J.; Li, J.; Fan, C.; Williams, B.C. Scalable and safe multi-agent motion planning with nonlinear dynamics and bounded disturbances. Proc. AAAI Conf. Artif. Intell. 2021, 35, 11237–11245. [Google Scholar] [CrossRef]
Mueller, M.W.; Hehn, M.; D’Andrea, R. A computationally efficient motion primitive for quadrocopter trajectory generation. IEEE Trans. Robot. 2015, 31, 1294–1310. [Google Scholar] [CrossRef]
Lian, L.; Zong, X.; He, K.; Yang, Z. Trajectory optimization of unmanned surface vehicle based on improved minimum snap. Ocean. Eng. 2024, 302, 117719. [Google Scholar] [CrossRef]
Mellinger, D.; Kumar, V. Minimum snap trajectory generation and control for quadrotors. In Proceedings of the 2011 IEEE International Conference on Robotics and Automation, Shanghai, China, 9–13 May 2011; pp. 2520–2525. [Google Scholar]
Stellato, B.; Banjac, G.; Goulart, P.; Bemporad, A.; Boyd, S. OSQP: An operator splitting solver for quadratic programs. Math. Program. Comput. 2020, 12, 637–672. [Google Scholar] [CrossRef]
Kennedy, J.; Eberhart, R. Particle swarm optimization. In Proceedings of the ICNN’95-International Conference on Neural Networks, Perth, WA, Australia, 27 November–1 December 1995; Volume 4, pp. 1942–1948. [Google Scholar]
Holland, J.H. Genetic algorithms. Sci. Am. 1992, 267, 66–73. [Google Scholar] [CrossRef]
Karaboga, D. An Idea Based on Honey Bee Swarm for Numerical Optimization; Technical report-tr06; Erciyes University, Engineering Faculty, Computer Engineering Department: Kayseri, Türkiye, 2005. [Google Scholar]
Dorigo, M.; Maniezzo, V.; Colorni, A. Ant system: Optimization by a colony of cooperating agents. IEEE Trans. Syst. Man Cybern. Part B 1996, 26, 29–41. [Google Scholar] [CrossRef]
Loscos, D.; Martí-Oliet, N.; Rodríguez, I. Generalization and completeness of stochastic local search algorithms. Swarm Evol. Comput. 2022, 68, 100982. [Google Scholar] [CrossRef]
Wolpert, D.H.; Macready, W.G. No free lunch theorems for optimization. IEEE Trans. Evol. Comput. 1997, 1, 67–82. [Google Scholar] [CrossRef]
Muñoz, A.; Rubio, F. Evaluating genetic algorithms through the approximability hierarchy. J. Comput. Sci. 2021, 53, 101388. [Google Scholar] [CrossRef]
Cheng, Z.; Sun, Y.; Liu, Y. Path planning based on immune genetic algorithm for UAV. In Proceedings of the 2011 International Conference on Electric Information and Control Engineering, Wuhan, China, 15–17 April 2011; pp. 590–593. [Google Scholar]
Tu, G.T.; Juang, J.G. Path planning and obstacle avoidance based on reinforcement learning for UAV application. In Proceedings of the 2021 International Conference on System Science and Engineering (ICSSE), Ho Chi Minh City, Vietnam, 26–28 August 2021; pp. 352–355. [Google Scholar]
Deb, K.; Pratap, A.; Agarwal, S.; Meyarivan, T. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 2002, 6, 182–197. [Google Scholar] [CrossRef]
Li, Y.; Dong, D.; Guo, X. Mobile robot path planning based on improved genetic algorithm with A-star heuristic method. In Proceedings of the 2020 IEEE 9th Joint International Information Technology and Artificial Intelligence Conference (ITAIC), Chongqing, China, 11–13 December 2020; Volume 9, pp. 1306–1311. [Google Scholar]

Figure 1. UAV swarm already has a wide range of applications in different fields, and this chart lists some example applications of military, civilian, and commercial UAV swarm.

Figure 2. In extreme environments, it is difficult for UAVs using original GA to find initialization paths, because of the disadvantages of traditional GA using path continuity functions.

Figure 3. Smooth trajectory computed from OBVP. By discretizing the actions and then solving for the trajectory of each action, the shortest trajectory (green line) is obtained.

Figure 4. The MSG trajectory of the UAV solved using the OSQP solver. The UAV starts from the starting point and passes through each way point, finally reaching the ending point to form a smooth trajectory.

Figure 5. This flow chart shows the general flow of the ERRT-GA proposed in this paper. The input is represented by Map. It passes through the task assignment module, the path planning module, and the multi-UAV path conflict detection module in turn, and finally obtains a smooth trajectory that meets the requirements.

Figure 6. Flowchart of the path planning module. The RRT module is a separate module that can be used for path initialization and also for expert policies. A feasible unsmoothed path is finally output by the path planning module.

Figure 7. With an RRT search tree, several non-optimal paths that can reach the target point can be searched in a short time.

Figure 8. A randomly generated map in one of the experiments. Green dots are starting points, red dots are task points, and black are obstacles. All are randomly generated.

Figure 9. Task assignment experiment. (a) TSP; (b) CVRP with the starting point in the center; (c) CVRP with the starting point in random position; (d) the path derived from CVRP using OBVP as a cost function.

Figure 10. For the comparison experiment of whether to use OBVP as a cost function for solving CVRP. (a) Trajectory derived from CVRP without using OBVP as a cost function and without considering the aircraft dynamics prediction; (b) Trajectory derived from CVRP with OBVP as a cost function and considering the aircraft dynamics prediction.

Figure 11. Path planning experiments with only a single objective to demonstrate the effect of population initialization. The green dot indicates the starting point, and the red dot indicates the ending point. (a) GA; (b) IGA; (c) Expert-GA; (d) GA-RRT; (e) IGA-RRT; (f) ERRT-GA.

Figure 12. Path trajectory smoothing and obstacle expansion. (a) Single-pass path planning with a smooth path; (b) Path planning with added obstacle expansion volume.

Figure 13. A complete path planning experiment. (a) Path planning of one UAV using ERRT-GA in complex map; (b) Path planning for five different UAVs using ERRT-GA.

Figure 14. Results of an experiment using the RRT algorithm for population initialization.

Figure 15. The variation in fitness with the number of iterations for different algorithms in a complete task using GA, IGA, PSO, GA-RRT, IGA-RRT, Expert-GA, and ERRT-GA, where NP = 10 and ITER = 50 (a solid line represents the best fit, and a dotted line represents the mean fit).

Figure 16. The variation in fitness with the number of iterations for different algorithms in a complete task using GA, IGA, PSO, GA-RRT, IGA-RRT, Expert-GA, and ERRT-GA, where NP = 10 and ITER = 200 (a solid line represents the best fit, and a dotted line represents the mean fit).

Figure 17. The variation in fitness with the number of iterations for different algorithms in a complete task using GA, IGA, PSO, GA-RRT, IGA-RRT, Expert-GA, and ERRT-GA, where NP = 50 and ITER = 200 (a solid line represents the best fit, and a dotted line represents the mean fit).

Figure 18. Comparison of ERRT-GA calculation time with other algorithms after applying the truncation mechanism.

Figure 19. Results diagram of total path length for different algorithms in the same experimental environment.

Table 1. The state of UAV in the beginning and end.

Time	Position	Velocity	Acceleration
$t = 0$	$p_{0}$	0	0
$t = T$	$p_{T}$	0	0

Table 2. Parameters in the experiment.

Map Size	Task Number	UAV Number	ITERMAX (Task)	ITERMAX (Path)
[43, 36] and [86, 72]	31	10	3000	200

Table 3. Population initialization time comparison between three different algorithms for small, medium, and large population sizes.

Algorithm	NP = 10	NP = 50	NP = 100
GA	0.3132 s	∞	∞
Expert-GA	0.2955 s	1.4241 s	2.8302 s
Ours (ERRT-GA)	0.3365 s	1.4803 s	2.8922 s

Table 4. Result from different algorithms with NP = 10, ITER_MAX = 50.

Algorithm	Mean Fit	Best Fit	Convergence Times	Time
GA	57.07	54.52	36	21.88 s
PSO	63.49	59.01	8	54.11 s
IGA	58.01	52.52	48	22.11 s
GA-RRT	56.32	55.11	6	19.08 s
IGA-RRT	56.23	52.76	7	19.95 s
Expert-GA	50.76	50.76	9	15.86 s
ERRT-GA	50.18	50.18	27	16.74 s

Table 5. Results from different algorithms with NP = 10, ITER_MAX = 200.

Algorithm	Mean Fit	Best Fit	Convergence Times	Time
GA	55.43	53.94	174	87.11 s
PSO	63.49	59.01	8	256.58 s
IGA	56.11	52.52	48	88.11 s
GA-RRT	55.78	53.11	114	74.02 s
IGA-RRT	55.42	52.70	7	79.74 s
Expert-GA	50.18	50.18	62	61.61 s
ERRT-GA	50.18	50.18	27	63.80 s

Table 6. Results from different algorithms with NP = 50, ITER_MAX = 200.

Algorithm	Mean Fit	Best Fit	Convergence Times	Time
GA	54.38	51.35	68	345.82 s
PSO	60.51	53.94	10	446.22 s
IGA	55.11	51.35	53	348.96 s
GA-RRT	55.93	53.11	44	359.15 s
IGA-RRT	55.81	53.11	61	362.69 s
Expert-GA	50.76	50.76	6	324.81 s
ERRT-GA	50.18	50.18	27	296.85 s

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Xu, H.; Niu, Z.; Jiang, B.; Zhang, Y.; Chen, S.; Li, Z.; Gao, M.; Zhu, M. ERRT-GA: Expert Genetic Algorithm with Rapidly Exploring Random Tree Initialization for Multi-UAV Path Planning. Drones 2024, 8, 367. https://doi.org/10.3390/drones8080367

AMA Style

Xu H, Niu Z, Jiang B, Zhang Y, Chen S, Li Z, Gao M, Zhu M. ERRT-GA: Expert Genetic Algorithm with Rapidly Exploring Random Tree Initialization for Multi-UAV Path Planning. Drones. 2024; 8(8):367. https://doi.org/10.3390/drones8080367

Chicago/Turabian Style

Xu, Hong, Zijing Niu, Bo Jiang, Yuhang Zhang, Siji Chen, Zhiqiang Li, Mingke Gao, and Miankuan Zhu. 2024. "ERRT-GA: Expert Genetic Algorithm with Rapidly Exploring Random Tree Initialization for Multi-UAV Path Planning" Drones 8, no. 8: 367. https://doi.org/10.3390/drones8080367

APA Style

Xu, H., Niu, Z., Jiang, B., Zhang, Y., Chen, S., Li, Z., Gao, M., & Zhu, M. (2024). ERRT-GA: Expert Genetic Algorithm with Rapidly Exploring Random Tree Initialization for Multi-UAV Path Planning. Drones, 8(8), 367. https://doi.org/10.3390/drones8080367

Article Menu

ERRT-GA: Expert Genetic Algorithm with Rapidly Exploring Random Tree Initialization for Multi-UAV Path Planning

Abstract

1. Introduction

2. Related Work

3. Preparatory Knowledge

3.1. Optimal Boundary Value Problem

3.2. Multi-Segment Trajectory Generation

4. Algorithm Model

4.1. Problem Model Paradigm

4.2. Task Assignment Based on Improved PSO-GA

4.3. ERRT-GA for Path Planning

4.3.1. Population Initialization

4.3.2. Expert Genetic Iteration

4.3.3. Multi-UAV Path Conflict Detection

5. Experiment and Result Analysis

5.1. Experiment Settings

5.2. Experiment and Analysis

5.2.1. Task Assignment

5.2.2. Path Planning

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI