Online Unmanned Aerial Vehicles Search Planning in an Unknown Search Environment

Duan, Haopeng; Xiao, Kaiming; Liu, Lihua; Chen, Haiwen; Huang, Hongbin

doi:10.3390/drones8070336

Open AccessArticle

Online Unmanned Aerial Vehicles Search Planning in an Unknown Search Environment^†

by

Haopeng Duan

,

Kaiming Xiao

^*,

Lihua Liu

,

Haiwen Chen

and

Hongbin Huang

Laboratory for Big Data and Decision, National University of Defense Technology, Changsha 410073, China

^*

Author to whom correspondence should be addressed.

^†

This paper is an extended version of our conference paper: Huang, H.; Duan, H.; Liu, L.; Xiao, K. (2022). A Two-Stage Framework for Online Unmanned Aerial Vehicles Search Planning. In Proceedings of the the 5th International Conference on Algorithms, Computing and Artificial Intelligence—ACAI, New York, NY, USA, 23–25 December 2022.

Drones 2024, 8(7), 336; https://doi.org/10.3390/drones8070336

Submission received: 12 June 2024 / Revised: 9 July 2024 / Accepted: 18 July 2024 / Published: 19 July 2024

Download

Browse Figures

Versions Notes

Abstract

Unmanned Aerial Vehicles (UAVs) have been widely used in localized data collection and information search. However, there are still many practical challenges in real-world operations of UAV search, such as unknown search environments. Specifically, the payoff and cost at each search point are unknown for the planner in advance, which poses a great challenge to decision making. That is, UAV search decisions should be made sequentially in an online manner thereby adapting to the unknown search environment. To this end, this paper initiates the problem of online decision making in UAV search planning, where the drone has limited energy supply as a constraint and has to make an irrevocable decision to search this area or route to the next in an online manner. To overcome the challenge of unknown search environment, a joint-planning approach is proposed, where both route selection and search decision are made in an integrated online manner. The integrated online decision is made through an online linear programming which is proved to be near-optimal, resulting in high information search revenue. Furthermore, this joint-planning approach can be favorably applied to multi-round online UAV search planning scenarios, showing a great superiority in first-mover dominance of gathering information. The effectiveness of the proposed approach is validated in a widely applied dataset, and experimental results show the superior performance of online search decision making.

Keywords:

unmanned aerial vehicles; information search; unknown environment; online search planning; online linear programming

1. Introduction

With the rapid development of unmanned technology, Unmanned Aerial Vehicles (UAVs) have been widely applied in information search [1], environmental monitoring [2], border reconnaissance [3], supply delivery [4], etc. This is mainly because of the superiority on maneuverability, portability and flexibility [5,6]. Generally, an operation of UAV includes two aspects, i.e., routing and searching, which are both technical foundations for the application of UAV and directly affect the efficiency of operation execution [7,8,9].

UAV search planning, an important choice for information collection, has become an attractive issue because of the superiority of UAV applications over the traditional manned aircraft approach. Recent research studies have made efforts to address the challenge of joint optimization considering both routing and searching processes in the UAV search planning problem [10,11]. Traditional approaches follow an offline two-stage paradigm that considers path selection and searching time allocation sequentially, in which all decisions are made before action. In the first stage, only the problem of routing is solved separately using methods such as greedy search [12], Ant Colony Optimization (ACO) [13], and the information including the searching payoff and cost at each search point is not considered. Then, the second stage focuses on the allocation of time or cost, where both the difficulty and potential value of searching a specific point are taken into account. To solve this problem, a series of approaches have been proposed, including the quick score method based on the significance of information revenue [14], the heuristic method based partially on the observable decentralized Markov process [10], and time allocation based on the Newton Method [15,16]. These methods all have the premise that the prior parameters of searching payoff and cost need to be known before making decisions.

Unfortunately, the searching payoff and cost are usually revealed in an online manner rather than being known before action in many real-world UAV search operations. However, few studies have investigated this challenge of online revealed information in UAV search planning from the unknown environment. For instance, in the aftermath of an earthquake, utilizing UAVs for search and rescue operations involves the challenge of unknown costs and benefits associated with each search location. The information regarding the cost and benefit of each search location becomes available only upon reaching that specific search point. To this end, this paper first addresses the problem of online decision making in UAV search planning, where the payoff and cost at each search point are unknown in advance. Furthermore, considering the fact that many real-world UAV search missions require multiple trips with battery replacement/charging and data offloading in between because of the limited energy budget of UAVs [17], the issue of multi-round online UAV search planning is addressed in this paper. The main contributions of this work are as follows:

A novel online UAV search planning problem and the multi-round planning problem are proposed to address the challenge of online revealed payoff and cost of search actions in unknown environment.
A joint-planning approach is proposed to solve this problem, where both route selection and search decision are made in an integrated online manner which is near-optimal to the omniscient offline solution.
The proposed approach can help planners achieve an online policy for the multi-round UAV search planning problem, which shows great superiority in the first-mover dominance of gathering information.

The effectiveness of the proposed approach is validated in a widely applied dataset [18], and experimental results show the superior performance of online search decision making.

2. Related Works

In recent years, a large number of scholars have been attracted to studying UAV planning problems. We here give a brief review of UAV planning problems from the following three aspects: UAV path planning, UAV search planning, and online planning in unknown environments.

2.1. UAV Route Planning

UAV path planning is one of the core technologies of UAV planning, in which the planner tries to find the best path for the UAV to reach a specified location under the constraints of the hardware of a drone, flight environment, and performance. Essentially, UAV path planning is an optimization problem under various constraints [19].

Traditional path planning algorithms include Dijkstra, A*, and artificial potential field algorithms. The Dijkstra algorithm [20] is based on the greedy approach to realize path planning for UAVs [21]. Building upon the Dijkstra algorithm, the A* algorithm was proposed by [21] to introduce global information for the estimation and evaluation of the shortest path, thus enhancing efficiency and rendering the path more efficient. In the realm of path planning, [22] put forward an artificial potential field algorithm composed of attractive and repulsive forces. This algorithm achieves smooth and secure paths; however, it is susceptible to local minima and fails to attain the optimal solution.

Moreover, the problem addressed in this study can similarly be regarded as an optimization problem with stochastic rewards. Reference [23] presented a precise algorithm for solving the directed problem with stochastic rewards, as well as a bi-objective genetic algorithm. ESA determines the convex Pareto frontier by solving multiple deterministic Orienteering Problem with Stochastic Profits (OPSP) problems, while GA is employed to generate approximate Pareto frontiers. For the optimization problem with stochastic rewards in a population, reference [24] addressed a problem known as Team Orienteering Coverage Planning with Uncertain Reward (TOCPUR). This problem considers the uncertain and increasing costs at each location and permits multiple visits to the same vertex. An iterative approach is proposed, which estimates the cumulative cost of each vertex and iteratively solves a variant of the Team Orienteering Problem (TOP). This method surpasses precise TOP solutions and greedy algorithms in terms of cost optimization.

2.2. UAV Search Planning

UAV search planning can be divided into dynamic and static types according to the probability modeling method in the search process [25]. For the static search method, since the probability of each search point is known, the search planning can be transformed into the optimal path planning in the search area. On the other hand, dynamic search, because the probability that other search points will search time and search order changes in the process of search, is a process that requires UAV online access to the environment change and decision making, so choosing reasonable environment modeling is the key. For instance, reference [26] proposed an online replanning approach for utilizing autonomous unmanned aerial vehicles (UAVs) to map agricultural fields. This method employs the Markov random field framework to represent the perception of uncertain maps and computes optimized pest sampling strategies. In addition, according to the number of search UAVs, it can be divided into two types, i.e., single UAV and multi-UAV search planning. Reference [11] proposed a multi-UAV-based search time allocation and cooperative search algorithm to obtain the optimal solution for time allocation under a given path. Studies on UAV search planning have been extended from single-target to multi-target missions. Reference [17] proposed the Gen-Path algorithm based on the multi-objective problem to improve the efficiency of UAVs searching for multiple targets.

2.3. Online Planning in Unknown Environments

Online decision making in unknown environments is critical to improve the effectiveness of UAV applications. Most UAV planning nowadays targets globally known environments or locally known environments and pre-plans search trajectories offline. However, when facing an unknown environment, these methods become less efficient in search, while lacking dynamic resilience and mostly transforming the target search problem into a simple area coverage problem [27]. Reference [28] proposed a real-time motion planning framework for mobile robots based on random sampling for dynamically uncertain environments. The robust asymptotically optimal RRT* algorithm is proposed for bounded uncertain linear systems. The CC-RRT algorithm was proposed for unbounded strongly uncertain systems, which can obtain probabilistic feasible path solutions in linear systems satisfying Gaussian distribution assumptions. References [29,30] proposed a series of PSO algorithms based on path planning and objective search. An improved particle swarm optimization method was proposed that applies the collaboration rules of the robot system to the potential field function and applies it as the fitness function of PSO to the unknown environment of the robot collaborative search [31]. Reference [32] proposed a distributed PSO-based algorithm to overcome the limitation of the robot workspace so that the PSO-based algorithm can achieve the overall optimum in the global mechanism.

3. Problem Description and Notations

In the problem of UAV search planning, the planner aims to maximize the information collection, i.e., the accumulated searching payoff over a set of search points (areas), with a limited energy supply constraint. Both actions of routing and searching consume energies of the UAV during the operations, which is referred to as cost in this paper. Traditionally, both the costs of traveling a edge and searching a point are known as prior knowledge; hence, plans can be made in an offline manner before taking actions. However, this is not the case in many real-world applications. In searching operations, the cost of searching a point to collect information is usually not known in advance before the UAV reaches and evaluates the search point, although the cost of traveling a path of a given distance is generally prior knowledge. Therefore, this paper focuses on the problem of UAV search planning, where the searching cost is revealed sequentially online.

Specifically,

G = (V, E)

denotes a directed network with a set of vertices

V = {v_{1}, v_{2}, \dots, v_{n}}

and a set of edges

E = {(i, j) | i, j \in V}

. At each search point

i \in V

, the searching cost of

a_{i}

is assigned to represent the energy consumption (vector form

a

), while

c_{i j}

(vector form

c

) denotes the energy cost of traveling through edge

(i, j) \in E

. The budget of energy for a UAV denoted by b is limited, which cannot support searching all points in G. The UAV aims to plan a traveling route considering all search points, decide on which points to reach, and conduct the search operations. Let

x_{i} \in {0, 1}^{n}

be the binary decision variables (vector form

x

) of the search selection, and

x_{i} = 1

if the UAV chooses to spend energy

a_{i}

to search vertex i. The corresponding payoff of searching from the vertices is presented as

r = {(r_{i})}_{i \in V} \in R^{n}

. We denote by

y_{i j} \in {0, 1}^{| E |}

(vector form

y

) the binary decision variables of the path selection, and

y_{i j} = 1

if the edge

(i, j)

is selected to traverse by the planner. The UAV should select a route that allows it to visit each vertex only once, starting and ending in the same vertex, i.e., the UAV base. The objective of the planner is to maximize the accumulated searching payoff over V, i.e.,

\sum_{i \in V} r_{i} x_{i}

, while the total cost of operations is within the budget, i.e.,

\sum_{i, j \in E} c_{i j} y_{i j} + \sum_{i \in V} a_{i} x_{i} \leq b,

(1)

where

\sum_{i, j \in E} c_{i j} y_{i j}

denotes the accumulated cost of traveling, and

\sum_{i \in V} a_{i} x_{i}

represents the accumulated cost of searching.

A descriptive schematic of the UAV search planning problem is shown in Figure 1. The start point of the UAV is

v_{1}

, where the pair of searching cost and payoff is

(a_{1}, r_{1}) = (0, 0)

since the UAV does not need to search for information at its base. By traversing the edge

(1, 2) \in A

and consuming the corresponding energy cost

c_{12}

for traveling, the UAV can reach vertex

V_{2}

, where the pair of searching cost and payoff

(a_{2}, r_{2})

is first revealed to the planner in the online version. The UAV should choose the route and decide online whether to search for the vertices it encounters.

To formulate the Online UAV Search Planning problem (OnUAVSP) formally, we define the history of search planning before stage

t - 1

as

H_{t - 1} = {a_{j}, r_{j}, c, x_{j}, y^{t - 1}}_{j = 1}^{t - 1},

(2)

where

y^{t}

denotes the accomplished route of UAV at stage t. Hence, solving OnUAVSP is equivalent to obtaining the policy functions of the history and the coefficients observed at the current time period, i.e.,

(x_{t}, y^{t}) = ψ_{t} (a_{t}, r_{t}, H_{t - 1}) .

(3)

Denote by

ψ = {(ψ_{t})}_{t \in {1, 2, \dots, | V |}}

the vectors of their policy functions, which can be time dependent.

Due to the energy budget of UAVs, many real-world UAV search missions require multiple trips with battery replacement/charging and data offloading in between [17]. As shown in Figure 2, since the energy budget is limited and can only support the search mission of partial vertexes, the UAV needs to return back to the base vertex for battery replacement/charging. The UAV search mission is divided into two trips (i.e., trip 1:

v_{1} \to v_{2} \to v_{4} \to v_{5} \to v_{3} \to v_{1}

and trip 2:

v_{1} \to v_{6} \to v_{8} \to v_{7} \to v_{1}

) with battery replacement/charging at

v_{1}

in between.

Therefore, we consider a Multi-round Online UAV Search Planning problem (MOUAVSP) in which k rounds of sub-missions are assigned to a UAV for searching for the specified region. At each round

τ \in K = {1, 2, \dots, k}

, the accumulated searching payoff

π_{τ}

can be obtained by solving the OnUAVSP problem at this round.

Accordingly, we can formulate problem MOUAVSP as follows:

[P-MOUAVSP] Λ_{k}^{*} = max_{P} \sum_{τ \in K} π_{τ},

where

V_{τ}

denotes the vertex set selected by UAV to traverse at round

τ

, and

P = {P_{1}, P_{2}, \dots, P_{k}}

presents a partition of

V / v_{1}

containing k sets of vertexes, which are chosen by the UAV to traverse at different rounds of the search mission. That is, solving MOUAVSP is directly finding an optimal partition

P *

to maximize

\sum_{τ \in K} π_{τ}

.

4. An Online Joint-Planning Approach

In the traditional two-stage approach, the UAV route planning is performed offline and consumes an energy budget to traverse the planned vertexes preferentially. Although the UAV search planning process can be performed online, only the remaining energy budget is considered to optimize the decisions of data collection and information search. This might lead to a local–optimal solution to a large extent. Hence, a joint-planning approach is proposed, where both the route selection and search decision are made in an integrated online manner so as to improve the overall revenue of the information search.

4.1. Online UAV Greedy Route Planning

When the energy budget is limited, exploring more valuable vertexes and reducing the energy consumption on routing are two fundamental options to improve the revenue of the information search. We first model the route selection at each stage t as an online decision making problem in which the UAV decides which vertex is to be the next search point in an online manner. We denote by

V

the final selected set of vertexes in this mission round. Let

S_{t} \in V

be the location state of UAV, and

b l_{t}

be the left energy budget at stage t. We can denote by

φ

the online route planning function at each stage as follows:

S_{t} = φ (S_{t - 1}, b l_{t - 1}, c),

(4)

where

b l_{t} = b - \sum_{i = 1}^{t} a_{i} x_{i} - L ({S_{i}}_{i = 1}^{t}),

(5)

and

L ({S_{i}}_{i = 1}^{t})

denotes the energy consumption traversing the vertex sequence

{S_{i}}_{i = 1}^{t}

.

The final aim of the online route planning is to optimally reduce the energy consumption on routing, thereby maximizing the accumulated searching payoff with the limited budget of the UAV energy. Meanwhile, since a UAV needs to return back to its base before its energy is exhausted, the following constraint should be considered during the route planning:

b l_{t - 1} - L ({S_{t - 1}, S_{t}}) \geq L ({S_{t}, S_{1}}) .

(6)

As shown in Figure 3, the UAV is now located at vertex

v_{4}

. According to constraint (6), only when the total energy consumption traverses from

v_{4}

to

v_{6}

, and

v_{6}

to

v_{1}

is smaller than the current remaining energy budget, is vertex

v_{6}

is feasible for search planning.

Different from the offline approach of solving a traveling salesman problem, an online greedy route selection method is proposed, in which the UAV selects the nearest vertex that has not been traversed as the next search point greedily at each stage t, i.e.,

S_{t} = arg min_{v_{i}} L ({S_{t - 1}, v_{i}}),

(7)

where

v_{i} \notin {S_{j}}_{j = 1}^{t - 1}

and the cost of traversing from

S_{t - 1}

to

S_{t}

can be denoted by

{\bar{c}}_{t}

.

4.2. Integrated Online UAV Search Planning

To solve OnUAVSP, the UAV planner should make irrevocable online decisions on which vertex to access and whether to search for it or not at each stage. After obtaining the online greedy UAV route planning strategy in (7), we next integrate it with the online UAV search planning process.

Similarly, the objective is to maximize the accumulated searching payoff over greedily selected vertexes

V

, i.e.,

\sum_{i \in V} r_{i} x_{i}

. When assuming that all online data are known in advance, we can formulate the integrated offline UAV search planning problem (InOffSP) as follows:

[P-InOffSP] R_{n}^{*} = max_{x \in {0, 1}^{| V |}} \sum_{i \in V} r_{i} x_{i}

s . t . \sum_{i \in V} a_{i} x_{i} \leq b - \sum_{i \in V} {\bar{c}}_{i},

(8)

where

V

and

{\bar{c}}_{i}

are determined by the greedy route planning above.

P-InOffSP is a linear integer programming. Since

a

,

\bar{c}

and

r

are revealed in an online manner, we cannot obtain the optimal solution

R_{n}^{*}

and

x^{*}

of problem InOffSP. In order to achieve an near-optimal online policy

ψ_{t}

for OnUAVSP, we can first relax the integer constraint of

x

in P-InOffSP and take the linear dual of it, thereby obtaining the following programming:

[DLP-InOffSP] min_{p \geq 0, s \geq 0} p (b - \sum_{i \in V} {\bar{c}}_{i}) + 1^{⊤} s

s . t . p \cdot a + s \geq r,

(9)

by introducing dual decision variables as

p \in R

and

s \in R^{n}

.

We denote by

x^{*}

,

p_{n}^{*}

and

s^{*}

the optimal solution of the linear relaxation of InOffSP and DLP-InOffSP. According to the complementary condition, the following is valid:

x_{j}^{*} = \{\begin{matrix} 1, & r_{j} > a_{j} p_{n}^{*} \\ 0, & r_{j} < a_{j} p_{n}^{*} \end{matrix} \forall j \in {1, 2, \dots, n},

(10)

and when

r_{j} = a_{j} p_{n}^{*}

, the decision variable

x_{j}^{*}

may be a non-integer. Based on Equation (10), we can design an online algorithm (as shown in Algorithm 1) to solve the integrated online UAV search planning problem [33].

To evaluate the quality of this algorithm, the expected optimality gap, i.e., the average regret, between OffSP and OnSP can be calculated as

Δ_{n} = E [R_{n}^{*} - R_{n}],

(11)

where

R_{n}^{*} = r^{⊤} x^{*}

and

R_{n} = r^{⊤} x

denote the accumulated payoff under the optimal offline solution and online Algorithm 1, respectively. When we assume that online parameters

r

,

\bar{c}

and

a

are independent and identically distributed sampled from unknown distribution, which is usually reasonable in real-world applications, Algorithm 1 is near optimal and achieves

O (\frac{1}{\sqrt{n}})

average regret [33] of problem InOffSP.

In this online joint-planning framework, the UAV planner make an online decision on which vertex to access in step 4 and whether to search for it in step 5 at each stage.

Algorithm 1 Online joint-planning algorithm for InOnSP

Input: n, online revealed

(r, a)

and

c

Parameter: learning rate

γ_{t} = \frac{1}{\sqrt{n}}

Output: an online searching sequence

x

1:: Initialize $p_{1} = 0$
2:: While $t \leq n$
3:: Let $e = \frac{b - \sum_{i = 1}^{t} {\bar{c}}_{i}}{n}$
4:: Select $S_{t}$ and obtain ${\bar{c}}_{t}$ through

$S_{t} = arg min_{v_{i}} L ({S_{t - 1}, v_{i}}) .$
5:: if $b l_{t - 1} - L ({S_{t - 1}, S_{t}}) \geq L ({S_{1}, S_{t}})$ then
6:: Set

$x_{t} = \{\begin{matrix} 1, & r_{t} > a_{t} p_{t} \\ 0 . & r_{t} \leq a_{t} p_{t} \end{matrix}$
7:: Compute

$p_{t + 1} = max {p_{t} + γ_{t} (a_{t} x_{t} - e), 0}$
8:: update

$t = t + 1$
9:: else
10:: break
11:: endif
12:: end while
13:: return $x = {(x_{i})}_{i = {1, 2, \dots, t}}$

4.3. Multi-Round Online UAV Search Planning

After obtaining an online search planning algorithm for a single round of the search mission in an unknown environment, we then apply it to the multi-round scenarios.

As mentioned above, the route selection and search decision are made in an online manner together in the joint-planning framework. That is, the planner does not need to determine the subset of vertexes in advance that can be covered using the limited energy budget b at each trip. Hence, this framework makes it possible to achieve a policy for the problem MOUAVSP. At each round

τ

, the UAV will receive a searching payoff

π_{τ}

and a set

P_{τ}

of vertexes which have been traversed in an online manner. By removing the searched vertexes in

P_{τ}

from the entire set V, the online UAV search planning can be applied in multi-round scenarios apparently.

To evaluate the performance of the planning approach on MOUAVSP, the summation of the searching payoff at each round, i.e.,

Λ = \sum_{τ \in K} π_{τ}

is an intuitive criterion. However, this summation criterion cannot reflect the process characteristics of information collection in multi-round scenarios. For example, the payoff summations from two planning approaches may be the same, but the one that collects more information in the early stage is preferred by mission planners, especially for emergency scenarios such as search-and-rescue operations [1,5]. Therefore, we then design another criterion to evaluate the ability of the first-mover information collection. We denote by

Π_{i} = {π_{1}^{(i)}, π_{2}^{(i)}, \dots, π_{k}^{(i)}}

the sequence of searching payoffs under plan i during the multi-round trips.

Definition 1.

(First-mover Dominance) The plan i dominates plan j at ratio α as a first-mover if

\{\begin{matrix} π_{τ}^{(i)} > π_{τ}^{(j)}, & \forall τ \leq α k \\ π_{τ}^{(i)} \leq π_{τ}^{(j)}, & \forall τ > α k \end{matrix},

(12)

where

α \in [0, 1]

denotes the magnitude of first-mover dominance over plan j.

5. Experimental Results and Discussion

In this section, we evaluate the performance of the proposed joint-planning approach to the online UAV search planning problem and the multi-round problem on a commonly used dataset named TSPLIB [18]. The proposed programming and algorithm are tested on a Windows10 (64) computer with Intel Core-i7 CPU and 16.0 GB RAM using the Gurobi 9.0.1 solvers.

5.1. Performance Analysis on Single-Round Online Search Planning

In this experiment, we compare the performance of different approaches to solving problem OnUAVSP. We first analyze the common situation that the energy budget of the UAV is limited and not capable of traversing all vertexes. As shown in Table 1, we test both the joint-planning and offline two-stage approaches on the same cases from TSPLIB with the same energy budget

b = 75

. In the offline two-stage approach, the routing plan over all vertexes is made first and then the remaining energy of UAV is optimally used to select vertexes to search in an omniscient way. In this case, the offline two-stage approach is more likely to converge to a local optimum, where energy is wasted on the trail, though the omniscient information about search costs is known, while the online joint-planning approach suffers from the unknown environment. However, the proposed online joint-planning approach still outperforms the offline one in the case of ‘att48’ and ‘pr1002’, achieving a higher online revenue as shown in Table 1. Meanwhile, the online joint-planning approach receives a high competitive ratio in all cases manifestly (reporting from 88.99% over 94.52%).

Secondly, the time efficiency of both approaches is tested. We denote by T and

T^{*}

the online policy and offline policy time consumption of different approaches. Note that the time consumption of the two-stage approach includes two part, i.e., the time for offline route planning (solving a TSP problem through ACO algorithm [13] in this paper) and offline search planning (solving an linear integer programming). As shown in Table 1, it is clear that the proposed online joint-planning approach shows magnificent superiority over the offline two-stage approach in time efficiency, reporting 97.79 versus 63,702.50 CUP seconds in the case ‘pr1002’ (i.e., around three orders of magnitude difference). These results illustrate that the proposed online joint-planning approach in Algorithm 1 is superior in time efficiency over the offline two-stage one when the energy budget of the UAV is limited.

Finally, we analyze the impact of the energy budget on the performance of both approaches. We record the accumulated search payoffs under both approaches with the increase in energy budget b from 25 to 350 in the case ‘ch130’ shown in Figure 4. When the amount of energy budget is relatively small (lower than 125), the payoff under the online joint-planning approach is apparently larger than that under the offline two-stage one. To intuitively illustrate the above phenomenon in Figure 4, we randomly select two specific UAV search planning solutions when the energy budget can only support the search mission of partial vertexes and the UAV needs to return back to the base vertex for battery replacement/charging. The results of the case ‘ch130’ are shown in Figure 5. The star symbol represents the base of the UAV, and the square denotes the vertex which is determined to be searched for. The straight line with the arrow represents the flight route of the UAV under different policies. With the same energy budget

b = 75

, the solution under the offline two-stage approach first fulfills the consumption of traversing all vertexes and then assigns the remained energy budget to valuable vertexes so as to accomplish the search tasks (cf. Figure 5a), reporting a revenue of 94.45. In Figure 5b, the total energy budget is allocated to route and search tasks as a whole in an integrated online manner, reporting a revenue of 141.80 in online joint-planning approach. Due to the advantage of integrated allocation, more of the energy budget is assigned to the fulfillment of search tasks, and less energy is wasted on useless routing missions.

5.2. Performance Analysis on Multi-Round Online Search Planning

This experiment focuses on the performance of different approaches on multi-round online search planning. The summation of the searching payoff over multi-rounds

Λ

and the criterion of first-order dominance are both discussed in the case of ‘ch130’ with an energy budget of

b = 75

for each round. Due to the limitation on energy for each round and the number of points to be searched being relatively large, we set the round number as

k = 10

. As shown in Figure 6, the summations of the searching payoff over 10 rounds are 968.98 under the online joint-planning approach and 866.16 under the offline two-stage approach. Moreover, they show different progress in the multi-round mission. The online joint-planning approach tends to receive higher payoff on information collection in early rounds of the mission, while that under offline two-stage approach fluctuates randomly. Specifically, the online joint-planning approach consistently achieves higher information collection benefits within the first six rounds compared to the offline two-stage method. This observation suggests that, under similar energy constraints, the online joint-planning approach is capable of acquiring a greater amount of information at a faster rate than the offline two-stage method. Quantitatively, the plan derived from the online joint-planning approach dominates the plan under the offline two-stage approach at a ratio of 0.6 as a first-mover, which shows great superiority in the first-mover dominance of gathering information. In conclusion, the online joint-planning approach surpasses the offline two-stage method in terms of both overall information collection benefits and the speed of acquiring those benefits.

6. Conclusions

In this paper, we address the problem of online decision making in UAV search planning as well as the problem in multi-round scenarios, where the payoff and cost at each search point are unknown for the planner in advance. UAV search decisions should be made sequentially in an online manner, thereby adapting to the unknown search environment.

We propose a novel online UAV search planning problem and the multi-round planning problem to address the challenge of dynamically revealed payoffs and costs associated with search actions in an unknown environment.
To overcome the uncertainty of the search environment, we introduce an online joint-planning approach. This approach integrates route selection and search decision making in a near-optimal manner, closely approximating the omniscient offline solution.
Our proposed approach enables planners to develop an online policy for the multi-round UAV search planning problem, demonstrating significant advantages in terms of first-mover dominance and information gathering capabilities.

The effectiveness of the proposed approach is validated in a widely applied dataset. Experimental results show the superior performance in both time efficiency and the solution quality of online search decision making, reporting several orders of magnitude improvement in time efficiency and superiority in terms of first-mover information collection. In the subsequent stages, we will delve deeper into intricate environments, taking into account precise factors such as wind direction and obstacles. Our objective is to deploy the algorithm in practical scenarios that are both relevant and consequential. These scenarios include post-earthquake UAV rescue missions and UAV data acquisition within wireless sensor networks.

Author Contributions

Conceptualization, K.X. and H.D.; methodology, K.X.; software, H.D.; validation, K.X., L.L. and H.H.; formal analysis, K.X. and H.D.; investigation, K.X.; resources, K.X.; data curation, L.L.; writing—original draft preparation, H.D.; writing—review and editing, K.X.; visualization, H.D.; supervision, H.C. and H.H.; project administration, L.L., H.C. and H.H.; funding acquisition, H.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Datasets used in this study are available at http://comopt.ifi.uni-heidelberg.de/software/TSPLIB95/ (accessed on 15 July 2023).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

UAV	Unmanned Aerial Vehicle
ACO	Ant Colony Optimization
TSP	Traveling Salesman Problem
OnUAVSP	Online UAV Search Planning problem
ILP	Integer Linear Programming
MTZ	Miller–Tucker–Zemlin
OffSP	offline UAV search planning problem
OnSP	online search planning problem
InOffSP	integrated offline UAV search planning problem

References

Hayat, S.; Yanmaz, E.; Bettstetter, C.; Brown, T.X. Multi-objective drone path planning for search and rescue with quality-of-service requirements. Auton. Robot. 2020, 44, 1183–1198. [Google Scholar] [CrossRef]
Zhen, L.; Li, M.; Laporte, G.; Wang, W. A vehicle routing problem arising in unmanned aerial monitoring. Comput. Oper. Res. 2019, 105, 1–11. [Google Scholar] [CrossRef]
Lundquist, E. Drone duties: The dull, the dirty, and the dangerous. Nav. Forces 2003, 24, 20. [Google Scholar]
Sanjab, A.; Saad, W.; Başar, T. A game of drones: Cyber-physical security of time-critical UAV applications with cumulative prospect theory perceptions and valuations. IEEE Trans. Commun. 2020, 68, 6990–7006. [Google Scholar] [CrossRef]
Yeong, S.; King, L.; Dol, S. A review on marine search and rescue operations using unmanned aerial vehicles. Int. J. Mar. Environ. Sci. 2015, 9, 396–399. [Google Scholar]
Agatz, N.; Bouman, P.; Schmidt, M. Optimization approaches for the traveling salesman problem with drone. Transp. Sci. 2018, 52, 965–981. [Google Scholar] [CrossRef]
Ng, K.; Sancho, N. Regional surveillance of disjoint rectangles: A travelling salesman formulation. J. Oper. Res. Soc. 2009, 60, 215–220. [Google Scholar] [CrossRef]
Otto, A.; Agatz, N.; Campbell, J.; Golden, B.; Pesch, E. Optimization approaches for civil applications of unmanned aerial vehicles (UAVs) or aerial drones: A survey. Networks 2018, 72, 411–458. [Google Scholar] [CrossRef]
Bodin, F.; Charrier, T.; Queffelec, A.; Schwarzentruber, F. Generating Plans for Cooperative Connected UAVs. In Proceedings of the IJCAI, Stockholm, Sweden, 13–19 July 2018; pp. 5811–5813. [Google Scholar]
Xia, Y.; Batta, R.; Nagi, R. Routing a fleet of vehicles for decentralized reconnaissance with shared workload among regions with uncertain information. Oper. Res. 2017, 65, 674–692. [Google Scholar] [CrossRef]
Li, Y.; Liu, L.; Wu, J.; Wang, M.; Zhou, H.; Huang, H. Optimal Searching Time Allocation for Information Collection Under Cooperative Path Planning of Multiple UAVs. IEEE Trans. Emerg. Top. Comput. Intell. 2021, 6, 1030–1043. [Google Scholar] [CrossRef]
Bian, Y.; Wang, Y.; Yao, Y.; Chen, H. Ensemble pruning based on objection maximization with a general distributed framework. IEEE Trans. Neural Netw. Learn. Syst. 2019, 31, 3766–3774. [Google Scholar] [CrossRef] [PubMed]
Daryanavard, H.; Harifi, A. UAV path planning for data gathering of IoT nodes: Ant colony or simulated annealing optimization. In Proceedings of the 2019 3rd International Conference on Internet of Things and Applications (IoT), Isfahan, Iran, 17–18 April 2019; pp. 1–4. [Google Scholar]
Moskal II, M.D. Adaptive Unmanned Aerial Vehicle Routing Methods for Tactical Surveillance Operations. Ph.D. Thesis, State University of New York at Buffalo, Buffalo, NY, USA, 2016. [Google Scholar]
Patriksson, M. A survey on the continuous nonlinear resource allocation problem. Eur. J. Oper. Res. 2008, 185, 1–46. [Google Scholar] [CrossRef]
Liao-McPherson, D.; Huang, M.; Kolmanovsky, I. A regularized and smoothed fischer–burmeister method for quadratic programming with applications to model predictive control. IEEE Trans. Autom. Control 2018, 64, 2937–2944. [Google Scholar] [CrossRef]
Bartolini, N.; Coletta, A.; Maselli, G.; Piva, M.; Silvestri, D. GenPath-A Genetic Multi-Round Path Planning Algorithm for Aerial Vehicles. In Proceedings of the IEEE INFOCOM 2021—IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), Vancouver, BC, Canada, 10–13 May 2021; pp. 1–6. [Google Scholar]
Reinelt, G. TSPLIB—A traveling salesman problem library. ORSA J. Comput. 1991, 3, 376–384. [Google Scholar] [CrossRef]
Mengying, Z.; Hua, W.; Feng, C. Online path planning algorithms for unmanned air vehicle. In Proceedings of the 2017 IEEE International Conference on Unmanned Systems (ICUS), Beijing, China, 27–29 October 2017; pp. 116–119. [Google Scholar]
Dijkstra, E.W. A note on two problems in connexion with graphs. In Edsger Wybe Dijkstra: His Life, Work, and Legacy; ACM: New York, NY, USA, 2022; pp. 287–290. [Google Scholar]
Hart, P.E.; Nilsson, N.J.; Raphael, B. A formal basis for the heuristic determination of minimum cost paths. IEEE Trans. Syst. Sci. Cybern. 1968, 4, 100–107. [Google Scholar] [CrossRef]
Khatib, O. Real-time obstacle avoidance for manipulators and mobile robots. In Proceedings of the 1985 IEEE International Conference on Robotics and Automation, St. Louis, MO, USA, 25–28 March 1985; Volume 2, pp. 500–505. [Google Scholar]
Ilhan, T.; Iravani, S.M.; Daskin, M.S. The orienteering problem with stochastic profits. Iie Trans. 2008, 40, 406–421. [Google Scholar] [CrossRef]
Liu, B.; Xiao, X.; Stone, P. Team orienteering coverage planning with uncertain reward. In Proceedings of the 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Prague, Czech Republic, 27 September–1 October 2021; pp. 9728–9733. [Google Scholar]
Chen, H. Study on Cooperative Path Planning of Multiple-UAV Based on Intelligent Optimization Algorithm. Ph.D. Thesis, China University of Mining and Technology, Xuzhou, China, 2020. [Google Scholar]
Albore, A.; Peyrard, N.; Sabbadin, R.; Königsbuch, F.T. An online replanning approach for crop fields mapping with autonomous UAVs. In Proceedings of the International Conference on Automated Planning and Scheduling, Jerusalem, Israel, 7–11 June 2015; Volume 25, pp. 259–267. [Google Scholar]
Zhang, J. Research on Multi-AUVs Target Search and Hunting Method in Unknown Environment. Ph.D. Thesis, Harbin Engineering University, Harbin, China, 2020. [Google Scholar]
Schmid, L.; Pantic, M.; Khanna, R.; Ott, L.; Siegwart, R.; Nieto, J. An efficient sampling-based method for online informative path planning in unknown environments. IEEE Robot. Autom. Lett. 2020, 5, 1500–1507. [Google Scholar] [CrossRef]
Cai, Y.; Yang, S.X.; Mittal, G.S. A PSO-based approach to cooperative foraging multi-robots in unknown environments. In Proceedings of the 2013 6th IEEE Conference on Robotics, Automation and Mechatronics (RAM), Manila, Philippines, 12–15 November 2013; pp. 67–72. [Google Scholar]
Cai, Y.; Yang, S.X. A PSO-based approach with fuzzy obstacle avoidance for cooperative multi-robots in unknown environments. Int. J. Comput. Intell. Appl. 2016, 15, 1650001. [Google Scholar] [CrossRef]
Cai, Y.; Yang, S.X. An improved PSO-based approach with dynamic parameter tuning for cooperative multi-robot target searching in complex unknown environments. Int. J. Control 2013, 86, 1720–1732. [Google Scholar] [CrossRef]
Dadgar, M.; Jafari, S.; Hamzeh, A. A PSO-based multi-robot cooperation method for target searching in unknown environments. Neurocomputing 2016, 177, 62–74. [Google Scholar] [CrossRef]
Li, X.; Sun, C.; Ye, Y. Simple and fast algorithm for binary integer and online linear programming. Adv. Neural Inf. Process. Syst. 2020, 33, 9412–9421. [Google Scholar] [CrossRef]

Figure 1. Routing a UAV to search for and collect information.

Figure 2. A schematic diagram of the multi-round UAV search mission.

Figure 3. A schematic diagram of the return-back constraint for UAV routing.

Figure 4. The impact of the energy budget on the accumulated online search payoff.

Figure 5. A comparison of solutions under two approaches.

Figure 6. The performance of different approaches on multi-round online search planning.

Table 1. Performance comparison of two approaches.

Case Name	Offline Two-Stage Approach			Online Joint-Planning Approach
Case Name	$R_{n}^{*}$	$ρ$ (%)	$T^{*}$	$R_{n}$	$R_{n}^{*}$	$ρ$ (%)	$T$	$T^{*}$
att48	46.98	45.16	8.61	58.16	65.26	88.99	1.97	2.53
ch130	210.66	75.66	72.70	181.25	196.66	92.21	7.06	7.83
tsp225	373.13	86.57	309.93	324.19	345.51	93.81	14.82	15.61
gr431	590.08	83.90	1707.88	513.56	549.24	93.49	36.54	37.30
pr1002	1068.86	84.89	63702.50	1074.30	1136.48	94.52	97.79	98.63

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Duan, H.; Xiao, K.; Liu, L.; Chen, H.; Huang, H. Online Unmanned Aerial Vehicles Search Planning in an Unknown Search Environment. Drones 2024, 8, 336. https://doi.org/10.3390/drones8070336

AMA Style

Duan H, Xiao K, Liu L, Chen H, Huang H. Online Unmanned Aerial Vehicles Search Planning in an Unknown Search Environment. Drones. 2024; 8(7):336. https://doi.org/10.3390/drones8070336

Chicago/Turabian Style

Duan, Haopeng, Kaiming Xiao, Lihua Liu, Haiwen Chen, and Hongbin Huang. 2024. "Online Unmanned Aerial Vehicles Search Planning in an Unknown Search Environment" Drones 8, no. 7: 336. https://doi.org/10.3390/drones8070336

APA Style

Duan, H., Xiao, K., Liu, L., Chen, H., & Huang, H. (2024). Online Unmanned Aerial Vehicles Search Planning in an Unknown Search Environment. Drones, 8(7), 336. https://doi.org/10.3390/drones8070336

Article Menu

Online Unmanned Aerial Vehicles Search Planning in an Unknown Search Environment^†

Abstract

1. Introduction

2. Related Works

2.1. UAV Route Planning

2.2. UAV Search Planning

2.3. Online Planning in Unknown Environments

3. Problem Description and Notations

4. An Online Joint-Planning Approach

4.1. Online UAV Greedy Route Planning

4.2. Integrated Online UAV Search Planning

4.3. Multi-Round Online UAV Search Planning

5. Experimental Results and Discussion

5.1. Performance Analysis on Single-Round Online Search Planning

5.2. Performance Analysis on Multi-Round Online Search Planning

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Online Unmanned Aerial Vehicles Search Planning in an Unknown Search Environment †

Abstract

1. Introduction

2. Related Works

2.1. UAV Route Planning

2.2. UAV Search Planning

2.3. Online Planning in Unknown Environments

3. Problem Description and Notations

4. An Online Joint-Planning Approach

4.1. Online UAV Greedy Route Planning

4.2. Integrated Online UAV Search Planning

4.3. Multi-Round Online UAV Search Planning

5. Experimental Results and Discussion

5.1. Performance Analysis on Single-Round Online Search Planning

5.2. Performance Analysis on Multi-Round Online Search Planning

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Online Unmanned Aerial Vehicles Search Planning in an Unknown Search Environment^†