Multi-UAVs Tracking Non-Cooperative Target Using Constrained Iterative Linear Quadratic Gaussian

Zhang, Can; Wang, Yidi; Zheng, Wei

doi:10.3390/drones8070326

Open AccessArticle

Multi-UAVs Tracking Non-Cooperative Target Using Constrained Iterative Linear Quadratic Gaussian

by

Can Zhang

,

Yidi Wang

^* and

Wei Zheng

College of Aerospace Science and Engineering, National University of Defense Technology, Changsha 410000, China

^*

Author to whom correspondence should be addressed.

Drones 2024, 8(7), 326; https://doi.org/10.3390/drones8070326

Submission received: 1 May 2024 / Revised: 24 June 2024 / Accepted: 9 July 2024 / Published: 15 July 2024

Download

Browse Figures

Versions Notes

Abstract

:

This study considers the problem of controlling multi-unmanned aerial vehicles (UAVs) to consistently track a non-cooperative ground target with uncertain motion in a hostile environment with obstacles. An active information acquisition (AIA) problem is formulated to minimize the uncertainty of the target tracking task. The uncertain motion of the target is represented as a Wiener process. First, we optimize the configuration of the UAV swarm considering the collision avoidance, horizontal field of view (HFOV), and communication radius to calculate the reference trajectories of the UAVs. Next, a novel algorithm called Constrained Iterative Linear Quadratic Gaussian (CILQG) is introduced to track the reference trajectory. The target’s state with uncertainty and the UAV state are described as beliefs. The CILQG algorithm utilizes the Unscented Transform to propagate the belief regarding the UAVs’ motions, while also accounting for the impact of navigation errors on the target tracking process. The estimation error of the target position of the proposed method is under 4 m, and the error of tracking the reference trajectories is under 3 m. The estimation error remains unchanged even in the presence of obstacles. Therefore, this approach effectively deals with the uncertainties involved and ensures accurate tracking of the target.

Keywords:

non-cooperative target tracking; active information acquisition; belief states; CILQG; multi-unmanned aerial vehicle

1. Introduction

Unmanned aerial vehicle (UAV) swarms have been extensively applied for surveillance in a hostile complex environment. A crucial surveillance scenario involves persistent tracking of suspicious ground targets [1] using a swarm of UAVs. Typically, the tracking process consists of two steps. (1) UAVs in the swarm collaboratively estimate the state of the target, i.e., position and velocity, based on the measurements from onboard devices, and (2) UAVs adjust their trajectories to ensure the motion of the target can be persistently and accurately estimated. These two steps have been intensively and separately investigated [2,3]. Specifically, the studies on the first step always assume precise knowledge of UAV trajectories [4], and the studies on the second step typically assume prior knowledge of the target’s state [5].

Regarding the first step, when the trajectories of UAVs are precisely known, the study on the first step can be recast as an optimal estimation problem. Various types of Kalman filters, such as the original Kalman Filter for linear systems, Extended Kalman Filter (EKF) [6], Unscented Kalman Filter (UKF) [7], and the particle filter for nonlinear systems, have been widely employed. For uncertain target motion, the interacting multiple model (IMM) filtering algorithm offers a solution by employing multi-Kalman-type filters to synchronously estimate the state of the target, and fusing the solutions from the filters. When the motion of the target is nonlinear, the original IMM filter is modified to be IMM-UKF [8] or IMM-CKF [3]. However, IMM filters rely on accurate model assumptions, and the performance can degrade if the true target motion deviates significantly. In practical tracking tasks, the trajectories of the UAVs are not perfectly accurate due to errors in the positions provided by navigation units and deviations from the planned trajectory. These factors should not be overlooked.

Under the assumption that the motion of the target of interest has already been obtained, the second step focuses on planning an optimal trajectory to accurately track the target. Various approaches consider constraints like obstacle avoidance, dynamics, communication, and energy consumption to plan optimal trajectories [5,9]. Control strategies like Iterative Linear Quadratic Regular (ILQR) [10,11] then guide the UAVs along these planned trajectories. However, the assumption that the states of non-cooperative targets are already known is impractical since their motion is unpredictable.

Belief space planning is a methodology that integrates estimation with control, commonly applied to robot navigation [12] and autonomous driving [13]. In this case, the robots estimate their own motion to form the belief state. The maximum a posteriori (MAP) estimate is used in [12] to minimize the uncertainty of the robots. However, in dynamic target tracking scenarios, the robots are supposed to estimate the target motion and their own motion. The challenge of bilateral uncertainties triggers the need for a refined control strategy. To tackle this issue, we propose a two-stage optimization process for tracking. The first stage focuses on minimizing the target uncertainty under the frame of an active information acquisition (AIA) problem [14]. Subsequently, the second stage entails tracking reference trajectories using Constrained Iterative Linear Quadratic Gaussian (CILQG), which addresses the UAV’s uncertainty reduction.

AIA is specifically used to describe the target tracking process via a swarm of robots, which attempts to jointly estimate the motion of the target and optimize the actions of robots in the swarm, recognizing their interdependence. In other words, the two aforementioned steps are combined. The trajectories of robots are dynamically planned and optimized to achieve accurate target tracking. A relevant approach to solving the AIA problem can be categorized as greedy and non-myopic. Greedy approaches [15] focus on myopic decision-making over non-Gaussian processes. Although myopic approaches offer computational efficiency, they are susceptible to local optima. Non-myopic robot-motion algorithms [14,16] provide optimal solutions under linear-Gaussian assumptions but suffer from high computational demands, especially in complex scenarios. Distributed non-myopic algorithms [17,18] alleviate this burden but necessitate all-to-all communication. Despite progress, a critical gap remains in developing non-myopic AIA algorithms that are both computationally efficient and communication-thrifty. Moreover, practical AIA tasks involve complex environments with potential obstacles, sudden failures, and adversarial attacks. While Resilient Active Information gathering [19,20] addresses some of these challenges, obstacle avoidance often remains overlooked.

The objective functions commonly employed in AIA problems typically revolve around maximizing information gain about the target’s state. This can be achieved by either maximizing the mutual information between the target states and measurements [16] or minimizing the conditional entropy of the target states given the measurements [14]. Based on the Gaussian assumption, the AIA problem simplifies to minimizing the covariance of the target estimation [14]. This approach aligns with the goal of reducing uncertainty in the target’s state over a finite time horizon [21]. Alternatively, maximizing the Fisher Information Matrix (FIM) [22,23,24,25,26] offers another avenue for uncertainty minimization. By optimizing control actions to maximize the FIM, we enhance the system’s sensitivity to changes in the target’s state, leading to improved observability and reduced uncertainty.

The searching and sampling processes to solve the AIA problem using non-myopic optimization are time-consuming. A promising alternative lies in determining optimal reference trajectories that adhere to the system constraints and employing efficient tracking controllers. Iteration Linear Quadratic Regulator (ILQR) [11] emerges as a strong candidate due to its computational advantages over methods such as deep deterministic policy gradient (DDPG) [27]. ILQR leverages iterative linear-quadratic approximations of the cost function and system dynamics, efficiently solving an LQR subproblem at each step. This method is free of extensive offline computation and pre-processing. Constrained Iteration Linear Quadratic Regulator (CILQR) [28,29] extends the ILQR method to handle constraints on control inputs and states. Recognizing the inherent uncertainties in UAV navigation and target state estimation, CILQR can be further enhanced by incorporating belief states, leading to CILQG [13,30].

This work studies the problem of persistently tracking a ground target using UAVs in the frame of AIA and optimizes the motions of UAVs over a horizon to minimize the uncertainty of the target states. The contributions of this work are as follows:

Bilateral uncertainty from the motion of the target and UAVs poses significant challenges to control, particularly in dynamic target tracking scenarios. To address this challenge, we propose a two-stage optimization process for tracking. The first stage focuses on minimizing the target uncertainty using the Fisher Information Matrix (FIM) under the frame of an AIA problem. The second stage is tracking the reference trajectories using CILQG, which addresses the UAV’s uncertainty reduction.
The interdependence between target motion estimation and UAV trajectory tracking also forms a crucial aspect of our approach. Due to the intertwined nature of these two processes, the impact of the bilateral uncertainty on the accuracy of the target estimation and the precision of UAVs’ trajectories will be intensified. Furthermore, the uncertainty inherent in target motion necessitates a control method capable of rapidly adapting to its movements. This challenge is amplified by limited computational resources and the critical need for real-time performance. The reasons above necessitate a control method that balances computational speed and accuracy. While rarely applied to target tracking, the Constrained Iterative Linear Quadratic Gaussian (CILQG) method demonstrates superior computational efficiency. Moreover, CILQG exhibits enhanced robustness to noise, effectively handling system uncertainties. This stems from its use of belief states and trajectory profile tracking to minimize uncertainty propagation. Conversely, conventional solvers typically treat estimated target states as deterministic values within the optimization problem, neglecting both target and robot uncertainties. This can lead to divergence and suboptimal control performance.

This work is organized as follows. In Section 2, we describe the target tracking task executed by a swarm of UAVs. In Section 3, we introduce the observation model, the belief dynamics of the UAVs, and the target deduced by the Unscented Transform. In Section 4, the AIA problem is formulated to solve the optimal configuration according to the visibility constraints and collision avoidance. In addition, the trajectory tracking problem is proposed. In Section 5, we use CILQG to track the reference trajectory. Simulations are conducted in Section 6. The conclusions are provided in Section 7.

2. Description of the Investigated Task

This work focuses on the task of tracking a non-cooperative ground target using a swarm of UAVs, comprising I individual UAVs, as illustrated in Figure 1. The tracking scenario unfolds in a complex environment cluttered with obstacles, demanding robust and efficient strategies. The objective of the UAV swarm is to estimate the target’s uncertain state based on the measurements and select the optimal control inputs to consistently track the target. To achieve this, we reframe the target tracking task as an AIA problem, in which the UAVs optimize their actions to maximize the information gathered from the target over a finite time horizon.

Recognizing the inherent uncertainties in both UAV positioning (estimated by onboard navigation units) and target motion, we represent the states of both the target and UAVs using belief states. The control block diagram of the system is depicted in Figure 2. The belief dynamics are based on the measurements, propagating the beliefs with uncertainty. To maximize the observability while adhering to system constraints arising from HFOV of UAVs, communication radius, and obstacle avoidance, we formulate the AIA problem to determine an optimal configuration for the UAV swarm. The reference trajectory for each UAV is calculated based on the optimal configuration and the belief of the target. Finally, the CILQG is introduced to track the reference trajectories of the UAVs, ensuring an accurate tracking.

3. Model for Target Tracking Problem

3.1. Measurement Acquisition Model

The UAVs are supposed to gather information on the target, and every UAV is equipped with a radar which is used for monitoring the ground target inside its sensing range. Specifically, the ith UAV receives a set of noisy measurements from the target, and the measurement model is as follows:

z_{i, k} = h (s_{i, k}^{v}, s_{k}^{t}) + n_{k}

(1)

h (s_{i, k}^{v}, s_{k}^{t}) = [\begin{matrix} \sqrt{{(x_{i, k}^{v} - x_{k}^{t})}^{2} + {(y_{i, k}^{v} - y_{k}^{t})}^{2} + {z_{i, k}^{v}}^{2}} \\ \arctan \frac{y_{i, k}^{v} - y_{k}^{t}}{x_{i, k}^{v} - x_{k}^{t}} \\ \arctan \frac{z_{i, k}^{v}}{\sqrt{{(x_{i, k}^{v} - x_{k}^{t})}^{2} + {(y_{i, k}^{v} - y_{k}^{t})}^{2}}} \end{matrix}]

(2)

where

s_{i, k}^{v} = {[x_{i, k}^{v}, y_{i, k}^{v}, z_{i, k}^{v}, φ_{i, k}^{v}, β_{i, k}^{v}]}^{T}

is the state of the ith UAV at

t_{k}

, including the position

x_{i, k}^{v}

,

y_{i, k}^{v}

,

z_{i, k}^{v}

in the East–North–Up (ENU) coordinate system, the pitch angle

φ_{i, k}^{v}

, and the yaw angle

β_{i, k}^{v}

.

s_{k}^{t} = {[x_{k}^{t}, y_{k}^{t}, θ_{k}^{t}]}^{T}

denotes the target’s state at

t_{k}

, including the position

x_{k}^{t}

,

y_{k}^{t}

in the ENU coordinate system, and the yaw angle

θ_{k}^{t}

.

n_{k}

denotes the measurement noise, which is modeled as a zero-mean Gaussian white noise with a variance of

Σ_{n}

. Therefore, the distribution of

z_{i, k + 1}

is

N (h (s_{i, k}^{v}, s_{k}^{t}), Σ_{n})

.

3.2. Belief Dynamics of UAV

The dynamics of an autonomous UAV are described by a discrete-time nonlinear Wiener process:

s_{i, k + 1}^{v} = s_{i, k}^{v} + g (s_{i, k}^{v}) (u_{i, k}^{v} + m_{k}) Δ T

(3)

g (s_{i, k}^{v}) = [\begin{matrix} \cos (φ_{i, k}^{v}) \cos (β_{i, k}^{v}) & 0 & 0 \\ \cos (φ_{i, k}^{v}) \sin (β_{i, k}^{v}) & 0 & 0 \\ \sin (φ_{i, k}^{v}) & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{matrix}]

(4)

where

u_{i, k}^{v} = {[\begin{matrix} V_{i, k}^{v} & {\dot{φ}}_{i, k}^{v} & {\dot{β}}_{i, k}^{v} \end{matrix}]}^{T}

is the control input of the ith UAV at

t_{k}

.

m_{k}

is the process noise, which is modeled as a zero-mean Gaussian noise with a variance of

Σ_{m}

.

The belief of the ith UAV at

t_{k}

,

b_{i, k}^{v}

, is updated according to Equations (5) and (6):

b_{i, k + 1}^{v} = \int G (s_{i, k + 1}^{v} | s_{i, k}^{v}, u_{i, k}^{v}) {\hat{b}}_{i, k}^{v} d s_{i, k}^{v}, i = 1, 2, 3

(5)

{\hat{b}}_{i, k + 1}^{v} = \frac{Φ (z_{i, k + 1}^{v} | s_{i, k + 1}^{v}) b_{i, k + 1}^{v}}{\int Φ (z_{i, k + 1}^{v} | s_{i, k + 1}^{v}) b_{i, k + 1}^{v} d s_{i, k + 1}^{v}}

(6)

where

G (s_{i, k + 1}^{v} | s_{i, k}^{v}, u_{i, k}^{v})

is the UAV state transition density at

t_{k}

, which can be described by

\begin{array}{l} G (s_{i, k + 1}^{v} | s_{i, k}^{v}, u_{i, k}^{v}) = \frac{1}{\sqrt{2 π | g (s_{i, k}^{v}) Σ_{m} g {(s_{i, k}^{v})}^{T} |} Δ T} \\ \exp [- (s_{i, k + 1}^{v} - s_{i, k}^{v} - g (s_{i, k}^{v}) u_{i, k}^{v} Δ T) {(2 Δ T^{2} g (s_{i, k}^{v}) Σ_{m} g {(s_{i, k}^{v})}^{T})}^{- 1} \\ {(s_{i, k + 1}^{v} - s_{i, k}^{v} - g (s_{i, k}^{v}) u_{i, k}^{v} Δ T)}^{T}] \end{array}

(7)

Since the dynamics of the UAV shown in Equation (3) are nonlinear, the integral in Equation (5) is calculated via the Unscented Transform. Ref. [31] elaborates on the details of the belief propagation via the Unscented Transform.

In Equation (6),

{\hat{b}}_{i, k}^{v}

is the posterior belief of the UAV’s state when the measurement from the onboard device is available, and

Φ (z_{i, k + 1}^{v} | s_{i, k + 1}^{v})

is the measurement likelihood function according to Section 3.1, which determines the probability density that the navigation unit of the UAV, such as the Inertial Measurement Unit, will receive the measurement

z_{i, k + 1}^{v}

from the ith UAV with state

s_{i, k + 1}^{v}

at

t_{k + 1}

.

3.3. Belief Dynamics of the Target

In this work, the target is a ground vehicle whose motion is assumed to be a discrete-time Wiener process:

s_{k + 1}^{t} = s_{k}^{t} + f (s_{k}^{t}) (u_{k}^{t} + m_{k}) Δ T

(8)

f (s_{k}^{t}) = [\begin{matrix} \cos (θ_{k}^{t}) & 0 \\ \sin (θ_{k}^{t}) & 0 \\ 0 & 1 \end{matrix}]

(9)

where

u_{k}^{t} = {[\begin{matrix} V_{k}^{t} & {\dot{θ}}_{k}^{t} \end{matrix}]}^{T}

is the control input of the target, including the velocity

V_{k}^{t}

and the angle velocity

{\dot{θ}}_{k}^{t}

, allowing the target to change its orientation and speed.

The belief of the target at

t_{k}

,

b_{k}^{t}

, is updated according to Equations (10) and (11):

b_{k + 1}^{t} = \int F (s_{k + 1}^{t} | s_{k}^{t}, u_{k}^{t}) {\hat{b}}_{k}^{t} d s_{k}^{t}

(10)

{\hat{b}}_{k + 1}^{t} = \frac{Ψ (z_{υ, k + 1}^{t} | s_{υ, k + 1}^{v}, s_{k + 1}^{t}) b_{k + 1}^{t}}{\int Ψ (z_{υ, k + 1}^{t} | s_{υ, k + 1}^{v}, s_{k + 1}^{t}) b_{k + 1}^{t} d s_{k + 1}^{t}}

(11)

where

F (s_{k + 1}^{t} | s_{k}^{t}, u_{k}^{t})

is the target state transition density at

t_{k}

, which can be described by

\begin{array}{l} F (s_{k + 1}^{t} | s_{k}^{t}, u_{k}^{t}) = \frac{1}{\sqrt{2 π | f (s_{k}^{t}) Σ_{m} f {(s_{k}^{t})}^{T} |} Δ T} \\ \exp [- (s_{k + 1}^{t} - s_{k}^{t} - f (s_{k}^{t}) u_{k}^{t} Δ T) {(2 Δ T^{2} f (s_{k}^{t}) Σ_{m} f {(s_{k}^{t})}^{T})}^{- 1} \\ {(s_{k + 1}^{t} - s_{k}^{t} - f (s_{k}^{t}) u_{k}^{t} Δ T)}^{T}] \end{array}

(12)

Since the dynamics of the target shown in (8) are nonlinear, the integral in Equation (10) is also calculated via the Unscented Transform.

In Equation (11),

{\hat{b}}_{k}^{t}

is the posterior belief on the target’s state when the measurement from the onboard device is available.

υ = {1, 2, \dots, I}

represents the set of UAVs,

s_{υ, k + 1}^{v} = {s_{1, k + 1}^{v}, s_{2, k + 1}^{v}, \dots, s_{I, k + 1}^{v}}

includes the states of all the UAVs of the swarm, and

z_{υ, k + 1}^{t} = {z_{1, k + 1}^{t}, z_{2, k + 1}^{t}, \dots, z_{I, k + 1}^{t}}

contains the measurements received by the UAVs from the target. The measurement likelihood function

Ψ (z_{υ, k + 1}^{t} | s_{υ, k + 1}^{v}, s_{k + 1}^{t})

gives the probability that the UAVs with states

s_{υ, k + 1}^{v}

will receive a set of measurements

z_{υ, k + 1}^{t}

from the target with state

s_{k + 1}^{t}

at

t_{k + 1}

.

3.4. Obstacle Model

The pth obstacle is modeled as a cylinder O_p[(

x_{p}^{o b s}

,

y_{p}^{o b s}

),

h_{p}^{o b s}

,

r_{p}^{o b s}

], where (

x_{p}^{o b s}

,

y_{p}^{o b s}

) is the position of the center of the obstacle,

h_{p}^{o b s}

is the height, and

r_{p}^{o b s}

is the radius. Assuming that

h_{p}^{o b s}

is greater than the height of the UAVs, the UAVs cannot fly over the obstacles.

4. Optimization Problem behind the Active Information Acquisition

In this paper, the AIA problem is composed of two cascaded optimization problems. The first optimization problem is to determine an optimal configuration of the swarm relative to the target, and the configuration is transferred to the reference trajectories of the UAVs. The second problem entails tracking these reference trajectories, which are derived from the solution to the first problem.

4.1. Optimization Problem about the Configuration of the Swarm

4.1.1. Pre-Designed Configuration

To ensure continuous target tracking within the UAVs’ HFOV, it is crucial to determine the angle between each UAV and the target. Assuming that the radar is fixed on the UAVs with an orientation identical to the UAV itself, and denoting the radar’s HFOV as α⁺, we need to maintain the target within this angular range. Additionally, effective communication necessitates that the relative distances between any two UAVs must be within the maximum communication radius, denoted as R⁺, which depends on the power consumption of the communication devices. Assuming that the UAVs maintain fixed altitudes, we propose a symmetrical configuration of the UAV swarm relative to the target, as illustrated in Figure 3. The blue sector represents the sensor detection range. We aim to optimize both the angle

α_{i, k}

between the y-axis and the UAV-target line of sight, and the distance

R_{i, k}

between the UAV and the target. This optimization considers the aforementioned constraints to achieve an optimal configuration for continuous tracking and communication.

4.1.2. Optimization Problem for $α_{i, k}$ and $R_{i, k}$

The optimal trajectories are not directly obtained. Instead, only the configuration of the swarm at the current time step can be determined. According to Section 4.1.1,

α_{i, k}

and

R_{i, k}

are the variables that need to be optimized when determining the optimal configuration. When

b_{k}^{t}

are available, the optimal configuration of the swarm at

t_{k}

can be determined by solving the following receding horizon optimization problem over an N-length planning horizon:

\max_{R_{i, k}, α_{i, k}} J (R_{1 : I, k : k + N}, α_{1 : I, k : k + N}, b_{k : k + N}^{t}) ≜ \sum_{i = 1}^{I} \sum_{j = k}^{k + N} E [(\nabla_{s_{j}^{t}} \ln p (z_{υ, 1 : j}^{t} | s_{j}^{t})) {(\nabla_{s_{j}^{t}} \ln p (z_{υ, 1 : j}^{t} | s_{j}^{t}))}^{T}]

(13)

s . t . b_{j + 1}^{t} = λ^{- 1} Ψ (z_{υ, j + 1}^{t} | s_{υ, j + 1}^{v}, s_{j + 1}^{t}) \int F (s_{j + 1}^{t} | s_{j}^{t}, u_{j}^{t}) b_{j}^{t} d s_{j}^{t}

(14)

\begin{array}{l} p (z_{υ, 1 : j}^{t} | s_{j}^{t}) = p (z_{υ, 1}^{t}, z_{υ, 2}^{t}, \dots, z_{υ, j}^{t} | s_{j}^{t}) = \prod_{j = 1}^{k} Ψ (z_{υ, j}^{t} | s_{υ, j}^{v}, s_{j}^{t}) \\ = \prod_{j = 1}^{k} 1 / \sqrt{2 π | Σ_{n} |} \exp [- \frac{1}{2} {(z_{j} - h ({\bar{s}}_{i, j}^{v}, b_{j}^{t}))}^{T} Σ_{n}^{- 1} (z_{j} - h ({\bar{s}}_{i, j}^{v}, b_{j}^{t}))] \end{array}

(15)

| | [{\bar{x}}_{i, j}^{v}, {\bar{y}}_{i, j}^{v}] - [x_{p}^{o b s}, y_{p}^{o b s}] | |_{2} > r, p = 1, 2, \dots, P

(16)

α_{i, k} \in [- α^{+} / 2, α^{+} / 2]

(17)

R_{i, k} \in (0, R^{+})

(18)

where

\nabla_{s_{k}^{t}}

is the gradient operator, and

p (z_{1 : k} | s_{k}^{t})

is the batch measurement likelihood [32,33,34,35].

I

is the number of the UAVs, and

P

is the number of the obstacles.

{\bar{s}}_{i, j}^{v}

is the reference state of the ith UAV at

t_{k}

, which is related to the target belief,

α_{i, k}

and

R_{i, k}

.

{\bar{s}}_{i, j}^{v}

includes the positions

{\bar{x}}_{i, j}^{v}

,

{\bar{y}}_{i, j}^{v}

,

{\bar{z}}_{i, j}^{v}

, the yaw

{\bar{φ}}_{i, j}^{v}

, and the pitch

{\bar{β}}_{i, j}^{v}

.

The objective function shown in Equation (13) is essentially the FIM for the horizon from

t_{k}

to

t_{k + N}

. As depicted in Figure 4, receding horizon control optimizes the actions to maximize the FIM over a finite prediction horizon, and applies the first action of the optimized sequence at the next time step. Equation (14) is the belief dynamics of the target, and Equation (16) is the obstacle avoidance constraint, which keeps the distance between the UAV and the obstacle center greater than the radius of the obstacle. Equations (17) and (18) restrict the maximum distance and angle of the UAV relative to the target.

4.2. Optimization Problem for Trajectory Tracking

The reference trajectory is not directly obtained; it is determined by the optimal configuration and the target belief

b_{k}^{t}

. When the optimal

R_{i, k}

and

α_{i, k}

are found, the configuration can be converted to the position of the UAV relative to the target,

D_{i, k}

. When there are only three UAVs, the reference state of the UAV,

{\bar{s}}_{i, k}^{v}

can be expressed as follows:

{\bar{s}}_{i, k}^{v} = {[{\bar{x}}_{i, k}^{v}, {\bar{y}}_{i, k}^{v}, {\bar{z}}_{i, k}^{v}, {\bar{φ}}_{i, k}^{v}, {\bar{β}}_{i, k}^{v}]}^{T} = {[{\tilde{x}}_{k}^{t}, {\tilde{y}}_{k}^{t}, H, 0, {\tilde{θ}}_{k}^{t}]}^{T} + {[D_{i, k}^{T}, 0, 0, 0]}^{T}

(19)

D_{i, k} = \{\begin{matrix} C_{k} [\begin{matrix} - R_{i, k} \sin α_{i, k} \\ - R_{i, k} \cos α_{i, k} \end{matrix}], i = 1 \\ C_{k} [\begin{matrix} - R_{i, k} \\ 0 \end{matrix}], i = 2 \\ C_{k} [\begin{matrix} - R_{i, k} \sin α_{i, k} \\ R_{i, k} \cos α_{i, k} \end{matrix}], i = 3 \end{matrix}

(20)

C_{k} = [\begin{matrix} \cos {\tilde{θ}}_{k}^{t} & - \sin {\tilde{θ}}_{k}^{t} \\ \sin {\tilde{θ}}_{k}^{t} & \cos {\tilde{θ}}_{k}^{t} \end{matrix}]

(21)

where

C_{k}

is the target heading rotation matrix.

{\tilde{x}}_{k}^{t}

,

{\tilde{y}}_{k}^{t}

, and

{\tilde{θ}}_{k}^{t}

are beliefs of positions and the yaw angle of the target. To avoid the collision between the UAVs, each UAV has an identical altitude.

The reference state

{\bar{s}}_{i, j}^{v}

composes a reference trajectory

\{{\bar{s}}_{i, k}^{v}, {\bar{s}}_{i, k + 1}^{v}, \dots, {\bar{s}}_{i, k + N}^{v}\}

during an N-length planning horizon. The reference trajectories are derived in this way. The optimization problem to track the reference trajectories of the UAVs is described by the following:

\min_{U_{1 : I, k}} J (b_{1 : I, k : k + N}^{v}, U_{1 : I, k}) ≜ \sum_{i = 1}^{I} \sum_{j = k}^{k + N} [{(b_{i, j}^{v} - {\bar{s}}_{i, j}^{v})}^{T} (b_{i, j}^{v} - {\bar{s}}_{i, j}^{v}) + {u_{i, j}^{v}}^{T} W_{u} u_{i, j}^{v}]

(22)

s . t . b_{i, j + 1}^{v} = η^{- 1} Φ (z_{i, j + 1}^{v} | s_{i, j + 1}^{v}) \int G (s_{i, j + 1}^{v} | s_{i, j}^{v}, u_{i, j}^{v}) b_{i, j}^{v} d s_{j}^{v}

(23)

| | [{\tilde{x}}_{i, j}^{v}, {\tilde{y}}_{i, j}^{v}] - [x_{p}^{o b s}, y_{p}^{o b s}] | |_{2} > r, p = 1, 2, \dots, P

(24)

u_{i, k}^{v} \in [u_{\min}, u_{\max}]

(25)

where

U_{i, k} = \{u_{i, k}^{v}, u_{i, k + 1}^{v}, \dots, u_{i, k + N - 1}^{v}\}

is the control input sequence of the ith UAV from

t_{k}

to

t_{k + N - 1}

,

W_{u}

is the weight matrix of the control input, and

{\tilde{x}}_{k}^{t}

,

{\tilde{y}}_{k}^{t}

are beliefs of the position of the UAV.

b_{i, k}^{v}

is the belief state vector of the UAV at

t_{k}

. Additionally,

b_{i, k}^{v} = {[x_{k}^{v}, y_{k}^{v}, z_{k}^{v}, φ_{k}^{v}, β_{k}^{v}, Σ_{k}^{x x}, Σ_{k}^{x y}, \dots, Σ_{k}^{β β}]}^{T}

(26)

where

x_{k}^{v}

,

y_{k}^{v}

,

z_{k}^{v}

,

φ_{k}^{v}

, and

β_{k}^{v}

are the mean of the UAV state, and

Σ_{k}^{11}

is the element of the variance matrix.

{\bar{s}}_{i, k}^{v}

includes the reference mean and reference variance. Therefore, the profile of the trajectory instead of a specific trajectory is tracked.

While reference trajectories for the UAVs are generated based on the estimated target belief and optimized configuration, the target belief itself is updated using measurements obtained by the UAVs along their trajectories. This highlights the intertwined nature of these two processes. The accuracy of tracking is affected by the uncertainty in the motion of the target and UAVs, which is represented by the process noise. Additionally, the measurement process is influenced by the measurement noise. Therefore, an effective tracking controller is needed.

5. CILQG for Active Information Acquisition Problem

CILQG provides an elegant solution by incorporating belief states directly into the control framework. This makes it particularly well suited for scenarios like ours, where both target motion and UAV states are subject to uncertainties. By constructing a state vector comprising the mean and variance of the UAV belief states, CILQG effectively tracks the entire trajectory profile, encompassing both the mean (blue line in Figure 5) and variance (blue circle in Figure 5). CILQG mainly contains the backward and the forward pass. In the backward pass, the objective function of tracking the reference trajectory is approximated as a quadratic form to calculate the optimal control input. In the forward pass, the belief dynamics of the UAVs are used to update the trajectory and the cost of the next iteration. The constraints in Equations (24) and (25) are converted to the cost via barrier function.

5.1. Backward Pass

The Bellman equation typically defines the cost-to-go from the current time to the terminal time. However, in the case of the target tracking problem, a specific terminal time does not exist since the tracking process is continuous and lacks a defined endpoint; therefore, we recursively optimize the control input sequence in an N-length planning horizon instead of a fixed terminal time. Therefore, Equation (22) is defined as the cost incurred from the current

t_{k}

until the end of the planning horizon,

t_{k + N}

. This approach allows us to continuously adapt the UAV trajectories based on the latest target information.

The optimal cost-to-go V at

t_{k}

is as follows:

V (b_{i, k}^{v}, k) = \min_{U_{i, k}} J_{i, k} (b_{i, k}^{v}, U_{i, k})

(27)

The optimal cost-to-go is separated into two parts, i.e., the cost at

t_{k + N}

in Equation (28) and the cost-to-go at

t_{k}

in Equation (29):

V (b_{i, k}^{v}, k + N) = \min_{U_{i, k}} {(b_{i, k + N}^{v} - {\bar{s}}_{i, k + N}^{v})}^{T} W_{N} (b_{i, k + N}^{v} - {\bar{s}}_{i, k + N}^{v})

(28)

V (b_{i, k}^{v}, k) = \min_{U_{i, k}} [{(b_{i, k}^{v} - {\bar{s}}_{i, k}^{v})}^{T} W_{k} (b_{i, k}^{v} - {\bar{s}}_{i, k}^{v}) + {u_{i, k}^{v}}^{T} R_{k} u_{i, k}^{v} + V_{k + 1} (b_{i, k + 1}^{v}, k + 1)]

(29)

The variation of

V (b_{i, k}^{v}, k)

around the belief

b_{i, k}^{v}

and control input

u_{i, k}^{v}

can be expressed as a quadratic function of small perturbations, denoted as

Q (δ b, δ u)

. It is expanded to second order:

\begin{matrix} Q (δ b, δ u) = & {(b_{i, k}^{v} + δ b - {\bar{s}}_{i, k}^{v})}^{T} W_{k} (b_{i, k}^{v} + δ b - {\bar{s}}_{i, k}^{v}) - {(b_{i, k}^{v} - {\bar{s}}_{i, k}^{v})}^{T} W_{k} (b_{i, k}^{v} - {\bar{s}}_{i, k}^{v}) \\ + {(u_{i, k}^{v} + δ u)}^{T} R_{k} (u_{i, k}^{v} + δ u) - u_{i, k}^{v} R_{k} u_{i, k}^{v} \\ + V_{k + 1}^{v} (b_{i, k + 1}^{v} + δ b, k + 1) - V_{k + 1} (b_{i, k + 1}^{v}, k + 1) \\ \approx \frac{1}{2} [\begin{array}{c} 1 \\ δ b_{k} \\ δ u_{k} \end{array}] [\begin{array}{lll} 0 & Q_{x}^{T} & Q_{u}^{T} \\ Q_{x} & Q_{x x} & Q_{? x} \\ Q_{u} & Q_{u x} & Q_{u u} \end{array}] [\begin{array}{c} 1 \\ δ b_{k} \\ δ u_{k} \end{array}] \end{matrix}

(30)

in which

Q_{x} = c_{x} + f_{x}^{T} V_{x}

(31)

Q_{u} = c_{u} + f_{u}^{T} V_{x}

(32)

Q_{x x} = c_{x x} + f_{x}^{T} V_{x x} f_{x} + V_{x} f_{x x}

(33)

Q_{u u} = c_{u u} + f_{u}^{T} V_{x x} f_{u} + V_{x} f_{u u}

(34)

Q_{u x} = c_{u x} + f_{u}^{T} V_{x x} f_{x} + V_{x} f_{u x}

(35)

where

c_{x}

,

c_{u}

,

c_{x x}

,

c_{u u}

, and

c_{u x}

are the gradients and Hessians of the cost function described in Equation (22),

f_{x}

,

f_{u}

,

f_{x x}

,

f_{u u}

, and

f_{u x}

are the gradients and Hessians of the nonlinear dynamics shown in Equation (3), and

V_{x}

,

V_{x x}

are the gradients and Hessians of the optimal cost-to-go depicted in Equation (27). The local optimum of

δ u

can be found by setting the derivative of

Q (δ b, δ u)

with respect to

δ u

to zero. Additionally,

δ u = \arg \min Q (δ b, δ u) = - Q_{u u}^{- 1} (Q_{u} + Q_{u x} δ b) = K_{C} + K_{A} δ b

(36)

where

K_{A} = - Q_{u u}^{- 1} Q_{u x}

and

K_{C} = - Q_{u u}^{- 1} Q_{u}

. By substituting Equation (36) back into Equation (30), we get the following:

Δ V (b_{i, k}^{v}, k) = - \frac{1}{2} {K_{C}}^{T} Q_{u u} K_{C}

(37)

V_{x} (b_{i, k}^{v}, k) = Q_{x} - {K_{A}}^{T} Q_{u u} K_{C}

(38)

V_{x x} (b_{i, k}^{v}, k) = Q_{x x} - {K_{A}}^{T} Q_{u u} K_{C}

(39)

By checking

Δ V

, we can assess whether the current local optimal controller is the global optimal controller. Equations (38) and (39) form the quadratic approximation of

V (b_{i, k}^{v}, k)

. To approach the local minimum of

V (b_{i, k}^{v}, k)

, the quadratic approximation of

V (b_{i, k}^{v}, k)

and the control feedback coefficients

{K_{A}, K_{C}}

are recursively computed backward from

t_{k + N}

.

5.2. Forward Pass

The optimal control input

u_{i, k}

is the sum of the current control input

{\bar{u}}_{i, k}

and the local optimum of

δ u

described in Equation (36).

u_{i, k}

is input to the belief dynamics of UAV to form a new trajectory. The nonlinear nature of our system introduces a challenge: when a newly computed trajectory deviates significantly from the model’s region of validity, the expected cost reduction may not materialize. To mitigate this issue, we introduce a backtracking line-search parameter,

0 < λ \leq 1

, to search for a valid trajectory. This parameter scales the control input update, effectively controlling the step size taken in the direction determined by the optimization process. Different values of

λ

are tested until the cost decreases; therefore, an appropriate step size is found to improve the control policy in the given search direction.

δ b_{i, k}^{v} = b_{i, k}^{v} - {\bar{s}}_{i, k}^{v}

(40)

u_{i, k} = {\bar{u}}_{i, k} + δ u_{i, k} = {\bar{u}}_{i, k} + λ K_{C} + K_{A} δ b_{i, k}^{v}

(41)

5.3. Constraints Handling by CILQG

At the backward pass, CILQG transforms the constraints into quadratic cost terms using barrier function. The input constraints of the ith UAV at

t_{k}

are

P_{k} (u_{i, k}^{v})

and

L_{k} (u_{i, k}^{v})

.

P_{k} (u_{i, k}^{v}) = u_{i, k}^{v} - u_{\max} \leq 0, L_{k} (u_{i, k}^{v}) = u_{\min} - u_{i, k}^{v} \leq 0

(42)

The collision avoidance constraint in Equation (24) is nonlinear, and we first linearize it as follows:

{F^{'}}_{k} (b_{i, k}^{v} + δ b_{i, k}^{v}) \approx {(δ b_{i, k}^{v})}^{T} \nabla F_{k} (b_{i, k}^{v}) + F_{k} (b_{i, k}^{v})

(43)

where

\nabla F_{k} (b_{i, k}^{v})

is the first order derivative of

F_{k}

on

b_{i, k}^{v}

. Taking

{F^{'}}_{k}

for example, the barrier function of

{F^{'}}_{k}

is as follows:

B_{k} (b_{i, k}^{v}) = q_{1} \exp (q_{2} {F^{'}}_{k} (b_{i, k}^{v}))

(44)

in which

q_{1}

and

q_{2}

are coefficients to tune the shape of the barrier function. The exponential barrier function in Equation (44) is second-order differentiable, which can be transferred to quadratic form as follows:

B_{k} (b_{k} + δ b_{k}) \approx δ b_{k}^{T} B_{s s} δ b_{k} + δ b_{k}^{T} B_{s} + B_{k} (b_{k})

(45)

where

B_{s}

,

B_{s s}

are the gradient and Hessian matrix of

B_{k} (s_{k})

.

B_{s} = q_{1} q_{2} \exp (q_{2} {F^{'}}_{k} (b_{k})) \nabla {F^{'}}_{k} (b_{k})

(46)

B_{s s} = q_{1} q_{2} \exp (q_{2} {F^{'}}_{k} (b_{k})) \nabla {F^{'}}_{k} (b_{k}) {(\nabla {F^{'}}_{k} (b_{k}))}^{T}

(47)

The quadratic cost of collision avoidance in Equation (45) and the cost of input are added to the total cost. At the same time, the gradient and Hessian matrix is used in the backward pass. The pseudo-code of the method is shown in Algorithm 1.

Algorithm 1 CILQG for trajectory tracking

Input:

b_{i, 0}^{v}

,

b_{0}^{t}

,

U_{ν, 0}

,

λ

Output:

b_{ν, k + 1}^{v}

1:

b_{i, k}^{v} \leftarrow b_{i, 0}^{v}

,

b_{k}^{t} \leftarrow b_{0}^{t}

2: for k = 0 to Termination

3:

z_{υ, k + 1}^{t} \leftarrow h (s_{υ, k + 1}^{v}, s_{k}^{t})

4:

b_{k + 1}^{t} \leftarrow B e l i e f d y n a m i c (z_{υ, k + 1}^{t}, b_{k}^{t})

via Equations (5) and (6)

5: for i = 1 to 3 (i is the number of the UAVs)

6:

s_{k + 1}^{t} = s_{k}^{t} + f (s_{k}^{t}) (u_{k}^{t} + m_{k}) Δ T

7: Calculate the reference trajectory according to Equation (19) to Equation (21)

8:

{b_{i, k}^{v}, U_{k}} \leftarrow f o r w a r d p a s s (0)

according to Equations (40) and (41)

9:

c o s t \leftarrow c o m p u t e c o s t ({\bar{s}}_{i, k}^{v}, b_{i, k}^{v}, U_{k})

using Equation (22)

10: for ii = 1 to iteration (ii is the number of the iterations of CILQG)

11:

{K_{A}, K_{C}, c o s t_r e d u} \leftarrow b a c k w a r d p a s s

using Equation (30) to Equation (39)

12: if cost_redu < expected_cost_redu (The gradient and Hessian of the cost)

13: break

14: while (flag = 0)

15:

{b_{i, k + 1}^{v}, U_{k}} \leftarrow f o r w a r d p a s s (λ)

16: cost

\leftarrow c o m p u t e c o s t ({\bar{b}}_{i, k}^{v}, b_{i, k + 1}^{v}, U_{k})

17: flag = (cost-currentcost)/cost_redu > threshold

18: if flag = 1

19: cost

\leftarrow

currentcost, input

\leftarrow

U_k, state

\leftarrow b_{i, k + 1}^{v}

20: else

21: λ

\leftarrow

λ*learnspeed

22: currentcost

\leftarrow

cost, U_i,k

\leftarrow

input

23: Update

b_{i, k + 1}^{v}

6. Simulation

6.1. Simulation Setup and ROS Scenario

In the simulation, the speed and heading angle of the ground maneuvering target are provided in Figure 6. The maximum communication radius is 100 m and the maximum detection range is π/3. The parameters of the UAV swarm are shown in Table 1. During the 500 s flight process, the three UAVs maintain their altitudes at 300 m, 301 m, and 302 m, respectively. The simulation step is 0.1 s.

The comparison is made in two aspects. First, we analyze the impact of the optimized configuration by examining the estimation errors across different values of α and R. Second, we compare CILQG against the Interior Point Optimizer (IPOPT) in terms of the error of tracking the reference trajectory, the calculation time, and the position estimation error of the target. IPOPT transfers the trajectory tracking problem to a nonlinear programming problem (NLP) and solves NLP based on the Interior Point method [36]. IPOPT requires fewer iterations to solve the problems than the Sequential Nonlinear Programming Technique (SNOPT), the Modular In-core Nonlinear Optimization System (MINOS), and Artelys Knitro (KNITRO) [37]. Moreover, IPOPT can reach a higher accuracy and better convergence rate than Finite Minimization with Constraints (FMINCON), nonlinear optimization using augmented Lagrange method (SOLNP), and SNOPT [38]. The simulation environment is established in both Python and the Robot Operating System (ROS), and the CPU is AMD Ryzen 9 7845 HX, 3.00 GHz. The target tracking task is executed in the urban environment simulated in the ROS, featuring a radio tower as an obstacle, as shown in Figure 7. The top view confirms the UAV swarm’s ability to maintain the desired configuration throughout the tracking process.

6.2. Comparison of Different Configurations

The reference trajectories to be tracked are calculated based on the configurations and the target belief. To assess the impact of configuration optimization, we vary α_1,k, α_2,k (angles of the first and second UAVs relative to the target, assumed equal due to symmetry), and R_i,k, resulting in four distinct cases outlined in Table 2 and visualized in Figure 8. This analysis highlights the benefits of the optimal configuration for minimizing the estimation errors.

The angle of the UAVs relative to the target in Case 1 is optimal under the detection angle constraint and communication radius constraint. In Case 2, the UAVs are close to each other. In Case 3, two UAVs are collinear with the target, causing the target to be outside the HFOV of UAV1 and UAV2. Moreover, the excessive distance between UAV1 and UAV2 (200 m) in Case 3 violates the maximum communication radius, as illustrated in Figure 8. Analyzing the target position estimation error in Figure 9, we observe a significant peak for Case 3 (yellow line) at 270 s. This spike coincides with the target’s acceleration and turning maneuver, highlighting the configuration’s inability to cope with such target dynamics. Conversely, the other three cases exhibit significantly lower estimation errors. Case 4, characterized by a more evenly distributed configuration and closer proximity to the target, consistently achieves the lowest error, while Case 2 shows the highest error among the successful configurations. These results underscore the importance of a well-distributed configuration that satisfies both communication and visibility constraints while maintaining proximity to the target for accurate state estimation.

In the following simulations, α_1,k, α_2,k, and α_3,k are set as −π/6 rad, π/6 rad, and 0 rad, and R_i,k = 100 m.

6.3. Comparison of IPOPT and CILQG

This section compares the performance of IPOPT and CILQG in tracking the reference trajectory while simultaneously estimating the state of a maneuvering target.

The time complexity of IPOPT and CILQG in each iteration is O(n³) and O(n²), respectively, where n represents the dimension of the state vector. Qualitatively, CILQG is asymptotically faster than IPOPT. Quantificationally, Figure 10 presents the average calculation time of three UAVs. On average, CILQG takes 0.0097 s, whereas IPOPT consumes 0.71 s. It demonstrates that the computing speed of CILQG is faster than IPOPT.

Figure 6 illustrates a challenging tracking scenario where the target undergoes turns and changes in speed. Notably, the target executes a turn coupled with acceleration between 250 s and 350 s, followed by another turn with deceleration from 350 s to 450 s. Figure 11 provides the side and top views of the resulting trajectories for both CILQG and IPOPT. The actual trajectory of the target is shown with a black line, and the estimated trajectories of CILQG and IPOPT are shown with a magenta dotted line and a yellow dotted line, respectively. While CILQG demonstrates robust tracking and maintains the desired formation even during the target’s maneuvers, IPOPT struggles to maintain the reference configuration. As evident in the enlarged view of Figure 11b, the UAVs controlled by IPOPT lose track of the target within their HFOV once the target initiates its maneuver, leading to the disappearance of the estimated trajectory (yellow dotted line).

This difference in performance is further emphasized in Figure 12, which presents the tracking error for both controllers. CILQG consistently maintains a tracking error below 3 m, with spikes only occurring during the target’s turns and speed changes. In stark contrast, IPOPT exhibits oscillatory behavior from the outset, and the tracking error rapidly diverges when the target performs its first turn at 150 s.

The time during which the target is within the HFOV of the UAVs is shown in Figure 13, where the value of 1 indicates the target is in the HFOV. IPOPT’s inability to maintain the desired configuration results in the target being lost from the HFOV, aligning with the increasing tracking error observed in Figure 12. While CILQG generally keeps the target within the HFOV during periods of constant target velocity and orientation, temporary losses occur during the target’s turns and shifts. During these instances, the UAVs rely on trajectory extrapolation due to the lack of measurements, leading to an increased estimation error. As depicted in Figure 14, the estimation error of CILQG is stable under 4 m. IPOPT’s estimation error exhibits significant oscillations after 150 s, coinciding with the loss of the target from the HFOV, highlighting the detrimental impact of poor tracking performance on estimation accuracy.

This disparity in performance stems from the fundamental difference in how each controller handles uncertainty. IPOPT treats target states as deterministic values at each time step, neglecting their inherent uncertainty, whereas CILQG regards the state as belief, characterized by its mean and variance. This allows CILQG to track the entire trajectory profile, effectively minimizing the variance of the belief and achieving robust performance even under uncertainty. Consequently, while IPOPT suffers from diverging tracking errors, potentially violating visibility constraints and losing track of the target, CILQG enables accurate trajectory tracking and timely target motion estimation, even in the presence of random target maneuvers.

6.4. Tracking with Obstacle

In this section, we compare the collision avoidance ability of CILQG and IPOPT. The obstacles are strategically positioned near the trajectory where the target makes turns. This setup requires the UAV to maneuver by altering its speed and orientation to avoid the obstacles while maintaining persistent tracking of the target.

Figure 15 showcases the effectiveness of CILQG in navigating a complex environment with obstacles (represented as blue cylinders in the side view and black hollow circles in the top-down view). CILQG adeptly guides the UAVs to adjust their trajectories and circumvent the three obstacles, deviating from the reference path only when necessary. In contrast, IPOPT falters in this challenging scenario, with UAV trajectories deviating significantly from the target’s path and failing to avoid obstacles. In Figure 15a, the z-axis positions of the UAVs even drop below zero. This indicates that the obstacles can cause the trajectories to deviate further apart, and highlights IPOPT’s vulnerability to environmental complexities and its limitations in handling constraints effectively. This conclusion is further supported by Figure 16, which displays the trajectory tracking error for the UAVs. The tracking errors of CILQG exhibit three peaks at 210 s, 340 s, and 430 s only during obstacle avoidance maneuvers, and the errors are under 3 m when the UAVs are away from the obstacles. IPOPT’s tracking errors exhibit an overall increasing trend and eventual divergence. Furthermore, Figure 17 illustrates that the target estimation error of CILQG remains below 4 m, consistent with its performance in the obstacle-free scenario, whereas the estimation error of IPOPT rises rapidly when one of the UAVs hits the obstacle and the target is lost in the HFOV. Therefore, the obstacles do not significantly affect the estimation performance of the proposed method. However, IPOPT is unable to guide the UAVs to track the target accurately and safely due to the randomness of the target states and UAV states.

7. Discussion

In addressing the challenge of target tracking with obstacles and uncertainty in both the target state and UAV navigation unit, this study sequentially tackled two joint optimization problems. Firstly, an optimal configuration for the UAVs was obtained by solving the AIA problem. This involved an online estimation of the target state, which subsequently informed the generation of reference trajectories for each UAV. Subsequently, a fast and efficient CILQG algorithm was developed to tackle the trajectory tracking problem. Recognizing the inherent uncertainties, we represented both the target and UAV states as belief states, employing the Unscented Transform for accurate belief propagation.

Our simulation results validated the effectiveness of this approach. The optimized configuration significantly reduced target estimation errors. Furthermore, CILQG demonstrated superior performance compared to IPOPT in terms of tracking accuracy, speed, and robustness. While IPOPT struggled to maintain visibility of the target, leading to constraint violations, CILQG ensured accurate estimation of target states even when the target maneuvered randomly. Importantly, CILQG ensured reliable obstacle avoidance throughout the tracking process, highlighting its suitability for complex, dynamic environments.

Author Contributions

Conceptualization, Y.W. and W.Z.; methodology, C.Z.; software, C.Z.; validation, C.Z. and Y.W.; formal analysis, Y.W.; investigation, C.Z.; resources, W.Z.; data curation, C.Z.; writing—original draft preparation, C.Z.; writing—review and editing, Y.W.; visualization, C.Z.; supervision, W.Z.; project administration, Y.W.; funding acquisition, W.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China grant number 62373366 and 92371207, and the Science and Technology Innovation Program of Hunan Province grant number 2021RC3078, and the Provincial Science Fund for Distinguished Young Scholars of Hunan grant number 2024JJ2064.

Data Availability Statement

The data is contained within the article.

Conflicts of Interest

The authors declare no conflict of interest.

References

Hsu, D.; Lee, W.S.; Rong, N. A Point-Based POMDP Planner for Target Tracking. In Proceedings of the 2008 IEEE International Conference on Robotics and Automation, Pasadena, CA, USA, 19–23 May 2008; pp. 2644–2650. [Google Scholar] [CrossRef]
Qi, S.; Yao, P. Persistent Tracking of Maneuvering Target Using IMM Filter and DMPC by Initialization-Guided Game Approach. IEEE Syst. J. 2019, 13, 4442–4453. [Google Scholar] [CrossRef]
Yang, H.; Wang, Y. Formation Optimization and Control for Maneuvering Target Tracking by Mobile Sensing Agents. IEEE Access 2019, 7, 32305–32314. [Google Scholar] [CrossRef]
Chen, J.; Hou, X.; Qin, Z.; Guo, R. A Novel Adaptive Estimator for Maneuvering Target Tracking. In Proceedings of the 2007 International Conference on Mechatronics and Automation, Harbin, China, 5–8 August 2007; pp. 3756–3760. [Google Scholar] [CrossRef]
Ruangwiset, A. Path Generation for Ground Target Tracking of Airplane-Typed UAV. In Proceedings of the 2008 IEEE International Conference on Robotics and Biomimetics, Bangkok, Thailand, 22–25 February 2009; pp. 1354–1358. [Google Scholar] [CrossRef]
Zhou, L.; Leng, S.; Liu, Q.; Wang, Q. Intelligent UAV Swarm Cooperation for Multiple Targets Tracking. IEEE Internet Things J. 2022, 9, 743–754. [Google Scholar] [CrossRef]
Yali, W. Bearings-Only Maneuvering Target Tracking Based on STF and UKF. In Proceedings of the 2008 International Conference on Advanced Computer Theory and Engineering, Phuket, Thailand, 20–22 December 2008; pp. 295–299. [Google Scholar] [CrossRef]
Zhou, H.; Zhao, H.; Huang, H.; Zhao, X. A Cubature-Principle-Assisted IMM-Adaptive UKF Algorithm for Maneuvering Target Tracking Caused by Sensor Faults. Appl. Sci. 2017, 7, 1003. [Google Scholar] [CrossRef]
Liao, S.; Zhu, R.; Wu, N.; Shaikh, T.A.; Sharaf, M.; Mostafa, A.M. Path Planning for Moving Target Tracking by Fixed-Wing UAV. Def. Technol. 2020, 16, 811–824. [Google Scholar] [CrossRef]
Kong, N.J.; Li, C.; Johnson, A.M. Hybrid iLQR Model Predictive Control for Contact Implicit Stabilization on Legged Robots. IEEE Trans. Robot. 2023, 39, 4712–4727. [Google Scholar] [CrossRef]
Lembono, T.S.; Calinon, S. Probabilistic Iterative LQR for Short Time Horizon MPC. arXiv 2021, arXiv:arXiv:2012.06349. [Google Scholar]
Indelman, V. Towards Multi-Robot Active Collaborative State Estimation via Belief Space Planning. In Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, Germany, 28 September–2 October 2015; pp. 4620–4626. [Google Scholar] [CrossRef]
Qiu, D.; Zhao, Y.; Baker, C. Latent Belief Space Motion Planning under Cost, Dynamics, and Intent Uncertainty. In Robotics: Science and Systems XVI; Robotics: Science and Systems Foundation: 2020, 12 July–16 July 2020; Available online: https://www.roboticsproceedings.org/rss16/index.html (accessed on 30 April 2024).
Atanasov, N.; Le Ny, J.; Daniilidis, K.; Pappas, G.J. Decentralized Active Information Acquisition: Theory and Application to Multi-Robot SLAM. In Proceedings of the 2015 IEEE International Conference on Robotics and Automation (ICRA), Seattle, WA, USA, 26–30 May 2015; pp. 4775–4782. [Google Scholar] [CrossRef]
Chung, T.H.; Burdick, J.W.; Murray, R.M. A Decentralized Motion Coordination Strategy for Dynamic Target Tracking. In Proceedings of the Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006, Orlando, FL, USA, 15–19 May 2006; pp. 2416–2422. [Google Scholar] [CrossRef]
Atanasov, N.; Ny, J.L.; Daniilidis, K.; Pappas, G.J. Information Acquisition with Sensing Robots: Algorithms and Error Bounds arXiv. arXiv 2013, arXiv:arXiv:1309.5390. [Google Scholar]
Tzes, M.; Kantaros, Y.; Pappas, G.J. Distributed Sampling-Based Planning for Non-Myopic Active Information Gathering. In Proceedings of the 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Prague, Czech Republic, 27 September–1 October 2021; pp. 5872–5877. [Google Scholar] [CrossRef]
Schlotfeldt, B.; Thakur, D.; Atanasov, N.; Kumar, V.; Pappas, G.J. Anytime Planning for Decentralized Multirobot Active Information Gathering. IEEE Robot. Autom. Lett. 2018, 3, 1025–1032. [Google Scholar] [CrossRef]
Schlotfeldt, B.; Tzoumas, V.; Pappas, G.J. Resilient Active Information Acquisition With Teams of Robots. IEEE Trans. Robot. 2022, 38, 244–261. [Google Scholar] [CrossRef]
Schlotfeldt, B.; Tzoumas, V.; Thakur, D.; Pappas, G.J. Resilient Active Information Gathering with Mobile Robots. arXiv 2018, arXiv: arXiv:1803.09730. [Google Scholar]
Papaioannou, S.; Laoudias, C.; Kolios, P.; Theocharides, T.; Panayiotou, C.G. Joint Estimation and Control for Multi-Target Passive Monitoring with an Autonomous UAV Agent. In Proceedings of the 2023 31st Mediterranean Conference on Control and Automation (MED), Limassol, Cyprus, 26–29 June 2023; pp. 176–181. [Google Scholar] [CrossRef]
Zhao, Y.; Wang, X.; Kong, W.; Shen, L.; Jia, S. Decision-Making of UAV for Tracking Moving Target via Information Geometry. In Proceedings of the 2016 35th Chinese Control Conference (CCC), Chengdu, China, 27–29 July 2016; pp. 5611–5617. [Google Scholar] [CrossRef]
Yang, X.; He, S.; Shin, H.-S.; Tsourdos, A. Trajectory Optimization for Target Localization and Sensor Bias Calibration with Bearing-Only Information. Guid. Navigat. Control 2022, 2, 2250015. [Google Scholar] [CrossRef]
He, R.; Chen, S.; Wu, H.; Liu, Z.; Chen, J. Optimal Maneuver Strategy of Observer for Bearing-Only Tracking in Threat Environment. Int. J. Aerosp. Eng. 2018, 2018, 1–9. [Google Scholar] [CrossRef]
Das, S.; Bhaumik, S. Observer Recommended Maneuver for Bearing Only Tracking of an Underwater Target. IEEE Sens. Lett. 2023, 7, 7006504. [Google Scholar] [CrossRef]
Oshman, Y.; Davidson, P. Optimization of Observer Trajectories for Bearings-Only Target Localization. IEEE Trans. Aerosp. Electron. Syst. 1999, 35, 892–902. [Google Scholar] [CrossRef]
Li, B.; Yang, Z.; Chen, D.; Liang, S.; Ma, H. Maneuvering Target Tracking of UAV Based on MN-DDPG and Transfer Learning. Def. Technol. 2021, 17, 457–466. [Google Scholar] [CrossRef]
You, C. Real Time Motion Planning Using Constrained Iterative Linear Quadratic Regulator for On-Road Self-Driving. arXiv 2022, arXiv:arXiv:2202.08400. [Google Scholar]
Lee, A.; Duan, Y.; Patil, S.; Schulman, J.; McCarthy, Z.; Van Den Berg, J.; Goldberg, K.; Abbeel, P. Sigma Hulls for Gaussian Belief Space Planning for Imprecise Articulated Robots amid Obstacles. In Proceedings of the 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, Tokyo, Japan, 3–7 November 2013; pp. 5660–5667. [Google Scholar] [CrossRef]
Chen, J.; Shimizu, Y.; Sun, L.; Tomizuka, M.; Zhan, W. Constrained Iterative LQG for Real-Time Chance-Constrained Gaussian Belief Space Planning. arXiv 2021, arXiv:arXiv:2108.06533. [Google Scholar]
Guo, Q.; Zeng, C.; Jiang, Z.; Hu, X.; Deng, X. Application of Unscented Kalman Filter in Tracking of Video Moving Target. In Biometric Recognition; Springer International Publishing: Cham, Switzerland, 2019; Volume 11818, pp. 483–492. [Google Scholar] [CrossRef]
Wächter, A.; Biegler, L.T. On the Implementation of an Interior-Point Filter Line-Search Algorithm for Large-Scale Nonlinear Programming. Math. Program. 2006, 106, 25–57. [Google Scholar] [CrossRef]
Poku, M.Y.B.; Biegler, L.T.; Kelly, J.D. Nonlinear Optimization with Many Degrees of Freedom in Process Engineering. Ind. Eng. Chem. Res. 2004, 43, 6803–6812. [Google Scholar] [CrossRef]
Lavezzi, G.; Guye, K.; Cichella, V.; Ciarcià, M. Comparative Analysis of Nonlinear Programming Solvers: Performance Evaluation, Benchmarking, and Multi-UAV Optimal Path Planning. Drones 2023, 7, 487. [Google Scholar] [CrossRef]
Wang, Y.; Zhang, S.; Ge, M.; Zheng, W.; Chen, X.; Zheng, S.; Lu, F. Fast On-Orbit Pulse Phase Estimation of X-Ray Crab Pulsar for XNAV Flight Experiments. IEEE Trans. Aerosp. Electron. Syst. 2023, 59, 3395–3404. [Google Scholar] [CrossRef]
Wang, Y.; Zheng, W.; Zhang, S.; Ge, M.; Li, L.; Jiang, K.; Chen, X.; Zhang, X.; Zheng, S.; Lu, F. Review of X-Ray Pulsar Spacecraft Autonomous Navigation. Chin. J. Aeronaut. 2023, 36, 44–63. [Google Scholar] [CrossRef]
Song, M.; Wang, Y.; Zheng, W.; Li, L.; Wang, Y.; Hu, X.; Wu, Y. Fast Period Estimation of X-Ray Pulsar Signals Using an Improved Fast Folding Algorithm. Chin. J. Aeronaut. 2023, 36, 309–316. [Google Scholar] [CrossRef]
Wang, Y.; Zheng, W.; Ge, M.; Zheng, S.; Zhang, S.; Lu, F. Use of Statistical Linearization for Nonlinear Least-Squares Problems in Pulsar Navigation. J. Guid. Control. Dyn. 2023, 46, 1850–1856. [Google Scholar] [CrossRef]

Figure 1. Scenario of the investigated task.

Figure 2. System control block diagram.

Figure 3. Configuration of the swarm relative to the target.

Figure 4. Receding horizon control.

Figure 5. Trajectory tracking of belief states.

Figure 6. Target speed and heading angle.

Figure 7. ROS simulation scenario. (a) Side view. (b) Top view.

Figure 8. Configurations of the four cases. (a) Case 1; (b) Case 2; (c) Case 3; (d) Case 4.

Figure 9. Two-dimensional position estimation error of the target.

Figure 10. Calculation time.

Figure 11. Trajectories of the target and UAV. (a) Side view. (b) Top view.

Figure 12. Tracking errors of CILQG and IPOPT.

Figure 13. Time of the target in the HFOV of the UAVs.

Figure 14. Two-dimensional position estimation error of the target.

Figure 15. Tracking trajectories with obstacles using CILQG and IPOPT. (a) Side view. (b) Top view.

Figure 16. Trajectory tracking errors of CILQG and IPOPT with obstacles.

Figure 17. Estimation error of CILQG and IPOPT with obstacles.

Table 1. Simulation parameters.

Parameter	Value
Number of the UAVs I	3
Number of obstacles P	3
Obstacle 1	O₁ [(3600 m, 800 m), 320 m, 400 m]
Obstacle 2	O₂ [(5800 m, 2810 m), 321 m, 400 m]
Obstacle 3	O₃ [(7850 m, 4900 m), 322 m, 400 m]
UAV initial position	[−50 m, −86.6 m, 300 m], [50 m, −86.6 m, 301 m], [0 m, −100 m, 302 m]
Target initial position	[0 m, 0 m, 0 m]
Relative angle α_i_,0	[−π/6 rad, 0 rad, π/6 rad]
Relative distance R_i_,0	[100 m, 100 m, 100 m]
Initial linear velocity and angular velocity	[20 m/s, 0 rad/s, 0 rad/s]
Initial heading and pitch angle	[0 rad, 0 rad]
UAV angle range	[−π/20 rad, π/20 rad]
Distance measurement noise	$n_{k}^{d} \sim N (0, 0.01)$ m
Angle measurement noise	$n_{k}^{a} \sim N (0, 0.0003)$ m
Length of horizon	4

Table 2. Parameters for different configurations.

	α_1,k (rad)	α_2,k (rad)	R_1,k R_2,k R_3,k (m)
Case 1	−π/6	π/6	100
Case 2	−π/36	π/36	100
Case 3	−π/2	π/2	100
Case 4	−π/6	π/6	50

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, C.; Wang, Y.; Zheng, W. Multi-UAVs Tracking Non-Cooperative Target Using Constrained Iterative Linear Quadratic Gaussian. Drones 2024, 8, 326. https://doi.org/10.3390/drones8070326

AMA Style

Zhang C, Wang Y, Zheng W. Multi-UAVs Tracking Non-Cooperative Target Using Constrained Iterative Linear Quadratic Gaussian. Drones. 2024; 8(7):326. https://doi.org/10.3390/drones8070326

Chicago/Turabian Style

Zhang, Can, Yidi Wang, and Wei Zheng. 2024. "Multi-UAVs Tracking Non-Cooperative Target Using Constrained Iterative Linear Quadratic Gaussian" Drones 8, no. 7: 326. https://doi.org/10.3390/drones8070326

Article Menu

Multi-UAVs Tracking Non-Cooperative Target Using Constrained Iterative Linear Quadratic Gaussian

Abstract

1. Introduction

2. Description of the Investigated Task

3. Model for Target Tracking Problem

3.1. Measurement Acquisition Model

3.2. Belief Dynamics of UAV

3.3. Belief Dynamics of the Target

3.4. Obstacle Model

4. Optimization Problem behind the Active Information Acquisition

4.1. Optimization Problem about the Configuration of the Swarm

4.1.1. Pre-Designed Configuration

4.1.2. Optimization Problem for $α_{i, k}$ and $R_{i, k}$

4.2. Optimization Problem for Trajectory Tracking

5. CILQG for Active Information Acquisition Problem

5.1. Backward Pass

5.2. Forward Pass

5.3. Constraints Handling by CILQG

6. Simulation

6.1. Simulation Setup and ROS Scenario

6.2. Comparison of Different Configurations

6.3. Comparison of IPOPT and CILQG

6.4. Tracking with Obstacle

7. Discussion

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Multi-UAVs Tracking Non-Cooperative Target Using Constrained Iterative Linear Quadratic Gaussian

Abstract

1. Introduction

2. Description of the Investigated Task

3. Model for Target Tracking Problem

3.1. Measurement Acquisition Model

3.2. Belief Dynamics of UAV

3.3. Belief Dynamics of the Target

3.4. Obstacle Model

4. Optimization Problem behind the Active Information Acquisition

4.1. Optimization Problem about the Configuration of the Swarm

4.1.1. Pre-Designed Configuration

4.1.2. Optimization Problem for α i , k and R i , k

4.2. Optimization Problem for Trajectory Tracking

5. CILQG for Active Information Acquisition Problem

5.1. Backward Pass

5.2. Forward Pass

5.3. Constraints Handling by CILQG

6. Simulation

6.1. Simulation Setup and ROS Scenario

6.2. Comparison of Different Configurations

6.3. Comparison of IPOPT and CILQG

6.4. Tracking with Obstacle

7. Discussion

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

4.1.2. Optimization Problem for $α_{i, k}$ and $R_{i, k}$