**Optimization and Communication in UAV Networks**

Editors

**Christelle Caillouet Nathalie Mitton**

MDPI • Basel • Beijing • Wuhan • Barcelona • Belgrade • Manchester • Tokyo • Cluj • Tianjin

*Editors* Christelle Caillouet Universite C´ ote d'Azur ˆ France Inria France

Nathalie Mitton

*Editorial Office* MDPI St. Alban-Anlage 66 4052 Basel, Switzerland

This is a reprint of articles from the Special Issue published online in the open access journal *Sensors* (ISSN 1424-8220) (available at: https://www.mdpi.com/journal/sensors/special issues/UAV net).

For citation purposes, cite each article independently as indicated on the article page online and as indicated below:

LastName, A.A.; LastName, B.B.; LastName, C.C. Article Title. *Journal Name* **Year**, *Article Number*, Page Range.

**ISBN 978-3-03943-310-0 (Hbk) ISBN 978-3-03943-311-7 (PDF)**

c 2020 by the authors. Articles in this book are Open Access and distributed under the Creative Commons Attribution (CC BY) license, which allows users to download, copy and build upon published articles, as long as the author and publisher are properly credited, which ensures maximum dissemination and a wider impact of our publications.

The book as a whole is distributed by MDPI under the terms and conditions of the Creative Commons license CC BY-NC-ND.

## **Contents**



## **About the Editors**

**Christelle Caillouet** has served as Associate Professor at University of Cote d'Azur since ˆ September 2011 and is a member of the joint team Coati between the I3S (CNRS, University of Nice-Sophia Antipolis) laboratory and Inria. She completed her MSc in Optimization and Game Theory at University Paris VI in 2006, and Ph.D. in Computer Science at University of Nice Sophia Antipolis in 2009. Her research interests are focused on optimization, linear programming, and algorithmics as applied to telecommunication networks (such as wireless mesh networks and backhaul networks) and biological networks.

**Nathalie Mitton** received her M.Sc. and Ph.D. degrees in Computer Science from INSA Lyon in 2003 and 2006, respectively. She received her Habilitation a diriger des recherches (HDR) in 2011 from ` Universite Lille 1. She is currently an Inria Full Researcher since 2006, and has served as Scientific ´ Head of the Inria FUN team from 2012, which is focused on small computing devices like electronic tags and sensor networks. Her research interests focus on self-organization from PHY to routing for wireless constrained networks. She has published her research in more than 30 international revues and more than 100 international conferences. She is involved in the setup of the FIT IoT LAB platform (http://fit-equipex.fr/,https://www.iot-lab.info), the H2020 CyberSANE and VESSEDIA projects, and in several program and organization committees such as Infocom 2019 and 2020, PerCom 2019 and 2020, DCOSS 2019, Adhocnow 2015–2019, ICC (since 2015), Globecom (since 2017), Pe-Wasun 2017, and VTC (since 2016). She also supervises numerous Ph.D. students and engineers.

## **Preface to "Optimization and Communication in UAV Networks"**

Nowadays, Unmanned Aerial Vehicles (UAVs) have received growing popularity in the Internet-of-Things (IoT) which often deploys many sensors in a relatively wide region. Current trends focus on deployment of a single UAV or a swarm of it to generally map an area, perform surveillance, monitoring or rescue operations, collect data from ground sensors or various communicating devices, provide additional computing services close to data producers, etc. Applications are very diverse and call for different features or requirements. But UAV remain low-power battery powered devices that in addition to their mission, must fly and communicate. Thanks to wireless communications, they participate to mobile dynamic networks composed of UAV and ground sensors and thus many challenges have to be addressed to make UAV very efficient. And behind any UAV application, hides an optimization problem. There is still a criterion or multiple ones to optimize such as flying time, energy consumption, number of UAV, quantity of data to send/receive, etc.

This book, which is a Special Issue of the Sensors journal, deals with the wedding of optimization and communication in UAV networks. We hope you will enjoy it. We wish you a pleasant reading.

> **Christelle Caillouet, Nathalie Mitton** *Editors*

### *Editorial* **Optimization and Communication in UAV Networks**

**Christelle Caillouet 1,2,\*,† and Nathalie Mitton 1,\*,†**


Received: 2 September 2020; Accepted: 3 September 2020; Published: 4 September 2020

**Abstract:** Nowadays, Unmanned Aerial Vehicles (UAVs) have received growing popularity in the Internet-of-Things (IoT) which often deploys many sensors in a relatively wide region. Current trends focus on deployment of a single UAV or a swarm of it to generally map an area, perform surveillance, monitoring or rescue operations, collect data from ground sensors or various communicating devices, provide additional computing services close to data producers, etc. Applications are very diverse and call for different features or requirements. But UAV remain low-power battery powered devices that in addition to their mission, must fly and communicate. Thanks to wireless communications, they participate to mobile dynamic networks composed of UAV and ground sensors and thus many challenges have to be addressed to make UAV very efficient. And behind any UAV application, hides an optimization problem. There is still a criterion or multiple ones to optimize such as flying time, energy consumption, number of UAV, quantity of data to send/receive, etc

**Keywords:** UAV; drones; wireless; self-organization; optimization; swarm; communication; algorithms

#### **1. Introduction**

With new technological advances, UAVs are becoming a reality and are attracting more and more attention. UAVs or drones are flying devices that can be remotely controlled or, more recently, completely autonomous. They can be used alone or as a fleet, and in a large set of applications: from rescue operations to event coverage going through servicing other networks such as sensor networks for replacing, recharging, or data offloading. They are hardware-constrained since they cannot be too heavy and rely on batteries. Depending on their use (alone or in a swarm) and the targeted applications, they must evolve differently and meet different requirements (energy preservation, delay of covering an area, coverage, limited number of devices, etc.) with limited resources (energy, speed, etc.). Yet, their use still raises a large set of new exciting challenges, in terms of trajectory optimization, positioning, when they are used alone or in cooperation, coordination, and communication when they evolve in a swarm, just to name a few. This Special Issue was calling for any new original submissions that deal with UAV or UAV swarm optimization or communication aspects. Among the numerous submissions, only twelve of them have been selected after a rigorous selection process. The main themes that arise from them are: (i) ground data collection from the air, (ii) control of UAV swarm UAV-based Mobile Edge Computing and (iii) application-driven UAV based measurements.

In the following, we sum up the contributions of the papers published to this Special Issue for each category to then conclude by drawing future challenges and still open issues.

#### **2. Ground Data Collection from the Air and Path Planning**

It is becoming more and more common to imagine having data sensed from ground wireless sensors collected by UAV to alleviate wireless peer-to-peer communications between ground sensors and reduce their energy consumption. However, such a paradigm raises a set of new challenges such as how to prioritize the sensors to visit, how to optimize the time to collect all data by visiting all devices, etc. This is an exciting optimization problem. Works [1–4] propose different approaches to address this issue with different perspectives. Different criteria are considered to plan the trajectory of the UAV and different functions are optimized.

Reference [1] proposes to visit the nodes in a given order and for a variable time that depend on a node priority, while in [2], the authors aim to maximize the data collection utility by jointly optimizing the communication scheduling and trajectory of each UAV. The data collection utility is determined by the amount and value of the collected data and a novel trajectory planning algorithm is designed to maximize it. The author of [3] focuses on the problem of minimizing the mission completion time (flying time and hovering time) for a multi-UAV system in a monitoring scenario while ensuring that the information of each sensor is collected. As for [4], the authors aim to improve the secrecy performance of cellular-enabled unmanned aerial vehicle communication networks through an aerial cooperative jamming scheme.

#### **3. Control of UAV Swarm**

When more than a drone is required, a swarm of UAVs can be deployed. Although bringing more performances in terms of coverage and connectivity, new optimization challenges pop up due to the difficulty to control and scale such swarms both in a distributed or centralized way. References [5–8] tackle these numerous challenges going from connectivity maintenance to swarm control.

Reference [5] studies the different factors that may impact the accuracy and efficiency of an unmanned aerial vehicle (UAV) swarm coordination. The authors propose a mathematical data model to demonstrate the fundamental properties of antenna arrays and study the performance of the data collection system framework. Numerical examples and practical measurements are provided to demonstrate the feasibility of the proposed data collection system framework using an iterative-MUSIC algorithm and benchmark theoretical expectations.

Reference [6] deals with multi-UAV systems where the UAV autonomy is much smaller than the time to complete their mission. The authors thus introduce a UAV replacement procedure as a way to guarantee ground users' connectivity over time, formulating the practical UAV replacements problem in moderately large multi-UAV swarms and proves it to be an NP-hard problem in which an optimal solution has exponential complexity. Reference [7] focuses on the maintenance formation with time-varying shape of a swarm proposing a virtual leader approach while [8] investigates a stochastic model of the UAV Swarm system with multiplicative noises.

#### **4. UAV Enabled Mobile Edge Computing**

The potential offered by the abundance of sensors, actuators, and communications in the Internet of Things (IoT) era is hindered by the limited computational capacity of local nodes. However, the latter do not necessarily always have the capacity to offload data to an edge server. In such a case, mobile edge servers can go to them thanks to the deployment of UAV-assisted Multi-access Edge Computing systems, which raises new challenging optimization and networking issues as addressed in [9,10].

Reference [9] proposes to provide an Unmanned Aerial Vehicle (UAV)-assisted Multi-access Edge Computing (MEC) system based on a usage-based pricing policy for allowing the exploitation of the servers' computing resources while the authors of [10] introduce the DRUID-NET perspective, aiming to adapt to a rapidly varying demand by applying different tools from Automata and Graph theory, Machine Learning, Modern Control Theory, and Network Theory combined.

#### **5. Application-Driven UAV Based Measurements**

In such cases, the application that has asked for UAV deployment comes with very specific constraints and requirements and calls for specific optimization models. Reference [11,12] gives two such examples dedicated respectively for three-dimensional measurements and surveillance.

For instance, Reference [11] aims to provide a comparative analysis of the precision of ground geodetic data versus the three-dimensional measurements from unmanned aerial vehicles (UAV), while establishing the impact of herbaceous vegetation on the UAV 3D model. A constraint to take into account in this application is the fact that herbaceous vegetation can impede the establishment of the anthropogenic roughness of the surface and deteriorates the identification of minor surfaces.

Reference [12] focuses on UAV cooperative surveillance networks and introduces the use of complex field network coding (CFNC) for this application. According to whether there is a direct communication link between any source drone and the destination, the information transfer mechanism at the downlink is set to one of two modes, either mixed or relay transmission, and two corresponding irregular topology structures for CFNC-based networks are proposed. Theoretical analysis and simulation results with an additive white Gaussian noise (AWGN) channel show that the CFNC obtains a throughput as high as 1/2 symbol per source per channel use. Results show that CFNC applied to the proposed irregular structures under the two transmission modes can achieve better reliability due to full diversity gain as compared to that based on the regular structure.

#### **6. Conclusions**

As you can notice, challenges in UAV networks are huge, numerous and heterogeneous. They are concerning different aspects of the deployment of drones, from path trajectory to connectivity maintenance going through energy management. Much of them have been addressed with optimization tools but there remain a lot of open issues and research directions. Contributions presented in this special issue are only a first step to pave the way towards even more exciting investigations.

**Funding:** This research received no external funding.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


c 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Priority-Based Data Collection for UAV-Aided Mobile Sensor Network**

#### **Xiaoyan Ma 1, Tianyi Liu 2,\*, Song Liu 1, Rahim Kacimi <sup>3</sup> and Riadh Dhaou <sup>4</sup>**


Received: 31 March 2020; Accepted: 25 May 2020; Published: 27 May 2020

**Abstract:** In this work, we study data collection in multiple unmanned aerial vehicle (UAV)-aided mobile wireless sensor networks (WSNs). The network topology is changing due to the mobility of the UAVs and the sensor nodes, so the design of efficient data collection protocols is a major concern. We address such high dynamic network and propose two mechanisms: prioritized-based contact-duration frame selection mechanism (PCdFS), and prioritized-based multiple contact-duration frame selection mechanisms (PMCdFS) to build collision-free scheduling and balance the nodes between the multi-UAV respectively. Based on the two mechanisms, we proposed a Balance algorithm to conduct the collision-free communication between the mobile nodes and the multi-UAVs. Two key design ideas for a Balance algorithm are: (a) no need of higher priority for those nodes that have lower transmission rate between them and the UAV and (b) improve the communication opportunity for those nodes that have shorter contact duration with the UAVs. We demonstrate the performance of proposed algorithms through extensive simulations, and real experiments. These experiments using 15 mobile nodes at a path with 10 intersections and 1 island, present that network fairness is efficiently enhanced. We also confirm the applicability of proposed algorithms in a challenging and realistic scenario through numerous experiments on a path at Tongji campus in Shanghai, China.

**Keywords:** wireless sensor networks; multiple unmanned aerial vehicles; mobile nodes; data collection; collision-free

#### **1. Introduction**

Unmanned Aerial Vehicle-aided wireless sensor networks (UAV-aided WSN) have gained more and more interest due to their many applications in monitoring, surveillance, and exploring in healthcare, agriculture, industry, and military [1–5]. Among UAVs' applications, one of the key functions is the data collection [6–11]. These works focus on deterministic topology where the nodes are deployed statically, and the locations of the sensors are known. The data collection issues addressed on dynamic topology, which are usually used in applications such as maritime detection, traffic surveillance, and wilderness rescuing where the targets are moving and no static sensors are deployed in advance, are seldom covered.

The main difference between the static network and mobile network are: the transmission opportunities for nodes that are within the coverage of the UAV are different. In static case, all covered nodes are static, the relative velocity (*vr*) between the nodes and the UAV are the same. Thus, the contact durations (CD) between them with the UAV depend on the relative distance (*dr*) between them (*CD* = *dr vr* , see [12,13] for more details). The relative distances almost have no difference if the UAV flies at a higher altitude. However, in mobile case, when the nodes move at different velocities, the

CD are different greatly even the relative distance is the same. Intuitively, the shorter the CD between them, the smaller the opportunities for the mobile node to communicate with the UAV. When the CD is very short, the mobile node may have no opportunity to communicate with the UAV if no attention is paid on the CD between it with the UAV. Thus, a contact-duration-based data collection algorithm should be designed for such context despite a large array of existing data collection algorithms (see Section 2. related works) on UAV-aided static WSNs.

The impact factors of the CD between mobile nodes and the UAV include two aspects: (a) the relative distance between the sensor and the UAV, and (b) the relative velocity between them. Priority-based Frame Selection (PFS) [14,15] is a one-hop mechanism based on the relative distance according to which the nodes are divided into different priority groups. Communications are conducted from higher to lower priorities. A multi-hop highest velocity opportunistic algorithm which is based on relative velocity between mobile nodes and the UAV is proposed in [16]. The ones that have higher velocity have longer CD with the UAV, therefore were selected as forwarded nodes. In our previous work [12,13], we studied the data collection maximization issues in single UAV enabled mobile WSN where the pre-defined path is a straight path without comparison with existing works and real experiments. The curve path and multi-UAVs aspects are also not covered in the previous work. Thus, a large room for enhancing the network performance still exists.

In this work, we focus on multi-UAV aided mobile WSN, Figure 1, where the nodes are deployed on mobile bicycles and move along a pre-defined curve path. Considering that, in the context of the nodes move along a path, two UAVs are enough to cover all mobile nodes when (as in Figure 1) *UAV*<sup>1</sup> take-off from the original point of the path and fly along the path, *UAV*<sup>2</sup> take-off from the end-point of the path and fly along the path. Data collection issues in such contexts contain two aspects. End-to-end data collection is a very complex problem. In this paper, we focus on the access link. As the literature, on this kind of link between the sensors and the UAV [6,7], still does not propose efficient solutions. The access link suffers from the synchronization problem due to the high dynamic network, the coordination between the mobile nodes and the multi-UAVs. Providing the opportunity of communication to the nodes that have a very short duration with the UAVs reduces the congestion risk. On the other hand, extensive literature can be referred to, on the second link, on the backhaul link, between the UAVs and gateways [17]. The second link is also challenging on several levels such as the data security, the security of UAVs, and the dimensioning of the backhaul. In our previous work [18,19], we focused on the backahul link with the satellite system. The proposed algorithms on mobile mules, in [18,19] are applicable for UAV-aided sensor networks. Moreover, because that the collected data (considering the value of data and distinguish the data collected from each sensor) could be stored in SD cards embedded on the UAV, thus, in this work, we focus on the access link. The data collection optimization objectives in such context include two aspects: (i) maximizing the number of collected packets, and (ii) maximizing the number of nodes that successfully send at least one packet during the collection period. Our main purpose is to jointly maximize the two aspects through formulating the dynamic parameters. Our main contributions are summarized as follows:


one-hop and slotted mechanism which is used to allocate the time-slot for the nodes that covered only by one of the UAVs.


**Figure 1.** An illustration of the unmanned aerial vehicle (UAV)-aided data collection for a mobile wireless sensor network. The exemplar trajectory of the *UAV*<sup>1</sup> is shown as: Waypoint *P*<sup>1</sup> *<sup>S</sup>* → Waypoint *P*1 <sup>1</sup> <sup>→</sup> Waypoint *<sup>P</sup>*<sup>1</sup> <sup>2</sup> <sup>→</sup> Waypoint *<sup>P</sup>*<sup>1</sup> <sup>3</sup> <sup>→</sup> Waypoint *<sup>P</sup>*<sup>1</sup> <sup>4</sup> <sup>→</sup> Waypoint *<sup>P</sup>*<sup>1</sup> <sup>5</sup> <sup>→</sup> Waypoint *<sup>P</sup>*<sup>1</sup> <sup>6</sup> <sup>→</sup> Waypoint *<sup>P</sup>*<sup>1</sup> *E*.

The remainder of this paper is organized as follows: in the next section, we discuss previous related work. Section 3 presents the system model and the problems formulated. Section 4 present the proposed algorithms. Section 5 evaluated the proposed algorithms through extensive simulations and real experiments. Section 6 concludes this paper and gives some future work suggestions.

#### **2. Related Works**

There exists an extensive array of research on data collection in UAV-aided WSN with different objectives ranging from completion time minimization [20], power controlling [21], trajectory distance minimizing [22] to energy consumption minimization [23,24]. We classify these existing data collection algorithms by two criteria: (i) Static or mobile nodes, and (ii) sensors are deployed along a path or deployed within an interesting area. In (i), algorithms are differentiated by whether the sensors mobile or not because the dynamic parameters brought by the movement of nodes in the network structure have a much greater impact on the system performance. In (ii), algorithms are differentiated by whether the nodes deployed along a path or not. The nodes deployed along a given path [12,13,25,26] so the UAV trajectory planning has very little impact on the network performance.

(i) Data collection algorithms addressed on mobile nodes. There are many works on studying how to collect data from WSN. The authors in [4,9,27–29] review these works. According to the [4,9,27–29], most of these algorithms only based on the mobile sink or only focused on mobile sensors. In our previous works [12,13,16], we studied how to use UAV to collect data from mobile nodes based on an assumption that both the nodes and the UAV move along a straight path with constant speeds. The case where both the UAV and the nodes move in a curved path is not considered. Numerous researches have been done on statically deployed networks [6,7,11,14,15,20,24,25,30–40].

(ii) Most of the aforementioned data collection algorithms can also be classified according to the deployed status of the nodes. Authors in [12,13,16,25,34] studied how to use UAV to collect data from nodes that deployed along a straight path. Especially in [25], the nodes deployed on a straight line, and the UAV flies over this line to collect data from nodes. In such context, the trajectory of the UAV is dependent on the path (or line) and has a light impact on the performance if the path is long enough. For instance, in [25], the authors aim to minimize the flight time through jointly optimizing the transmit power of nodes, the UAV speed and the transmission intervals. For the case that nodes are deployed within the area of interest, one of the main issues is to plan the UAV's trajectory so as to enhance the network performance. Numerous research has been done on the UAV trajectory planning issues [6,7,20,24,30–40]. These works are different from the optimization method and objective function because of different scenarios. They are mainly classified into two types: single-UAV trajectory planning [6,7,24,30–34] and multi-UAV trajectory planning [20,35–40].

The first is the single-UAV trajectory planning. Authors in [33] use a UAV for the mobile edge computing system. They minimize the maximum delay of all ground users through jointly optimizing the offloading ratio, the users' scheduling variables, and UAV's trajectory. While, in [24], the authors aim to minimize the maximum energy consumption by optimizing the trajectory of a rotary-wing UAV. The authors utilize a UAV to collect data from IoT devices with each has limited buffer size and target data upload deadline [6]. In this study, the data should be transmitted before it loses its meaning or becomes irrelevant. To maximize the number of served IoT devices, they jointly optimize the radio resource allocation and the UAV's trajectory.

The second is the multi-UAV trajectory planning. Multi-UAVs were used as mobile base stations to provide service for ground users in [38]. They aim to maximize the minimum throughput of ground users by optimizing the trajectory for each UAV. Scholars in [20] employ multi-UAVs to collect data from nodes. Through jointly optimizing the trajectories of UAVs, wake-up association and scheduling for sensors, they minimize the maximum mission completion time of all UAVs. The authors studied a multiple casting network utilizing the UAV to send files to all ground users [37]. They aim to minimize the mission completion time of the UAVs through designing the UAV's trajectory. Meanwhile, the proposed algorithms guarantee that each ground user can successfully recover the file. In urban applications, the authors proposed a risk-aware trajectory planning algorithm [36] for multi-UAVs. Under the same test scenarios, authors in [39] aim to minimize the mission time by planning the trajectory of each UAV. The scholars exploit the nested Markov chains to analyze the probability for successful data transmission [40]. They propose a sense-and-send mechanism [40] for real-time sensing missions, and a multi-UAVs enabled Q-learning algorithm for decentralized UAV trajectory planning.

In other cases. The authors in [11] use a single UAV to collect data from harsh terrains. Due to the large scale of the detection area, the network has a high demand for power. They adopted a rechargeable mechanism to extend the lifetime of the UAV so as to enhance the collection period. The PFS mechanism in [14,15] is based on the nodes' positions for the data collection in single-UAV aided static sensor networks. The nodes are divided into different priority groups according to two steps: (i). increasing group and decreasing group (Figure 2). The nodes within the decreasing group was given higher priority than the ones within the increasing group. (ii). For each group in (i), the nodes were divided into sub-groups according to which power level does it belong to. The sets of nodes within "power level 1" in the increasing group and in the decreasing group are denoted by S<sup>1</sup> *a*,*I* and S<sup>1</sup> *<sup>a</sup>*,*D*, respectively. The priority values for nodes within <sup>S</sup><sup>1</sup> *<sup>a</sup>*,*<sup>I</sup>* and <sup>S</sup><sup>1</sup> *<sup>a</sup>*,*<sup>D</sup>* are denoted by *<sup>P</sup>*<sup>1</sup> *<sup>a</sup>*,*<sup>I</sup>* and *<sup>P</sup>*<sup>1</sup> *<sup>a</sup>*,*D*, respectively. The authors give high priority to those nodes that are within high power level (Figure 2), and applied opposite actions to the increasing and decreasing groups: (a) in the increasing group, the nodes within high power level was given high priority value; (b) in the decreasing group, the nodes within lower power level were given high priority. After these actions, almost all nodes at the best channel conditions have been considered.

Table 1 presents the key focuses and the key difference of our proposed algorithms from existing algorithms. Although a lot of research has been done on data collection, there is still room to enhance the network performance through balancing the dynamic parameters in the first link in mobile sensor networks.


**Table 1.** Summary of related works.

**Figure 2.** The Priority Frame Selection (PFS) mechanism.

#### **3. System Model and Problem Formulation**

#### *3.1. System Model*

This paper considers a UAV-assisted mobile sensor network which has *N* mobile bicycles with each equipped a sensor, and *M* UAVs with each equipped a sensor (as illustrated in Figure 1, where *<sup>M</sup>* = 2). <sup>S</sup> <sup>=</sup> {*S*1, *<sup>S</sup>*2, ··· , *SN*} is the set of mobile sensors. *<sup>N</sup>* nodes move along a pre-defined path (path length is denoted as *L*) with each has a speed *vi*. The UAV *Ui* is dispatched to collect data from mobile sensors at a given height *hi* and speed *v<sup>i</sup> <sup>u</sup>* along a predefined trajectory (Figure 1).

The trajectory consists of a few line segments that contain the waypoint start and waypoint end (e.g., in Figure 1, waypoint *P<sup>i</sup> <sup>S</sup>* and waypoint *<sup>P</sup><sup>i</sup> <sup>E</sup>* in the trajectory of *UAVi*, *i* = 1, 2), and *k* intermediate waypoints (e.g., in Figure 1, waypoint *P<sup>i</sup>* <sup>1</sup>, waypoint *<sup>P</sup><sup>i</sup>* <sup>2</sup>, waypoint *<sup>P</sup><sup>i</sup>* <sup>3</sup>, waypoint *<sup>P</sup><sup>i</sup>* <sup>4</sup>, waypoint *<sup>P</sup><sup>i</sup>* <sup>5</sup> and waypoint *P<sup>i</sup>* <sup>6</sup> in the trajectory of *UAVi*, *<sup>i</sup>* = 1, 2). Let <sup>P</sup>*<sup>i</sup>* <sup>=</sup> {*P<sup>i</sup> <sup>S</sup>*, *<sup>P</sup><sup>i</sup>* <sup>1</sup>, *<sup>P</sup><sup>i</sup>* <sup>2</sup>, ··· , *<sup>P</sup><sup>i</sup> <sup>k</sup>*, *<sup>P</sup><sup>i</sup> <sup>E</sup>*} denote the set of all waypoints of *UAVi*. The coordinates for each waypoint *P<sup>i</sup> <sup>m</sup>* is denoted by *P<sup>i</sup> m*(*x<sup>i</sup> <sup>m</sup>*, *y<sup>i</sup> <sup>m</sup>*, *hi*). The UAV's flight time between any two waypoints *P<sup>i</sup> <sup>m</sup>* and *P<sup>i</sup> <sup>n</sup>* is given by,

$$\lambda\_{m,n}^i = \frac{\left\| \left| P\_m^i - P\_n^i \right| \right\|}{\upsilon\_n^i}, \quad P\_{m'}^i P\_n^i \in \mathbb{P}\_i. \tag{1}$$

The collection period of the *UAVi* is the duration from waypoint *P<sup>i</sup>* <sup>1</sup> to the waypoint *<sup>P</sup><sup>i</sup> <sup>E</sup>*. It is denoted by *Ti*,

$$T\_i = \Sigma\_{m=1}^{k-1} \lambda\_{m,m+1}^i + \lambda\_{k,E}^i. \tag{2}$$

The trajectory length for *UAVi* is,

$$L\_i = \Sigma\_{m=1}^{k-1} \parallel P\_{m+1}^i - P\_m^i \parallel + \parallel P\_E^i - P\_k^i \parallel \,. \tag{3}$$

Generally, in a given path, the coordinates (x-axis and y-axis) of the waypoints for the UAVs are the same except the height (z-axis). For instance, the point (*x<sup>i</sup> <sup>m</sup>*, *y<sup>i</sup> <sup>m</sup>*, *hj*) is one of the waypoints for *UAVj* (i.e., *<sup>P</sup><sup>j</sup> m*(*x<sup>i</sup> <sup>m</sup>*, *y<sup>i</sup> <sup>m</sup>*, *hj*) <sup>∈</sup> <sup>P</sup>*j*) if *<sup>P</sup><sup>i</sup> m*(*x<sup>i</sup> <sup>m</sup>*, *y<sup>i</sup> <sup>m</sup>*, *hi*) <sup>∈</sup> <sup>P</sup>*i*. Thus, we have *Li* <sup>=</sup> *Lj*. Intuitively, the straighter the pre-defined path, the smaller the Δ*L* (Δ*L* = |*L* − *Li*|). The larger the number of waypoints, the smaller the Δ*L*. Major notations used in this work are defined in Table 2.


**Table 2.** Major notations used in this article.

To well present the impact of the dynamic parameters on the system, we using homogeneous UAVs (*v<sup>i</sup> <sup>u</sup>* = *v*) to reduce the influence brought by UAVs' speeds. Accordingly, the collecting period is denoted by *T*, and *T* = *Ti*.

#### *3.2. Discrete Time Mechanism*

Considering the waypoint selection and beacon sending, we introduce a discrete-time mechanism where the collecting period *T* is divided into *Nts* time-slots with each lasting *α* time units, *Nts* = - *T α* , where · is the rounding down function. It is assumed that the time-slots are indexed as 1, 2, ··· , *Nts*, and <sup>T</sup> <sup>=</sup> {*t*1, *<sup>t</sup>*2, ··· , *tNts*} (Figure 3). It is worth note that, in each time slot, a sensor could communicate only with one UAV. For example, in *tk*, *Si* communicate with *UAVm*, and *Sj* communicate with *UAVn* (*i* = *j* and *m* = *n*).

**Figure 3.** An illustration of studied scenario.

From Figure 1, the nodes that are covered by the *UAVi* and deployed nearly complete to communicate with the *UAVi*. For instance, *Sm*, *Sn*, *Sk* in Figure 1 complete to communicate with the *UAV*1. Meanwhile, there are more than one UAV within the range of one node. For example, *Sk* in Figure 1 with the range of both *UAV*<sup>1</sup> and *UAV*2. The *Sk* should choose one from them to send packets. Hence, how to balance the communication between nodes and the UAVs so as to maximize the data collection is a challenging task.

#### *3.3. Data Collection Protocols Using UAV*

In this paper, we present a distributed method for the data collection issues in UAV-aided mobile sensor networks as follows. The collection period *T* is divided into *Nts* time slots. At the beginning of every time slot (Figure 4), UAV sends a beacon message to tell the mobile nodes that UAV is coming. The beacon includes the UAV's information, e.g., the aerial height, speed, etc. The new comers send a JOIN message which includes the sensors' information to the UAV to update the network topology. The UAV judges whether the nodes are within its range or not according to these messages. Then, it calculates the contact duration, the relative distance, and the potential time slots for each node that successfully sends the JOIN message. According to the time slot allocation algorithms that we proposed in Section 4, the UAV provides scheduling for the covered sensors, and broadcasts them a scheduling message which contains the assignment of the time-slots. Having received the scheduling message, every sensor transmits its data in its own time slots.

**Figure 4.** The procedure of allocating.

#### 3.3.1. Collecting Packets

Allocating the *Nts* time slots to individual mobile sensors under the proposed mechanism is equivalent to maximizing the usage of time slots. Let

$$N\_{\rm ts,a}(i,j,k) = \begin{cases} 1 & \mathcal{U}AV\_i \text{commuuricate with } S\_j \text{int}\_k\\ 0 & \text{otherwise.} \end{cases}$$

The data collection maximization problem is to maximize the number of collected packets, *Np*,

$$N\_p = \sum\_{i=1}^{M} \sum\_{j=1}^{N} \sum\_{k=1}^{N\_{ls}} N\_{ts,a}(i, j, k) \cdot \sigma\_{i\bar{j}k} \cdot D\_r \cdot \alpha \ . \tag{4}$$

time

where *Dr* is the transmission rate, and

$$
\sigma\_{ijk} = \begin{cases} 1 & \text{successfully transmission} \\ 0 & \text{otherwise.} \end{cases}
$$

Our objective is to balance the communication between the two UAVs and *N* mobile nodes to maximize the overall data collection utility. Therefore, the optimization problem can be formulated as,

$$\left\{ \begin{array}{l} \mathcal{O}\_1 \\ \end{array} \right. \quad \mathrel{\mathop{:}} \max\_{\left\{ \begin{array}{l} S\_p \in \mathbb{S}, t\_k \in \mathbb{T} \end{array} \right\}} \left\{ N\_p \right\} \,, \tag{5}$$

$$\text{s.t.} \qquad \sum\_{k=1}^{N\_{\text{ts}}} \mathcal{N}\_{\text{ts},a}(i, j, k) \le \mathcal{N}\_{\text{ts}}, \forall i, j \,\,. \tag{6}$$

$$\sum\_{j=1}^{N} \mathcal{N}\_{ts,d}(i, j, k) \le \mathcal{N}\_{\prime} \,\forall i, k\_{\prime} \tag{7}$$

$$\sum\_{i=1}^{M} N\_{ts,a}(i,j,k) \le M \text{ } \forall j,k \tag{8}$$

$$\sum\_{i=1}^{M} \sum\_{j=1}^{N} \sum\_{k=1}^{N\_{\rm ts}} \mathbf{N}\_{\rm ts,a}(i,j,k) \le M \cdot \mathbf{N}\_{\rm ts}, \forall i, j \,. \tag{9}$$

Constraints (6)–(8) imply that, in a given time-slot, a UAV chooses only one node to collect data, and one node selects only one UAV to send data. Constraint (9) ensures that, in a given time-slot, no more than two communications happen between UAVs and mobile nodes.

#### 3.3.2. The Number of Nodes that Successfully Send Packets to the UAV

During the communication between the UAVs and mobile nodes, the sensors transmission state contains: have no opportunity to send packets, have an opportunity to send but fail to transmit, and successfully send data to the UAVs. The larger number of nodes (*Nnode*) that successfully transmit packets, the higher the system performance. Thus, to enhance the number of nodes that successfully send data to the UAVs is one of the key points in designing data collection algorithms.

Let matrix *IM*×*N*×*Nts* is given by,

$$I\_{ijk} = N\_{ts,a}(i,j,k) \cdot \sigma\_{ijk} \cdot i\_{\prime} \qquad \mathcal{U} \\ \text{A} \\ V\_{i} \in \mathbb{U}, \\ S\_{j} \in \mathbb{S} \\ \\ and t\_{k} \in \mathbb{T}.$$

The elements in matrix *I* are the node ID. Then, we can obtain the number of nodes that successfully transmit at least one packet,

$$N\_{nodc} \stackrel{\triangle}{=} Hist(I). \tag{10}$$

where "*Hist*" is used to calculated the number of different elements in the *I* matrix. The *Nnode* maximization problem can be regarded as the formulated problem,

$$\beta\mathcal{P}\_2: \max\_{S\_f \in \mathbb{S}, t\_k \in \mathbb{T}} \{N\_{node}\}\_\prime \tag{11}$$

$$\text{s.t.} \qquad \sum\_{k=1}^{N\_{\text{tr}}} \mathcal{N}\_{\text{ts},a}(i, j, k) \le \mathcal{N}\_{\text{ts}}, \forall i, j \,\,. \tag{12}$$

$$\sum\_{j=1}^{N} N\_{ts,a}(i,j,k) \le N \text{ } \forall i,k \text{ } \tag{13}$$

$$\sum\_{i=1}^{M} N\_{ts,a}(i,j,k) \le M \text{ } \forall j,k \,\,\, \tag{14}$$

$$\sum\_{i=1}^{M} \sum\_{j=1}^{N} \sum\_{k=1}^{N\_{ts}} N\_{ts,a}(i\_{\prime}j\_{\prime}k) \le M \cdot N\_{ts} \; \forall i\_{\prime}j \; . \tag{15}$$

When *i* = 1 (single-UAV enabled sensor network), it is a classical NP-hard problem that we have studied in [12,13]. When *i* = 2 (multi-UAV enabled sensor network), this problem is also an NP-hard combinatorial maximization problem [41]: under the given conditions, its objective is to select items which have unique weight and value to maximize the total value.

#### **4. Proposed Algorithms**

In this section, we study how to balance the communication between multi-UAVs and mobile nodes, and we propose a balance mechanism. For the two cases, multiple nodes within the range of both two UAVs and multiple nodes only with the range of only one UAV, we propose two algorithms: PCdFS (Section 4.2) and PMCdFS (Section 4.3) algorithms.

#### *4.1. Balance Algorithm between UAVs and Mobile Nodes*

In a given time slot *tk* (*tk* <sup>∈</sup> <sup>T</sup>), there are multiple nodes within the range of the UAV. The nodes that are potentially for *UAV*<sup>1</sup> and *UAV*<sup>2</sup> are denoted by S<sup>1</sup> *kB* and <sup>S</sup><sup>2</sup> *kB* respectively. When <sup>S</sup><sup>1</sup> *kB* <sup>∩</sup> <sup>S</sup><sup>2</sup> *kB* <sup>=</sup> <sup>∅</sup>, there is no node within the range of the *UAV*<sup>1</sup> and *UAV*<sup>2</sup> at the same time. In this case, we propose PCdFS mechanism (see Section 4.2 for more details) to balance the communications between S<sup>1</sup> *kB* and *UAV*1, S<sup>2</sup> *kB* and *UAV*<sup>2</sup> respectively. When <sup>S</sup><sup>1</sup> *kB* <sup>∩</sup> <sup>S</sup><sup>2</sup> *kB* <sup>=</sup> <sup>∅</sup>, and <sup>S</sup>1,2 *kB* - S1 *kB* <sup>∩</sup> <sup>S</sup><sup>2</sup> *kB*. Then,

$$\mathbb{S}^{1,o}\_{kB} \triangleq \mathbb{S}^1\_{kB} - \mathbb{S}^{1,2}\_{kB\ \prime} \tag{16}$$

$$\mathbb{S}\_{kB}^{2,\rho} \triangleq \mathbb{S}\_{kB}^2 - \mathbb{S}\_{kB}^{1,2} \tag{17}$$

S1,*o kB* and <sup>S</sup>2,*<sup>o</sup> kB* denote the sensors set only within the range of the *UAV*<sup>1</sup> and *UAV*<sup>2</sup> respectively. We use the PCdFS mechanism to balance the communications between S1,*<sup>o</sup> kB* and *UAV*1, <sup>S</sup>2,*<sup>o</sup> kB* and *UAV*<sup>2</sup> respectively. For the nodes within S1,2 *kB* , we proposed the PMCdFS algorithm to balance between <sup>|</sup> <sup>S</sup>1,2 *kB* | mobile nodes and multi-UAVs. The Balance algorithm is detailed in Algorithm 1.

**Algorithm 1** Balance Algorithm.

**Input:** Initial deployed information of nodes and UAVs

**Output:** *Np, Nnode* 1: *Np* = *Nnode* = 0, *k* = 1, *Tnow* = 0; 2: **Step 1. Synchronization;** 3: UAV sends *k*-th 'Beacon' message; 4: Network update, obtain the S<sup>1</sup> *kB* and <sup>S</sup><sup>2</sup> *kB* ; 5: **Step 2. Data Communication;** 6: **while** *Tnow* < *T* **do** 7: Let S1,2 *kB* - S1 *kB* <sup>∩</sup> <sup>S</sup><sup>2</sup> *kB*, <sup>S</sup>1,*<sup>o</sup> kB* - S1 *kB* <sup>−</sup> <sup>S</sup>1,2 *kB* , and <sup>S</sup>2,*<sup>o</sup> kB* - S2 *kB* <sup>−</sup> <sup>S</sup>1,2 *kB* ; 8: **if** S1,2 *kB* <sup>=</sup> <sup>∅</sup> **then** 9: Apply PCdFS mechanism (Algorithm 2) to balance the communication between S1,*<sup>o</sup> kB* and *UAV*1, S2,*<sup>o</sup> kB* and *UAV*<sup>2</sup> respectively; 10: **else** 11: Apply PMCdFS algorithm (Algorithm 3) for the balancing between mobile nodes in S1,2 *kB* and multi-UAVs, and obtain S<sup>1</sup> *kB* and <sup>S</sup><sup>2</sup> *kB* through PMCdFS algorithm; 12: Apply PCdFS mechanism (Algorithm 2) to balance the communication between S1,*<sup>o</sup> kB* and *UAV*1, S2,*<sup>o</sup> kB* and *UAV*<sup>2</sup> respectively; 13: **end if** 14: Update *Tnow*, *k*, *Np* and *Nnode*; 15: **end while** 16: **return** *Np* and *Nnode*;

*4.2. Priority-Based Contact-Duration Frame Selection Mechanism*

In the PCdFS mechanism, the priority areas division includes two steps: (i) divide the nodes into different groups according to their power level. For example, the nodes are divided into two groups and three groups in Figures 3 and 5, respectively. In Figure 3, *Si*<sup>1</sup> and *Si*<sup>2</sup> are within the same priority area (level 2), *Si*<sup>3</sup> , *Si*<sup>4</sup> and *Si*<sup>5</sup> are within level 2. If we take more priority levels into account, e.g., 3 levels as in Figure 5, *Si*<sup>1</sup> and *Si*<sup>2</sup> belong to level 1, *Si*<sup>4</sup> is in level 2, *Si*<sup>3</sup> and *Si*<sup>5</sup> are in level 3. The more levels, the more detailed group. (ii) For each group, the nodes are given different priority according to their contact duration with the UAV. The ones that have short CD with the UAV are given higher priority values. In PCdFS, different nodes are given different priority values except the case that more than one node have the same CD with the UAV. In PCdFS, it makes the nodes facing a connection lose with the UAV highly concerned. In addition, PCdFS provides the nodes within a higher power level to send data exactly at the moment of their good channel condition so as to reduce the packet's loss. The PCdFS algorithm is detailed in Algorithm 2.

**Figure 5.** Priority areas.

**Algorithm 2** Prioritized-based contact-duration frame selection mechanism (PCdFS) Algorithm.

**Input:** Initial deployed information of nodes and UAVs, S<sup>1</sup> *kB*, <sup>S</sup><sup>2</sup> *kB*, *Np*, *Nnode*.

**Output:** *Np, Nnode*


priority area, *tk* allocated to the one (e.g., *Sik* for *UAV*1, and *Sjl* for *UAV*2) which has the shorter contact duration with the UAV.


#### *4.3. Priority-Based Multiple-Contact-Duration Frame Selection Mechanism*

The PMCdFS algorithm is used to balance the communications between the UAVs and nodes when these nodes are within the range of the multi-UAVs at the same time. Intuitively, the longer the CD between the nodes and the UAV, the higher the opportunity to send packets to the UAV. Thus, it increases the transmission opportunity of the node if it was arranged to the UAV which has a longer CD between it and the UAV. The PMCdFS is detailed in Algorithm 3. Through the PMCdFS algorithm, we obtain the sensors set in which all nodes only compete to communicate with a single UAV (*UAV*<sup>1</sup> or *UAV*2). Then, we apply the PCdFS algorithm to conduct the communication among them. The proposed algorithms are summarized in Table 3.

**Algorithm 3** Prioritized-based multiple contact-duration frame selection mechanisms (PMCdFS) algorithm.

**Input:** Initial deployed information of nodes and UAVs, S1,2 *kB* , <sup>S</sup><sup>1</sup> *kB*, <sup>S</sup><sup>2</sup> *kB*.

**Output:** S<sup>1</sup> *kB and* <sup>S</sup><sup>2</sup> *kB*

1: **for** <sup>∀</sup>*Si* <sup>∈</sup> <sup>S</sup>1,2 *kB* **do**

2: Calculate the *contact duration* between *Si* and the *UAV*<sup>1</sup> (denoted as *Ti*,1), the *UAV*<sup>2</sup> (denoted as

```
Ti,2), respectively;
3: if Ti,1 < Ti,2 then
4: S2
          kB = S2
                kB ∪ {Si} ;
5: else
6: S1
          kB = S1
                kB ∪ {Si} ;
7: end if
8: end for
9: return S1
          kB and S2
                   kB;
```



In the following, we will evaluate the proposed algorithms through different configurations, and compare our proposed algorithms with the existing algorithm (PFS).

#### **5. Implementation and Evaluation**

We implement the algorithms in both simulations and real experiments as following.

#### *5.1. Simulations*

We conduct the simulations in MATLAB/Simulink where the UAV fly (5 min) along a path (the path is 10 m wide). The simulated priority groups are {2, 3, 4, 5} groups. The other simulation parameters are presented in Table 4, the final results are given by the mean of 30 simulation runs. Considering that, the PFS mechanism is proposed and examined based on a single-UAV sensor network. To compare it to the proposed algorithm, we use *M* = 1 in the simulations in Sections 5.1.1–5.1.4. In Section 5.1.5, we compare our proposed algorithms when using single UAV and multiple UAVs. All the simulations are summarized in Table 5.


**Table 4.** Simulation parameters.


**Table 5.** Summary of simulations.

#### 5.1.1. Impact of Priority Level Changes

Figure 6 presents the impact of varying the number of priority groups. The more priority groups, the smaller number of collected packets. The number of collected packets is much improved at two priority groups division as compared to five priority groups division. That is because the nodes in lower priority groups may have changed their state when it was their turn to send packets. The introduction of contact duration provides high priority to them so as to overcome a part of this issue, thus more packets were collected in PCdFS algorithm.

**Figure 6.** Impact of priority area change. In these simulations, the proposed algorithm is the combination of proposed Balance and prioritized-based contact-duration frame selection mechanism (PCdFS) algorithms.

It also can be concluded that at a larger number of priority groups, a smaller number of nodes were within the highest priority group. Then, the smaller number of nodes have opportunities to send packets, which is unfair for the network. The number of nodes that successfully sent at least one packet in the proposed Priority-based Contact-duration Frame Selection mechanism was 16.2 times larger than in the PFS mechanism which is because the dynamic parameters are concerned in the proposed algorithm.

In the following, in both simulations and real experiments, the number of priority groups is fixed at 2.

#### 5.1.2. Varying Beacon Intervals

Figure 7 shows that both *Np* and *Nnode* were much improved when the inter-beacon duration at 2 s. Indeed, the longer the beacon intervals, the smaller the number of beacons sent. Thus, the number of network synchronizations is reduced so that nodes were seldom detected during collecting. No node will be detected if no beacon is sent.

**Figure 7.** The impact of inter-beacon duration on network performance. In these simulations, the proposed algorithm is the combination of proposed Balance and PCdFS algorithms.

#### 5.1.3. Impact of UAV's Parameters Changes

Figure 8 shows the impact of the total number of collected packets for varying the UAV speed and fly height. The network achieves the optimal (*Nnode* = 46.5 of 30 simulations) when the fly height is 15 m (Figure 8a). In this simulation, the UAV speed is 10 ms−1, and the size is 200 with nodes speeds vary from 1 ms−<sup>1</sup> to 10 ms−1. Due to using fixed *Dr*, the flight height had very slight impact on both *Np* and *Nnode*. The contact duration which was given by the relative distance between the nodes and the UAV was highly affected by the fly height. Hence, the PCdFS algorithm presents a difference from the PFS mechanism when the fly height is 95 m. Compared to *Np*, the *Nnode* was affected much when the fly height is larger than 75 m. There is clearly a difference between the two mechanisms when the gap between different fly heights exceeds 50 m.

The change of the UAV speed has a huge impact on both the total number of collected packets and the number of nodes that successfully send packets to the UAV Figure 8b. When the gap between the UAV speed and the maxi speed of all nodes is very small, the network performance is optimal. In this studied scenario, the maxi speed for all nodes is 10 ms−1, thus, the performance is optimal when the UAV speed is 10 ms−1. When *Vuav* > 10 ms−1, the higher the *Vuav*, the bigger gap between the UAV speed and the nodes' speeds, the shorter contact duration between them, then, the less opportunities for nodes to communicate with the UAV. Then, the smaller number of packets sent to the UAV, the more it was unfair for the network.

**Figure 8.** Network performance for varying UAV' parameters: flight height and speed. In these simulations, the proposed algorithm is the combination of proposed Balance and PCdFS algorithms. (**a**) The number of collected packets for the network for varying fly height of the UAV. The number of nodes that successfully send packet to the UAV in the same scenario. (**b**) The number of collected packets for the network for varying UAV' speed. The number of nodes that successfully send packet to the UAV for varying the speed of the UAV.

#### 5.1.4. Scalability

Figure 9 shows the impact of the network size on system performance. In this study, the flight height is 15 m and UAV's speed is 10 ms−<sup>1</sup> and the size vary from 5 to 200 with nodes' speeds vary from 1 ms−<sup>1</sup> to 10 ms<sup>−</sup>1.

The larger the network size, the larger number of nodes has the opportunity to communicate with the UAV, thus, the larger number of packets were sent to the UAV. When the size was larger than 30, each time-slot has successful communication, thus, the number of collected packets in the PFS mechanism keeps steady. It keeps increasing in the PCdFS algorithm until it reaches the transmission upper bound of the collection time. The *Nnode* increased steadily in the proposed algorithm. The *Nnode* when *N* = 200 in the proposed algorithm is 11.34 times larger than when *N* = 5 while it is almost the same in the PFS mechanism. Hence, the proposed algorithm shows high scalability in terms of sensors density.

**Figure 9.** Evaluation of proposed algorithm (the combination of proposed Balance and PCdFS algorithms) on network size.

#### 5.1.5. Comparison between Multi-UAVs and Single-UAV

Figure 10 presents the impact of proposed algorithms on the network size. "Alg1/*UAV*1" simulate the combination of proposed Balance and PCdFS algorithms on the *UAV*<sup>1</sup> which takes-off from the original point of the path, while "Alg1/*UAV*2" simulate the same combination algorithms on the *UAV*<sup>2</sup> which takes-off from the endpoint (the midline of the path) of the path. *UAV*<sup>1</sup> fly in the same direction as the nodes while *UAV*<sup>2</sup> fly in the opposite direction. Intuitively, the average contact duration between the *UAV*<sup>1</sup> and the nodes is longer than the average value between *UAV*<sup>2</sup> and the nodes. Thus, the communication conducted in the *UAV*<sup>1</sup> case works better than in *UAV*2. There is no doubt that the multi-UAVs work better than single UAVs in data collection issues because of more opportunity provided for mobile nodes.

**Figure 10.** The impact of *UAV*1, *UAV*<sup>2</sup> and multi-UAVs of proposed algorithm on network size. In these simulations, "Alg1" is the combination of proposed Balance and PCdFS algorithms, "Alg2" is the combination of proposed prioritized-based multiple contact-duration frame selection mechanisms (PMCdFS), PCdFS, and Balance algorithms.

#### *5.2. Real Experiment*

#### 5.2.1. Set Up

We study a path in Tongji University (Jiading Campus) as in Figure 11a. It is 5 meters wide and 1200 m long, with several intersections and 1 island (Figure 11a). In these experiments, the UAV equips a Pixhawk autopilot system [42,43] (as shown in Figure 11b) so as to fly along a predefined path at a given height. The UAV controlled through a ground station (Figure 12) where the flight height, speed and the packet transmission are controlled. We implement 15 bicycles move along the path with each equips a Pixhawk to simulate the communications based on proposed algorithms (Figure 11c). These nodes start with a random distance from the original point (point A in Figure 11a). Their locations and speeds are expressed in the NED coordinate system, as presented in Figure 11a.

**Figure 11.** Presentation of the studied path and hardware in experiments. (**a**) Experiments path in Tongji University-Jiading Campus. (**b**) The UAV employed with a Pixhawk autopilot system. (**c**) The Pixhawk autopilot system deployed on a bicycle.

**Figure 12.** A screen shot from ground control station.

Pixhawk has built-in MAVLINK protocol [44], the protocol No.24 (GPS\_RAW\_INT) [44] is used as the "beacon" packet (including the speed and location of the UAV) for the UAV, whose interval can be configured (e.g., in the following experiments, the beacon intervals is set at 2 s). For mobile nodes, the protocol No.24 (GPS\_RAW\_INT) is used as the "update" packet (including the speed and location of the mobile node). We modified and reused the protocol No.36 (SERVO\_OUTPUT\_RAW) [44] as the "scheduling" packet (which stores the sensor ID and time-slot ID for the collision-free communication between nodes and the UAV) for the UAV. Each MAVLINK packet contains a system ID field so we can use it to identify the sender. The pixhawk also has a log system so the GPS information, as well as the received packet number and time, is stored in the on-board SD card.

Figure 13 presents the movements for *UAV*<sup>1</sup> (only one UAV is used in real experiments) where the fly height is 15 m, with control speeds of 5 ms−<sup>1</sup> (Figure 13a) and 3 ms−<sup>1</sup> (Figure 13b) according to Pixhawk. In the studied experiments, the UAV flew at 15 m and 30 m. Figure 14 is an example to present the instantaneous speeds and trajectories for 5 nodes (Node 1 to Node 5) according to the Pixhawk.

To make the UAV fly along this path, we set four waypoints along the path as shown in Figure 12. In the experiments, the UAV start from *Point*<sup>1</sup> to achieve its given speed (it is 5 ms−<sup>1</sup> in Figure 12) to *Point*2, *Point*<sup>3</sup> and the ending point (*Point*4). In Pixhawk autopilot system, the UAV will hover on the waypoint and ending point for 2 s. That is why in the Figure 13a, the UAV speed is lower than 5 ms−<sup>1</sup> at *P*<sup>2</sup> and *P*3. In Figure 13b, both the height and instantaneous speed of UAV have a shock between the *Point*<sup>3</sup> and *Point*<sup>4</sup> because of the influence of wind. The wind has an impact on the dynamic parameters so as to affect the relative velocity between the mobile node and the UAV, the network performance affected accordingly. However, it cannot be control during experiments.

**Figure 13.** Presentation of the movements for UAV when it fly at 15 m with 3 ms−<sup>1</sup> and 5 ms−<sup>1</sup> in ground station. (**a**) The movements for UAV flying at 15 m, and its speed is 5 ms−<sup>1</sup> in ground control station. (**b**) The movements for UAV flying at 15 m, and its speed is 3 ms−<sup>1</sup> in ground control station.

**Figure 14.** The movements for five nodes. (**a**) Instantaneous speeds of five nodes over time. (**b**) Trajectories of five nodes.

#### 5.2.2. Results

Figures 15 and 16 show the experiments results under the proposed algorithms, the combination of the Balance algorithm and the PCdFS algorithm. From Figure 15, the number of collected packets in simulation is almost two times larger than in the experiments because of the impacts of hardware and environments are not considered in simulations. The flying height has a significant impact on the number of collected packets in experiments, especially when *Nnode* is steady between different heights. The higher the height, the larger number of nodes in both PFS and proposed algorithms. The number of collected packets of the proposed algorithm in size 15, *h* = 30 m is more than twice than in *h* = 15 m. The system performance increase as the size increase. The larger the network, the more nodes have opportunities to send packets, the more packets were collected. The number of collected packets in the proposed algorithm (when *h* is 15 m) is 1.2 times larger than in the PFS algorithm.

From Figure 16, it can be found that the UAV's speed has little impact on data collection in real experiments. This is because the UAV's speed is set at 3 ms−<sup>1</sup> and 5 ms−<sup>1</sup> because of the battery constrictions and the campus constrictions. The nodes' speeds are between 2 ms−<sup>1</sup> and 5 ms−<sup>1</sup> also (Figure 14). Thus, the relative velocity between the UAV and mobile nodes is very small. The number of collected packets presented in Figure 16 keep the same conclusions as in simulations in Section 5.1.3 where the UAV fly at 5 ms−<sup>1</sup> and 10 ms<sup>−</sup>1.

The flight height almost has no influence on the number of nodes that successfully transmit packets to the UAV, as presented in both Figures 15 and 16, which are the same as in Section 5.1.3.

**Figure 15.** The impact of network size, and flying height over the system performance. In these experiments, the beacon interval is fixed at 2 s according to the simulation results in Figure 7.

**Figure 16.** The impact of network size, and UAV's speed over the system performance. In these experiments, the beacon interval is fixed at 2 s. All the results are based on the combination of Balance and PCdFS algorithms.

#### *5.3. Discussions*

According to the aforementioned simulations, the beacon interval and the UAV speed have a huge impact on network performance. The shorter the beacon interval, the better the system performance. The UAV speed is constrained by the node speed. The smaller the relative velocity between them, the higher the network performance. It keeps the same conclusions as in the real experiment. In real experiments, the data collection is well conducted when the UAV speed is set at 5 ms−<sup>1</sup> which is very close to the average speed of mobile nodes. Compare to the other dynamic parameters, the number of priority levels has a steady impact on data collection in the simulations. From the movements of the nodes in Figure 14, it can be seen that the difference between the trajectories of nodes is very small because the road width is 5 m and the road length is 1200 m.

Compare Figure 13a,b, it also can be found that, the fly time in 3 ms−<sup>1</sup> is 1.56 times as in 5 ms−<sup>1</sup> while the speed increase by 66.67% (from 3 ms−<sup>1</sup> to 5 ms−1). In the studied scenario, there are very small differences between the trajectories when UAV fly at 3 ms−<sup>1</sup> and 5 ms−<sup>1</sup> because the UAV follow the same path which width is very short compared to its length. Thus, the fly time is mainly dependent on the speed of the UAV. In other words, the slower the UAV fly, the higher energy consumption of the battery energy. From Figure 16, we notice that, the data collection has very little difference when UAV fly at 3 ms−<sup>1</sup> and 5 ms−1. Therefore, under given constrictions, the higher the fly speed of the UAV, the more saved battery energy.

The fly height has very little impact on data collection in simulations because of the same transmission rate is adopted. However, the fly height has a huge impact on data collection in experiments because a real and complex antenna system are conducted among the transmissions between the node and the UAV. The higher the flying height, the less interference from external factors (e.g., buildings, etc.). Thus, the better the transmission, the higher the network performance.

#### **6. Conclusions**

In this paper, we developed two mechanisms: PCdFS and PMCdFS. PCdFS mechanism is used to build the scheduling communications when the nodes are only covered by one of the UAVs. PMCdFS is used to balance the communication between the nodes and multi-UAVs when these nodes within the range of multi-UAVs at the same time. Based on the two mechanisms, we proposed the Balance algorithm which highly enhances the network fairness in the applications where both the nodes and the collectors are mobile. Two key mechanisms for designing Balance algorithm are: (i) divide the interesting areas into different priority areas and (ii) provide an independent priority value for each node in the same priority group according to their contact duration with the UAVs. We examined the performance of proposed algorithms through extensive simulations, and real experiments. In the experiments, we used 15 mobile nodes at a path with several intersections and one island at the Tongji campus in Shanghai, China. We also confirm the applicability of the proposed algorithm in a challenging and realistic scenario through numerous experiments. Both simulation results and experiment results present that the proposed PCdFS algorithm enhanced the network performance efficiently. The backhaul dimensioning is an interesting problem that we will address in our future work. It depends on the used backhaul type (either satellite or terrestrial) and on the allocation that is reserved to the network slice dedicated for Machine Type Communication (MTC) traffic.

**Author Contributions:** Conceptualization, X.M., T.L., R.K., and R.D.; methodology, X.M. and T.L.; software, X.M. and T.L.; validation, X.M. and T.L.; formal analysis, X.M. and T.L.; resources, T.L.; data curation, X.M. and T.L.; writing–original draft preparation, X.M. and T.L.; writing–review and editing, X.M., T.L., R.K. and R.D.; funding acquisition, S.L. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Conflicts of Interest:** Authors declare no conflict of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Trajectory Planning for Data Collection of Energy-Constrained Heterogeneous UAVs**

**Zhen Qin 1, Chao Dong 2,\*, Hai Wang 1, Aijing Li 1, Haipeng Dai <sup>3</sup> and Weihao Sun <sup>1</sup> and Zhengqin Xu <sup>1</sup>**


Received: 29 September 2019; Accepted: 7 November 2019; Published: 8 November 2019

**Abstract:** Nowadays, Unmanned Aerial Vehicles (UAVs) have received growing popularity in the Internet-of-Things (IoT) which often deploys many sensors in a relatively wide region. Since the battery capacity is limited, sensors cannot transmit over a long distance. It is necessary for designing efficient sensor data collection mechanisms to prolong the lifetime of the IoT and enhance data collection efficiency. In this paper, we consider a UAV-enabled data collection scenario, where multiple heterogeneous UAVs with different energy constraints are employed to collect data from sensors. The value of data depends on the importance of the monitoring area of the sensor and the freshness of collected data. Our objective is to maximize the data collection utility by jointly optimizing the communication scheduling and trajectory of each UAV. The data collection utility is determined by the amount and value of the collected data. This problem is a variant of multiple knapsack problem, which is a classical NP-hard problem. First, we transform the initial problem into a submodular function maximization problem under energy constraints, and then we design a novel trajectory planning algorithm to maximize the data collection utility, while accounting for different values of data and different energy constraints of heterogeneous UAVs. Finally, under different network settings, the performance of the proposed trajectory planning algorithm is evaluated via extensive simulations. The results show that the proposed algorithm can obtain maximum data collection utility.

**Keywords:** unmanned aerial vehicles; trajectory planning; sensors; data collection utility

#### **1. Introduction**

Thanks to its tremendous application potentials in civilian, commercial and military related fields, the Internet of things (IoT) has attracted increased attention in many applications, e.g., natural disaster prediction, smart city, environmental monitoring, and reconnaissance [1–5]. The IoT often deploys many energy-constrained sensors in a relatively wide region. The task of the sensor is to collect data from the monitoring area, and then it uses multi-hop transmission mode to transmit data to the base station or sink node. Since the battery capacity is limited, sensors cannot transmit over a long distance. It is necessary for designing efficient sensor data collection mechanisms to prolong the lifetime of the IoT and enhance data collection efficiency [6].

In order to achieve efficient data collection, more and more people exploit Unmanned Aerial Vehicles (UAVs) to collect data from sensors, which will probably be a prospective technology for realizing the future IoT [6,7]. The heterogeneity and multi-domain nature of UAVs are indispensable in the IoT environment [8–10]. Thus, it is necessary for the IoT environment to use multiple heterogeneous UAVs with different capabilities. Different from traditional Wireless Sensor Networks (WSNs), the UAV-enabled data collection system uses mobile data collection devices installed on UAVs to communicate with sensors directly through the UAV-sensor channels, which are dominated by line-of-sight (LoS). UAVs can move towards the sensors and establish reliable connections with them due to their flexible deployment and high mobility [11]. UAV-enabled data collection system can reduce the energy consumption of sensors and improve the throughput and coverage.

There are many studies on studying how to use UAVs to collect reliable data from sensors. They mainly focus on two aspects of optimization. On the one hand, some works focus on solving the energy limitation problem of sensors in WSN [6,12–14]. They aim to optimize the wake-up schedule of sensor nodes, reduce the transmission power of sensors or improve the energy efficiency of data collection. However, these works rarely consider the value of data and distinguish the data collected from each sensor. For example, in reconnaissance application, the monitoring data of enemy command center is more important than that of living area. The value of data depends on the importance of monitoring area of the sensor and elapsed time after the previous collecting time (i.e., the freshness of collected data). On the other hand, some works focus on improving energy efficiency of UAV system [15–17]. They mainly aim to optimize the deployment of UAV, the trajectory of UAV, and the velocity of UAV. In these works, they use a single UAV or homogeneous UAVs. There are few studies that consider multiple heterogeneous UAVs with different energy constraints and power efficiency. Multiple heterogeneous UAVs can not only solve the energy limitation problem of a single UAV, but also fully utilize the capability characteristics of heterogeneous UAVs to implement complementary performance.

In this paper, we study a UAV-enabled data collection scenario, where multiple heterogeneous UAVs with different energy constraints are employed to collect data from sensors. The UAVs are responsible for transmitting data from sensors to the base station or sink node. We aim to maximize the data collection utility by planning trajectories of UAVs. The data collection utility is calculated by the value of data and amount of data. The value of data collected from the sensor depends on the importance of the monitoring area of the sensor and the freshness of collected data. Our problem contains three main technical challenges:


To solve this problem, we transform the initial problem into a submodular function maximization problem under energy constraints, and then, to maximize the data collection utility, we propose a novel trajectory planning algorithm. The main contributions of this paper are summarized as follows:


• Sufficient simulations are performed to demonstrate the validity and applicability of the proposed algorithm. The data collection utility of our algorithm can be increased by 134% at most, and the proposed algorithm is the closest to the optimal scheme compared with other algorithms.

The rest of this paper is organized as follows. In Section 2, we introduce the related work about the UAV-enabled data collection system and trajectory planning. In Section 3, we present the system model and problem formulation. Then we propose a solution for the formulated problem in Section 4. Simulation results are provided and analyzed in Section 5. Discussion is provided in Section 6. Finally, we conclude the paper in Section 7.

#### **2. Related Work**

#### *2.1. UAV-Enabled Data Collection*

There are many works on studying how to use the UAV to collect data from sensors. In [20], the authors considered that UAVs were used for collecting imagery information from nodes, and then, the UAVs transmitted information to the ground station. They proposed a predictive compression policy to maximize the end-to-end image quality. Gong et al. utilized a UAV to collect data from sensors which are deployed on a straight line [16]. The authors minimized the flight time of the UAV from a starting location to an ending location, and they jointly optimized the transmit power of sensors, the speed of the UAV and the data collection intervals. Ebrahimi et al. considered a scenario where UAVs collected the data in dense WSNs [6]. The authors used a novel solution methodology which is called projection-based Compressive Data Gathering (CDG). CDG aggregated data from sensors to the selected projection nodes which acted as cluster heads. Next, the UAV transferred the aggregated data from selected nodes to the sink node. In [21], the authors proposed a novel UAV-assisted backscatter communication. The UAV collected data from terrestrial backscattering tags, and then uploaded the collected data when it flied to the coverage area of the base station. Liu et al. proposed a UAV trajectory design for data collection to reduce redundant data and improving energy efficiency [22]. In [23], the authors deployed multi-UAV to serve vehicles on a highway. They utilized UAVs to deliver critical data to the vehicles crossing the given highway segment. By planning the trajectory of each UAV and optimizing the radio resource allocation, they aimed to minimize the number of UAVs to serve all vehicles. Sanaa et al. deployed UAVs as base stations to provide instant recovery via temporary wireless coverage [24]. They minimized the number of UAVs and optimized the positions of them in selected locations to enhance performance. Yang et al. studied a UAV-enabled data collection system, in which the UAV was employed to gather data from ground users. The sensors have limited battery and lower power. To prolong the lifetime of sensors, UAVs can move close to sensors to collect their information with minimum transmit power [11]. However, these works rarely consider distinguishing the data collected from each sensor. The value of data collected from each sensor is different, which depends on the importance of the monitoring area of the sensor and the freshness of collected data.

#### *2.2. Trajectory Planning*

Although people have strong interest in UAVs, studies on the location optimizing and trajectory planning of UAVs are still in progress. These studies are different in the optimization method and objective function because they assume different environments. These works are mainly divided into two types: single UAV trajectory planning and multi-UAV trajectory planning. Hu et al. considered a UAV used for the mobile edge computing system, where the mobile UAV equipped with computing resources provided service for many ground users [13]. By jointly optimizing the ratio of offloading tasks, the trajectories of UAVs, and the user scheduling variables, the authors minimized the maximum delay of all users. In the IoT system, Zhan et al. used a rotary-wing UAV for collecting the data from the IoT devices [14]. Under the energy constraint of the UAV, the authors minimized the maximum energy consumption of all IoT devices. Moataz et al. utilized a UAV to collect data from time-constrained IoT

devices [25]. These devices with limited buffer sizes had their own target data upload deadline, and thus data needed to be collected before it lost its value. Their goal was to maximize the number of served IoT devices by jointly optimizing the radio resource allocation and the trajectory of the UAV. This paper took into account the change in the value of data. It provided a basis for us to consider the value of a sensor's data. Hu et al. studied a UAV-enabled wireless power system, where the UAV provided wireless energy supply for ground users with a linear topology. The authors maximized the minimum received energy of ground users by optimizing the trajectory of the UAV [26]. They first presented the globally optimal one-dimensional (1D) trajectory solution to the minimum received energy maximization problem. Zeng et al. studied a multicasting system which utilized the UAV to transmit the file to all ground users [27]. By designing the UAV's trajectory, the authors minimized the mission completion time of the UAV. Meanwhile, they guaranteed that each ground user can successfully recover the file. However, in some applications, a single UAV has been unable to meet the demands of missions. There are many works on studying how to design the trajectories of multi-UAV. Under urban environments, in order to minimize the risk to the population, the authors proposed a risk-aware trajectory planning algorithm for multi-UAV [28]. Islam et al. proposed a task-oriented trajectory planning scheme for multi-UAV [29]. The UAVs taken autonomous decisions to find their trajectories for flying to the mission area while avoiding collision to barriers. In [30], the authors aimed to minimizing the mission time by planning the trajectory of each UAV, while satisfying the time requirements. Under the same test scenarios, Christian et al. presented advancements over the A\* and the smoothing algorithms [31]. Hu et al. exploited the nested Markov chains to analyze the probability for successful data transmission, and then, for real-time sensing missions, the authors proposed a sense-and-send protocol [32]. To solve the decentralized UAV trajectory planning problem, they proposed a multi-UAV Q-learning algorithm. Wu et al. used multi-UAV as mobile base stations which provided the service to the ground users [33]. The authors optimized the trajectory of each UAV to maximize the minimum throughput of ground users. Zhan et al. employed multi-UAVs to collect data from sensors in WSN [17]. By jointly optimizing the trajectories of UAVs, wake-up association and scheduling for sensors, the author minimized the maximum mission completion time of all UAVs. However, there are few studies that consider multiple heterogeneous UAVs with different energy constraints and power efficiency. Multi-heterogeneous UAVs not only can solve the energy limitation problem of a traditional single UAV, but also make use of the capability characteristics of heterogeneous UAVs to achieve complementary performance.

#### **3. System Model and Problem Formulation**

#### *3.1. System Model*

#### 3.1.1. Network Model

We consider a UAV-enabled data collection scenario, where *k* heterogeneous UAVs with different energy constraints are used for collecting the data from sensors to a remote base station or sink node as shown in Figure 1. In the UAV-enabled data collection system, since sensors are employed in a large area, it is inconvenient for the UAVs to fly over each sensor to collect data. In order to achieve efficient and scalable performance, more and more people adopt a clustering approach in WSN. In this paper, an overlapping clustering method is used for dividing the sensors on the ground [34,35]. Sensors transmit data to cluster heads, and then UAVs move towards cluster heads to collect data. The characteristic of overlapping clustering is that a sensor may belong to multiple clusters at the same time, which is different from traditional clustering algorithms. The cluster head can receive data from all sensors in its coverage. In other words, the sensor will transmit its data to each cluster head which it belongs to. For example, if a sensor fits in two overlapping clusters, it will transmit its data to two cluster heads. Establishing overlapping clusters can improve the success rate and robustness of data collection. For convenience, Table 1 provides major notations used in this paper.


**Table 1.** Major notations.

The UAV, sensor and cluster head sets are denoted as *U* = {*u*1, ..*ui*.., *uk*}, *S* = {*s*1, ..*sj*..,*sn*} and *C* = {*c*1, ..*ca*.., *cm*}, respectively. In addition, ground sensors can be partitioned into *m* sets, *S*1, *S*2, ...*Sm*. Each UAV *ui* is constrained with an energy budget Emax,*i*. In this paper, the UAV mainly consumes communication-related energy and propulsion energy [36–38]. The communication-related energy is used for transmitting the collected data. The propulsion energy includes motion energy and hovering energy. The UAV consumes motion energy for flying between clusters, and hovering energy for hovering at cluster heads to collect data.

**Figure 1.** Monitoring scenario.

#### 3.1.2. Propulsion Energy Consumption Model

The motion energy is spent to overcome the gravity and drag forces caused by forward motions and wind. The motion energy consumption is calculated by minimum motion power *p*min,*<sup>m</sup>* and the length of a UAV's trajectory [38,39]. It can be expressed as

$$E\_{m,i} = \frac{p\_{m,i} \cdot b\_i}{\upsilon\_i} = \frac{p\_{\min,m} \cdot b\_i}{\eta\_i \cdot \upsilon\_i} \, \, \, \, \tag{1}$$

where *pm*,*<sup>i</sup>* is the actual motion power consumption of UAV *ui*, *η<sup>i</sup>* is the UAV's power efficiency, *vi* is the velocity of the UAV *ui* and *bi* is the length of trajectory *Li*.

The hovering energy consumption depends on the hovering time and actual hovering power *ph*,*i*. The actual hovering power relates to the power efficiency and minimum hovering power *p*min,*h*. The minimum hovering power relates to the density of air, diameter, thrust and the number of rotors [39,40]. The hovering time is calculated by amount of data *Na*,*<sup>i</sup>* which is collected from cluster head *ca* by UAV *ui* and data transmission rate *Ra*,*<sup>i</sup>* between cluster head *ca* and UAV *ui*. Therefore, the hovering energy consumption of UAV *ui* for collecting data from cluster head *ca* can be calculated by

$$E\_{h,i}^{a} = \frac{p\_{h,i} \cdot N\_{a,i}}{R\_{a,i}} = \frac{p\_{\min,h} \cdot N\_{a,i}}{\eta\_i \cdot R\_{a,i}}.\tag{2}$$

In this paper, we mainly consider that UAVs are used for data collection application. This kind of applications commonly used small rotary-wing UAVs. For example, the mass of UAV is 2.07 kg, the number of rotors is 4, and the rotor diameter is 0.254 m [38]. According to references [39,40], the minimum motion power is set to 388.32 J/s, and the minimum hovering power is set to 308 J/s.

#### 3.1.3. Communication-Related Energy Consumption Model

The communication-related energy consumption for transmitting data cannot be ignored when the transmission distance or the amount of data is large. The energy consumed for successful transmitting wireless data is affected by the channel between source and destination nodes, the transmission distance and other factors like interference, fading and noises. The communication energy consumption for transmitting *Na*,*<sup>i</sup>* bits over distance *d* can be calculated by [41]

$$E^{a}\_{c,i} = N\_{a,i} \cdot d^{a} \cdot e\_{x\_{\prime}} \tag{3}$$

where *ex* and *α* are constants which depends on the characteristics of the communication channel. *ex* is unit energy consumption which represents the energy consumption for transmitting one bit, measured in *<sup>J</sup>*/(*m<sup>α</sup>* · *bit*), and *<sup>α</sup>* is the path loss exponent which depends on the data transmission environment.

#### 3.1.4. Utility Model

The data collection utility is calculated by the value of data and the amount of data. The value of data depends on the importance of the monitoring area of the sensor and the freshness of collected data. In fact, the importance of the monitoring area has different performance metrics in different applications and scenarios. For example, the importance of the monitoring area can be defined by traffic [42].

To calculate the data collection utility, we first define the value of data. On the one hand, the value of data collected from the sensor depends on the importance of monitoring area of the sensor. In this paper, the initial value of data from *sj* is defined as *V*<sup>0</sup> *<sup>j</sup>* = *<sup>V</sup>*max *<sup>j</sup>* . Once the data of sensor *sj* is collected, the value of data collected from *sj* is set to *V*min *<sup>j</sup>* . For any sensor *sj* and *sj* , if the monitoring area of sensor *sj* is more important than the monitoring area of sensor *sj* , the relations can be expressed by

$$V\_j^{\max} \geqslant V\_{j'}^{\max} \geqslant V\_j^{\min} \geqslant V\_{j'}^{\min}.\tag{4}$$

On the other hand, the value of data collected from sensor *sj* depends on elapsed time after the previous collecting time (i.e., the freshness of collected data). For each sensor *sj*, recovery interval *Tj* is different which depends on the importance of the monitoring area and the required monitoring interval of the sensor. At time *t*, the value of data collected from sensor *sj* can be denoted as [43]

$$V\_{\vec{j}}(t) = \begin{cases} A \times \exp(t - t'\_{\vec{j}}) + B, & \text{if } |t - t'\_{\vec{j}}| \le T\_{\vec{j}} \\ V\_{\vec{j}}^{\text{max}}, & \text{otherwise} \end{cases},\tag{5}$$

$$\begin{cases} \begin{array}{c} A = \frac{V\_j^{\max} - V\_j^{\min}}{\epsilon^{\frac{1}{j}} - 1}, \\ B = V\_j^{\min} - A, \end{array} \end{cases} \tag{6}$$

where *t <sup>j</sup>* is the time of previous data collection from sensor *sj*. As we can see from Equation (5) and Equation (6), when the sensor's data is collected by one UAV, the value of data will decrease to the minimum value. As time elapses, the value of data increases exponentially until it reaches its maximum value. After the value of data reaches the maximum value, it remains until the sensor's data is collected by UAVs again.

In this paper, sensors transmit data to cluster heads, and then UAVs move towards cluster heads to collect data. The data collection utility mainly depends on the amount of data and its value. The data collection utility of the selected cluster head *ca* which is served by a UAV *ui* can be given by

$$q\_{a,i} = \sum\_{s\_j \in S\_a} N\_{a,i}(s\_j) \cdot V\_j(t\_{a,i})\_\prime \tag{7}$$

where *Sa* is the set of all sensors in cluster *ca*, *Na*,*i*(*sj*) represents the amount of data of sensor *sj* included in cluster head *ca* which is served by UAV *ui*. Meanwhile, since the time of data collection is relatively short, we do not consider the changes of the data's amount and value in the process of data collection. *ta*,*<sup>i</sup>* represents the time when a UAV *ui* starts to collect data from cluster head *ca*. The data collection utility of UAV *ui* can be calculated by

$$Q\_i = \sum\_{c\_d \in P\_i} q\_{a, i\prime} \tag{8}$$

where *Pi* is the set of cluster head that is served by UAV *ui*. Therefore, the overall utility of data collection mission can be calculated by

$$Q(P) = \sum\_{i=1}^{k} Q\_i = \sum\_{i=1}^{k} \sum\_{c\_a \in P\_i} q\_{a,i} \,. \tag{9}$$

#### *3.2. Problem Formulation*

In this paper, the flying altitude of the UAVs is assumed to be a constant altitude *H*. We assume that *r* = (*xr*, *yr*, *H*) is the initial location of all UAVs. The total energy consumption *Ei* includes the hovering energy consumption, motion energy consumption and communication energy consumption, which can be expressed by

$$E\_i = E\_{m,i} + E\_{l,i} + E\_{c,i} \,. \tag{10}$$

Denote the trajectory of UAV *ui* projected on the ground as *li*(*t*)=[*xi*(*t*), *yi*(*t*)]*<sup>T</sup>* <sup>∈</sup> <sup>R</sup>2×1, where 0 ≤ *t* ≤ *T*. The trajectory of each UAV is subject to the velocity constraints, which can be given by

$$\left\| \left\| \dot{l}\_i(t) \right\| \right\| \le \upsilon\_{\text{max}}, \forall i, t \in [0, T], \tag{11}$$

where · *li*(*t*) is the time derivative of *li*(*t*) and *v*max is the maximum velocity of UAVs.

Our goal is to plan the trajectories of heterogeneous UAVs with different energy constraints to maximize the overall data collection utility. Therefore, the optimization problem can be formulated as

$$P1: \max\_{P \in \mathcal{C}, L\_i, 1 \le i \le k} Q(P) \tag{12}$$

$$\text{s.t.} \quad E\_i \le E\_{\text{max}, i\prime} \forall i,\tag{13}$$

$$\left\|\dot{l}\_{i}(t)\right\| \leq \upsilon\_{\text{max},\prime} \forall i, t \in [0, T], \tag{14}$$

$$L\_i(0) = L\_i(T) = r\_\prime \forall i,\tag{15}$$

$$\left||L\_i(t) - L\_{i'}(t)\right|| \ge d\_{\text{min}\_{\prime}} \forall i \ne i', t \in [0, T]. \tag{16}$$

where *P* represents the selected cluster heads, *Li* represents the trajectory of UAV *ui*, *r* is the initial location of all UAVs and *d*min denotes the minimum distance between UAVs to ensure collision avoidance. Constraint (13) implies the energy consumption of UAV *ui* cannot be greater than its maximum energy constraint Emax,*i*. In (15), it ensures that each UAV needs to return to initial location *r* by the end of data collection mission. When all UAVs fly at the same altitude *H*, the trajectories of UAVs are also constrained by collision avoidance (16).

#### **4. Solution**

#### *4.1. Hardness Analysis*

The formulated problem combines two-level optimizations. The objective of upper level optimization is to select cluster heads and the objective of lower level optimization is to design trajectories for energy-constraint heterogeneous UAVs. The results of each level optimization problem would directly affect another level optimization. If we select cluster heads without considering trajectory planning, it will consume much motion energy. If we do not consider to select appropriate cluster heads in trajectory planning, the data collection utility will not be maximized. Therefore, the two-level optimizations are coupled with each other and cannot be solved separately.

Without considering the motion energy consumption, the upper cluster head selection problem can be regarded as a simplified form of the formulated problem

$$P2: \max\_{P \in \mathcal{C}, L\_i, 1 \le i \le k} Q(P) \tag{17}$$

$$\text{s.t.} \quad E\_{\text{lt},i} + E\_{\text{c.f.}} \le E\_{\text{max},i\text{.} \forall i} \forall i,\tag{18}$$

$$\left\|\dot{l}\_i(t)\right\| \le \upsilon\_{\text{max}, \prime} \forall i, t \in [0, T], \tag{19}$$

$$L\_i(0) = L\_i(T) = r\_\prime \,\forall i,\tag{20}$$

$$\|\|L\_i(t) - L\_{i'}(t)\|\| \ge d\_{\text{min}} \,\forall i \ne i', t \in [0, T]. \tag{21}$$

This problem can be modeled as a multiple capacity-constraint knapsack problem. When *k* = 1, this problem is a knapsack problem, which is a classical NP-hard problem. Therefore, when the value of *k* is greater than 1, our problem is also NP-hard. The knapsack problem is a combinatorial optimization problem: under the given weight limit, its objective is to select items which have unique weight and value to maximize the total value [18,19].

In addition, if we consider the motion energy consumption, we would calculate the *k* closed trajectories including all cluster heads in the selected set *P*. This problem can be formulated as multiple Travelling Salesman Problem, which is also a NP-hard problem. Therefore, the original optimization problem is difficult to solve, which combines two coupling NP-hard problems.

#### *4.2. Submodular Analysis*

To solve this problem, we transform the initial problem into the problem of maximizing a submodular function with energy constraints. We prove the data collection utility function has three tractable properties: submodularity, nonnegativity and monotonicity. We first give some definitions to facilitate further analysis.

**Definition 1.** *(monotonicity, nonnegativity, and submodularity) given a finite set* Ψ*, a submodular function is a set function <sup>f</sup>* : 2<sup>Ψ</sup> <sup>→</sup> *R. <sup>f</sup> is called monotonicity (nondecreasing), nonnegativity, and submodularity if and only if it can satisfy the following requirements, respectively.*


**Theorem 1.** *The constructed objective function is submodular, monotone and nonnegative.*

**Proof.** According to the definition of the data collection utility function, *Q*(*P*) ≥ 0, then it is nonnegative.

The data collection utility *Q*(*P*) increases as the number of cluster heads collected by UAVs increases. According to the utility model, for the set *X* ⊆ *Y* ⊆ *C*, we can obtain the following inequation

$$Q(X) \le Q(Y),\tag{22}$$

where implies *Q*(*P*) is monotone. Next, we prove that *Q*(*P*) is a submodular function by proving the following inequation

$$Q(X \cup \{\mathfrak{c}\}) - Q(X) \ge Q(Y \cup \{\mathfrak{c}\}) - Q(Y), \\ X \subseteq Y \subseteq \mathbb{C}, \mathbf{x} \in \mathbb{C} \backslash Y, \tag{23}$$

where *X* and *Y* represent the set of cluster heads in WSN. We denote *SX* as the sensors covered by the set of cluster heads *X*. We prove the inequation under two cases.

*Case 1* (*Sc* ∩ *SY* = ∅): In this case, the data of sensors included in the newly added cluster has never been collected. Therefore, the value of data of sensors covered by cluster head *c* can reach their maximum value *V*max. We can obtain

$$Q(X \cup \{\mathfrak{c}\}) - Q(X) = Q(Y \cup \{\mathfrak{c}\}) - Q(Y). \tag{24}$$

*Case 2* (*Sc* ∩ *SY* = ∅): In this case, the data of some sensors included in the newly added cluster has been collected. Once the data of sensor is collected, the value of data will be set to *V*min. Therefore, we can obtain

$$Q(X \cup \{\varepsilon\}) - Q(X) \ge Q(Y \cup \{\varepsilon\}) - Q(Y). \tag{25}$$

Therefore, we prove that *Q*(*P*) is a submodular function. To solve this problem, the initial problem is transformed into a submodular function maximization problem with energy constraints. We propose a novel trajectory planning algorithm which refers to the idea of [44,45]. It aims to maximize the overall data collection utility, while accounting for cluster head selection and differnent energy constraints of heterogeneous UAVs.

#### *4.3. Algorithm*

Based on the submodular function, we jointly consider the upper level optimization and lower level optimization, and then we design a simple but efficient algorithm referring to the idea of [44,45]. The algorithm attempts to select appropriate cluster heads to collect data and design the collecting sequence. The core idea of our algorithm is to iteratively select a new cluster head *cj* by greedy method, which has the maximum utility-cost ratio. For example, in iteration *j*, the selected cluster head can be expressed as follows

$$\mathcal{L}\_{\mathcal{I}'} = \underset{\mathcal{c} \in I \backslash P\_{j-1}}{\text{arg}\max} \frac{Q(P\_{j-1} \cup \{\mathcal{c}\}) - Q(P\_{j-1})}{E(P\_{j-1} \cup \{\mathcal{c}\}) - E(P\_{j-1})}.\tag{26}$$

Algorithm 1 consists of a parent loop and a child loop. After inputting and initializing relative parameters, we use the parent loop to select cluster heads and plan trajectories (Line 2–Line 12). When all cluster heads have been traversed, the parent loop is no longer executed. In the parent loop, there is a child loop for selecting the cluster head which has the maximum utility-cost ratio (Line 3–Line 7). In each iteration, we use Algorithm 2 to calculate the energy cost and the trajectory, which considers the energy constraint of each UAV. The utility and energy cost are calculated according to the previously selected cluster heads plus possible cluster head *cj*, and then, we pick up the cluster head with the highest utility ratio (Line 7). Afterwards, we check whether the UAVs satisfy the respective energy constraints (Line 8–Line 10). Next, it deletes this cluster head and starts next parent loop. Each parent loop returns a solution which is better than previous solution and the nature of the result depends on the quality of Algorithm 2. Finally, we can obtain the selected cluster heads *P* and the trajectories of UAVs *L* which satisfy the respective energy constraints.

#### **Algorithm 1** Cluster Head Selection and Trajectory Planning Algorithm

**Input:** Cluster head set *C*, energy constraints Emax,*i*, 1 ≤ *i* ≤ *k*. **Output:** Selected cluster heads *P* ⊆ *C* and *k* trajectories of UAVs. 1: Initialize *I* ← *C*, *P*<sup>0</sup> ← ∅, *E*(*P*0) ← 0, *Q*(*P*0) ← 0, *L* ← ∅, *j* ← 1; 2: **while** *I* = ∅ **do** 3: **for** *i* = 1 to |*I*| **do** 4: Calculate *Q*(*Pj*−<sup>1</sup> ∪ {*ci*}) and *Q*(*Pj*−1); 5: Using Algorithm 2 to get the trajectories and the energy cost *E*(*Pj*−<sup>1</sup> ∪ {*ci*}); 6: **end for** 7: *cj* <sup>=</sup> arg max *<sup>c</sup>*∈*<sup>I</sup> <sup>Q</sup>*(*Pj*−1∪{*c*})−*Q*(*Pj*−1) *<sup>E</sup>*(*Pj*−1∪{*c*})−*E*(*Pj*−1) ; 8: **if** *E*(*Pj*−<sup>1</sup> ∪ {*cj*}) = *Inf* **then** 9: *Pj* ← *Pj*−<sup>1</sup> ∪ {*cj*}, *j* ← *j* + 1, *L* ← *Lcj* ; 10: **end if** 11: *I* ← *I*\*cj* ; 12: **end while**

13: Output *P* ← *Pj*−1, *L*.

**Algorithm 2** Multiple Energy-constrained Heterogeneous UAV Trajectory Planning Algorithm

**Input:** *P*, Emax,*i*, 1 ≤ *i* ≤ *k*, starting location *r*.

**Output:** *k* trajectories of UAVs and the energy cost *E*.

1: Initialize *Y* ← *P*, *E* ← ∅, MAXE = 0, *L* ← ∅;


```
4: for i = 1 to |ζ| do
```

```
7: Use the TSP algorithm to calculate the energy cost Ei,j and the trajectory Li,j which covers
```
all cluster heads in *Xi* ∪ *cj*;

```
8: if Ei.j ≤ Emax,i then
9: Xi ← Xi ∪ cj, Ei ← Ei,j, Li ← Li,j;
10: end if
11: end for
12: Y ← Y\Xi, E ← E + Ei, L ← L ∪ Li;
13: end for
14: if Y = ∅ then
15: E =Inf ;
16: end if
17: end while
```
#### **5. Simulation Results**

In this paper, our algorithm aims to maximize the data collection utility by optimizing the trajectory of each UAV. The data collection utility is calculated by the value and amount of data. The value of data depends on the importance of the monitoring area of the sensor and the freshness of collected data.

#### *5.1. Simulation Setup*

We consider a mission area of size 1 km × 1 km. The simulations are performed according to parameters specified in Table 2 [38–41]. The time requirement for data uploading is not a constant. It can be changed depending on the amount of data and data transmission rate. In fact, sensors continuously monitor the area and generate new data. However, since the time of data collection is relatively short, we do not consider the changes of data's amount and value in the process of data collection. We assume that communication links between UAVs and sensors are dominated by the LoS links where the channel quality mainly depends on the UAV-sensor distance [16,33,46]. Meanwhile, since the UAVs fly at a fixed altitude, we can set the data transmission rate to be 2 Mbps. Furthermore, the simulation results are averaged over extensive simulation runs.



#### *5.2. Baseline Setup*

To demonstrate the performance of the proposed UAV trajectory planning algorithm (UE), we compare and implement the following four benchmark schemes:


#### *5.3. Different Number of Sensors*

In this simulation, we set the number of UAVs to 3. Figure 2 shows the trend of data collection utility as the number of sensors changes. We can observe that the data collection utility gradually increases as the number of sensors increases. Our algorithm achieves almost the same performance

with the optimal scheme when the number of sensors is small. However, the gap between our algorithm and the optimal scheme increases as the number of sensors increases, from 5.2% to 66.6%. Meanwhile, the proposed algorithm shows better performance when the number of sensors is large. Compared with RAN algorithm, the data collection utility of our algorithm is improved by 103%–134%. This is reasonable since it chooses the data collection points with the highest utility-cost ratio each time, which saves the energy of UAVs and improves the data collection utility of UAVs. Compared with EC algorithm, the data collection utility of our algorithm is improved by 49%–62%. Because our algorithm considers the value of data when selecting cluster heads, not only the amount of data collection. Furthermore, GU algorithm chooses a cluster which has the most data collection utility. However, it does not consider the energy consumption for collecting data from this cluster. Compared with the GU algorithm, the data collection utility of our algorithm is improved by 72%–102%. Our proposed algorithm makes reasonable and effective use of UAV's energy to collect more valuable data of sensors.

**Figure 2.** Different Number of Sensors.

In Figure 3, we illustrate the convergence of our algorithm under different number of sensors. From the figure, we note that our algorithm achieves fast convergence in three cases. Meanwhile, we can obtain that the number of iterations of the proposed algorithm is related to the number of sensors.

**Figure 3.** Convergence of UE Algorithm (*n* = 20, 60, 120.)

#### *5.4. Different Number of UAVs*

In this simulation, the number of sensors is set to 100. As shown in Figure 4, as the number of UAVs increases, the advantages of our algorithm are more obvious. This is because we fully consider the energy constraint and power efficiency of each UAV in trajectory planning. Therefore, as the number of heterogeneous UAVs increases, the data collection utility gradually increases, and our algorithm is closer to the optimal scheme than other three algorithms. The data collection utility of the optimal scheme is 17.6%–38.9% higher than that of our algorithm. In practical applications, we should consider the mission requirements and existing equipment to dispatch an appropriate number of UAVs to perform mission. Using too many UAVs may bring economic pressure and reduce the energy efficiency. Figure 5 shows the convergence of the proposed algorithm under different number of UAVs. We can observe that our algorithm achieves fast convergence. Meanwhile, we can also find that the number of UAVs has little effect on the convergence of the proposed algorithm.

**Figure 4.** Different Numberof UAVs.

**Figure 5.** Convergence of UEAlgorithm (*k* = 2, 5, 9).

#### *5.5. Different Mission Area Sizes*

Figure 6 shows the trend of data collection utility as the size of mission area changes. We set the number of sensors to 100, and the number of UAVs to 3. We assume the mission area is a square area, and the variable in this simulation is the side length of the mission area. As shown in Figure 6, with the expansion of the mission area, the data collection utility gradually decreases. This is reasonable since the UAVs need to consume more energy to fly between data collection points when the sensors are distributed in a large mission area. Under this scenario, the energy used for data collection will be reduced, leading to the decrease of data collection utility. However, when the mission area is large, the data collection utility of our algorithm is also higher than other algorithms. For example, when the mission area is 1500 m × 1500 m, the data collection utility of our algorithm is 52%–134% higher than compared algorithms. Meanwhile, the data collection utility of the optimal scheme is 8.7%–35.1% higher than that of our algorithm as the mission area expands. In Figure 7, we illustrate the convergence of our algorithm under different mission area sizes. We can see that the algorithm can converge quickly in different data collection areas.

**Figure 6.** Different Mission Area Sizes.

**Figure 7.** Convergence of UE Algorithm (800 m, 1200 m, 1500 m).

#### *5.6. Trajectories of UAVs*

In this subsection, we use Figure 8 to show the resulting trajectories by each of the algorithms. The serial number of the cluster represents the importance of its coverage area. UE algorithm chooses the data collection points with the highest utility-cost ratio each time, which saves the energy of UAVs and improves the data collection utility of UAVs. Compared with the other three algorithms, our algorithm takes into account the data collection utility and the energy consumption.

**Figure 8.** Trajectories of Three UAVs.

#### **6. Discussion**

In this paper, we mainly focus on two-dimensional trajectory planning of UAVs. In fact, it is worthwhile to optimize UAV's altitude. However, the optimization of flight altitude will bring some challenges. First, the ascend and descend of UAVs will bring extra energy consumption. Second, the flight altitude of UAVs can influence the quality of communication channel. Third, it will bring new optimization variables and increase the search space. We need to further study to solve these problems. In the future, to further improve the performance of multi-UAV data collection system, we will present a new design framework of three-dimensional UAV trajectory.

#### **7. Conclusions**

In this paper, we consider exploiting UAVs to collect data from sensors. The value of data collected from each sensor is different. It depends on the importance of the monitoring area of the sensor and the freshness of collected data. To improve the data collection utility, we optimize the trajectory planning, communication scheduling and sensor node association. The data collection utility is determined by the amount and value of data. First, we formulate this problem as a variant of multiple knapsack problem, which is a classical NP-hard problem. We transform the initial problem into the problem of maximizing a submodular function under energy constraints. To maximize the data collection

utility, we propose a novel trajectory planning algorithm, while accounting for different value of data and different energy constraints of heterogeneous UAVs. Sufficient simulations are performed to demonstrate the validity and applicability of the proposed algorithm. The results show that the data collection utility of our algorithm can be increased by 134% at most.

**Author Contributions:** Conceptualization, Z.Q., C.D. and H.W.; methodology, H.D. and Z.Q.; software, A.L. and W.S.; validation, Z.Q. and C.D.; formal analysis, H.W. and A.L.; investigation, A.L., W.S. and Z.X.; writing–original draft preparation, Z.Q.; writing–review and editing, C.D., H.D. and A.L.

**Funding:** This work was supported in part by the National Natural Science Foundation of China under Grants (No. 61931011, No. 61872178, No. 61827801, No. 61702545, No. 61702525, and No. 61631020), in part by the Fundamental Research Funds for the Central Universities No. 021014380079, and in part by the Natural Science Foundation of Jiangsu Province under Grant No. BK20181251.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


c 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Completion Time Minimization for Multi-UAV Information Collection via Trajectory Planning**

#### **Zhen Qin 1, Aijing Li 1, Chao Dong 2,\*, Haipeng Dai <sup>3</sup> and Zhengqin Xu <sup>1</sup>**


Received: 5 August 2019; Accepted: 16 September 2019; Published: 18 September 2019

**Abstract:** Unmanned Aerial Vehicles (UAVs) are widely used as mobile information collectors for sensors to prolong the network time in Wireless Sensor Networks (WSNs) due to their flexible deployment, high mobility, and low cost. This paper focuses on the scenario where rotary-wing UAVs complete information collection mission cooperatively. For the first time, we study the problem of minimizing the mission completion time for a multi-UAV system in a monitoring scenario when considering the information collection quality. The mission completion time includes flying time and hovering time. By optimizing the trajectories of all UAVs, we minimize the mission completion time while ensuring that the information of each sensor is collected. This problem can be formulated as a mixed-integer non-convex one which has been proved to be NP-hard. To solve the formulated problem, we first propose a hovering point selection algorithm to select appropriate hovering points where the UAVs can sequentially collect the information from multiple sensors. We model this problem as a BS coverage problem with the information collection quality in consideration. Then, we use a min-max cycle cover algorithm to assign these hovering points and get the trajectory of each UAV. Finally, with the obtained UAVs trajectories, we further consider the UAVs can also collect information when flying and optimize the time allocations. The performance of our algorithm is verified by simulations, which show that the mission completion time is minimum compared with state-of-the-art algorithms.

**Keywords:** wireless sensor networks; unmanned aerial vehicle; mission completion time; trajectory planning

#### **1. Introduction**

Due to their tremendous application potentials in military and civilian related applications, Wireless Sensor Networks (WSNs) have attracted increasing research interest in many fields, such as industrial process control, environmental monitoring, and battlefield surveillance [1–4]. Generally, WSNs are composed of small, battery-limited sensors which cannot transmit over a long distance [5]. There is a great interest recently in utilizing rotary-wing Unmanned Aerial Vehicles (UAVs) as mobile information collectors in WSNs [6–8]. They can move towards the sensors and establish reliable connections with them due to their high mobility and flexible deployment [9]. Rotary-wing UAVs are equipped with propellers that enable them to hover over fixed locations. This advantage makes them suitable candidates for mobile information collectors as they can hover over sensors to collect information [10,11]. Nowadays, with the popularity of WSNs, a single UAV cannot meet the demands of monitoring missions over larger monitoring areas. Information collection of WSNs with multi-UAV

cooperation is more desired. Currently, there are two types of methods to collect the information from WSNs with multiple UAVs. The first is to use static UAVs to form a network to cover all sensors, which is suitable for small area monitoring [12–15]. The second is to use mobile UAVs to cover a larger area [16–18]. As shown in Figure 1, we mainly focus on the second scenario in this paper.

**Figure 1.** Monitoring scenario.

For many WSNs applications, the timeliness of monitoring information is critical. Therefore, the mission completion time of multi-UAV cooperative information collection is very important. For example, in a city security monitoring system, if the control center gets fire information as soon as possible, the city may avoid serious economic loss and casualties. In traffic monitoring applications, real-time road information helps cars effectively avoid traffic congestion. In this paper, the mission completion time includes flying time and hovering time, both of which are related to the trajectory planning of the UAVs. Therefore, the trajectories need to be carefully planned such that UAVs can complete theirs missions quickly and satisfy the communication requirements along their entire trajectory. By optimizing the trajectory of each UAV, we pursue minimizing the mission completion time of multi-UAV system.

Some works have studied how to minimize a single UAV's the mission completion time [16,17,19–21]. However, these works cannot be used for information collection with multi-UAV directly. In fact, a multi-UAV system requires solving two sub-problems, task assignment and trajectory optimization. Generally, we cannot address one of problems individually because the two sub-problems are tightly coupled. If task assignment is not considered in trajectory planning, it may result in unreasonable task assignment. For example, a UAV may be responsible for too many tasks, which can lead to longer mission completion time. If trajectory optimization is not considered in task assignment, it may result in long flying distance, which may also lead to longer mission completion time. Therefore, we cannot solve the two sub-problems separately. In other words, we cannot assign tasks first, and then apply the algorithm of a single UAV to solve this problem. Since we use a multi-UAV system to collect information, the completion time of the entire mission is the time of all tasks completed by UAVs. In other words, it is the maximum completion time among the UAVs. The completion time of a UAV is mainly divided into two parts, flying time and hovering time. As shown in Figure 1, we give a toy example illustrating flying time and hovering time. The flying time is the time taken for the UAV to fly from point A to point B. The hovering time is the time for the UAV staying at the hovering point (A or B) to collect information from sensors. There are two

main differences from previous works. First, some works only consider flying time or only consider hovering time in UAV-enabled network [22–24]. Second, some works assume that UAVs only collect information from sensors when hovering [25–30]. If flying time is fully utilized, the mission completion time can be reduced. Therefore, we consider that the UAVs can collect information when they are hovering and flying.

In this paper, our objective is to plan the trajectories of all UAVs to minimize the mission completion time, while ensuring that the information of each sensor is collected. Rotary-wing UAVs can collect information when they are hovering and flying. This problem can be formulated as a mixed-integer non-convex problem. Since the problem involves infinitely many variables over time, it is difficult to optimize trajectories, and hovering and flying communication scheduling at the same time. To tackle this problem, we propose an improved fly-hover-fly trajectory planning algorithm. We first propose a hovering point selection algorithm to get hovering points. Then, based on these hovering points, we plan the trajectory of each UAV. Intuitively, it can use the algorithm of Multiple Traveling Salesman Problem (MTSP) to plan the trajectories of multi-UAV. However, MTSP's goal is to minimize the total completion time of multi-UAV. Different from MTSP, our problem aims to collect all sensors' information with multi-UAV such that the maximum completion time among UAVs is minimized. By using the Base Station (BS) coverage algorithm and min-max cycle cover algorithm, we select appropriate hovering points where the UAVs can sequentially collect the information from multiple sensors and get the trajectory of each UAV. Finally, based on the obtained trajectories, we propose an effective method to optimize UAVs hovering and fly-collection time allocations. The main contributions of this paper are summarized as follows:


The rest of this paper is organized as follows. In Section 2, we introduce the related work. The system model and problem formulation are presented in Section 3. Then, we propose the FHF algorithm to optimize the hovering points and obtain the trajectories in Section 4. In Section 5, we optimize the time allocations. Simulation results are provided and analyzed in Section 6. Discussion is provided in Section 7. Finally, we conclude the paper in Section 8.

#### **2. Related Work**

#### *2.1. Single UAV's Mission Completion Time Problem*

There are many studies on minimizing a single UAV's mission completion time. In [16], the authors studied a UAV-enabled multicasting system and planned the trajectory of the UAV for minimizing mission completion time, while ensuring that ground terminals successfully recovered files. To reduce the single UAV's mission completion time and periodic flight duration, Zhang et al. [17]

optimized the trajectory of UAV and communication resource allocation, while satisfying the data requirement of ground users. Zhang et al. [19] focused on a scenario where a single UAV communicated with multiple ground base stations. They aimed to minimize the mission completion time by planning the trajectory of UAV, while the UAV was also subject to practical communication connectivity constraint with ground base stations. One of the most important reconnaissance missions of the UAV is to take pictures of interesting targets in a wide area. Under this scenario, Lim et al. [20] proposed a trajectory generation algorithm for reconnaissance UAV with time constraints. Zhang et al. [21] proposed sparse *A*∗ search approach for a single UAV route planning with time constraints. It can be seen that the above works only discuss a single UAV's mission completion time in different scenarios. However, these works cannot be used for information collection with multi-UAV directly. Therefore, the mission completion time problem of multi-UAV is worth being further studied.

#### *2.2. Multi-UAV's Mission Completion Time Problem*

There are some studies on completing missions by multi-UAV within the specified time. Pohl et al. [31] studied the routing of multi-UAV, while the trajectories of UAVs met the constraints of total path cost optimization, total mission time, enemy radar avoidance, and time on target. This work aimed to minimize the total distance by planning trajectories. Shi et al. [32] proposed a statistical physics method for UAVs cooperative reconnaissance mission planning, while the trajectories satisfied the reconnaissance time requirements. However, in many applications, it is more important to complete the missions as soon as possible. In [23], the authors introduced a geometric path planning method and proposed a resource allocation algorithm for multi-UAV. However, they only considered the flying time. Hayat et al. [33] optimized the trajectory of each UAV and synchronized multi-UAV so that the UAVs can fly to the next task only after all UAVs had completed their previous task. Mousavi et al. [34] aimed to form UAV coalition formation to reduce the mission completion time. Wang et al. [35] found the optimal schedule to perform diverse missions of different time windows at various locations using fixed-wing heterogeneous UAVs. The solutions of minimizing the mission completion time for multi-UAV system are not directly applicable to our problem.

#### **3. System Model and Problem Formulation**

#### *3.1. System Model*

As shown in Figure 1, we consider a monitoring scenario where multiple rotary-wing UAVs are employed to collect information from sensors whose distributions follow the homogeneous Poisson Point Process. Traditionally, sensors transmit their monitored information to a base station via multi-hop transmissions. Therefore, a sensor is required to not only transmit its own information, but also relay the information of other sensors. As a consequence, the sensors battery may drain quickly and the multi-hop network connection may be lost [22]. By utilizing the UAVs as mobile information collectors, sensors can directly send monitored information to UAVs. However, as UAVs have limited on-board battery, it is important to reduce the completion time needed for the information collection mission. In this paper, we aim to minimize the maximum completion time among all UAVs by optimizing the trajectories of UAVs to complete the information collection mission as soon as possible. We assume the UAVs fly at a fixed altitude. The UAV collects information from sensors via time-division multiple access (TDMA) that means the UAV only communicates with at most one sensor at each time. Meanwhile, we consider a scenario where ground users are mainly composed of sensors. The extension to the three-dimensional (3D) trajectory planning and other multiple access schemes will be left for our future work. For convenience, Table 1 provides major notations used in this paper.


**Table 1.** Major notations .

The UAV and sensor sets are denoted as *<sup>U</sup>* <sup>Δ</sup> <sup>=</sup> {*u*1, ..., *uk*} and *<sup>P</sup>* <sup>Δ</sup> = {*p*1, ...., *pn*}, respectively. Considering a 3D Cartesian coordinate system, sensor *pj* is fixed at *cj* = (*xj*, *yj*, 0). Denote the trajectory of UAV *ui* projected on the ground as *li*(*t*)=[*xi*(*t*), *yi*(*t*)]*<sup>T</sup>* <sup>∈</sup> <sup>R</sup>2×1, where 0 <sup>≤</sup> *<sup>t</sup>* <sup>≤</sup> *<sup>T</sup>*. The trajectory of each UAV should satisfy the following constraint

$$l\_i(0) = l\_i(T), \forall i. \tag{1}$$

which means that the UAVs need to fly back to their starting location by the end of monitoring mission. Furthermore, the trajectory of each UAV is subject to the velocity constraints, which can be given by

$$\left\| \dot{l}\_i(t) \right\| \le \upsilon\_{\text{max}}, \forall i, t \in [0, T], \tag{2}$$

where *<sup>v</sup>*max denotes the maximum velocity of UAVs and · *li*(*t*) is the time derivative of *li*(*t*). As multi-UAV fly at the constant altitude, the trajectory of each UAV is expected to satisfy the collision avoidance constraint [36]

$$\|\|l\_i(t) - l\_{i'}(t)\|\| \ge d\_{\text{min}} \,\forall i \ne i', t \in [0, T], \tag{3}$$

where *d*min is the minimum distance between the UAVs for avoiding collision. In addition, the time-varying distance between UAV *ui* and sensor *pj* can be expressed as

$$d\_{i,j}(t) = \sqrt{H^2 + \left(\mathbf{x}\_i(t) - \mathbf{x}\_j\right)^2 + \left(y\_i(t) - y\_j\right)^2}.\tag{4}$$

Due to scattering and ground reflection, the communication channels between UAVs and sensors include both Line-of-Sight (LoS) and Non-Line-of Sight (NLoS) links [16]. The UAV–ground link channel is characterized by the presence of strong LoS path. The Rician channel model is an appropriate choice, as it can effectively reflect the combination of scattering and LoS that exists in the links between the UAVs and sensors [13]. In this paper, we consider the more practically Rician fading channels between the UAVs and sensors [13,16,37–39]. The instantaneous channel gains between the UAV *ui* and sensor *pj* can be expressed as [16,37]

$$h\_{i,j}(t) = \sqrt{\beta\_{i,j}(t)} g\_{i,j,\prime} \tag{5}$$

where *gi*,*<sup>j</sup>* is the small-scale fading coefficient. *βi*,*j*(*t*) is the average channel power gain, which can be expressed by [16,37]

$$\beta\_{i,j}(t) = \beta\_0 d\_{i,j}^{-a}(t) = \frac{\beta\_0}{\left[H^2 + (x\_i(t) - x\_j)^2 + (y\_i(t) - y\_j)\right]^{a/2}},\tag{6}$$

where *α* is the path loss exponent and *β*<sup>0</sup> denotes the average channel power gain at the reference distance *d*<sup>0</sup> = 1*m*. *gi*,*<sup>j</sup>* is a random variable that corresponds to the effects of the small-scale fading such that *E gi*,*<sup>j</sup>* 2 = 1, which can be given by [16,37]

$$g\_{i,j} = \sqrt{\frac{K\_R}{K\_R + 1}} \mathbf{g} + \sqrt{\frac{1}{K\_R + 1}} \mathbf{\bar{g}}\_{\prime} \tag{7}$$

where *KR* is the Rician factors of the channel and *g* denotes the deterministic LoS channel component with |*g*| = 1. *g*˜ represents the random scattered component which is a zero-mean unit-variance circularly symmetric complex Gaussian (CSCG) random variable. For Rician fading, the cdf of  *gi*,*<sup>j</sup>* 2 is explicitly represented as [37,38]

$$F\_{\left|\mathcal{G}\_{ij}\right|^2}(\mathbf{x}) = 1 - Q\_1(\sqrt{2K\_{\mathcal{R}}}, \sqrt{2(K\_{\mathcal{R}} + 1)\mathbf{x}}),\tag{8}$$

where *Q*1(*a*, *b*) is the standard Marcum-Q function. We assume that *p* is the transmission power of sensors. The instantaneous channel capacity is expressed as [36]

$$R\_{i,j}(t) = B \log\_2(1 + \frac{p|h\_{i,j}(t)|^2}{\sum\_{m=1, m \neq j}^n p|h\_{i,j}(t)|^2 x\_m(t) + \sigma^2}),\tag{9}$$

where *B* is the channel bandwidth and *σ*<sup>2</sup> denotes the white Gaussian noise power at the receiver. The term *<sup>n</sup>* ∑ *m*=1,*m*=*j p*|*hi*,*m*(*t*)| 2 *xm*(*t*) in Equation (9) represents the co-channel interference caused by the transmissions of other sensors in time *t*. We denote a binary variable *xm*(*t*) to indicate whether sensor *pm* transmits information to UAVs at time *t*, which can be given by

$$\mathbf{x}\_m(t) = \begin{cases} 1 & p\_j \text{ transmits information to UAVs} \\ 0 & \text{otherwise} \end{cases} \tag{10}$$

The maximum communication radius projected on the ground is expressed by *r*, which depends on the maximum communication distance *d*max *<sup>U</sup>*−*<sup>S</sup>* and UAV altitude *<sup>H</sup>*, which is determined as [15,40]

$$r = \sqrt{d\_{ll-S}^{\max} ^2 - H^2} \,\text{}$$

*Sensors* **2019**, *19*, 4032

We assume that the UAVs fly at a determined altitude *H* when they collect information from the sensors. Therefore, *r* mainly depends on the maximum communication distance. To successfully collect information from sensors, the received power of UAV must be higher than or equal to the minimum decodable power *pU*,min [40]. The maximum communication distance between the sensor and the UAV can be calculated as

$$d\_{ll-S}^{\text{max}} = \sqrt{\frac{G\_S^t \cdot G\_{ll}^r \cdot \lambda^2 \cdot p}{\left(4\pi\right)^2 \cdot p\_{ll,\text{min}}}},\tag{12}$$

where *G<sup>t</sup> <sup>S</sup>* is the transmit antenna gain of the sensor, *<sup>G</sup><sup>r</sup> <sup>U</sup>* is the receive antenna gain of the UAV. *λ* represents the wavelength of the signal transmitted by a sensor which is calculated by the frequency of signal. Based on the maximum communication radius *r*, the communication condition can be expressed as

$$\left\|\left|l\_i(\mathbf{t}) - \mathbf{c}\_j'\right|\right\| \le r,\tag{13}$$

where *cj* is the coordinate representation of sensor *pj* on a two-dimensional plane. It is worth noting that, if UAV *ui* needs to collect information from sensor *pj* at time *t*, in order to ensure reliable transmission, the distance between them must satisfy Equation (13).

We define a binary variable *fi*,*j*(*t*), and it indicates whether sensor *pj* is served by UAV *ui* at time *t*

$$f\_{i,j}(t) = \begin{cases} 1 & \text{ $u\_i$ } \text{ collect information from  $p\_j$ } \\ 0 & \text{otherwise} \end{cases} \tag{14}$$

The binary variables specify not only the communication scheduling across the different time, but also the association between UAVs and sensors. In this paper, the UAV collects information from sensors via TDMA that means the UAV only communicates with at most one sensor at each time. Meanwhile, each sensor is served by only one UAV at a time instance, but can be served by different UAVs over different time slots. These constraints can be expressed as

$$\sum\_{j=1}^{n} f\_{i,j}(t) \le 1, \forall i, j,\tag{15}$$

$$\sum\_{i=1}^{k} f\_{i,j}(t) \le \mathbf{1}\_{\prime} \forall i, j,\tag{16}$$

Furthermore, the total amount of data *R*¯ *<sup>j</sup>* transmitted from sensor *pj* to UAV *ui* is a function of UAV trajectory *li*(*t*), which is expressed as

$$\bar{R}\_{j}(l\_{i}(t)) = \int\_{0}^{T} B \log\_{2}(1 + \frac{p\left|h\_{i,j}(t)\right|^{2}}{\sum\_{m=1, m \neq j}^{n} p\left|h\_{i,j}(t)\right|^{2}x\_{m}(t) + \sigma^{2}}) f\_{i,j}(t)dt. \tag{17}$$

#### *3.2. Problem Formulation*

The throughput requirement corresponding to sensor *pj* is assumed to be *Cj* bits. By optimizing the trajectories of UAVs, our objective is to minimize mission completion time, while ensuring that the information of sensors is collected. The mission completion time refers to the time taken by all UAVs to complete respective tasks. The completion time of a UAV is mainly divided into two parts, flying time and hovering time. Therefore, the completion time of a single UAV can be represented as

$$T\_i = T\_F^i + T\_{H'}^i \tag{18}$$

where *T<sup>i</sup> <sup>F</sup>* represents the flying time and *<sup>T</sup><sup>i</sup> <sup>H</sup>* represents the hovering time.

We denote **F** = { *fi*,*j*(*t*), ∀*i*, *j*, *t*} and **L** = {*li*(*t*), ∀*i*, *t*}. Assuming the sensors' locations are given, we aim to minimize the mission completion time by jointly optimizing the UAVs' trajectories (i.e., **L**), communication scheduling and association (i.e., **F**). Define *T*(**F**, **L**) = max 1≤*i*≤*k Ti* as the mission completion time which is a function of **F** and **L**. The optimization problem can be formulated as

$$\min\_{\mathbf{F}, \mathbf{L}, \{T\_H\}, \{T\_F\}} \quad T \tag{19}$$

$$\text{s.t.}\quad l\_i(0) = l\_i(T), \forall i,\tag{20}$$

$$
\bar{\mathcal{R}}\_{\rangle} \cong \mathbb{C}\_{\rangle}, \forall j, \tag{21}
$$

$$\sum\_{i=1}^{k} f\_{i,j}(t) \le 1, \forall i, j, t \in [0, T], \tag{22}$$

$$\sum\_{j=1}^{n} f\_{i,j}(t) \le 1, \forall i, j, t \in [0, T], \tag{23}$$

$$\left\| \left\| \dot{l}\_i(t) \right\| \right\| \le \upsilon\_{\max} \,\,\forall i, t \in [0, T]\_\prime \tag{24}$$

$$\|\|l\_i(t) - l\_{i'}(t)\|\| \ge d\_{\text{min}} \,\forall i \ne i', t \in [0, T], \tag{25}$$

$$0 \le \left| \left| l\_i(t) - c\_j' \right| \right| f\_{i,j}(t) \le r\_\prime \forall i, j, t \in [0, T]. \tag{26}$$

This problem is challenging to be solved due to three main reasons. First, the problem needs to optimize continuous functions **F** and **L**, which essentially involve an infinite number of optimization variables that are closely coupled with each other. Second, since the optimization variables **F** for UAV–sensor association and communication scheduling are binary, the formulated problem is a non-convex problem. Third, it is difficult to optimize trajectories, hovering and flying communication scheduling at the same time. To solve this problem, we propose an improved fly-hover-fly trajectory planning algorithm (FLY), in which the UAVs successively hover at a finite number of hovering points each for an optimized hovering duration. Different from other traditional fly-hover-fly trajectory designs [25–30], UAVs can collect information from sensors not only in hovering, but also in flying. First, we propose the FHF algorithm to optimize the hovering points and obtain the trajectories in Section 4. Then, under the obtained trajectories, we further consider the UAVs can also collect information when flying and optimize the time allocations in Section 5.

#### **4. FHF Trajectory Planning Algorithm**

As shown in Figure 2, we propose FHF algorithm which includes two steps. First, we propose a hovering point selection algorithm to select appropriate hovering points with considering the information collection quality. Red dots represent sensors and blue dots represent optimized hovering points. The number of hovering points is less than that of ground sensors. Not stopping above each sensor node, we select appropriate hovering points where the UAVs can sequentially collect the information from multiple sensors. By hovering at different points, which are close to different subsets of sensors, the UAVs can obtain better wireless communication links and decrease the flying time compared with the policy of hovering above each sensor. Moreover, the policy of hovering above each sensor was used as a comparison algorithm in our simulations.

Second, we use min-max cycle cover algorithm to assign these hovering points and get the trajectory of each UAV. Based on the hovering points optimized in the first step, we use Traveling Salesman Problem (TSP) algorithm to calculate a cycle covering all hovering points. Then, we use a novel k-cycles algorithm for getting the trajectory of each UAV, which can minimize the maximum time of the cycles including flying time and hovering time.

**Figure 2.** FHF trajectory planning algorithm.

#### *4.1. Hovering Point Selection*

For a given SNR threshold, it is unnecessary for UAVs to fly over all sensors in general. We model hovering point selection problem as a BS coverage problem [15]. Specifically, given sensors' locations and UAV's communication radius, the hovering point selection problem's objective is to obtain a minimum number of hovering points and respective locations, while ensuring that all sensors are covered by at least one hovering point. As shown in Algorithm 1, we use the min-max location algorithm as sub-algorithm to solve this problem. Meanwhile, major notations used in Algorithm 1 are provided in Table 2.

#### **Algorithm 1** Hovering point selection algorithm.

**Input:** *r*, sensor set *P* with known locations {*cj*}*pj*∈*P*. **Output:** Hovering point set *V*, with optimized locations {*hm*}*vm*∈*V*. 1: Initialize *X* ← ∅, *Y* ← *P*, *V* ← ∅, *m* ← 1; 2: **while** |*X*| < *n* **do** 3: Calculate boundary sensor set *Ybo* ⊆ *Y*, update inner sensor set *Yin* ← *Y*\*Ybo*; 4: Randomly select a sensor *b*<sup>0</sup> ∈ *Ybo*. Denote *ρ* ← {*b* | *b*<sup>0</sup> − *b* ≤ 2*r*, *b* ∈ *Ybo*}; 5: Use the min-max location algorithm to find hovering point's location *hm* to cover *ρ*. Denote the maximum distance is *d*; 6: **while** *d* > *r* **do** 7: *<sup>ρ</sup>* <sup>←</sup> *<sup>ρ</sup>*\ arg max *<sup>b</sup>*∈*<sup>ρ</sup> b* − *hm*. Repeat step 5; 8: **end while** 9: *ρ* ← *ρ* ∪ {*b* | *b* − *hm* ≤ *r*,*b* ∈ *Yin*}; 10: *θ* ← {*b* |*r* < *b* − *hm* ≤ 2*r*,*b* ∈ *Yin*}; 11: **while** *θ* = ∅ **do** 12: Find a sensor *b* ∈ *θ* which has the shortest distance to *hm*; 13: Remove (add) *b* from (to) *θ* (*ρ*) if *b* is covered by changing *hm* via using min-max location algorithm. Stop otherwise; 14: **end while** 15: *X* ← *X* ∪ *ρ*; 16: *Y* ← *Y*\*ρ*; 17: *V* ← *V* ∪ {*vm*}, *m* ← *m* + 1. 18: **end while**


**Table 2.** Major notations in Algorithm 1.

Now, we describe the hovering point selection algorithm in detail. First, we define *Y* as the set of uncovered sensors, which is initialized to be equal to *P*; *X* as the set of covered sensor, which is initialized to an empty set; and *V* as the set of hovering points, which is initialized to empty set (Line 1). We utilize the convex hull to define boundary sensors, whereas other boundary sensors definitions can also be used to produce similar results. In this algorithm, to reduce the occurrence of outlier sensors, which require one hovering point to cover it, we give higher priority to the boundary sensors so that boundary sensors are guaranteed to be covered firstly. The boundary sensors exist in *Ybo* and the inner sensors exist in *Yin* (Line 3). We randomly choose a boundary sensor *b*<sup>0</sup> that exists in *Ybo*. The sensors that are less than 2*r* away from *b*<sup>0</sup> are put in *ρ* (Line 4).

At this time, it can be converted into the min-max location problem to find the position of the hovering point. In operations research of facilities location type, the min-max location problem is a classical combinatorial optimization problem. The problem can be described as follows: given a function to calculate the cost between demand points and a facility, a space of feasible locations of a facility, and a set of *n* demand points, find the facility's location which minimizes the maximum facility-demand point cost [41,42]. The simple special case when the demand points and feasible locations are in the plane with Euclidean distance as cost, it is also known as the smallest circle problem [43]. Our goal is to minimize the maximum sensor-hovering point distance. Therefore, we use the min-max location algorithm to find hovering point's location *hm* to cover all sensors in *ρ* [44] (Line 5). This algorithm minimizes the maximum distance from sensors to *hm*. The farthest distance is defined as *d*.

If the value of *d* exceeds *r*, *ρ* needs to be rebuilt until the farthest distance is less than *r*. In each iteration, the sensor which is the farthest away from *hm* is deleted (Lines 6–8). Update *ρ* by adding sensors which satisfy the distance conditions (Line 9). This check reduces the execution times of the min-max location algorithm in the next step. The sensors in *Yin* whose distance to *hm* are in the range of (*r*, 2*r*) are put in *θ* (Line 10). We find a sensor *b* ∈ *θ* which is the closest to *hm*. Remove (add) *b* from (to) *θ* (*ρ*) if *b* is covered by changing *hm* via using min-max location algorithm. When *θ* is empty or new *hm* is not found, the *m*th hovering point's location *hm* is determined (Line 11–14). *X*, *Y* and *V* are updated based on the results we got (Line 15–17). When all sensors are covered, the algorithm will stop. In Figure 3, we show an example of Algorithm 1. There are seventy sensors in a square area of 5 km × 5 km and *r* equals 500 m. By utilizing Algorithm 1, we obtain twenty hovering points and respective locations.

For hovering point selection algorithm, the convex hull algorithm of *n* points to find boundary sensors has complexity *O*(*n* log *n*) [15,45]. In addition, each hovering point executes the min-max location algorithm for up to *O*(*n*) times. Meanwhile, the complexity of the min-max location algorithm is *O*(*n*2) [44]. Since the number of hovering points is at most *O*(*n*), the overall complexity of the hovering point selection algorithm is upper-bounded by *O*(*n*[*n* log *n* + *n*3]). To sum up, the complexity of the hovering point selection algorithm is upper-bounded by *O*(*n*4).

#### *4.2. Min-Max Cycle Cover*

The multi-UAV system is modeled as a complete undirected graph *G* = (*V*, *E*), where vertices in *V* represent hovering points for UAV and edges in *E* represent flying paths of the UAV. For each edge *e*(*vm*, *vm*+1) ∈ *E*, *vm*, *vm*+<sup>1</sup> ∈ *V*, an edge weight *w*(*vm*, *vm*+1) is given to represent the flying time. For each vertex *vm*, a vertex weight *h*(*vm*) is given to represent the hovering time to collect information. Since we use a multi-UAV system to collect information, the completion time of the entire mission is the time of tasks completed by all UAVs. In other words, it is the maximum completion time of UAVs. The minimum mission completion time problem aims to cover hovering points with multi-UAV such that the maximum weight of the cycles including flying time and hovering time is minimized. To tackle this problem, we formulate it as a min-max cycle cover problem. The min-max cycle cover problem is an extension of the classical TSP. Since the TSP is NP-hard when *k* = 1, the min-max cycle cover problem is also NP-hard for any *k* ≥ 1 [46].

The heuristic algorithm we propose includes two steps to plan the trajectory of each UAV. Firstly, the ant colony algorithm is used for obtaining a cycle *C* covering all hovering points. As a global searching algorithm, the ant colony algorithm can solve TSP problems [47]. Then, we use a novel *k*-cycles algorithm for getting the trajectory of each UAV [48]. This algorithm mainly decomposes cycle *C* into *k* segments.

The *k*-cycles algorithm for obtaining *k* trajectories of UAVs is shown as Figure 4. This algorithm takes the number of UAVs, cycle *C* and the complete undirected graph *G* = (*V*, *E*) as input. In *k*-cycles algorithm, to decompose cycle *C* into *k* trajectories of UAVs, we first compute the bound vector *Q* = (*Q*1, ..., *Qi*, ..., *Qk*). *Qi* can be calculated on the basis of the total weight of cycle *C*, *Qi* = *<sup>i</sup> <sup>k</sup>W*(*C*), for all 1 ≤ *i* ≤ *k*, where *W*(*C*) = *h*(*C*) + *w*(*C*). *h*(*C*) and *w*(*C*) represent the weight of all vertices and the weight of all edges in cycle *C*, respectively. Second, along the cycle, we build a mixed set of edges and vertices containing 2 |*V*| + 1 elements, denoting it as *VE* = (*ve*<sup>0</sup> = *v*0, *ve*<sup>1</sup> = *e*(*v*0, *v*1), *ve*<sup>2</sup> = *<sup>v</sup>*1, *ve*<sup>3</sup> = *<sup>e</sup>*(*v*1, *<sup>v</sup>*2), ..., *ve*2|*V*|+<sup>1</sup> = *<sup>v</sup>*0). *wh*(*ve*0, ..., *vem*) represents the sum weight of vertices and edges. Compared with Author1 [49], the weight of vertices, which represents the hovering time, is calculated

one time in *k*-cycles algorithm. By constructing a mixed set of vertices and edges, the weight of vertices can be taken into account in the cycle splitting. Next, from *ve*0, we aim to find an element *vem*(*i*) along cycle *C* such that the weight of segment *wh*(*veo*, *ve*1, ..., *vem*(*i*)) satisfies *wh*(*veo*, *ve*1, ..., *vem*(*i*)) ≤ *Qi*, for each *i*, 1 ≤ *i* ≤ *k*. There are two possibilities for the demarcation point *vem*(*i*).


Then, we obtain *k* segments of cycle *C* (i.e., {*C*1, ..., *Ci*, ..., *Ck*}). Finally, the segments add the starting location to build the closed trajectory for each UAV.

**Figure 4.** *k*-cycles algorithm.

The complexity of the ant colony algorithm is related to the number of ants, the number of hovering points and the number of iterations. The number of hovering points cannot exceed *n*. Thus, the complexity of the ant colony algorithm is upper-bounded by *<sup>O</sup>*(*iter* · *<sup>A</sup>* · *<sup>n</sup>*2), where *iter* is the number of iterations and *A* is the number of ants. Furthermore, the complexity of *k*-cycles algorithm depends on the number of sensors. Since *k*-cycles algorithm is a polynomial-time algorithm, its complexity is *O*(*n*). In addition, the complexity of the hovering point selection algorithm is upper-bounded by *O*(*n*4). To sum up, the complexity of the hover-collection UAV trajectory planning algorithm is upper-bounded by *<sup>O</sup>*(*n*2[*n*<sup>2</sup> + *iter* · *<sup>A</sup>*]).

#### **5. Time Allocation**

With the obtained UAVs trajectories, we further consider the UAVs can also collect information when flying. We propose an effective method to optimize UAVs fly-collection and hovering time allocations.

We define *T*˜ *<sup>H</sup>*,*<sup>m</sup>* ≥ 0 as the new allocated time for the UAVs to collect information when hovering at location *vm*. Since UAVs can collect information when flying, we can obtain a constraint *T*˜ *H*,*m* ≤ *TH*,*m*, ∀*m*. The time allocation problem is challenging to be solved due to three main reasons. First, the time allocation problem needs to optimize continuous functions **F**, which essentially involves an infinite number of optimization variables. Second, the mission completion time is unknown. Therefore, the time allocation problem cannot be solved by widely time discretization method. Third, the UAV–sensor association and communication scheduling are binary, thus the formulated problem is a non-convex problem. To tackle this problem, we use a new discretization method, called path discretization [50,51].

With the UAVs' trajectories *UT* and hovering points *V* given, UAV's trajectory *UTi* can be discretized into *<sup>Z</sup><sup>i</sup>* line segments, which are represented by the *<sup>Z</sup><sup>i</sup>* sampling points {*qzi*}. The length of the *z<sup>i</sup>* th line segment can be expressed as *μz<sup>i</sup>* = *qzi*<sup>+</sup><sup>1</sup> − *qzi* . Note that *<sup>Z</sup><sup>i</sup>* needs to be chosen to be sufficiently large so that *<sup>μ</sup>z<sup>i</sup>* ≤ *<sup>μ</sup><sup>i</sup>* max . *<sup>μ</sup><sup>i</sup>* max is an appropriately chosen value so that the distance between each sensor and UAV *ui* is approximately unchanged under each line segment. Furthermore, we assume the time for flying in the *z<sup>i</sup>* th line segment is *πz<sup>i</sup>* . Meanwhile, *λz<sup>i</sup>* is defined as the allocated information collection time when *ui* flies on the *z<sup>i</sup>* th line segment, where *λz<sup>i</sup>* ≤ *πz<sup>i</sup>* . When information collection occurs in the mobile state, the UAV sends an activation signal to a sensor which satisfies SNR conditions. Then, the UAV collects information from the activated sensor.

The formulated problem reduces to optimizing the communication scheduling **F**, the hovering time *T*˜ *H* and fly-collection time {*λz<sup>i</sup>*}. Therefore, we can obtain

$$\min\_{\mathbf{F}, \{\mathbf{T}\_H\}, \{\boldsymbol{\lambda}\_{\boldsymbol{\varepsilon}^i}\}} \quad \quad \quad \quad \quad \quad \quad \tag{27}$$

$$\text{s.t.} \quad \vec{\mathcal{R}}\_{\rangle} \ge \mathcal{C}\_{\rangle}, \forall j,\tag{28}$$

$$\sum\_{i=1}^{k} f\_{i,j}(t) \le \mathbf{1}, \forall i, j, t \in [0, T], \tag{29}$$

$$\sum\_{j=1}^{n} f\_{i,j}(t) \le \mathbf{1}\_{\prime} \,\forall i, j, t \in [0, T], \tag{30}$$

$$0 \le \left| \left| l\_i(t) - c\_j' \right| \right| f\_{i,j}(t) \le r\_\prime \forall i, j, t \in [0, T] \,\tag{31}$$

$$T\_{H,m} \le T\_{H,m\nu} \forall m\,,\tag{32}$$

$$
\lambda\_{z^i} \le \pi\_{z^i}, \forall z, i. \tag{33}
$$

With the given trajectories, since the velocity of UAVs remains unchanged, the new hovering time cannot exceed the initial value. The new formulated problem is an integer programming problem, which can use the existing software toolbox such as CVX to be efficiently solved. Therefore, under the given trajectories, we can get fly-collection and hovering time allocations. Furthermore, joint optimization of multi-UAV trajectories, communication scheduling and time allocation will be left for our future works.

#### **6. Simulation Results**

In this section, we present the simulation results to validate the proposed FLY algorithm, which mainly includes the FHF algorithm and time allocations.

#### *6.1. Simulation Setup*

We assumed that the size of the monitoring area was 5 km × 5 km. The total bandwidth was 1 MHz, and the average channel power gain at 1 m was −50 dBm [52]. The maximum velocity of a UAV was set to 50 m/s. Furthermore, the minimum distance between UAVs was set to 100 m [36] and UAVs flew at a constant altitude of 50 m. We assumed that the transmission power equaled 10 dBm [16]. The white Gaussian noise power was equal to −110 dBm [16]. Following the authors of [15,40], the communication radius was 500 m, which was calculated by the altitude and maximum communication radius. As stated in [36], the Rician channel model can be utilized for estimating the link performance for UAV-to-ground links. Simulations were conducted using the parameters specified in Table 3.


**Table 3.** Simulation parameters.

#### *6.2. Baseline Setup*

To demonstrate the performance of the proposed information-collection UAV trajectory planning algorithm (FLY), we compared and implemented the following six benchmark schemes:


#### *6.3. Different Number of Sensors*

In this simulation, with different number of sensors, the number of UAVs was fixed at three and the seven algorithms were compared. As the number of sensors changes, the trend of mission completion time is shown in Figure 5. It was observed that the algorithms we proposed significantly outperformed four other algorithms. FHF and FLY algorithms were the closest to the optimal solution compared with four other algorithms. FLY algorithm achieved almost the same performance with the optimal scheme when *n* < 20, and the mission completion time difference was less than 6%. When the number of sensors was greater than 20, FLY algorithm was also closer to the optimal scheme than five other algorithms. For example, when *n* = 120, FLY algorithm was 119% of the optimal solution, which was the biggest gap between FLY algorithm and the optimal scheme in this simulation. It was also observed that the completion time of mission conducted only in static case cOULD not be smaller than in both static and mobile cases. FLY algorithm was reduced by 7.0–14.1% compared with FHF algorithm. That was expected; since UAVs utilized the flying phase to collect information, the hovering time was reduced. Meanwhile, we found that our subalgorithm (FHF) performed better than the other compared algorithms. FHF algorithm was reduced by 3.5–46.9% compared with four other algorithms. When the number of sensors was 10, FHF algorithm was reduced by 3.5% compared with KMEAN algorithm. When the number of sensors was 120, FHF algorithm was reduced by 46.9% compared with KTSP algorithm. By hovering at a point where it could communicate with multiple sensors, the UAV could decrease the flying time compared with SHP algorithm, PB algorithm and KTSP algorithm. In particular, when the number of sensors increased, the UAV could communicate with more sensors at one hovering point. Therefore, our algorithms performed better when the number of sensors was large. Moreover, SHP algorithm and PB algorithm use the same TSP algorithm, thus the performance was relatively close. Meanwhile, KMEAN algorithm selected hovering points without considering the information collection quality of UAVs. There was retransmission in the actual transmission, which increased the information collection time and KMEAN algorithm did not reach the effect of selecting the hovering points. Therefore, KMEAN algorithm was close to SHP algorithm and PB algorithm.

**Figure 5.** Different number of sensors.

#### *6.4. Different Number of UAVs*

Under different number of UAVs, the number of sensors was set to be 100 and we compared the seven algorithms. As the number of UAVs changes, the trend of mission completion time is shown in Figure 6. Obviously, increasing the number of UAVs could improve the speed of monitoring, and the mission could be completed in less time. However, increasing the number of UAVs also brought some overhead and economic pressure. As shown in Figure 6, from one UAV to four UAVs, the mission completion time dropped significantly. When the number of UAVs exceeded five, the changes in mission completion time tended to be flat. It was obvious that more UAVs could share the monitoring mission to reduce mission completion time. However, some sensors were far from the initial location, and the UAV might take too much time going back and forth. Therefore, the maximum completion time of UAVs had a lower bound, and it was impossible to keep falling as the number of UAVs increased. Compared with four other algorithms, our algorithms were closer to optimal scheme. As shown in Figure 6, the gap between our algorithms and the optimal solution decreased as the number of UAVs increased. When *k* = 1, FHF algorithm was 86.7% higher than the optimal scheme and FLY algorithm was 52.4% higher than it. When *k* = 10, FHF algorithm was 12% higher than the optimal scheme and FLY algorithm was 3.1% higher than it. As the number of UAVs changed, FLY algorithm was reduced by 8.0–18.1% compared with FHF algorithm. Such results demonstrate the flying and hovering information collection solution is better than hovering information collection solution.

**Figure 6.** Different number of UAVs.

#### *6.5. Different Communication Radius*

Under different communication radius, we set the number of UAVs at 3 and the number of sensors at 120, and compared the seven algorithms. Figure 7 shows the trend of mission completion time as the communication radius changes. Under different communication radius, our algorithms were closer to the optimal scheme than four other algorithms. As the communication radius increased, our algorithms had obvious advantages. When the communication radius was 100 m, FLY algorithm was 145% of the optimal algorithm, and FHF algorithm was 169% of the optimal algorithm. However, when the communication radius was 800 m, FLY algorithm was 116% of the optimal algorithm, and FHF algorithm was 130% of the optimal algorithm. As shown in Figure 7, the mission completion time of our algorithms and KMEAN algorithm gradually decreased as the communication radius increased, and other algorithms' change trends were relatively small. The reason is that the change in communication radius mainly affected the hovering point selection algorithm, which is an important difference between three algorithms and three other algorithms. Since we considered the information collection quality in hovering point selection algorithm, our algorithms had a higher performance improvement compared with KMEAN algorithm. Of course, in practical applications, the communication range of the UAV is related to the altitude and the communication equipment of the UAV.

**Figure 7.** Different communication radius.

#### *6.6. Different Monitoring Area Size*

Under different monitoring area size, we set the number of UAVs at 3 and the number of sensors at 100 and compared the seven algorithms. Figure 8 shows the trend of mission completion time as the monitoring area size changes. It was observed that FHF and FLY algorithms were closer to the optimal solution than four other algorithms. The gap between FHF algorithm and the optimal solution increased as the monitoring area expanded, from 21.6% to 67.7%. The gap between FLY algorithm and the optimal solution increased as the monitoring area expanded, from 10.3% to 41.3%. As the monitoring area became larger, the advantage of FLY algorithm was more obvious. FLY algorithm was reduced by 6–15.6% compared with FHF algorithm. That was expected; since the larger area caused a long flying time, the UAVs could obtain more time to collect information when flying. Furthermore, FHF algorithm was reduced by 9.7–40.2% compared with four other algorithms. As the monitoring area size increased, the flying scope of the UAV became larger, and the flying time became longer, resulting in longer mission completion time. As shown in Figure 8, FLY algorithm performed the best under different monitoring area size. Since it fully considers the information collection quality and the positions of sensors, the algorithm can be applied to different monitoring area size. Based on the simulation results in the previous subsection, if the communication range of UAVs is appropriately changed, the mission completion time will be shorter.

**Figure 8.** Different monitoring area size.

#### **7. Discussion**

#### *7.1. Trade-Offs in Communication and Trajectory Design*

There are some new and interesting trade-offs among the energy, delay and throughput in UAV communication and trajectory design which are different from traditional terrestrial communication [55]. In this section, we mainly discuss the trade-offs of UAVs when they are used as mobile information collectors. First, there is a trade-off between delay and throughput in UAV-enabled wireless network. To maximize throughput, UAVs always fly sufficiently close to the sensors to improve the link capacity. Although this method improves the throughput, it also brings delay due to the movement of the UAV. Second, there is also an interesting trade-off between energy consumption and throughput in UAVs. To gain higher throughput, UAVs generally fly close to sensors which consume more energy to move. Third, the above two trade-offs naturally imply an energy-delay trade-off. To reduce the delay in UAV–sensor communication, UAVs need to move faster to sensors ,which brings more energy consumption. In fact, by planning the trajectory of UAV, the energy, delay and throughput can be traded off among each other. Finally, there is a trade-off between sensors and UAVs [9]. The sensors have limited battery and lower power. To prolong the lifetime of sensors, UAVs can move close to sensors to collect their information with minimum transmit power. However, UAVs will consume more energy to move. In this paper, we mainly focus on minimizing mission completion time on the premise of achieving throughput requirements. In the future, we will consider these trade-offs in trajectory design.

#### *7.2. Multiple Access Schemes*

In this paper, we consider that the UAVs collect information from sensors via time-division multiple access (TDMA) that means only one sensor is scheduled for information collection at each time [36,37,39,55]. There are some works using other multiple access schemes [25,52]. For example, the authors considered a scenario that a UAV is dispatched as the mobile BS to provide service for ground users via orthogonal frequency division multiple access (OFDMA) [52]. Besides orthogonal multiple access schemes such as TDMA considered above, non-orthogonal multiple access schemes also are considered in UAV-enabled wireless networks. To further improve the capacity limits, Wu et al. found that non-orthogonal multiple access schemes based on superposition coding (SC) can be jointly designed with the UAV trajectory [25]. The authors considered a two-user broadcast channel. To achieve the capacity region, they proposed a practical fly-hover-fly trajectory with SC. However, they only considered two users. In the future, for the next-generation wireless networks with massive connectivities, we will study designing the efficient trajectory algorithm joint with other multiple access schemes.

#### *7.3. The Case of a Large Number of Sensors*

In practical application, there may be many sensors deployed in the monitoring area. The complexity of hovering point selection problem becomes extremely high when the number of points *n* is very large. Therefore, the application scope of the proposed algorithm will be limited. To expand the application scope of the algorithm, we propose an efficient method. When the number of sensors is large, the monitoring area can be divided into some subareas. In other words, ground sensors can be partitioned into *M* disjoint sets, *P*1, *P*2,. . . , *PM* with each set corresponding to the sensors in its subarea. Then, the hovering point selection algorithm can be implemented for sensors in each subarea. By dividing the monitoring area, the complexity of trajectory planning problem can be reduced, and the application scope of the proposed algorithm can be improved.

#### **8. Conclusions**

In this paper, we aim to minimize the mission completion time of multi-UAV by optimizing the trajectories of UAVs to complete the mission as soon as possible. We formulate this problem as a mixed-integer non-convex problem. It is difficult to optimize trajectories, hovering and flying communication scheduling at the same time. To tackle the formulated problem, we propose an improved fly-hover-fly trajectory planning algorithm, which includes two steps. First, we propose the FHF algorithm to optimize the hovering points and obtain the trajectories. Second, with the obtained UAVs trajectories, we further consider that the UAVs can also collect information when flying. We propose an effective method to optimize UAVs fly-collection and hovering time allocations. The simulation results show that the mission completion time of our algorithm is minimum compared with other algorithms.

Since we add some constraints in the model, there are some limitations in practical application. For example, we assume that the UAVs fly at a fixed altitude. In fact, UAV's altitude will directly affect Rician factor and information transmission rate. It is also worthwhile to optimize UAV's altitude. In the future, we will exploit the vertical trajectory of the UAV and present a new design framework of 3D UAV trajectory to further improve the performance of multi-UAV information collection system. In addition, for the next-generation wireless networks with massive connectivities, we will study designing the efficient trajectory algorithm joint with other multiple access schemes. In our future works, other practical constraints on UAV's trajectory will be considered, such as the maximum acceleration, the maximum instantaneous output power of the engine and the maximum turning angle.

**Author Contributions:** Conceptualization, Z.Q. and C.D.; methodology, H.D. and Z.Q.; software, A.L. and Z.X.; validation, Z.Q. and C.D.; formal analysis, A.L.; investigation, A.L. and Z.X.; writing—original draft preparation, Z.Q.; and writing—review and editing, C.D., H.D. and A.L.

**Funding:** This work was supported in part by the National Natural Science Foundation of China under Grants (No. 61931011, No. 61872178, No. 61827801, No. 61702545, No. 61702525 and No. 61631020), in part by the Fundamental Research Funds for the Central Universities No. 021014380079 and in part by the Natural Science Foundation of Jiangsu Province under Grant No. BK20181251.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


c 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## **Aerial Cooperative Jamming for Cellular-Enabled UAV Secure Communication Network: Joint Trajectory and Power Control Design**

#### **Hanming Sun 1, Bin Duo 2,\*, Zhengqiang Wang 3, Xiaochen Lin <sup>4</sup> and Changchun Gao <sup>1</sup>**


Received: 22 September 2019; Accepted: 11 October 2019; Published: 14 October 2019

**Abstract:** To improve the secrecy performance of cellular-enabled unmanned aerial vehicle (UAV) communication networks, this paper proposes an aerial cooperative jamming scheme and studies its optimal design to achieve the maximum average secrecy rate. Specifically, a base station (BS) transmits confidential messages to a UAV and meanwhile another UAV performs the role of an aerial jammer by cooperatively sending jamming signals to oppose multiple suspicious eavesdroppers on the ground. As the UAVs have the advantage of the controllable mobility, the objective is to maximize the worst-case average secrecy rate by the joint optimization of the two UAVs' trajectories and the BS's/UAV jammer's transmit/jamming power over a given mission period. The objective function of the formulated problem is highly non-linear regarding the optimization variables and the problem has non-convex constraints, which is, in general, difficult to achieve a globally optimal solution. Thus, we divide the original problem into four subproblems and then solve them by applying the successive convex approximation (SCA) and block coordinate descent (BCD) methods. Numerical results demonstrate that the significantly better secrecy performance can be obtained by using the proposed algorithm in comparison with benchmark schemes.

**Keywords:** UAV secure communication; secrecy rate maximization; jamming; trajectory design; power control

#### **1. Introduction**

Due to the many advantages of controllable mobility, such as on-demand fast deployment, wide coverage, low cost, and line-of-sight (LoS) transmission that offers good channel capacity, unmanned aerial vehicles (UAVs) have been extensively utilized in different scenarios, e.g., surveillance and monitoring [1–3], search and rescue [4,5], cargo transportation [6], data collection [7] and mobile relays [8].

Recently, UAVs have attracted increasing attention in wireless communications, and are anticipated to playing an important role in the next-generation wireless networks [9,10]. Generally, there are two promising solutions to UAV communication applications: cellular-enabled UAV communication (CEUC) and UAV-assisted terrestrial communication (UATC) networks [11]. In UATC, the UAVs are flexibly deployed as aerial base stations (BSs) or mobile relays to assist in providing reliable communication services for terrestrial networks [12–14]. By contrast, the UAVs are integrated

into the wireless network scenarios as aerial users served by ground BSs in the CEUC system [15]. Owing much to the almost ubiquitous accessibility of the existing LTE (Long Term Evolution) and the forthcoming (beyond) fifth-generation ((B)5G) cellular networks, reliable communications can be supported between UAVs and their corresponding BSs [16,17]. The CEUC is anticipated to have a number of appealing advantages over the existing ground-to-UAV communications, including the ease of monitoring and management, ubiquitous accessibility, robust navigation and enhanced performance, etc. [11]. Despite its merits, the UAV communication based on the future (B)5G cellular networks is more susceptible to suspicious eavesdropping on the ground, which leads to a severe security challenge that is urged to be solved.

Currently, the UAV trajectory design combined with physical-layer security techniques, as a promising solution, has drawn significant attention to safeguard the UAV communication. Specifically, UAV secure communications are studied in [18,19], where the average secrecy rate is significantly improved via optimizing the trajectory of the UAV jointly with the power control for a finite mission duration. As cooperative jamming is one of the important physical-layer techniques that can enhance the secrecy performance, reference [20] proposed to employ a UAV as a friendly mobile jammer, to ensure the secrecy of the ground wiretap channel. In [21], a novel full-duplex operation was applied to the rotary-wing UAV to further improve the energy efficiency (EE) of UAV secrecy communications, and the EE was maximized by the joint optimization of the source transmit/UAV's jamming power and UAV trajectory. A four-node mobile relay and eavesdropper system is proposed in [22], where the UAV was employed as a mobile relay to assist in terrestrial communications. To cope with the non-convex secrecy rate maximization problem, an alternating optimization algorithm is designed by optimizing the power control and UAV trajectory alternatively. The authors in [23,24] proposed a dual-UAV UATC network to enhance the communication quality and improve the secrecy performance, where the downlink transmission from the UAVs is established by adaptively adjusting the UAVs' trajectories and transmit powers. Note that most of the above studies only focus on the security issues in UATC systems. However, how to design efficient anti-eavesdropping methods, to protect legitimate BS-to-UAV transmission in the CEUC networks, has not been investigated, and thus remains a challenging problem to address.

In light of the above, we propose an anti-eavesdropping scheme by employing an aerial UAV jammer in the CEUC network, where one UAV flies to receive confidential messages from a BS while the mobile UAV jammer confuses multiple suspicious eavesdroppers on the ground by sending jamming signals. Specifically, we take into account the joint optimization of both the UAVs' trajectories and the BS's/UAV jammer's power allocation, in order to maximize the average worst-case secrecy rate of the UAV receiver for a given finite period. In the proposed scheme, the UAVs are subject to the practical mobility as well as both the average and peak power constraints. In contrast with the above-mentioned existing works, the UAVs' trajectory design in our proposed CEUC network is particularly important, as the interference from other UAVs cannot be practically cancelled, which causes different objective function and constraints. Therefore, well-designed trajectories of the UAVs can not only avoid severe interference between UAVs, but also provide effective jamming signals to the eavesdroppers, which is expected to notably enhance the secrecy performance. As the formulated optimization problem is non-convex with the objective function as well as its constraints, it is very hard to obtain a globally optimal solution (Since the difficulty of the original problem is NP-hard, it is generally impossible to obtain the globally optimal solution by using the present optimization techniques.) To tackle this challenging problem, we first transform it into a lower bound expression with more tractability. Then, an efficient algorithm is designed by applying the block coordinate descent (BCD) method [25,26]. To be specific, we partition the total optimization variables into four blocks for the two UAVs' trajectories, BS's transmit power, and UAV jammer's jamming power control, respectively. Then, each block is alternatively optimized in each iteration with other blocks being fixed. Although we fix the other three blocks, the corresponding optimization problem remains intractable because of its non-convex. To obtain a high-quality approximately optimal solution, we thus introduce a series of slack variables

and apply successive convex approximation (SCA) technique [27,28]. The proposed algorithm has the applicable complexity and guarantees to converge to a locally optimal solution to this problem. To best of knowledge, this is the first work that exploits the anti-eavesdropping UAV trajectory design to solve physical-layer security issues of the CEUC system. The numerical results illustrate that the designed algorithm achieves significantly better secrecy performance than all benchmarks without trajectory or power control design, especially the scheme without the UAV jammer, as in [18].

The rest of this paper is organized as follows. Section 2 gives the system model and problem formulation. In Section 3, a joint optimization algorithm is proposed and its complexity and convergence performance are also analyzed. The simulations are presented in Section 4 to verify the effectiveness of the proposed algorithm. Finally, Section 5 concludes the paper.

#### **2. System Model and Problem Formulation**

#### *2.1. System Model*

Consider a CEUC network, as shown in Figure 1, where a ground BS transmits confidential messages to a mobile UAV receiver (denoted by U) within a given UAV flight period *T*, while *I* malicious eavesdroppers on the ground, denoted by E*<sup>i</sup>* for *i* ∈ I - {1, ··· , *I*}, intercept the messages from the valid UAV communication. To safeguard the legitimate transmission, the potential eavesdroppers are kept under surveillance by an aerial UAV jammer (denoted by J). The aim of the UAV J is to cooperatively send jamming signals to the eavesdroppers to resist their wiretapping. Notice that if there is no friendly UAV J and only one eavesdropper is considered, the proposed scenario reduces to the goround-to-UAV transmission in [18].

**Figure 1.** Cellular-enabled UAV secure communication network with aerial cooperative jamming.

Based on the three-dimensional Cartesian coordinate system, we denote **w**<sup>B</sup> = [*x*B, *y*B] *<sup>T</sup>* and **w**E*<sup>i</sup>* = [*x*E*<sup>i</sup>* , *y*E*<sup>i</sup>* ] *<sup>T</sup>* as the horizontal coordinates of the BS and E*i*, respectively, which are assumed to be fixed and known beforehand to the UAVs. The assumption that **w**E*<sup>i</sup>* is known in the network is proper when E*<sup>i</sup>* is an active ground node but untrusted by the UAV [29]. Therefore, E*<sup>i</sup>* can be detected by the synthetic aperture radar or optical camera mounted on the UAV [18]. The initial and final locations of the UAVs are assumed to be pre-specified, which are denoted by **q***k*,0 = [*xk*,0, *yk*,0] *<sup>T</sup>* and **q***k*,F = [*xk*,F, *yk*,F] *<sup>T</sup>* for *<sup>k</sup>* ∈ {U, <sup>J</sup>}, respectively. To make it more manageable, the period *<sup>T</sup>* is partitioned into *N* equal-length time slots, i.e., *T* = *δtN*, where *δ<sup>t</sup>* is the length of one time slot. As such, the UAV trajectory in time slot *n* ∈ *N* can be represented approximately by **q***k*[*n*] - [*xk*[*n*], *yk*[*n*]]*<sup>T</sup>* for *<sup>k</sup>* ∈ {U, <sup>J</sup>}, with a fixed altitude *H*. Let *Ω* = *V*max*δ<sup>t</sup>* be the maximum horizontal distance that the UAV can travel in a single time slot, where *V*max is the maximum speed of the UAV. Practically, the UAVs should satisfy the following mobility constraints,

$$||\mathbf{q}\_k[n+1] - \mathbf{q}\_k[n]||^2 \le \Omega^2, n = 0, \dots, N-1,\tag{1}$$

$$\mathbf{q}\_k[0] = \mathbf{q}\_{k,0\prime}\mathbf{q}\_k[N] = \mathbf{q}\_{k,\mathbf{F}\prime} \tag{2}$$

$$||\mathbf{q}\_{\mathbf{U}}[n] - \mathbf{q}\_{\mathbf{J}}[n]||^2 \ge d\_{\text{min}}^2 n = 0, \cdot, \cdot, \text{N}, \tag{3}$$

where *d*min is the minimum tolerable distance between the two UAVs that ensures the avoidance of a collision.

We assume that the ground-to-UAV and UAV-to-UAV transmissions are mainly governed by LoS channels [18,20,23,24]. Thus, the corresponding channel power gains in time slot *n* follow the free-space path loss model. (For the purpose of exposition, it is reasonable to assume that the ground-to-UAV follows the free-space LoS channel model when the UAV is deployed in the rural area with sufficiently high altitude. In this case, the probability of Non-LoS state is negligible compared to the dominant LoS state [30]. However, the proposed design is readily extendable to more general channel models in urban areas with Non-LoS effects, e.g., [30].), which are, respectively, given as below,

$$h\_{\rm BU}[n] = \rho\_0 d\_{\rm BU}^{-2}[n] = \frac{\rho\_0}{(H - H\_{\rm B})^2 + ||\mathbf{q}\_{\rm U}[n] - \mathbf{w}\_{\rm B}||^2} \tag{4}$$

$$h\_{\rm I\to i}[n] = \rho\_0 d\_{\rm I\to i}^{-2}[n] = \frac{\rho\_0}{H^2 + ||\mathbf{q}\_\mathbf{J}[n] - \mathbf{w}\_{\rm Ei}||^2},\tag{5}$$

$$h\_{\rm JU}[n] = \rho\_0 d\_{\rm JU}^{-2}[n] = \frac{\rho\_0}{||\mathbf{q}\_{\rm U}[n] - \mathbf{q}\_{\rm J}[n]||^2} \tag{6}$$

where *d*BU[*n*], *d*JE*<sup>i</sup>* [*n*] and *d*JU[*n*] are the distances from the BS to the UAV U, from the UAV J to the eavesdropper E*i*, and between the two UAVs in time slot *n*, respectively, *ρ*<sup>0</sup> is the channel power gain at the reference distance *d*<sup>0</sup> = 1 m and *H*<sup>B</sup> is the altitude of the BS. The ground-to-ground transmission is assumed to follow the Rayleigh fading channel. As such, the channel power gain is denoted by

$$h\_{\rm BE\_i} = \rho\_0 \zeta\_i d\_{\rm BE\_i}^{-\kappa} = \frac{\rho\_0 \zeta\_i}{||\mathbf{w}\_{\rm B} - \mathbf{w}\_{\rm E\_i}||^\kappa} \tag{7}$$

where *d*BE*<sup>i</sup>* is the distance between the BS and the eavesdropper E*i*, *ζ<sup>i</sup>* is an exponentially distributed random variable with unit mean representing small-scale Rayleigh fading and *κ* ≥ 2 is the distance-dependent path loss exponent.

Denote by *P*[*n*] and *Q*[*n*] the BS's transmit power and the UAV J's jamming power in time slot *n*, respectively. In practice, they should satisfy the respective average power constraint *P*¯ or *Q*¯, and peak power constraint *P*ˆ or *Q*ˆ, i.e.,

$$\frac{1}{N} \sum\_{n=1}^{N} P[n] \le \bar{P}, 0 \le P[n] \le \hat{P},\tag{8}$$

$$\frac{1}{N} \sum\_{n=1}^{N} Q[n] \le \bar{Q}, 0 \le Q[n] \le \hat{Q}. \tag{9}$$

where *<sup>P</sup>*¯ <sup>≤</sup> *<sup>P</sup>*<sup>ˆ</sup> and *<sup>Q</sup>*¯ <sup>≤</sup> *<sup>Q</sup>*ˆ. Then, the achievable rate in bits/second/Hertz (bps/Hz) of the UAV U in time slot *n* is given by

$$R\_\mathrm{U}[n] = \log\_2\left(1 + \frac{P[n]h\_\mathrm{BU}[n]}{Q[n]h\_\mathrm{U}[n] + \sigma^2}\right),\tag{10}$$

where *Q*[*n*]*h*JU[*n*] is the jamming interference from the UAV J, and *σ*<sup>2</sup> is the additive white Gaussian noise power at the receivers. Similarly, the achievable rate of the eavesdropper E*<sup>i</sup>* in time slot *n* can be expressed as

$$\begin{array}{lcl} \mathbb{R}\_{\mathbb{E}\_i}[n] &= \mathbb{E}\_{\mathbb{V}\_i} \left[ \log\_2 \left( 1 + \frac{P[n]h\_{\text{BE}\_i}}{Q[n]h\_{\text{IE}\_i}[n] + \sigma^2} \right) \right] \\ &\leq \log\_2 \left( 1 + \frac{P[n]\rho\_0 || \mathbf{w}\_{\text{B}} - \mathbf{w}\_{\text{E}\_i} || \cdot^{-\kappa}}{Q[n]h\_{\text{IE}\_i}[n] + \sigma^2} \right) \\ \triangleq \mathcal{R}\_{\text{E}\_i}[n] \end{array} \tag{11}$$

where <sup>E</sup>*ζ<sup>i</sup>* [·] is the expectation operator with respect to (w.r.t.) *ζi*. Note that *R*E*<sup>i</sup>* [*n*] is replaced by *<sup>R</sup>*<sup>ˆ</sup> <sup>E</sup>*<sup>i</sup>* [*n*] based on Jensen' inequality and the concavity of *R*E*<sup>i</sup>* [*n*] w.r.t. *<sup>ζ</sup>i*, and *<sup>R</sup>*<sup>ˆ</sup> <sup>E</sup>*<sup>i</sup>* [*n*] is the largest rate that E*<sup>i</sup>* can achieve. Therefore, in accordance with the theoretical results in [31], the worst-case secrecy rate for each time slot can be lower bounded by

$$R\_{\text{wcs}}[n] = \max(R\_{\text{U}}[n] - \max\_{i \in \mathcal{T}} \mathcal{R}\_{\text{E}\_i}[n], 0). \tag{12}$$

Note that by adaptively setting *P*[*n*] = 0, the optimal solution to (12) is at least to be zero for any time slot *n*, without violating the power constraint (8). Therefore, the maximum operation can be dropped in the following optimization problems.

#### *2.2. Problem Formulation*

In this paper, we aim to maximize the average worst-case achievable secrecy rate from the BS to the UAV U over *N* time slots, by jointly optimizing the BS's transmit power **P** - {*P*[*n*], *n* ∈ *N*}, the jamming power **Q** - {*Q*[*n*], *n* ∈ *N*} of the UAV J, and the UAV trajectory **q***<sup>k</sup>* = {**q***k*[*n*], *n* ∈ *N*} for *k* ∈ {U, J}. Thus, this optimization problem can be formulated as

$$\max\_{\mathbf{P}, \mathbf{Q}, \mathbf{Q}, \mathbf{q}\_{\parallel}, \mathbf{q}\_{\parallel}} \frac{1}{N} \sum\_{n=1}^{N} \left( R\_{\text{U}}[n] - R\_{\text{E}}[n] \right) \tag{13}$$
 
$$\text{s.t. (1)-(3), (8)-(9).}$$

where we let *<sup>R</sup>*E[*n*] = max*i*∈I *<sup>R</sup>*<sup>ˆ</sup> <sup>E</sup>*<sup>i</sup>* [*n*], and thus *R*E[*n*] corresponds to the maximum achievable rate among multiple eavesdroppers in time slot *n.* Optimally solving problem (13) is difficult, in general, due to the following two main reasons: (1) the objective function is not concave w.r.t the corresponding optimization variables even with fixed variables of other blocks and (2) the constraint in (3) is non-convex w.r.t. the UAVs' trajectory variables.

#### **3. Joint Trajectory and Power Control Algorithm**

In this section, an efficient algorithm is proposed to obtain the sub-optimal solution to problem (13). Specifically, we cope with problem (13) by solving four subproblems iteratively, i.e., the alternative optimization of the transmit power **P**, jamming power **Q**, UAV U's trajectory **q**U, and UAV J's trajectory **q**J, by fixing the other three optimization variables. Furthermore, the overall algorithm is presented, and its complexity and convergence are analyzed rigorously.

#### *3.1. Transmit Power Optimization*

For simplicity, let *an* = *<sup>γ</sup>*<sup>0</sup> *d*2 BU[*n*](1+*Q*[*n*]*γ*0/*ρ*0*d*<sup>2</sup> JU[*n*]), and *bn* <sup>=</sup> *<sup>γ</sup>*0||**w**B−**w**E||−*<sup>κ</sup>* 1+*Q*[*n*]*γ*0/*d*<sup>2</sup> JE[*n*] , where *γ*<sup>0</sup> = *ρ*0/*σ*<sup>2</sup> is the reference signal-to-noise ratio (SNR), and **w**<sup>E</sup> is denoted as the horizontal location of the eavesdropper that achieves the largest rate and *d*<sup>2</sup> JE[*n*] is the distance from the UAV J to the eavesdropper **w**E. Thus, with given **Q**, **q**U, and **q**J, problem (13) can be simplified as

$$\max\_{\mathbf{P}} \sum\_{n=1}^{N} \left[ \log\_2 \left( 1 + a\_n P[n] \right) - \log\_2 \left( 1 + b\_n P[n] \right) \right] \tag{14}$$

s.t. (8).

Based on the result in [18], the close-form solution to this problem is given by: *P*∗[*n*] = min([Λ*n*] <sup>+</sup>, *<sup>P</sup>*ˆ) if *an* <sup>&</sup>gt; *bn*; otherwise *<sup>P</sup>*∗[*n*] = 0, where <sup>Λ</sup>*<sup>n</sup>* = ((1/2*bn* <sup>−</sup> 1/2*an*)<sup>2</sup> + (1/*bn* <sup>−</sup> 1/*an*)/(*<sup>λ</sup>* ln <sup>2</sup>)) <sup>1</sup> <sup>2</sup> − 1/2*an* − 1/2*bn*. The value of *λ* ≥ 0 is a constant that ensures constraint (8) is met, which can be obtained cost-effectively via the bisection algorithm [32]. By obtaining the optimal transmit power variables **P**, they can be seen as the given input for the jamming power optimization problem in the next subsections.

#### *3.2. Jamming Power Optimization*

Let *cn* = *<sup>P</sup>*[*n*]*γ*<sup>0</sup> *d*2 BU[*n*] , *dn* = *<sup>γ</sup>*<sup>0</sup> *d*2 JU[*n*] , *en* <sup>=</sup> *<sup>P</sup>*[*n*]*γ*0||**w**<sup>B</sup> <sup>−</sup> **<sup>w</sup>**E||−*<sup>κ</sup>* and *fn* <sup>=</sup> *<sup>γ</sup>*<sup>0</sup> *d*2 JE[*n*] . With given **P**, **q**U, and **q**J, we can reformulate problem (13) as

$$\max\_{\mathbf{Q}} \sum\_{n=1}^{N} \left[ \log\_2 \left( 1 + \frac{c\_n}{1 + d\_n Q[n]} \right) - \log\_2 \left( 1 + \frac{c\_n}{1 + f\_n Q[n]} \right) \right] \tag{15}$$
 
$$\text{s.t. (9)}$$

Problem (15) is a non-convex problem because of the non-convex objective function, which is actually difficult to solve for general *N*. However, the first term in (15) is convex w.r.t. *Q*[*n*], and thus it can be approximated to a convex function within each iteration by applying the SCA method. It is known that the first-order Taylor expansion can be used to obtain the global under-estimator for any convex function at any point [32]. Thus, denoted by **<sup>Q</sup>***<sup>l</sup>* = {*Q<sup>l</sup>* [*n*], *n* ∈ *N*}, the given local point in the *l*-th iteration, we have

$$\log\_2\left(1+\frac{c\_n}{1+d\_nQ[n]}\right) \ge A\_n^l + B\_n^l(Q[n]-Q^l[n])\tag{16}$$

where *A<sup>l</sup> <sup>n</sup>* = log2 1 + *cn* 1+*dnQ<sup>l</sup>* [*n*] and

$$B\_n^l = \frac{-c\_n d\_n}{\ln 2 (1 + d\_n Q^l[n]) (1 + c\_n + d\_n Q^l[n])}.$$

With (16), problem (15) is lower bounded by the following problem for any given **Q***<sup>l</sup>* ,

$$\max\_{\mathbf{Q}} \sum\_{n=1}^{N} \left[ B\_n^l Q[n] - \log\_2 \left( 1 + \frac{\varepsilon\_n}{1 + f\_n Q[n]} \right) \right] \tag{17}$$
 
$$\text{s.t. } (9).$$

Observe that this subproblem is concave w.r.t. *Q*[*n*] and thus can be solved efficiently by the interior-point method [32]*.* After solving problem (17), the obtained jamming power **Q** serves as the given variables for the trajectory optimization problem of the UAVs.

#### *3.3. Trajectory Optimization of the UAV U*

Even with given **P**, **Q**, and **q**J, it is still hard to achieve the optimal solution to problem (13), due to the non-concavity of the objective function w.r.t. **q**<sup>U</sup> and the non-convexity of the constraint (3). To tackle this subproblem, we first introduce the slack variables *<sup>α</sup>* = {*α*[*n*]=(*<sup>H</sup>* − *<sup>H</sup>*B)<sup>2</sup> + ||**q**U[*n*] − **<sup>w</sup>**B||2, *<sup>n</sup>* ∈ *<sup>N</sup>*} and *<sup>β</sup>* = {*β*[*n*] = ||**q**U[*n*] − **<sup>q</sup>**J[*n*]||2, *<sup>n</sup>* ∈ *<sup>N</sup>*}. After some simple transformations, solving problem (13) is equivalent to solve the following problem,

$$\max\_{\mathbf{q}\_{\mathbf{U}}, \mathbf{a}, \boldsymbol{\theta}} \sum\_{n=1}^{N} \left[ \log\_2 \left( 1 + \frac{P[n]\gamma\_0}{a[n]} + \frac{Q[n]\gamma\_0}{\beta[n]} \right) - \log\_2 \left( 1 + \frac{Q[n]\gamma\_0}{\beta[n]} \right) \right] \tag{18}$$

$$\text{s.t.}\ a[n] \ge (H - H\_\mathcal{B})^2 + ||\mathbf{q}\_\mathcal{U}[n] - \mathbf{w}\_\mathcal{B}||^2,\tag{19}$$

$$\beta[n] \le ||\mathbf{q}\_{\mathbf{U}}[n] - \mathbf{q}\_{\mathbf{I}}[n]||^2,\tag{20}$$
  $(\mathbf{1})-(\mathbf{3})$ .

$$\text{In fact, if } a[n] \text{ (}\beta[n]\text{) is increased (decreased), the objective value of problem (13) will be decreased, and thus the constraints for } \mathfrak{a} \text{ and } \beta \text{ must satisfy the equalities. Problem (18) is still non-convex, because of the non-convex objective function in (18), and the constraints in (3) and (20). To take this, the result is a priori, an important lemma is provided as below.$$

**Lemma 1.** *Given K*<sup>1</sup> > 0 *and K*<sup>2</sup> > 0*, the function f*(*x*, *y*) = log2 1 + *<sup>K</sup>*<sup>1</sup> *<sup>x</sup>* <sup>+</sup> *<sup>K</sup>*<sup>2</sup> *y is jointly convex w.r.t. x* > 0 *and y* > 0*.*

**Proof.** See Appendix A.

Based on Lemma 1, it is easy to prove the convexity of the first term in problem (18). By using the first-order Taylor expansions of a convex function *f*(*x*, *y*) in a neighborhood of (*x*, *y*)=(*x*0, *y*0), i.e., *f*(*x*, *y*) = *f*(*x*0, *y*0) + *fx*(*x*0, *y*0)(*x* − *x*0) + *fy*(*x*0, *y*0)(*y* − *y*0), the first term in (18) at given local points denoted by *<sup>α</sup><sup>l</sup>* = {*α<sup>l</sup>* [*n*], *<sup>n</sup>* <sup>∈</sup> *<sup>N</sup>*} and *<sup>β</sup><sup>l</sup>* <sup>=</sup> {*β<sup>l</sup>* [*n*], *n* ∈ *N*} in the *l*-th iteration, can be given as follows,

$$\log\_2\left(1+\frac{P[n]\gamma\_0}{a[n]}+\frac{Q[n]\gamma\_0}{\beta[n]}\right) \ge \log\_2 C\_n^l - \frac{D\_n^l}{C\_n^l \ln 2} \tag{21}$$

where *C<sup>l</sup> <sup>n</sup>* <sup>=</sup> <sup>1</sup> <sup>+</sup> *<sup>P</sup>*[*n*]*γ*<sup>0</sup> *<sup>α</sup><sup>l</sup>* [*n*] <sup>+</sup> *<sup>Q</sup>*[*n*]*γ*<sup>0</sup> *<sup>β</sup><sup>l</sup>* [*n*] and *<sup>D</sup><sup>l</sup> <sup>n</sup>* = *P*[*n*]*γ*0(*α<sup>l</sup>* [*n*])−2(*α*[*n*] − *<sup>α</sup><sup>l</sup>* [*n*]) + *Q*[*n*]*γ*0(*β<sup>l</sup>* [*n*])−2(*β*[*n*] − *βl* [*n*]). Similarly, by using the first-order Taylor expansion at the given local point denoted by **q***l* <sup>U</sup> = {*q<sup>l</sup>* <sup>U</sup>[*n*], *<sup>n</sup>* ∈ *<sup>N</sup>*} in the *<sup>l</sup>*-th iteration, the convex function ||**q**U[*n*] − **<sup>q</sup>**J[*n*]||2, w.r.t. **<sup>q</sup>**U[*n*] in problem (3) and in problem (20) can be replaced by their convex lower bounds, i.e.,

$$\left|\left|\left|\mathbf{q}\_{\mathbf{U}}[n]-\mathbf{q}\_{\mathbf{J}}[n]\right|\right|^{2} \geq \left|\left|\mathbf{q}\_{\mathbf{U}}^{l}[n]-\mathbf{q}\_{\mathbf{J}}[n]\right|\right|^{2} + 2\left(\mathbf{q}\_{\mathbf{U}}^{l}[n]-\mathbf{q}\_{\mathbf{J}}[n]\right)^{T}\left(\mathbf{q}\_{\mathbf{U}}[n]-\mathbf{q}\_{\mathbf{U}}^{l}[n]\right). \tag{22}$$

As a result, by applying SCA technique in each iteration, we approximate the original convex functions to more manageable functions at given local points. Therefore, with (21)–(22), we have the following optimization problem

$$\max\_{\mathbf{q}\in\mathcal{A},\mathbf{c},\mathbf{f}} \sum\_{n=1}^{N} \left[ -\frac{D\_n^l}{\mathbf{C}\_n^l \ln 2} - \log\_2 \left( 1 + \frac{\underline{Q}[n]\gamma\_0}{\beta[n]} \right) \right] \tag{23}$$

$$\text{s.t.} \,\beta[n] \le ||\mathbf{q}\_\mathbf{U}^l[n] - \mathbf{q}\_\mathbf{I}[n]||^2 \,+ 2(\mathbf{q}\_\mathbf{U}^l[n] - \mathbf{q}\_\mathbf{I}[n])^T (\mathbf{q}\_\mathbf{U}[n] - \mathbf{q}\_\mathbf{U}^l[n]),\tag{24}$$

$$d\_{\min}^2 \le ||\mathbf{q}\_\mathbf{U}^l[n] - \mathbf{q}\_\mathbf{J}[n]||^2 + 2(\mathbf{q}\_\mathbf{U}^l[n] - \mathbf{q}\_\mathbf{J}[n])^T(\mathbf{q}\_\mathbf{U}[n] - \mathbf{q}\_\mathbf{U}^l[n]),\tag{25}$$
  $(1)\_\prime (2)\_\prime (19).$ 

$$\text{It is observed that problem (23) is now convex with all convex constraints. As such, the interior-point method can be used efficiently to solve this problem. Note that the lower bounds of }\text{[tab:model]}$$

Bothized by the Taylor expansions suggest that the optimal objective value by solving problem (23) is a lower bound of that of problem (18). In the next subsection, the solved  $\mathbf{q}\_{\text{U}}$  is input to the trajectory optimization problem of the UAV is the given variable.

#### *3.4. Trajectory Optimization of the UAV J*

With given **<sup>P</sup>**, **<sup>Q</sup>**, and **<sup>q</sup>**U, we let *<sup>δ</sup>* = {*δ*[*n*] = *<sup>H</sup>*<sup>2</sup> + ||**q**J[*n*] − **<sup>w</sup>**E||2, *<sup>n</sup>* ∈ *<sup>N</sup>*}, to tackle the non-concavity of the objective function w.r.t. **q**J. Therefore, problem (13) can be rewritten as

$$\begin{split} \max\_{\mathbf{q}, \mathcal{S}, \mathcal{S}} \sum\_{n=1}^{N} \left[ \log\_2 \left( \beta[n] + c\_n \beta[n] + Q[n] \gamma\_0 \right) - \log\_2 \left( \beta[n] + Q[n] \gamma\_0 \right) \right] \\ - \log\_2 \left( 1 + \frac{c\_n \delta[n]}{\delta[n] + Q[n] \gamma\_0} \right) \end{split} \tag{26}$$

$$\text{s.t.}\,\delta[n] \ge H^2 + ||\mathbf{q}[n] - \mathbf{w}\_\mathbf{E}||^2,\tag{27}$$

$$(1)-(3), (20).$$

Also, the constraint for *δ* holds with equalities, otherwise the objective value of problem (13) will be decreased by increasing *δ*[*n*]. Similarly, by using the first-order Taylor expansion at given local points denoted by *<sup>δ</sup><sup>l</sup>* <sup>=</sup> {*δ<sup>l</sup>* [*n*], *<sup>n</sup>* <sup>∈</sup> *<sup>N</sup>*}, *<sup>β</sup><sup>l</sup>* <sup>=</sup> {*β<sup>l</sup>* [*n*], *<sup>n</sup>* ∈ *<sup>N</sup>*} and **<sup>q</sup>***<sup>l</sup>* <sup>J</sup> = {*q<sup>l</sup>* J [*n*], *n* ∈ *N*} in the *l*-th iteration, the second and third terms in problem (26), and ||**q**U[*n*] − **<sup>q</sup>**J[*n*]||<sup>2</sup> in (3) and in (20) can be substituted by their respective concave upper and convex lower bounds, i.e.,

$$\log\_2\left(\beta[n] + Q[n]\gamma\_0\right) \le \log\_2\left(\beta^l[n] + Q[n]\gamma\_0\right) + \frac{\beta[n] - \beta^l[n]}{\ln 2 (\beta^l[n] + Q[n]\gamma\_0)},\tag{28}$$

$$\log\_2\left(1+\frac{c\_n\delta[n]}{\delta[n]+Q[n]\gamma\_0}\right) \le E\_n^l + F\_n^l(\delta[n]-\delta^l[n]),\tag{29}$$

where *E<sup>l</sup> <sup>n</sup>* = log2 <sup>1</sup> <sup>+</sup> *enδ<sup>l</sup>* [*n*] *δ<sup>l</sup>* [*n*]+*Q*[*n*]*γ*<sup>0</sup> ,

$$F\_n^l = \frac{\varepsilon\_n \gamma\_0 Q[n]}{\ln 2 (\gamma\_0 Q[n] + (\varepsilon\_n + 1)\delta^l[n])(\gamma\_0 Q[n] + \delta^l[n])},$$

and

$$\left| \left| \mathbf{q}\_{\rm U}[n] - \mathbf{q}\_{\rm I}[n] \right| \right|^{2} \geq \left| \left| \mathbf{q}\_{\rm U}[n] - \mathbf{q}\_{\rm I}^{l}[n] \right| \right|^{2} - 2 \left( \mathbf{q}\_{\rm U}[n] - \mathbf{q}\_{\rm I}^{l}[n] \right)^{T} \left( \mathbf{q}\_{\rm I}[n] - \mathbf{q}\_{\rm I}^{l}[n] \right). \tag{30}$$

With problems (28)–(30), we approximate problem, (26) as the following optimization problem

$$\max\_{\mathbf{q}, \boldsymbol{\theta}, \boldsymbol{\theta}, \boldsymbol{\mathcal{S}}} \sum\_{n=1}^{N} \left[ \log\_2 \left( \beta[n] + c\_n \beta[n] + Q[n] \gamma\_0 \right) - F\_n^l \delta[n] - \frac{\beta[n]}{\ln 2 (\beta^l[n] + Q[n] \gamma\_0)} \right] \tag{31}$$

$$\text{s.t.} \, d\_{\text{min}}^2 \le ||\mathbf{q}\_{\text{U}}[n] - \mathbf{q}\_{\text{I}}^l[n]||^2 - 2(\mathbf{q}\_{\text{U}}[n] - \mathbf{q}\_{\text{I}}^l[n])^T (\mathbf{q}\_{\text{I}}[n] - \mathbf{q}\_{\text{I}}^l[n]),\tag{32}$$

$$\beta[n] \le |\mathbf{q}\_{\mathbf{U}}[n] - \mathbf{q}\_{\mathbf{J}}^{l}[n]|^{2} - 2(\mathbf{q}\_{\mathbf{U}}[n] - \mathbf{q}\_{\mathbf{J}}^{l}[n])^{T}(\mathbf{q}\_{\mathbf{J}}[n] - \mathbf{q}\_{\mathbf{J}}^{l}[n]),\tag{33}$$
 
$$(\mathbf{r}) \quad (\alpha) \quad (\alpha \tau)$$

$$(1)\_\prime(2)\_\prime(27).$$

Problem (31) is now a convex optimization problem that can be cost-effectively solved by the interior-point method. Furthermore, the Taylor expansions in problems (28)–(30) indicate that the objective value of problem (26) is at least the same as that by solving problem (31). Note that all the obtained variables **P**, **Q**, **q**U, and **q**<sup>J</sup> are utilized as the given variables for the next iteration.

#### *3.5. Overall Algorithm*

In summary, the overall algorithm for obtaining the locally optimal solution to problem (13) is computed by the joint optimization of both the BS's transmit power **P** and the UAV J's jamming power

**Q** as well as the two UAVs' trajectories **q**U, and **q**<sup>J</sup> variables, via alternatively solving subproblems (14), (17), (23) and (31) in an iterative way, respectively. The detailed procedure for solving problem (13) is summarized in Algorithm 1.

In the following, we analyze the computation complexity of Algorithm 1. In each iteration, the BS's transmit power, UAV J's jamming power, and the trajectories of UAVs U and J are optimized in sequence, based on the interior-point method by using existing solvers, such as CVX [33]. Therefore, the complexity for solving the four subproblems can be expressed by *O*(log *N*), *O*(*N*3.5 log(1/)), *O*((3*N*)3.5 log(1/)), and *O*((3*N*)3.5 log(1/)), respectively, for the given solution precision of > 0 [34]. In addition, as the complexity for updating all variables in BCD iterations is in the order of log(1/), the total computation complexity of the proposed algorithm is *O*(*N*3.5 log2(1/)). Due to the polynomial time complexity, Algorithm 1 is applicable to the aerial cooperative jamming for cellular-enabled UAV networks.



8: **until** Converge to a pre-specified precision > 0.

Next, the convergence of Algorithm 1 is discussed as follows. Let *ψ*(**P***<sup>l</sup>* , **Q***<sup>l</sup>* , **q***<sup>l</sup>* <sup>U</sup>, **<sup>q</sup>***<sup>l</sup>* J ) denote the value of the objective function in problem (13) in the *l*-th iteration. Then, we have

$$
\psi(\mathbf{P}^l, \mathbf{Q}^l, \mathbf{q}\_{\mathbf{U}^l}^l, \mathbf{q}\_{\mathbf{J}}^l) \le \psi\_\mathbf{P}(\mathbf{P}^{l+1}, \mathbf{Q}^l, \mathbf{q}\_{\mathbf{U}^l}^l, \mathbf{q}\_{\mathbf{J}}^l), \tag{34}
$$

where *ψ***P**(**P***l*<sup>+</sup>1, **Q***<sup>l</sup>* , **q***<sup>l</sup>* <sup>U</sup>, **<sup>q</sup>***<sup>l</sup>* J ) is defined as the obtained objective value of problem (14) and **P***l*+<sup>1</sup> is the optimal solution to problem (14). For the optimization of the jamming power **Q**, the following equations hold,

$$\begin{split} \psi(\mathbf{P}^{l+1}, \mathbf{Q}^{l}, \mathbf{q}\_{\mathbf{U}^{l}}^{l}, \mathbf{q}\_{\mathbf{I}}^{l}) \overset{(j\_{1})}{=} \psi\_{\mathbf{Q}}^{\text{lb}}(\mathbf{P}^{l+1}, \mathbf{Q}^{l}, \mathbf{q}\_{\mathbf{U}^{l}}^{l}, \mathbf{q}\_{\mathbf{I}}^{l}) \\ \overset{(j\_{2})}{\leq} \psi\_{\mathbf{Q}}^{\text{lb}}(\mathbf{P}^{l+1}, \mathbf{Q}^{l+1}, \mathbf{q}\_{\mathbf{U}^{l}}^{l}, \mathbf{q}\_{\mathbf{I}}^{l}) \\ \overset{(j\_{3})}{\leq} \psi(\mathbf{P}^{l+1}, \mathbf{Q}^{l+1}, \mathbf{q}\_{\mathbf{U}^{l}}^{l}, \mathbf{q}\_{\mathbf{I}}^{l}), \end{split} \tag{35}$$

where *ψ*lb **<sup>Q</sup>** is denoted as the objective value of problem (17), (*j*1) holds since the first-order Taylor expansion in (16) is tight at the local point **Q***<sup>l</sup>* in problem (17), (*j*2) satisfies due to the optimal solution **Q***l*+<sup>1</sup> to problem (17), and (*j*3) is because the computed objective value of problem (17) is lower bounded by that of problem (15). For the two UAVs' trajectories optimization, the similar derivation procedure as in (35) can be used, which are given as below,

$$\begin{split} \psi(\mathbf{P}^{l+1}, \mathbf{Q}^{l+1}, \mathbf{q}^{l}\_{\mathbf{U}^{\prime}}, \mathbf{q}^{l}\_{\mathbf{U}^{\prime}}) &= \psi^{\text{lb}}\_{\mathbf{q}\mathbf{U}}(\mathbf{P}^{l+1}, \mathbf{Q}^{l+1}, \mathbf{q}^{l}\_{\mathbf{U}^{\prime}}, \mathbf{q}^{l}\_{\mathbf{U}^{\prime}}) \\ &\leq \psi^{\text{lb}}\_{\mathbf{q}\mathbf{U}}(\mathbf{P}^{l+1}, \mathbf{Q}^{l+1}, \mathbf{q}^{l+1}\_{\mathbf{U}^{\prime}}, \mathbf{q}^{l+1}\_{\mathbf{U}^{\prime}}, \mathbf{q}^{l}\_{\mathbf{U}}) \\ &\leq \psi(\mathbf{P}^{l+1}, \mathbf{Q}^{l+1}, \mathbf{q}^{l+1}\_{\mathbf{U}}, \mathbf{q}^{l+1}\_{\mathbf{U}}), \end{split} \tag{36}$$

$$\begin{split} \psi(\mathbf{P}^{l+1}, \mathbf{Q}^{l+1}, \mathbf{q}\_{\mathbf{U}}^{l+1}, \mathbf{q}\_{\mathbf{I}}^{l}) &= \psi\_{\mathbf{q}\_{\parallel}}^{\text{lb}}(\mathbf{P}^{l+1}, \mathbf{Q}^{l+1}, \mathbf{q}\_{\mathbf{U}}^{l+1}, \mathbf{q}\_{\mathbf{I}}^{l}) \\ &\leq \psi\_{\mathbf{q}\_{\parallel}}^{\text{lb}}(\mathbf{P}^{l+1}, \mathbf{Q}^{l+1}, \mathbf{q}\_{\mathbf{U}}^{l+1}, \mathbf{q}\_{\mathbf{I}}^{l+1}) \\ &\leq \psi(\mathbf{P}^{l+1}, \mathbf{Q}^{l+1}, \mathbf{q}\_{\mathbf{U}}^{l+1}, \mathbf{q}\_{\mathbf{I}}^{l+1}), \end{split} \tag{37}$$

With (34)–(37), we finally obtain that

$$
\psi(\mathbf{P}^l, \mathbf{Q}^l, \mathbf{q}\_{\mathbf{U}^l}^l, \mathbf{q}\_{\mathbf{I}}^l) \le \psi(\mathbf{P}^{l+1}, \mathbf{Q}^{l+1}, \mathbf{q}\_{\mathbf{U}^l}^{l+1}, \mathbf{q}\_{\mathbf{I}}^{l+1}). \tag{38}
$$

As a result, Algorithm 1 ensures that the obtained objective value of problem (13) is non-decreasing over the iterations, and thus it guarantees its convergence to the locally optimal solution to problem (13).

#### **4. Numerical Results**

In this section, we verify our joint trajectories and powers optimization (denoted as 2T&P) algorithm through simulations. Three benchmark schemes are taken into account as a comparison:


Specifically, the 2T/NP scheme sets the powers of the BS and the UAV U as *P*[*n*] = *P*¯ and *<sup>Q</sup>*[*n*] = *<sup>Q</sup>*¯, <sup>∀</sup>*n*, respectively, and the trajectories of the two UAVs are obtained by solving problems (23) and (31) iteratively until convergence. In the 2HT/P scheme, the UAV U flies directly to the top of the BS at its maximum speed, then stays hovering as long as possible, and finally travels directly to its destination at its maximum speed by the end of *T*. Different from UAV U, UAV J keeps hovering right above the eavesdropper with the largest achievable rate. Given heuristic trajectories in the 2HT/P, the powers *P*[*n*] and *Q*[*n*] can be obtained by solving problems (14) and (17), respectively. The initial UAV trajectory for the 2T&P and 2T/NP schemes are constructed by the heuristic UAV trajectories as in 2HT/P. The simulation parameters are specified in Table 1.

We first verify the convergence behaviour of the proposed Algorithm 1 versus the iteration numbers for different *T* in Figure 2. It is illustrated that the average secrecy rate increases quickly and converges within five iterations, and its performance increases significantly with *T*. This confirms that a locally optimal solution to problem (13) can be converged by using the proposed algorithm.

Figure 3 illustrates the optimized trajectories of the two UAVs by different schemes when *T* is sufficiently large, e.g., *T* = 300 s. It is observed that the hovering locations of all algorithms for the UAV U are directly above the BS. This occurs because the locations of the eavesdroppers are not related to the UAV U's trajectory due to the ground-to-air transmission, and thus the UAV U can obtain its maximum achievable rate hovering at the location on top of the BS. In addition, the the trajectories of the UAV U in 2T&P and 2T/NP show the curved paths in order to escape from the unintended interference caused by the UAV J. However, the trajectories of the UAV J present significant different. In particular, for our 2T&P scheme in Figure 3a, the UAV first flies along an arc-like path and reaches a certain point close to the eavesdropper E1 to avoid a collision with the UAV U; then, it keeps static at this hovering location for a permission period, and finally reaches its destination by the end of *T*, also in an arc-like path to prevent it causing much interference for the UAV U. Notice that the hovering

location of the UAV J is closer to E1 compared to E2, as the channel quality of BS-to-E1 link is much better than that of BS-to-E2 link. The BS-to-E2 link can also be degraded if the UAV J can guarantee that the secrecy of the worst-case, i.e., BS-to-E1 link transmission, by taking advantage of the dominant air-to-ground links. Moreover, at their hovering locations, the UAVs can achieve the better secrecy rate by effectively balancing between enhancing the communication of the ground-to-air link and degrading the quality of the BS-to-E*<sup>i</sup>* channel*.* In contrast with the 2T&P scheme, we can observe that on its way to the final location, the UAV J flies in a big arc path to keep away from the UAV U in the 2T/NP scheme as shown in Figure 3c. This is because the BS' s transmit power and UAV J's jamming power in 2T/NP are fixed, and thus the UAV J has to fly as far as possible to avoid severe interference with the UAV U over the whole duration, *T*.


**Table 1.** Simulation parameters.

**Figure 2.** The convergence performance of the proposed Algorithm 1.

(**c**) UAVs' trajectories for the 2T/NP scheme.

**Figure 3.** UAVs' trajectories by different schemes for *T* = 300 s. All trajectories are sampled every 5 s. The horizontal locations of the BS, eavesdroppers, UAVs' initial and final locations are marked with , •, and ♦, respectively.

Note that there is a tradeoff between improving the average achievable secrecy rate of the UAV U and avoiding the interference induced by the UAV J. For 2HT/P in Figure 3b, with the pre-specified UAV trajectories, the BS and the UAV jammer can adjust their power allocations to enhance the secrecy performance. Specifically, the BS gradually increases its transmit power before the UAV U flies to its hovering location, while the UAV J properly decreases its jamming power when it reaches above E1 to suppress the interference to the UAV U. In contrast, a secure communication-aware UAV trajectory design provides additional flexibility to avoid interference between UAVs in our 2T&P scheme. Thus, the UAV J adaptively adjusts its jamming power and trajectory according to the BS's transmit power and the location of the UAV U to further achieve the better secrecy rate.

Figure 4 illustrates the average secrecy rate versus *T*. It is expected that the average secrecy rates obtained by all schemes raise with *T*, and the proposed 2T&P scheme significantly outperforms other benchmark schemes owing to its joint optimization. Moreover, the proposed 2T&P scheme provides the significant gain as compared to the scheme in [18], i.e., 1T&P. This indicates that the advantage brought by the aerial cooperative jamming is more effective and important on notably improving the average secrecy rate. However, the 2T/NP presents the worst performance, which demonstrates that the power control also plays a key role in avoiding jamming from other UAVs, which is necessary in our cellular-enabled UAV communication networks with aerial cooperative jamming; otherwise, the secrecy rate can be significantly degraded as shown in Figure 4. Note that we expect that the proposed 2T&P algorithm can still achieve the best secrecy performance via the joint design, even if the number of the eavesdropper increases. This is because the joint design guarantees that the eavesdropper with the best channel condition can be effectively jammed by the UAV J; other eavesdroppers cannot wiretap confidential messages from the BS. The obtained results validate the advantages of introducing aerial cooperative jamming, and the joint optimization of UAV trajectories and power allocations.

**Figure 4.** Average secrecy rate versus *T* with different trajectory and power control designs.

#### **5. Conclusions**

Integrating UAVs into the forthcoming 5G cellular networks faces new security challenges. Thus, a new type of cooperative aerial jamming scheme for the cellular-enabled UAV secure communication networks has been investigated in this paper. In particular, the UAV receiver and the UAV jammer cooperate closely with each other to maximize the worst-case average secrecy rate by jointly optimizing their trajectories and the BS/UAV transmit/jamming power. An efficient iterative solution has been proposed to approximately tackle the secrecy rate maximization problem over a given flight period, by means of the BCD and SCA methods. The proposed algorithm is guaranteed to converge to a locally optimal solution with suitable computational complexity. We have demonstrated, by numerical results, that the friendly UAV jammer provides flexible mobility for interference with the ground

eavesdroppers, as well as effective power control of preventing it from jamming the UAV receiver, and thereby improves the system secrecy performance. Furthermore, the proposed scheme significantly outperforms the benchmark schemes with simple heuristic trajectories and pre-configured powers. The current scenario can also be extended to the general case with multiple legitimate UAVs, where optimal communication scheduling between the BS and each UAV should be considered. In this case, the design for UAV trajectories needs to avoid collision between UAVs more effectively, and reconciles a tradeoff between maximizing the minimum secrecy rate among multiple UAVs and suppressing the interference from the UAV jammer, which is an interesting problem to be resolved in the future.

**Author Contributions:** Conceptualization, methodology, and software, H.S. and B.D.; validation and investigation, Z.W. and X.L.; and writing—original draft preparation, B.D.; and writing—review and editing, H.S. and C.G.

**Funding:** This work was partially supported by the National Natural Science Foundation of China (Nos. 61701064, 71874027), partially by the China Scholarship Council (No. 201808510008), partially by the Sichuan Science and Technology Program (No. 2018GZ0454), partially by the Chengdu Science and Technology Program (No. 2017-RK00-00363-ZF), and partially by the Chongqing Natural Science Foundation (cstc2019jcyj-msxmX0264).

**Acknowledgments:** The authors the appreciate suggestions of the editors and reviewers, which are of great value to the paper.

**Conflicts of Interest:** The authors declare no conflicts of interest.

#### **Appendix A. Appendix**

We prove Lemma 1 via the definition of convex functions. First, since *f*(*x*, *y*) = log2 1 + *<sup>K</sup>*<sup>1</sup> *<sup>x</sup>* <sup>+</sup> *<sup>K</sup>*<sup>2</sup> *y* where *x* > 0, *y* > 0, *K*<sup>1</sup> > 0 and *K*<sup>2</sup> > 0, the first-order derivatives of *f*(*x*, *y*) w.r.t. *x* and *y* are given by

$$f\_x(\mathbf{x}, y) = -\frac{K\_1}{\mathbf{x}^2 G \ln 2}, \qquad f\_y(\mathbf{x}, y) = -\frac{K\_2}{y^2 G \ln 2},\tag{A1}$$

where we let *G* = 1 + *<sup>K</sup>*<sup>1</sup> *<sup>x</sup>* <sup>+</sup> *<sup>K</sup>*<sup>2</sup> *<sup>y</sup>* for brevity. Then, the Hessian of *f*(*x*, *y*) is

$$\nabla^2 f(x, y) = \begin{bmatrix} \frac{2K\_1 xy + 2K\_1 K\_2 x + K\_1^2 y}{x^4 y G^2 \ln 2} & -\frac{K\_1 K\_2}{x^2 y^2 G^2 \ln 2} \\ -\frac{K\_1 K\_2}{x^2 y^2 G^2 \ln 2} & \frac{2K\_2 xy + 2K\_1 K\_2 y + K\_2^2 x}{xy^6 G^2 \ln 2} \end{bmatrix} \tag{A2}$$

For any **t** = [*t*1, *t*2] *<sup>T</sup>*, we have

$$(\mathbf{t}^T \nabla^2 f(\mathbf{x}, y)\mathbf{t} = \frac{F\_1(\mathbf{x}, y) + F\_2(\mathbf{x}, y) + F\_3(\mathbf{x}, y)}{\mathbf{x}^4 y^4 G^2 \ln 2} \ge 0$$

where *F*1(*x*, *y*) = 2*K*1*t* 2 <sup>1</sup>*xy*3(*<sup>y</sup>* + *<sup>K</sup>*2), *<sup>F</sup>*2(*x*, *<sup>y</sup>*) = <sup>2</sup>*K*2*<sup>t</sup>* 2 <sup>2</sup>*x*3*y*(*<sup>x</sup>* + *<sup>K</sup>*1) and *<sup>F</sup>*3(*x*, *<sup>y</sup>*)=(*t*1*K*1*y*<sup>2</sup> − *<sup>t</sup>*2*K*2*x*2)<sup>2</sup> for *x* > 0, *y* > 0, *K*<sup>1</sup> > 0 and *K*<sup>2</sup> > 0, which finally leads to the convexity of the function *f*(*x*, *y*).

#### **References**


c 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **A Sensor-Driven Analysis of Distributed Direction Finding Systems Based on UAV Swarms**

#### **Zhong Chen 1, Shihyuan Yeh 1, Jean-Francois Chamberland <sup>1</sup> and Gregory H. Hu**ff **2,\***


Received: 29 April 2019; Accepted: 8 June 2019; Published: 12 June 2019

**Abstract:** This paper reports on the research of factors that impact the accuracy and efficiency of an unmanned aerial vehicle (UAV) based radio frequency (RF) and microwave data collection system. The swarming UAVs (agents) can be utilized to create micro-UAV swarm-based (MUSB) aperiodic antenna arrays that reduce angle ambiguity and improve convergence in sub-space direction-of-arrival (DOA) techniques. A mathematical data model is addressed in this paper to demonstrate fundamental properties of MUSB antenna arrays and study the performance of the data collection system framework. The Cramer–Rao bound (CRB) associated with two-dimensional (2D) DOAs of sources in the presence of sensor gain and phase coefficient is derived. The single-source case is studied in detail. The vector-space of emitters is exploited and the iterative-MUSIC (multiple signal classification) algorithm is created to estimate 2D DOAs of emitters. Numerical examples and practical measurements are provided to demonstrate the feasibility of the proposed MUSB data collection system framework using iterative-MUSIC algorithm and benchmark theoretical expectations.

**Keywords:** direction-of-arrival estimation; unmanned aerial vehicles; UAV swarm; aperiodic arrays; MUSIC; Cramer–Rao bound

#### **1. Introduction**

DOA estimation of source using array sensors plays an important role in the field of array signal processing. DOA estimation includes one-dimensional (1D) DOA estimation (azimuth) and 2D DOA estimation (azimuth and elevation). The accuracy of DOA estimation is mainly impacted by the algorithm, the geometry of sensor array structure, signal-to-noise (SNR), and snapshots, etc. Algorithms and array geometry are two essential research topics.

The DOA estimation method has been studied for decades and it is still an active research topic in recent years. Initially, DOA estimation based on sensor array structures used the Bartlett beamforming method, but it cannot offer high-resolution due to the Rayleigh limit [1,2]. Then, Burg came up with the maximum entropy (ME) method, which is a high-resolution method, but it has a low robustness and a considerable computation [3]. Later, a series of high-resolution spatial spectrum estimation methods based on decomposition of matrix eigenvectors for array signal processing came out and created a new era. All those spatial spectrum estimation methods were represented by MUSIC and estimation signal parameters via rotational invariance technique (ESPRIT) [4–7]. These two methods have greater resolution and accuracy than other classical methods. The simulations in [8] indicate that MUSIC algorithm is more accurate and stable than ESPRIT algorithm for uniform linear array (ULA). Furthermore, ESPRIT can only be used for invariant geometry, while MUSIC can be applied for arbitrary non-uniform sensor arrays and multiple-source estimation. The MUSIC algorithm attracted more and more attention since it came out, which was the milestone of spatial spectrum

estimation methods. In certain conditions, MUSIC algorithm is 1D implementation of maximum likelihood (ML) method, which shares the same characteristic with ML [9,10]. Then, due to the relatively high calculation complexity of MUSIC, some search-free algorithms, such as root-MUSIC, manifold separation based on root-MUSIC, use the root-solving technique to reduce the computational complexity [11,12]. Nevertheless, it can only be used for ULA. In order to deal with arbitrary non-uniform arrays, the Fourier domain root-MUSIC (FD root-MUSIC) algorithm was addressed [13], but the FD root-MUSIC algorithm is mainly used for 1D DOA estimation.

Most of algorithms above are still in simulation and theoretical stages. Recently, practical implementation of the DOA estimation system using field programmable gate array (FPGA) and digital signal processing (DSP) are reported [14–16], those implementations achieve real-time application of DOA estimation based on the MUSIC algorithm. Different from other well-known subspace-based techniques like ESPRIT, MUSIC algorithm has many advantages in the real implementation owing to its simplicity and suitability for parallel processing [16].

Apart from DOA estimation methods, the geometry of the antenna array is also very important. Most of the investigations on the array structure are limited in either 1D linear array or 2D planar array configurations [17]. The 1D array can only detect 1D DOA, and 2D array structures such as uniform rectangular array (URA) and uniform circular array (UCA) can well estimate azimuth angles but cannot well estimate elevation angles due to its small antenna aperture in the elevation direction. In order to improve the elevation angle estimation accuracy, one may develop a three-dimension (3D) array structure by putting more elements in the vertical direction to make a large elevation aperture [18,19]. However, it requires additional hardware and computational cost. Furthermore, the uniform linear and planar array will cause the ambiguity problem (angle aliasing) due to the symmetry of array structures [20–22]. Xia et al. proposed that cubic arrays still have ambiguity problem and the spherical array can significantly reduce this problem [22]. Recently, 3D antenna array configurations have attracted much more research interest in array signal processing [22–28]. Most of those 3D arrays above are constructed from regular structure (i.e., cubic, cylinder), extending the planar array (i.e., URA, UCA) or configuring the virtual 3D array based on the planar array. Even though those special 3D arrays increase the elevation angle estimation accuracy, their array apertures are still relatively small since the physical size of static arrays are restricted. Moreover, conventional investigations in the DOA estimation require that the number of sensors is more than the number of receiving signals, which increase hardware cost and system complexity.

In contrast, in order to investigate the compromise between hardware cost and signal processing time, time-variant arrays whose element positions are changed over time are examined by time-divided sampling rather than simultaneously sampling as static array. Many researchers have reported using time-variant arrays to improve DOA estimation performance [29–35]. Instead of using a set of different elements to process the incident signals, the time-variant array can only use one or a small number of moving elements to construct virtual antenna arrays. Wan et al. proposed a method of combining the characteristics of arbitrary virtual baseline to construct virtual 3D array [29]. However, the number of sub-array elements is too small so that it cannot have high resolution. Liu examined a rotating long baseline interferometer whose length is much larger than one wavelength to estimate 2D DOAs by constructing virtual 2D circular arrays [30]. However, the 2D circular array has limited elevation aperture and still cannot well estimate the elevation angle.

The MUSB aperiodic array reconstructed from swarming UAVs proposed herein has aperture dimensions in both azimuth and elevation directions, which increase the accuracy of both azimuth and elevation angle estimation. Corner and Lamont proposed a parallel simulation of UAV swarm scenarios [36], and Saad et al. reported a testbed of vehicle swarm rapid prototyping [37]. Recently, many researchers investigated the methods and impact factors of designing the robust MUSB antenna arrays for signal collection platforms [38–41]. However, those works are limited in MUSB array constructing investigations including UAV positional precision, turbulence of the environment, micro-UAV swarm algorithm, and swarm-based real-time data collection. In this paper, 2D DOA estimation using the

MUSB aperiodic array is provided, a mathematical model of the MUSB data collection system for signal processing is offered firstly and the impact of associated parameters on DOA estimation accuracy and convergence in this model are analyzed. The MUSB arrays have advantages of large aperture, large interelement spacing, no shadowing effect, low mutual coupling effect and large spatial sampling data from different locations in free space.

In practice, the MUSB system for DOA estimation requires low snapshots and might be applied in low SNR scenarios. However, the subspace-based techniques require adequate SNR and snapshots to guarantee good performance. We utilize the iterative method to lower the noise floor by multiplying the current MUSIC spectrum and previous spectrum for each iteration. The details of the iterative-MUSIC algorithm will be presented in Section 5.

This paper mainly contributes to the array signal processing area in three aspects. First, the mathematical model of the MUSB aperiodic array data collection system is introduced to demonstrate the fundamental DOA estimation impact factors and performance. Second, the CRB associated with DOAs in the presence of the gain and phase coefficient in the system is derived to reveal some direction-finding properties such as the global convergence, snapshots, SNR, and the number of arrays. Third, a successive DOA refinement procedure with iterative-MUSIC algorithm is provided based on the reconstructed arrays and spectrum to meet the requirement of high-precision DOA estimation. The rest of this paper mainly consists of six sections. Section 2 introduces the MUSB mathematical model; Section 3 derives the CRB associated with source DOAs in the presence of the sensor gain and phase coefficient; Section 4 proposes performance analysis of the MUSB system for the single-emitter case; Section 5 gives the algorithm applied in this paper; Section 6 provides simulation and measurement results in different scenarios and Section 7 concludes the paper.

Glossary of notations is listed below: *Ck*×*<sup>p</sup>* = the space of *k* × *p* complex-valued matrices; *E* = expectation operator; *Aij* <sup>=</sup> the i, j element of a general matrix *A* ∈ *Ck*×*p*; *AT* <sup>=</sup> the transpose of *A* ∈ *Ck*×*p*; *AH* = the conjugate transpose of *A* ∈ *Ck*×*p*; Re (A) = the real part of *A* ∈ *Ck*×*p*; Im (A) = the image part of *A* ∈ *Ck*×*p*; tr (A) = the trace of *A* ∈ *Ck*×*k*; det (A) = the determinant of *A* ∈ *Ck*×*k*; *A B* = the Schur–Hadamard matrix product of *A*, *B* ∈ *Ck*×*p*, defined by [*A B*]*ij* = *AijBij*; *A* ⊗ *B* = the Kronecker matrix product of *A*, *B* ∈ *Ck*×*p*, defined by

$$A \otimes B = \begin{bmatrix} a\_{11}B & \cdots & a\_{1j}B \\ \vdots & \ddots & \vdots \\ a\_{i1}B & \cdots & a\_{ij}B \end{bmatrix} \tag{1}$$

*z* ∼ *cN*(μ(α), ζ(α)) = the complex Gaussian distribution of the complex random vector *z* with mean μ and variance ζ, and α is a real-valued parameter vector that completely and uniquely specifies the distribution of z (see [42]).

#### **2. Problem Formulation**

#### *2.1. Swarming UAV Synthetic Aperture*

A swarming UAV synthetic aperture was presented in our early published paper [43]. Figure 1 shows a graphical representation of a UAV swarm as it morphs in time (iteration *I* in this paper). Each of the *M* agents in the swarm has a location, orientation, and trajectory. Notionally, these have position *Pm*,*i*(*r*, θ, φ), where *m* is the agent's index and *i* is the iteration. During swarming, the agents undergo rotations and translations, where a dual quaternion framework provides a convenient mechanism to handle this behavior. This motion rotates the agents' local *(u, v, w)* coordinate systems that describe the spatial orientation of their antenna radiation pattern with respect to (w.r.t.) the global coordinate system and the direction to the incoming signal of interest *S*(θ*n*, φ*n*), which is the *n*th source. The collection of these measurements over iteration creates a synthetic aperture that can be used to calculate the

parameters of interest (θ*n*, φ*n*). Notionally, *K* is independent data sampled for each agent in each iteration and *K* is usually called a "snapshot".

**Figure 1.** Morphing micro-UAV swarm-based (MUSB) antenna array configuration.

#### *2.2. Signal Model*

Friedlander and Weiss presented a mutual coupling model in the presence of sensor mutual coupling, gain, and phase uncertainties [44]. We neglect the mutual coupling effect in the signal model since the spacing of aperiodic array reconstructed from swarming UAVs is expected to be much larger than a wavelength. Consider that an arbitrary array of *M* elements receive *N* uncorrelated incident signals in the far-field demonstrated in part 1. Thus, the received signal for the *i-*th iteration can be represented as

$$X\_i(k) = \Gamma\_i \cdot\_i \cdot S\_i(k) + W\_i(k)k = 1, 2, \cdots, K; \quad i = 1, 2, \cdots, I \tag{2}$$

where *Xi*(*k*)=[*X*1,1(*k*), ··· , *<sup>X</sup>*1,*M*(*k*), ··· , *Xi*,1(*k*), ··· , *Xi*,*M*(*k*)]*T*, *Si*(*k*) = [*S*1,1(*k*), ··· , *<sup>S</sup>*1,*N*(*k*), ··· , *Si*,1(*k*), ··· , *Si*,*N*(*k*)]*T*, *Wi*(*k*)=[*W*1,1(*k*), ··· , *<sup>W</sup>*1,*M*(*k*), ··· ,*Wi*,1(*k*), ··· , *Wi*,*M*(*k*)]*T*, <sup>Γ</sup>*<sup>i</sup>* <sup>=</sup> *diag <sup>g</sup>*1,1*e*−*j*ω0ψ1,1 , ··· , *<sup>g</sup>*1,*Me*<sup>−</sup>*j*ω0ψ1,*M*, ··· , *gi*,1*e*−*j*ω0ψ*i*,1 , ··· , *gi*,*Me*<sup>−</sup>*j*ω0ψ*i*,*<sup>M</sup>* , *<sup>A</sup>i*,*mn* = *<sup>e</sup>*−*j*ω0τ*i*,*mn* and *m* = 1, 2, ··· , *M*; *n* = 1, 2, ··· , *N*; *i* = 1, 2, ··· , *I*. Then,

$$X(k) = \sum\_{i=1}^{I} X\_i(k) \quad = \sum\_{i=1}^{I} \left[ \Gamma\_i \xleftarrow{}\_i \text{S}\_i(k) + \mathcal{W}\_i(k) \right] \tag{3}$$

therefore, *<sup>S</sup>*(*k*) = *<sup>I</sup> i*=1 *Si*(*k*),*W*(*k*) = *<sup>I</sup> i*=1 *Wi*(*k*),<sup>Γ</sup> <sup>=</sup> *<sup>I</sup> i*=1 <sup>Γ</sup>*<sup>i</sup>* , and *<sup>A</sup>* <sup>=</sup> *<sup>I</sup> i*=1 *Ai*. Note that the sensor gain *gi*,*<sup>m</sup>* and phase ψ*i*,*<sup>m</sup>* change w.r.t. element location based on orientation of UAV; τ*i*,*mn* changes w.r.t. location; *Si* is constant; *Wi* may change w.r.t. location, velocity of UAV, and environment.

Since the sources we consider here are in the far field from the observing array. It is easy to find that τ*i*,*mn* can be represented by τ*i*,*mn* = −*di*,*mn*/*c*, and then

$$d\_{i,mn} = x\_{i,m} \sin \theta\_n \cos \phi\_n + y\_{i,m} \sin \theta\_n \sin \phi\_n + z\_{i,m} \cos \theta\_n \tag{4}$$

where *di*,*mn* is the distance from origin (reference sensor) of the coordinate to the *m-*th sensor in the direction of the *n*th source for the *i-*th iteration, c is the propagating velocity in free space, (*xi*,*m*, *yi*,*m*, *zi*,*m*) are the coordinates of the *m-*th sensor for the *i-*th iteration, (θ*n*, φ*n*) are the DOAs of the *n-*th source in the sphere coordinate. From Equations (5), the matrix *A*can be obtained by

$$\widetilde{A}\_{l,\text{int}} = e^{\left(\left(\omega\_{0}/\epsilon\right)\left(x\_{\psi0}\sin\theta\_{h}\cos\phi\_{0} + y\_{\psi0}\sin\theta\_{h}\sin\phi\_{0} + z\_{\psi0}\cos\theta\_{h}\right) - \epsilon\right)} = e^{\left(\left(2\pi/\lambda\right)\left(x\_{\psi0}\sin\theta\_{h}\cos\phi\_{0} + y\_{\psi0}\sin\theta\_{h}\sin\phi\_{0} + z\_{\psi0}\cos\theta\_{h}\right) - \epsilon\right)} \tag{5}$$

where λ is wavelength.

#### **3. The CRB**

#### *3.1. Swarming UAV Synthetic Aperture*

In theory, for a static array, the steering vector *A*is considered invariant over different snapshots since the array geometry is invariant. Assume *z* ∼ *cN*(μ(α), ζ(α)), then, the *m*,*n-*th general formula of the Fisher information matrix (FIM) on the covariance matrix of any unbiased estimate of α is:

$$F\_{\rm mu} = \text{tr}\left[\zeta^{-1}(a)\frac{\partial\zeta(a)}{\partial a\_{\rm m}}\zeta^{-1}(a)\frac{\partial\zeta(a)}{\partial a\_{\rm n}}\right] \\
+ 2\text{Re}\left\{\frac{\partial\mu(a)}{\partial a\_{\rm m}}\zeta^{-1}(a)\frac{\partial\mu(a)}{\partial a\_{\rm n}}\right\} \tag{6}$$

where α*<sup>m</sup>* denotes the *m-*th component of α. The general formula has been presented in [45] and proved in [46].

Petre and Nehorai presented the deterministic and stochastic CRB in [46]. For deterministic CRB, the parameters, mean, and variance of the complex distribution are given by <sup>α</sup> <sup>=</sup> θ, Re[*s*[*t*]], Im[*s*[*t*]]*<sup>K</sup> <sup>k</sup>*=1, σ<sup>2</sup> , μ(α) = [*As*[*k*]]*<sup>K</sup> <sup>k</sup>*=1, <sup>ζ</sup>(α) = *block* <sup>−</sup> *diag* σ2*I* . Then, the *m,n*-th FIM is

$$F\_{mn} = K \frac{\mu}{\sigma^4} \frac{\partial \sigma^2}{\partial \alpha\_m} \frac{\partial \sigma^2}{\partial \alpha\_n} + \frac{2}{\sigma^2} \quad \cdot \sum\_{k=1}^K \left[ \frac{\partial}{\partial \alpha\_m} A s[k] \right]^H \left[ \frac{\partial}{\partial \alpha\_{ll}} A s[k] \right] \tag{7}$$

For stochastic CRB, the parameters, mean and variance of the complex distribution are given by <sup>α</sup> = <sup>θ</sup>, {Re{*Pmn*}, Im{*Pmn*}}*<sup>K</sup> <sup>m</sup>*,*n*=1, <sup>σ</sup><sup>2</sup> ,μ(α) = 0,ζ(α) = *block* − *diag*(*R*). Then, the *m,n*-th FIM is

$$F\_{\rm III} = K \left[ \mathcal{R}^{-1}(\alpha) \frac{\partial \mathcal{R}(\alpha)}{\partial \alpha\_m} \mathcal{R}^{-1}(\alpha) \frac{\partial \mathcal{R}(\alpha)}{\partial \alpha\_n} \right] \tag{8}$$

Since the signals estimated cannot be known completely and the signals in the practical are stochastic, this paper considers the stochastic CRB.

#### *3.2. CRB for the UAV Swarming System*

Before deriving the CRB, we assume that both incident signals and noise are stationary and the ergodic complex Gaussian random process with zero mean and nonsingular covariance matrix is uncorrelated with each other. The columns of *A* = Γ*A* are linearly independent. An additional assumption is that the number of array elements reconstructed from swarming UAVs is greater than the number of sources. Therefore, the matrix of the steering vector has a full column rank.

The covariance matrices of signal, noise and observation vectors for *i-*th iteration are given by

$$\begin{array}{c} P\_{\hat{i}} = E \Big[ S\_{\hat{i}} S\_{\hat{i}}^{H} \Big] \sigma\_{i}^{2} I\_{0} = E \Big[ \mathcal{W}\_{\hat{i}} \mathcal{W}\_{\hat{i}}^{H} \Big] \\ R\_{\hat{i}} = E \Big[ X\_{\hat{i}}(k) X\_{\hat{i}}^{H}(k) \Big] = \Gamma\_{\hat{i}} A\_{\hat{i}} P\_{\hat{i}} \widetilde{A}\_{\hat{i}}^{H} \Gamma\_{\hat{i}}^{H} + \sigma\_{\hat{i}}^{2} I\_{0} \ = A\_{\hat{i}} P\_{\hat{i}} A\_{\hat{i}}^{H} + \sigma\_{\hat{i}}^{2} I\_{0} \end{array} \tag{9}$$

where *A<sup>i</sup>* is the steering vector for the *i*-th iteration and *Ai* - Γ*iAi*. It is useful to observe that if we let the sample data covariance matrix *R*ˆ *<sup>i</sup>* = <sup>1</sup> *K K k*=1 *Xi*(*k*)*Xi <sup>H</sup>*(*k*), and *R*ˆ = <sup>1</sup> *I I i*=1 *R*ˆ *<sup>i</sup>*, then lim *<sup>K</sup>*→∞*R*<sup>ˆ</sup> *<sup>i</sup>* <sup>=</sup> *Ri*, and lim*I*→∞*R*<sup>ˆ</sup> <sup>=</sup> *<sup>R</sup>*.

The log-likelihood function of *K* independent samples in *i-*th iteration of a zero-mean complex Gaussian random process *Xi*(*k*) whose statistics depend on a parameter vector α is given by

$$\begin{split} L\_i(\boldsymbol{\alpha}) &= -K \ln[\det[\boldsymbol{R}\_i \boldsymbol{\pi}]] - \sum\_{k=1}^{K} \boldsymbol{x}\_i^H(k) \boldsymbol{R}\_i^{-1} \boldsymbol{x}\_i(k) \\ &= \boldsymbol{Z} - K \ln[\det[\boldsymbol{R}\_i]] - \sum\_{k=1}^{K} \boldsymbol{x}\_i^H(k) \boldsymbol{R}\_i^{-1} \boldsymbol{x}\_i(k) \end{split} \tag{10}$$

where *Z* denotes the constant term of the log-likelihood function, det(R) represents the determinant of the matrix R, and *Ri* is the time-varying covariance matrix w.r.t. the iteration. Thus, the log-likelihood function of *X*(*k*) is:

$$\begin{array}{ll} L(a) &= -K \sum\_{i=1}^{I} \ln[\det(\mathbf{R}\_i \boldsymbol{\pi})] - \sum\_{i=1}^{I} \sum\_{k=1}^{K} \mathbf{x}\_i^H(k) \mathbf{R}\_i^{-1} \mathbf{x}\_i(k) \\ &= \sum\_{i=1}^{I} \left\{ Z - K \ln[\det(\mathbf{R}\_i)] - \sum\_{k=1}^{K} \mathbf{x}\_i^H(k) \mathbf{R}\_i^{-1} \mathbf{x}\_i(k) \right\} \\ &= \sum\_{i=1}^{I} L\_i(a) \end{array} \tag{11}$$

Therefore, the *m*,*n-*th elements of FIM are given by

$$F\_{\rm mu} = -E\left[\frac{\partial^2 L(a)}{\partial a\_{\rm H} \partial a\_{\rm H}}\right] = -\sum\_{i=1}^{I} E\left[\frac{\partial^2 L\_i(a)}{\partial a\_{\rm H} \partial a\_{\rm H}}\right] \\ \quad = \sum\_{i=1}^{I} \left\{ K \left. tr \middle| \boldsymbol{R}\_i^{-1}(a) \frac{\partial \mathcal{R}\_i(a)}{\partial a\_{\rm H}} \boldsymbol{R}\_i^{-1}(a) \frac{\partial \mathcal{R}\_i(a)}{\partial a\_{\rm H}} \right\} \right\} \\ \quad = \sum\_{i=1}^{I} F\_{i, \rm mu} \tag{12}$$

It follows that the FIM's submatrix *Fmn* for the UAV swarming data collecting system can be obtained by summing the single-iteration *Fi*,*mn* of FIM over the iterations. Furthermore, the FIM's submatrix *Fi*,*mn* can be obtained by multiplying the single-snapshot *Fi*,*mn* and the number of snapshots. Thus, we only need to know the single-snapshot *Fi*,*mn* for the *i-*th iteration. The problem of the major interest is the estimation of the incident angles of the sources. Expression of CRB for 1D DOA of each iteration for the present problem is listed in [47]. The CRB of 2D DOA estimation with arbitrary array for the *i-*th iteration, which can be considered as an arbitrary static array, presented in this paper is given in the Appendix A.

#### **4. Analysis of Single-Emitter Case**

In this section, we investigate more details of the MUSB data collecting system with single-emitter case. The unknown parameters we consider here are the 2D DOAs (θ, φ). Assume the source variance is *P*, the noise variance is σ2, the snapshot for each iteration is *K*, and the iteration is *I*. From the Appendix A, we have the formula of FIM w.r.t. 2D DOAs in the *i-*th iteration of the MUSB system.

$$F\_{i, \varphi, 2} = \left[ \begin{array}{c} F\_{i, 0 \emptyset} \ F\_{i, 0 \phi} \\ F\_{i, \phi \emptyset} \ F\_{i, \phi \phi} \end{array} \right] = \frac{2K}{\sigma^2} \{ \text{Re} [D\_i^H A\_i^+ D\_i \odot 1\_2 \mathbf{1}\_2^T \otimes \mathcal{U}\_i] \}\tag{13}$$

The *m*th element of the steering vector for the *i-*th iteration is given by

$$a\_{i,\mathfrak{m}}(\theta\_{\prime}\phi) = \gamma\_{i,\mathfrak{m}} \cdot \exp[j\frac{2\pi}{\lambda}(\mathbf{x}\_{i,\mathfrak{m}}\sin\theta\cos\phi + y\_{i,\mathfrak{m}}\sin\theta\sin\phi + z\_{i,\mathfrak{m}}\cos\theta)]\tag{14}$$

where γ*i*,*<sup>m</sup>* is the gain and phase parameter and can be represented as γ*i*,*<sup>m</sup>* = *gi*,*me*<sup>−</sup>*j*ω0ψ*i*,*<sup>m</sup>* . Taking the derivative of *ai*(θ, φ) w.r.t. θ and φ for the *i-*th iteration, we obtain

$$d\_{i, \theta} = \frac{d}{d\theta} a\_i(\theta, \phi) = j \frac{2\pi}{\lambda} b\_{i, \mathfrak{m}} \odot a\_i(\theta, \phi), \; d\_{i, \phi} = \frac{d}{d\phi} a\_i(\theta, \phi) = j \frac{2\pi}{\lambda} q\_{i, \mathfrak{m}} \odot a\_i(\theta, \phi) \tag{15}$$

where *bi*,*<sup>m</sup>* = *xi*,*<sup>m</sup>* cosθcosφ + *yi*,*<sup>m</sup>* cosθsinφ − *zi*,*<sup>m</sup>* sinθ,*qi*,*<sup>m</sup>* = −*xi*,*<sup>m</sup>* sinθsinφ + *yi*,*<sup>m</sup>* sinθcosφ. Note

$$A\_i^H A\_i = \sum\_{m=1}^{M} g\_{i,m}^2 = G\_{i,M} \tag{16}$$

Therefore, it is straightforward to verify that

$$D\_i^H(\theta)D\_i(\theta) = \left(-j\frac{2\pi}{\lambda}b\_iA\_i^H\right)\left(j\frac{2\pi}{\lambda}b\_ib\_iA\_i\right) = \frac{4\pi^2}{\lambda^2}\sum\_{m=1}^M b\_{i,m}^2 g\_{i,m}^2\tag{17}$$

$$D\_i^H(\phi)D\_i(\phi) = \left(-j\frac{2\pi}{\lambda}q\_iA\_i^H\right)\left(j\frac{2\pi}{\lambda}q\_ia\_i\right) \\ = \frac{4\pi^2}{\lambda^2}\sum\_{m=1}^M q\_{i,m}^2 g\_{i,m}^2 \tag{18}$$

$$D\_i^H(\theta)D\_i(\phi) = \left(-j\frac{2\pi}{\lambda}b\_iA\_i^H\right)\left(j\frac{2\pi}{\lambda}q\_ia\_i\right)\_- = \frac{4\pi^2}{\lambda^2}\sum\_{m=1}^M b\_{i,m}q\_{i,m}g\_{i,m}^2 = D\_i^H(\phi)D\_i(\theta) \tag{19}$$

Substituting Equations (14), (15), and (17), we can obtain

$$\begin{split} D\_{l}^{H}(\theta)A\_{l}^{\perp}D\_{l}(\theta) &= D\_{l}^{H}(\theta) \Big( \mathbb{I}\_{0} - A\_{l} \Big( A\_{l}^{H}A\_{l} \Big)^{-1} A\_{l}^{H} \Big) \mathcal{D}\_{l}(\theta) = D\_{l}^{H}(\theta)D\_{l}(\theta) - \frac{D\_{l}^{H}(\theta)A\_{l}A\_{l}^{H}\mathcal{D}\_{l}(\theta)}{C\_{l,0}} \\ &= \frac{4\pi^{2}}{\lambda^{2}} \sum\_{m=-1}^{M} b\_{l,m}^{2} \varrho\_{l,m}^{2} - \frac{4\pi^{2}}{\lambda^{2}} \frac{1}{C\_{l,0}} \Big( \sum\_{m=-1}^{M} b\_{l,m} \varrho\_{l,m}^{2} \Big)^{2} = \frac{4\pi^{2}}{\lambda^{2}} \left\{ \sum\_{m=-1}^{M} b\_{l,m}^{2} \varrho\_{l,m}^{2} - \frac{1}{C\_{l,0}} \left( \sum\_{m=-1}^{M} b\_{l,m} \varrho\_{l,m}^{2} \right)^{2} \right\} \end{split} \tag{20}$$

Using the same derivative procedure, we can obtain

$$D\_i^H(\phi)A\_i^\perp D\_i(\phi) = \frac{4\pi^2}{\lambda^2} \left\{ \sum\_{m=1}^M q\_{i,m}^2 g\_{i,m}^2 - \frac{1}{G\_{i,M}} \cdot \sum\_{m=1}^M q\_{i,m} g\_{i,m}^2 \right\}^2 \tag{21}$$

$$D\_i^H(\theta)A\_i^\perp D\_i(\phi) = \frac{4\pi^2}{\lambda} \left\{ \sum\_{m=1}^M b\_{i,m} q\_{i,m} g\_{i,m}^2 - \frac{1}{\mathcal{G}\_{i,M}} \cdot \sum\_{m=1}^M b\_{i,m} g\_{i,m}^2 \sum\_{m=1}^M q\_{i,m} g\_{i,m}^2 \right\} = D\_i^H(\theta) A\_i^\perp D\_i(\phi) \tag{22}$$

Furthermore,

$$\mathrm{d}I\_{i} = \mathrm{P}A\_{i}^{H}\mathrm{R}\_{i}^{-1}\\\mathrm{A}\_{i}\mathrm{P} = \mathrm{P}^{2}A\_{i}^{H}\left(\mathrm{A}\_{i}^{H}\mathrm{P}\mathrm{A}\_{i} + \sigma^{2}I\_{0}\right)^{-1}\\\mathrm{A}\_{i} = \mathrm{P}^{2}\mathrm{G}\_{i,M}\left(\mathrm{PG}\_{i,M} + \sigma^{2}\right)^{-1} = \frac{\mathrm{P}^{2}\mathrm{G}\_{i,M}}{\mathrm{PG}\_{i,M} + \sigma^{2}}\tag{23}$$

Therefore,

$$F\_{i,00} = \frac{8K\pi^2}{\lambda^2 \sigma^2} \frac{P^2 G\_{i,M}^2}{P G\_{i,M} + \sigma^2} \left\{ \frac{1}{G\_{i,M}} \sum\_{m=1}^M b\_{i,m}^2 g\_{i,m}^2 - \left\{ \frac{1}{G\_{i,M}} \sum\_{m=1}^M b\_{i,m} g\_{i,m}^2 \right\}^2 \right\} \tag{24}$$

$$F\_{i, \phi \phi} = \frac{8K\pi^2}{\lambda^2 \sigma^2} \frac{P^2 G\_{i,M}^2}{P G\_{i,M} + \sigma^2} \left\{ \frac{1}{G\_{i,M}} \sum\_{m=1}^M q\_{i,m}^2 g\_{i,m}^2 - \left\{ \frac{1}{G\_{i,M}} \sum\_{m=1}^M q\_{i,m} g\_{i,m}^2 \right\}^2 \right\} \tag{25}$$

$$F\_{i,0;\phi} = \frac{8K\pi^2}{\lambda^2 \sigma^2} \frac{P^2 G\_{i,M}^2}{P G\_{i,M} + \sigma^2} \left\{ \frac{1}{G\_{i,M}} \sum\_{m=1}^M b\_{i,m} q\_{i,m} g\_{i,m}^2 - \left\{ \frac{1}{G\_{i,M}^2} \sum\_{m=1}^M b\_{i,m} g\_{i,m}^2 \sum\_{m=1}^M q\_{i,m} g\_{i,m}^2 \right\} \right\} = F\_{i,\phi 0} \tag{26}$$

Assume

$$B\_i = \frac{1}{G\_{i,M}} \sum\_{m=1}^{M} b\_{i,m}^2 g\_{i,m}^2 - \left(\frac{1}{G\_{i,M}} \sum\_{m=1}^{M} b\_{i,m} g\_{i,m}^2\right)^2,\\ Q\_i = \frac{1}{G\_{i,M}} \sum\_{m=1}^{M} q\_{i,m}^2 g\_{i,m}^2 - \left(\frac{1}{G\_{i,M}} \sum\_{m=1}^{M} q\_{i,m} g\_{i,m}^2\right)^2 \tag{27}$$

$$V\_i = \frac{1}{G\_{i,M}} \sum\_{m=1}^{M} b\_{i,m} q\_{i,m} g\_{i,m}^2 - \left(\frac{1}{G\_{i,M}^2} \sum\_{m=1}^{M} b\_{i,m} g\_{i,m}^2 \sum\_{m=1}^{M} q\_{i,m} g\_{i,m}^2\right) \\ C\_i = \frac{G\_{i,M}^2 SNR^2}{G\_{i,M} SNR + 1} \tag{28}$$

where *SNR* = *P*/σ2. Thus, summing over iteration, we obtain the CRB

$$\text{CRB} = \begin{bmatrix} F\_{\theta\theta} & F\_{\theta\phi} \\ F\_{\phi\theta} & F\_{\phi\phi} \end{bmatrix}^{-1} \tag{29}$$

where *<sup>F</sup>*θθ = <sup>8</sup>*K*π<sup>2</sup> λ2 *I i*=1 *CiBi*, *<sup>F</sup>*φφ <sup>=</sup> <sup>8</sup>*K*π<sup>2</sup> λ2 *I i*=1 *CiQi*, *<sup>F</sup>*θφ <sup>=</sup> <sup>8</sup>*K*π<sup>2</sup> λ2 *I i*=1 *CiVi* = *F*φθ. If we ignore the sensor orientation (i.e., Γ = 1) and let *K* = 1, then *AH <sup>i</sup> Ai* = *M* and the FIM w.r.t. 1-D DOA is given by

$$F\_{\theta} = \frac{8\pi^2}{\lambda^2} \frac{M^2 SNR^2}{MSNR + 1} B$$

where *<sup>B</sup>* <sup>=</sup> *<sup>I</sup> i*=1 *Bi* and *Bi* = <sup>1</sup> *M M m*=1 *b*2 *<sup>i</sup>*,*<sup>m</sup>* − 1 *M M m*=1 *bi*,*<sup>m</sup>* 2 , which coincides with the results in [47].

#### **5. The Algorithm**

#### *5.1. UAV Parameters*

Assume there are *M* swarming UAVs and each UAV swarms in a cylinder region (radius *r*, *h* = 2r). Each UAV has an initial location (x, y, z) in the swarming region with a vector velocity <sup>→</sup> *Vi*,*<sup>m</sup>* in each iteration and the initial locations of the *M* UAVs are considered as the first iteration. The swarming short distance in each iteration is represented as a vector <sup>→</sup> *di*,*m*. The scalar quantity can be represented as *d*, which uses the following relation:

$$d = \beta\_{\rm l} \cdot \lambda / \rho \tag{31}$$

where β*<sup>t</sup>* is a uniformly distributed random number matrix between zero and one, λ is wavelength and *p* is the coefficient determining the distribution mean of swarm distance. Figure 2 shows the distribution of short distance with mean μ = 0.25 wavelength. Here, the range of *d* also depends on the speed of UAV and data sampling interval and can be configured by the customer. We can obtain

**Figure 2.** Distribution of short distance that one unmanned aerial vehicle (UAV) swarms.

$$
\stackrel{\rightarrow}{d\_{i,m}} = \stackrel{\rightarrow}{V\_{i,m}} \cdot \text{t} \tag{32}
$$

#### *5.2. Data Processing and Algorithm*

When UAV swarms, a significant amount of data/information may be sampled due to the number of swarming UAVs and number of iterations. Herein, the location of each UAV is considered as one data point. Thus, when *M* UAVs morph *I* times, we have *M* ∗*I* data points which will be used to reconstruct virtual (e.g., synthetic) 3D aperiodic arrays to calculate the MUSIC spectrum and estimate DOAs. The number of data points (represented as *Nd*) is equivalent to the number of elements for a static array.

One challenge is how to choose *Nd* for computing the MUSIC spectrum at each signal processing iteration (represented as p-iteration). We would like to use more *Nd* in each p-iteration since more *Nd* used for MUSIC spectrum calculation each time means that more array elements are used for data processing in a static antenna array, and are more accurate for DOA estimation. However, more data processing points cause higher calculation cost. Thus, it is necessary to compromise *Nd* at each *p*-iteration and computational complexity.

Another problem is how to well choose data collecting and processing methods. Wide range of methods can be deployed to collect, concatenate, and process data collected from time-dependent measurements. MUSIC algorithm requires a minimum of three unique samples collected from these time-dependent measurements to provide DOA estimation. The required data samples can be obtained from agents in the swarm (e.g., one UAV locally collecting three samples at different times, or three spatially distributed UAVs collecting one sample simultaneously). We need to consider spatiotemporal distribution of samples (such as sampling rate w.r.t. wavelength, trajectory and velocity of UAVs, orientation of sensors, etc.) and iterative processing of measurements (use all data collected, truncate or applying a moving window, etc.). The iterative processing method with a moving window is called the iterative-MUSIC used in this paper.

Figure 3 shows the data processing schematic for three data points in each iteration within the MUSB system. The left part in Figure 3 shows a UAV swarm and sampling method, and the right part shows the algorithm and data processing method. When UAV swarms to a certain location, we will sample *K* times and each snapshot takes time *dt*. After taking *K* snapshots, the program sets up a data point. When *Nd* = 3, the program computes the stand MUSIC spectrum using three data points and stores the result. Then when UAV swarms again, we accumulate the current data point and two previous data points to calculate the MUSIC spectrum. Then we multiply the current and previous MUSIC spectrum at each p-iteration where we obtain the iterative-MUSIC spectrum to reduce noise level and improve DOA estimation performance. Note that if the p-iteration is too big, the value of spectral points will be very small and might be taken as zero. If so, DOA cannot be estimated and we may use dB instead of a number at that situation. Then, we refine and estimate DOAs and use predefined DOA estimation precision criteria to stop the process.

**Figure 3.** Data processing schematic for MUSB system.

#### *5.3. MUSIC Algorithm for MUSB Array*

As stated in part 2 of this section, the number of reconstructed MUSB arrays is *Nd*. Here, we rewrite the covariance of signals listed in Part 2 of Section 3 and give the first *p*-iteration (slightly different from UAV swarming iteration) of the iterative-MUSIC algorithm.

Rewriting the data model (1), we obtain

$$X(k) = \Gamma \begin{array}{c} \top \cdot S(k) + \mathcal{W}(k) \\ \end{array} = A \cdot S(k) + \mathcal{W}(k) \qquad \qquad \qquad k = 1, 2, \cdots, K \tag{33}$$

where *<sup>X</sup>*{*k*} <sup>∈</sup> *<sup>C</sup>Nd*×<sup>1</sup> are the vectors of sampled data, *<sup>S</sup>*{*k*} <sup>∈</sup> *CN*×<sup>1</sup> are the source signals, Γ ∈ *CNd*×*Nd* are the gain and phase of sensors and *<sup>A</sup>* <sup>∈</sup> *<sup>C</sup>Nd*×*<sup>N</sup>* are the regular steering vectors. Thus, the covariance of *X*(*k*) is

$$R\_{\perp} = \,^E \left[ X(k) X^H(k) \right] = \Gamma \tilde{A} P \tilde{A}^H \Gamma^H + \sigma^2 I\_0 = A P A^H + \sigma^2 I\_0 \tag{34}$$

Define

$$R\_{\mathbb{S}} = APA^H \tag{35}$$

*Rs* is *Nd* × *Nd* matrix with rank *N*. Therefore, it has *Nd* − *N* repeated eigenvectors corresponding to the minimum eigenvalues <sup>σ</sup>2. Let *ei* be such an eigenvector so that *Rsei* = 0 or *<sup>A</sup>Hei* = 0, thus, *Nd* <sup>−</sup> *<sup>N</sup>* eigenvectors *ei* corresponding to the minimal eigenvalues are orthogonal to each of *N* signal columns of *A* = Γ*A*, proved in [5]. *Nd* − *N* dimensional subspace spanned by the noise eigenvectors is defined as noise subspace and *N* dimensional subspace spanned by incident signal mode vectors is defined as signal subspace.

Let *Qn* be *Nd* × (*Nd* − *N*) matrix of noise eigenvectors, then the MUSIC spatial spectrum function is given by

$$P\_{\rm MLI}(\theta,\phi) = \frac{1}{\widetilde{a}(\theta,\phi)^H \Gamma^H Q\_n Q\_n{}^H \Gamma \widetilde{a}(\theta,\phi)} = \frac{1}{\left\| Q\_n{}^H \Gamma \widetilde{a}(\theta,\phi) \right\|^2} = \frac{1}{\left\| Q\_n{}^H a(\theta,\phi) \right\|^2} \tag{36}$$

Then, the search spectrum peaks in the range of θ and φ, and the peak spectrum points we obtain are the estimation of arrival angles of incident waves.

#### *5.4. Convergence Check*

The algorithm performs the calculation until the system converges. The convergence can be guaranteed since the estimated DOA is a convergent series.

When the signal is covered by a high noise level, the estimated DOAs might be far from the ground truth and cannot be judged for convergence. However, as the iteration increases, the noise level is reduced and estimated DOAs are converged gradually. The Equation for judging convergence is given by

$$DOA\_{i+1} - DOA\_i \le \varepsilon, \qquad \qquad i = 1, 2, \dots, I\_{\text{m}} \tag{37}$$

where *Im* is the number of p-iteration of the iterative-MUSIC algorithm and ε is the preset threshold. The numerical simulation is given in part 1 of Section 6.

#### *5.5. Computational Complexity Analysis*

The orders of computational complexities of conventional spectral MUSIC [5], iterative-MUSIC, and eigenstructure-based algorithm with array interpolation (denoted by array interpolation) [48] are compared in Table 1. In this table, we assume the interpolated number of the array interpolation technique is equal to the number of actual sensors (*M1* = *Nd* ) and the total number of angular sectors is denoted as *I*θ*I*φ. Those methods in Table 1 include the eigendecomposition step represented by the term *O N*2 *dN* and the computation of *J*θ*J*<sup>φ</sup> samples of the MUSIC null-spectrum function represented by *O J*θ*J*φ(*Nd* + 1)(*Nd* − *N*) , where *J*<sup>θ</sup> and *J*<sup>φ</sup> stand for the search numbers along the directions of θ and φ. *J*θ*J*<sup>φ</sup> stands for the total search numbers for each sector of array interpolation algorithm. The iterative-MUSIC at each iteration may have very low snapshot (*K* = 1) different from the traditional spectral MUSIC algorithm which requires high snapshots. The computation complexity of the iterative-MUSIC algorithm is a little bit lower than the array interpolation at each iteration (*J*θ*J*<sup>φ</sup> ≈ *I*θ*I*φ*J*θ*J*<sup>φ</sup> >> *Nd*), but it is higher in the whole UAV swarming period. However, the array interpolation algorithm will be very complex when applied for 3D random arrays.

**Table 1.** The orders of computational complexities of real-valued operations.


#### **6. Simulation and Measurement Results**

In this section, several groups of simulations will be carried out to demonstrate the performance of the presented distributed directional finding system in this paper. As the framework of the MUSB system established in this paper is mentioned for the first time, we focus mainly on analyzing the impact of various factors on the feasibility of the MUSB system. Practical measurements in the lab are also given in part 3 to show the DOA estimation performance in practice.

The wavelength of signal is fixed at 1 meter (m) and the simulation in each scenario is repeated 500 times. The elevation angle of the source emitter is 85◦ and the azimuth angle is 270◦. As one UAV swarms till *Nd* = 3, the iterative-MUSIC algorithm begins to search (0◦, 179◦) space for elevation angle and (0◦, 359◦) space for azimuth angle with 1◦ interval to form the overcomplete MUSIC spectrum for each p-iteration. Then, UAV keeps swarming and the iterative-MUSIC algorithm keeps computing the MUSIC spectrum before the precision of DOA estimation is satisfied. When the preset threshold is satisfied, UAV stops swarming and the reconstructed process of phased arrays is terminated. The refined DOA estimations are obtained by scanning the reconstructed signal peaks from the iterative-MUSIC algorithm with 0.1◦ step during the refinement procedure 10 times.

Moreover, the speed of UAV will influence the snapshot at each location where the system samples the source emitter. The snapshot at each location should be very low if the UAV swarms very fast and the snapshot can be high when the swarming rate of UAV is low. The speed of UAV in this paper will be represented by the distance between two iterations of swarming UAVs shown in Figure 2.

The joint root-mean-square error (RMSE) of incident signals is used for statistical DOA estimation precision evaluation, which is defined as

$$RMSE\_{\mathcal{O},\phi} = \sqrt{\sum\_{w=1}^{W} \sum\_{n=1}^{N} \left[ \left[ \hat{\theta}\_n^w - \theta\_n^w \right]^2 + \left[ \hat{\phi}\_n^w - \phi\_n^w \right]^2 \right] / 2 \text{WN}} \tag{38}$$

where *W* is the number of Monte Carlo simulations, (θ*<sup>w</sup> <sup>n</sup>* , φ*<sup>w</sup> <sup>n</sup>* ) represent the actual DOAs of the *n*th signal, and θˆ*w <sup>n</sup>* , φˆ *<sup>w</sup> n* represent the estimated DOAs of the *n*th signal in the *w-*th simulation.

First, the system convergence will be studied in a typical scenario. Second, the DOA estimation performance using the iterative-MUSIC algorithm will be compared with CRB in various scenarios. Finally, DOA estimation performance in practice will be investigated.

#### *6.1. System Convergence*

Figure 4 provides and example of the MUSB distributed directional finding system gradually converging to the ground truth as iteration increases by using iterative-MUSIC algorithm.

**Figure 4.** Direction-of-arrival (DOA) estimation convergence for the MUSB system. (**a**) *Nd* = 3, SNR = 0 dB, *K* = 1. (**b**) *Nd* = 3, SNR = 0 dB, *K* = 16.

#### *6.2. DOA Estimation Performance*

The performance of DOA estimation depends on multiple factors such as SNR, *K*, *Nd*, velocity of UAV and number of iterations. Furthermore, the performance also depends on the distinctness of the array geometries due to the diversity of different observations at different time instants. In this section, we only consider the single-emitter case. When a single source is present, a typical scenario is set in which UAV swarming short distance mean μ = 2λ, the Monte Carlo simulation number and angles of the source are the same as before. Figure 5 depicts the results predicted by the stochastic CRB derived in Section 4. The joint RMSE of elevation and azimuth angles obtained from the iterative-MUSIC algorithm together with CRB from Section 4 are shown in Figures 6 and 7.

The extreme case is *K* = 1, *Nd* = 3 (2D DOA estimation requires a minimum of three unique samples). The fixed settings and changed settings are listed as: Figure 5a varies snapshots *K* from 1 to 96 for different SNR with iteration *i* = 100, *Nd* = 3; Figure 5b varies SNR from −30 to 0 dB for different UAV iterations with *K* = 1, *Nd* = 3; Figure 5c varies SNR from -30 to 0 dB for different data points *Nd* in each iteration with *K* = 1, *i* = 100; Figure 5d varies SNR for different speed of UAV with *K* = 1, *Nd* = 3, *i* = 100; Figure 6a varies SNR of one signal from -20 to 20 dB with *K* = 1, *Nd* = 3; Figure 6b varies *K* from 1 to 99 with SNR = 0 dB, *Nd* = 3; Figure 6c varies speed of UAV with *K* = 1, *Nd* = 3, SNR = 0 dB; Figure 6d varies incident elevation angles from 5◦ to 90◦ with SNR = 0 dB, *Nd* = 3, *K* = 1; Figure 7a varies sensor gain deviation with SNR = 0 dB, *Nd* = 3, *K* = 1; Figure 7b varies sensor phase deviation with SNR = 0 dB, *Nd* = 3, *K* = 1. Scenarios (a)–(d) in Figure 5 give the lower bounds of the MUSB system. Scenarios (a)–(d) in Figure 6 show the performance of MUSB array without element rotation (sensor gain *g* = 1 and sensor phase ψ = 0 and scenarios (a)–(b) in Figure 7 show the impact factors with the element rotation of MUSB array (sensor gain and phase coefficients have certain deviations).

From Figures 6 and 7, we can find that Figure 6a shows the DOA estimation performance of the MUSB system will increase with increasing SNR; Figure 6b shows that the system can estimate DOAs even when snapshot *K* = 1; Figure 6c shows that when the UAV speed increases, the precision of DOA estimation increases; Figure 6d shows that the precision of DOA estimation increases with increasing elevation angles. Figure 7 shows the performance of the MUSB array with UAV rotating associated with sensor gain and phase varying. Figure 7a shows that the DOA estimation is not significantly impacted as the standard deviation of sensor gain increases; Figure 7b shows the precision of DOA estimation decreases significantly when the standard deviation of the sensor phase increases to a certain value.

In Figure 7b, we find that the stochastic CRB is flat for the single-emitter case as the phase standard phase varies because Equation 16 shown in Section 4 cancels the sensor phase errors by multiplying the steering vector *A* and the conjugate transpose of *A*, while the iterative-MUSIC algorithm does not have the advantages CRB has utilized. Equation 39 shows the advantages of CRB for the single-emitter case and Equations 16–21 give the derivation process with advantages of sensor phase error cancel.

$$A\_i^H A\_i = \left[ \mathcal{g}\_{i,1} \cdot e^{-j\eta\_{1,1}} \cdot \cdots \cdot \mathcal{g}\_{i,m} \cdot e^{-j\eta\_{1,m}} \right] \left| \begin{array}{c} \mathcal{g}\_{i,1} \cdot e^{j\eta\_{1,1}} \\ \vdots \\ \mathcal{g}\_{i,m} \cdot e^{j\eta\_{1,m}} \end{array} \right| = \sum\_{m=1}^{M} \mathcal{g}\_{i,m}^2 \tag{39}$$

where ϕ*i*,*<sup>m</sup>* includes the sensor phase deviation and phase difference from sensor *m* to relative phase center in the *i-*th iteration.

**Figure 5.** DOA standard deviation predicted by the stochastic Cramer–Rao bound (CRB). (**a**) Varying snapshots with different signal-to-noise ratio (SNR); (**b**) varying SNR with different iterations of UAV swarming; (**c**) varying SNR with different array element number; (**d**) varying SNR with different average speed.

**Figure 6.** DOA estimation root mean square error (RMSE) of iterative-MUSIC (multiple signal classification) and CRB in different scenarios. (**a**) Varying SNR (*Nd* = 3); (**b**) varying *K*; (**c**) varying the speed of UAV w.r.t. wavelength; (**d**) varying incident elevation angles.

**Figure 7.** DOA estimation RMSE of iterative-MUSIC and CRB in different scenarios. (**a**) Varying sensor gain standard deviation; (**b**) varying sensor phase standard deviation.

#### *6.3. Measurement*

In the experiment, the test fixture provides a convenient platform to study this morphing in time (using sixteen elements). Randomly positioned rectangular patch antennas designed for 2.45 GHz are used with a randomly morphing volume provided by a moving platform named as "Medusa" to estimate DOAs. Figure 8 shows that one source in the far-field transmits and receiving sensors receive the signal with the vector network analyzer (VNA) measuring the *S21* of transmitting and receiving antennas. Figure 9 shows the measured MUSIC spectrum with 16 spatially distributed elements as a static volumetric random array in Medusa with element rotation (Rotate 0–45 degrees around x, y, z axis randomly, no UAV swarming). Figure 10 shows the measured azimuth and elevation errors gradually converge from around 10–20 degrees to around 1–2 degrees as the iteration increases with UAV swarming. In Figure 10, we take the extreme case (*Nd* = 3, *K* = 1).

**Figure 8.** Test diagram. (**a**) Schematic diagram; (**b**) practical test diagram.

**Figure 9.** Measured MUSIC spectrum with iteration *i* = 1, *K* = 1, *Nd*= 16, an incident signal of azimuth 356.3◦ and elevation 18◦. (**a**) 2D MUSIC spectrum; (**b**) 3D MUSIC spectrum.

**Figure 10.** Measured errors compared to iterations with *Nd*= 3, *K* = 1. (**a**) Element without rotation; (**b**) element with rotation (element rotates 0–45 degrees around x, y, z axis randomly to simulate UAV flying).

#### **7. Conclusions**

This paper establishes a MUSB data collection framework for the first time, which makes it possible to realize source estimation with low snapshot under low SNR environment in a UAV swarming period. Theoretical and experiment results are given to reveal the performance of the MUSB phased array system used for 2D DOA estimation, which supports the feasibility of the system. The iterative-MUSIC algorithm is applied for the framework and it can estimate the DOAs efficiently only with one snapshot in each iteration when UAV swarms very fast. The UAV speed controls the structure of the reconstructed aperiodic phased arrays from the MUSB system and the DOA estimation precision is increased when the distance between the two iterations of swarming UAV is increased. The impact of known sensor gain errors and phase errors from the UAV rotation for the DOA estimation performance are also investigated. Practical experiment results match the theoretical expectation of the MUSB system using the iterative-MUSIC algorithm. Our results will benefit future research on performance analysis and optimal design of time-varying antenna arrays based on the UAV swarm. It is also interesting to extend the results when position errors are present in the future.

**Author Contributions:** Z.C. and G.H.H. proposed the main idea; Z.C., G.H.H., S.Y., G.-F.C. conceived and designed the simulations; Z.C. wrote the paper.

**Funding:** This research received no external funding.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A**

Referred to [49], the stochastic FIM's *F*<sup>θ</sup> w.r.t. 1D incident angle θ of static array is available:

$$F\_{\varnothing} = \mathbb{C}RB^{-1} = \begin{array}{c} 2K \\ \sigma^2 \end{array} \text{[Re}[H \odot \mathbb{U}I] \text{]}\tag{A1}$$

where *H* = *DH I*<sup>0</sup> − *A AHA* −<sup>1</sup> *A<sup>H</sup> <sup>D</sup>*, *<sup>D</sup>* <sup>=</sup> [*d*1, *<sup>d</sup>*2, ··· *dN*], *dn* <sup>=</sup> *da*(θ) *d*θ 0 0 0 0 θ=θ*<sup>n</sup>* , and *U* = *PAHR*−1*AP*. For each iteration, the FIM's *Fi*,<sup>θ</sup> for the presented problem in this paper is given by

$$F\_{i,0} = \text{CRB}\_{i,0}{}^{-1} = \begin{array}{c} \frac{2K}{\sigma^2} \{ \text{Re} [H\_i \odot \mathcal{U}\_i] \} \end{array} \tag{A2}$$

where *Hi* = *Di H I*<sup>0</sup> − *Ai Ai HAi* −<sup>1</sup> *Ai H Di*, *Di* <sup>=</sup> [*di*,1, *di*,2, ··· *di*,*N*], *di*,*<sup>n</sup>* <sup>=</sup> *dai*(θ) *<sup>d</sup>*<sup>θ</sup> |θ=θ*<sup>n</sup>* , *Ui* = *PiAi HRi* <sup>−</sup><sup>1</sup>*AiPi*. As already stated,

$$F\_{\mathcal{O}} = \sum\_{i=1}^{I} F\_{i,\mathcal{O}} \tag{A3}$$

Omitting to the parameters Re(*Pmn*), Im(*Pmn*), σ2, gain and phase of signals, only consider the estimation of elevation and azimuth angles (θ*n*,φ*n*) hereby is given by <sup>α</sup> <sup>=</sup> θ*T*, φ*<sup>T</sup> T* , θ = [θ1, θ2, ··· θ*N*] *<sup>T</sup>*, and <sup>φ</sup> <sup>=</sup> [φ1, <sup>φ</sup>2, ··· <sup>φ</sup>*N*] *<sup>T</sup>*. Thus, the submatrix of FIM associated with 2D DOA is

$$F\_{i,s,2} = \begin{bmatrix} F\_{i,00} & F\_{i,0\phi} \\ F\_{i,\phi0} & F\_{i,\phi\phi} \end{bmatrix} = \frac{2K}{\sigma^2} \{ \text{Re} [D\_i^H A\_i^\perp D\_i \odot 1\_2 \mathbf{1}\_2^T \otimes \mathbf{1}\_i] \}\tag{A4}$$

$$A\_i^\perp = I\_0 - A\_i \left(A\_i^H A\_i\right)^{-1} A\_i^H \tag{A5}$$

where "*i, s, 2*" denotes the 2D stochastic bounds for the *i-*th iteration,12 represents 2x1 vectors of ones and 

$$\begin{array}{l} F\_{i, \emptyset \emptyset} = \frac{2K}{\sigma^2} \left\{ \text{Re} [D\_i^H [\theta] A\_i^\perp D\_i [\theta] \odot \mathcal{U}\_i] \right\} \\ F\_{i, \emptyset \phi} = \frac{2K}{\sigma^2} \left\{ \text{Re} [D\_i^H [\theta] A\_i^\perp D\_i [\phi] \odot \mathcal{U}\_i] \right\} \end{array} \tag{A6}$$

$$\begin{array}{l} F\_{i\phi\phi 0} = \frac{2K}{\varrho^2} \left\{ \text{Re} [D\_i^H [\phi] A\_i^\perp D\_i [\theta] \odot \mathcal{U}\_i] \right\} \\ F\_{i\phi\phi \flat} = \frac{2K}{\varrho^2} \left\{ \text{Re} [D\_i^H [\phi] A\_i^\perp D\_i [\phi] \odot \mathcal{U}\_i] \right\} \end{array} \tag{A7}$$

where *Ai*(θ, φ) = [*ai*[θ1, φ1], *ai*[θ2, φ2], ··· , *ai*[θ*N*, φ*N*]]. Thus, the system stochastic FIM is given by

$$F\_{s,2} = \sum\_{i=1}^{I} F\_{i,s,2} = \begin{bmatrix} F\_{\ell \ell \theta} & F\_{\ell \phi \phi} \\ F\_{\phi \phi \theta} & F\_{\phi \phi \phi} \end{bmatrix} \tag{A8}$$

where *<sup>F</sup>*θθ <sup>=</sup> *<sup>I</sup> i*=1 *Fi*,θθ, *<sup>F</sup>*θφ <sup>=</sup> *<sup>I</sup> i*=1 *Fi*,θφ, *<sup>F</sup>*φθ <sup>=</sup> *<sup>I</sup> i*=1 *Fi*,φθ, *<sup>F</sup>*φφ <sup>=</sup> *<sup>I</sup> i*=1 *Fi*,φφ.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Energy-Aware Management in Multi-UAV Deployments: Modelling and Strategies**

#### **Victor Sanchez-Aguero 1,2,\*, Francisco Valera 2, Ivan Vidal 2, Christian Tipantuña 3,4 and Xavier Hesselbach <sup>3</sup>**


Received: 30 March 2020; Accepted: 12 May 2020; Published: 14 May 2020

**Abstract:** Nowadays, Unmanned Aerial Vehicles (UAV) are frequently present in the civilian environment. However, proper implementations of different solutions based on these aircraft still face important challenges. This article deals with multi-UAV systems, forming aerial networks, mainly employed to provide Internet connectivity and different network services to ground users. However, the mission duration (hours) is longer than the limited UAVs' battery life-time (minutes). This paper introduces the UAV replacement procedure as a way to guarantee ground users' connectivity over time. This article also formulates the practical UAV replacements problem in moderately large multi-UAV swarms and proves it to be an NP-hard problem in which an optimal solution has exponential complexity. In this regard, the main objective of this article is to evaluate the suitability of heuristic approaches for different scenarios. This paper proposes betweenness centrality heuristic algorithm (BETA), a graph theory-based heuristic algorithm. BETA not only generates solutions close to the optimal (even with 99% similarity to the exact result) but also improves two ground-truth solutions, especially in low-resource scenarios.

**Keywords:** UAV; UAV fleet; UAV swarm; energy consumption; self-organization; algorithms; optimization; UAV replacement

#### **1. Introduction**

The unstoppable growth of the Unmanned Aerial Vehicles (UAVs) (commonly known as drones) ecosystem during these last years, has been proven to be just the beginning of a near-future global phenomenon. The US Federal Aviation Administration predicts [1] that UAVs providing commercial services will triple over the next five years, and will overtake consumer off-the-shelf UAVs by the year 2024. UAVs will grow eightfold over the next decade and will become the largest segment of the civilian market.

The utilization of multi-UAV systems, because of their rapid deployment, mobility, and flexibility, has recently attracted attention to support/extend the fifth-generation cellular network technology (5G) in extraordinary situations (e.g., massified events, natural disasters, infrastructure failures). The 5G networks will certainly bring faster uploading and downloading speeds in combination with a dramatic decrease of the network latency. However, in exceptional or emergency circumstances, the deployment of 5G terrestrial infrastructure may not be economically viable. In addition, the deployment times of these extraordinary on-demand 5G network services should meet the Key Performance Indicators [2] (KPI) defined by the 5G-PPP, which states that new deployments must finish within

90 min. Accordingly, it is here where UAVs are expected to play a crucial role. If properly deployed and configured, UAV networks can provide fast-ubiquitous 5G access (which is also within the 5G KPIs) employing wireless communications solutions in a diversity of real-world scenarios.

5G UAV missions, e.g., to complement existing cellular networks in high-density environments, deliver network coverage in hard to reach rural areas (Remote Access Networks), or in IoT scenarios, may require the management of moderately large UAV fleets. UAVs mainly act as aerial communication platforms such as (i) aerial base stations (BS) (to support existing 5G infrastructure in high traffic demand) [3], or (ii) aerial WiFi access points (AP) forming a Flying Ad Hoc Network (FANET) (to create new networks) [4]. So that UAV research area aims to extend the 5G network (where it has no range) or support the existing 5G network (when it is not enough) using radio solutions as payload [5], e.g., 5G or LTE microcells, multi-hop solutions based on commodity WiFi. When compared to terrestrial antennas, aerial units may have some advantages since they can change their altitude with the possibility of avoiding obstacles, including no geographical restriction on the antenna location. However, these advantages turn into crucial design challenges, such as the optimal positioning, the limited flight time, or the optimal trajectories calculation and network planning [6].

In particular, multi-UAV environments may give rise to long-endurance missions that require uninterrupted service provisioning (performing UAV replacements) that are not achievable using a single UAV due to battery capacity constraints (at most around 20 min flight [7]). A UAV replacement means that a UAV that is waiting in the Ground Control Station (GCS) becomes active and goes into the scenario to substitute one of the UAVs that are on service, so as to provide the same functionality. This is only possible by providing a fleet exceeding the number of UAVs that have to be active on service at the same time, i.e., there is a reasonable number of fresh UAVs for replacement.

Nevertheless, developing an appropriate replacement strategy of UAVs, is one of the critical hurdles that have not yet been properly addressed by the research community. The replacement strategy enables to optimize the cost in terms of required aerial infrastructure resources while keeping the provided level of service. To guarantee that long-term (beyond battery life) services can be deployed, some of the UAVs that are at the GCS must be able to successfully replace the UAVs that provide the actual service on the stage whenever necessary (for example, when a UAV has low battery or fails). However, the economic cost of oversizing the fleet is enormous, as these devices commonly have high prices. For this reason, it is necessary to develop a resource optimization mechanism in order to allow intelligent and autonomous UAV systems to be managed with the lowest possible number of UAVs.

Figure 1 illustrates a representative use case of UAVs delivering network coverage. As it can be appreciated, some UAVs provide connectivity to several end-users. A Controller entity, located in the GCS, is in charge of scheduling when UAV replacements will take place. Once the replacement procedure is started for a certain UAV, that UAV directly goes back to the GCS to change its battery 1 , while another one comes back in its place 2 . As soon as it has a charged battery installed, it is available again in the replacement pool for other UAVs to be changed when required. By following this methodology, an uninterrupted service may be provided. Reducing the UAV fleet while ensuring a reasonable quality of service is not a straightforward procedure (Further details about the methodology depicted in Figure 1 can be found in Section 4, Sections 4.1 and 4.4).

This article presents the practical UAV replacement problem and analyses its complexity. Next, different scenarios, as well as the methodology to evaluate the service performance, are presented. A sub-optimal heuristic algorithm is proposed that guarantees the proper modeling and control of a fleet of UAVs that are used to provide Internet connectivity minimizing the fleet size. This algorithm is validated comparing it with the optimal solution, using an improved version of the brute-force search combinatorial algorithm developed in our previous work [8]. Finally, the practical limitations of the proposal are analyzed, while the possible alternatives solutions are considered.

The rest of the article is organized as follows: the related work and background are reviewed in Section 2. Section 3 states the proposed problem and analyses its complexity. Section 4 describes the methodology to operate with large multi-UAV fleets. Then, Section 5 details the suggested scenarios and remarks the results obtained from the simulation. Finally, Section 6 concludes the article and depicts some future research lines.

**Figure 1.** Typical Unmanned Aerial Vehicles (UAV) use case using the proposed methodology.

#### **2. Related Work and Background**

Due to the versatility of UAVs, these devices are used in a wide variety of fields. The following section introduces the evolution of management strategies used to overcome battery limitations and algorithmic solutions for UAV battery replacements which are the main topics of this article. It also explains how UAVs may complement 5G networks, and wireless communication solutions in large geographical areas using UAV swarms.

Ubiquitous connectivity is one of the current challenges of 5G networks and beyond 5G [6]. UAVs have appeared as a promising solution to provide reliable and flexible wireless communication services for ground users in a wide variety of scenarios [3]. The usage of UAVs promises to provide cost-effective wireless connectivity for devices without infrastructure coverage. Concretely, UAVs are considered as flying BSs for coverage extension and capacity enhancement of the existing 5G cellular networks. In this paper [9], authors explore the use of UAV-BSs to provide coverage during natural disasters. In this work [5], an evolved packet core (EPC) inside a UAV is introduced, to orchestrate the LTE RAN in the presence of multiple BSs. This EPC can also interoperate with commercial BSs as well as commodity user equipment. In [10], the authors provide an overview of UAV-aided networks, introducing the underlying architecture and wireless channel characteristics.

One of the most critical design challenges in multi-UAV systems is the achievement of the all-to-all communication between UAVs, which is necessary for cooperation and collaboration [11,12]. If every UAV is connected to existing network infrastructure such as a GCS, satellite network or base stations, swarm communications can be delivered via this infrastructure. This type of network scheme simplifies some problems that may be associated with UAVs ad hoc networks alternatives, like routing protocols or the distributed control of the network. However, it also brings as a consequence certain limitations such as the expensive equipment (long-range or satellite antennas) and obviously less flexibility since the deployment is fixed to existing infrastructure. An alternative solution is the usage of FANETs. In this type of system, UAVs have several roles, not only as functional devices to provide coverage, gathering sensor data, or video dissemination but also to be used as network relays to connect all UAVs through the UAV network itself. Commonly, only one (or a few) UAV (also known as backbone UAV) are required to be connected to the fixed infrastructure (GCS). The backbone UAV is generally equipped with two radios: (i) low-power radio (WiFi or Bluetooth, for instance) is used for communication between the UAVs and (ii) high power long-range radio to communicate with the GCS [13]. It is common to find quite a few examples of research works that use FANETs to support 5G networks [14]. For instance, Reference [4] extends a 5G network slice for video monitoring with a

FANET composed of small low-altitude UAVs with multi-access edge computing (MEC) facilities to allow high-speed transmission.

Although the development of UAV networks is receiving significant attention from the research community, some challenges must be solved before their proper deployment and consolidation. One of them is their limited battery capacity since normally a UAV source power mainly depends on small batteries (we are considering in this article small rotary-wing UAVs and not big fixed-wing UAVs with fuel engines). Consequently, these SUAVs (Small UAVs) are hardware-constrained devices that cannot be too heavy or carry heavy payloads. Besides, to the power consumption of the flight engines, it is essential to consider the additional energy required by onboarded computers, that may not be carrying their own external batteries and in case they were, extra weight would be added to the system. As a consequence, we find that the useful lifetime of a UAV system is undoubtedly limited by these restrictions. Different research works propose solutions to provide uninterrupted service on long endurance missions and overcome the reduced-battery challenge. For instance, Ref. [15] presents an algorithm to offer continuous structural inspection services using UAVs not only through simulation results, but also by using an implementation. In this solution, authors replace a UAV unit before its battery is drained. The replacement algorithm employed in this article will be used later in this paper to compare and contextualize the proposed solution. The modification of their methodology (since in this particular case, authors work with a single-UAV system while the proposed scenarios require moderately large fleets of UAVs) is explained and detailed in Section 5.2.2. In [16], authors consider UAV replacement (among other possible alternatives, such as refueling [17] or recharging) to maintain total surveillance of an area perimeter. Additionally, some articles propose an automatic battery replacement [18–20]. They offer a GCS capable of swapping UAV batteries without human interaction. Ground task automation not only reduces human interaction but also increases the multi-UAV system operation area, improving the coverage and enabling operation in hazardous environments. This trend makes us choose battery replacement as the preferred option in the solution proposed in this paper. Battery price is considerably lower than the cost of a UAV, and the time to replace the battery is remarkably shorter than the time to recharge it. Moreover, thanks to these studies and their practical experimentation, we use these results as input for our scheduling algorithms to provide accuracy to the design of UAV replacement strategies. Diverse works attempt to solve the limited battery life problem which is inherent to current SUAVs by proposing diverse alternatives. In [21], it is considered that the UAVs land to provide service (if possible and secure operation). The work in [22] summarizes different techniques to prolong the UAV operation time from Battery dumping [23] to Photovoltaic arrays [24,25]. Some other additional techniques have been proposed like wireless charging using lasers is in [26].

The optimization field, to improve the restricted communication performance of UAV networks while using the minimum amount of physical resources, is also an actual discussion topic in state of the art. In [27], the effective use of flight-time constrained devices is investigated, maximizing the average data service to ground users following a fair resource allocation policy. The solution of the cooperative allocation problem proposed in [28] significantly improves the performance of several network parameters. In [29], the authors try to minimize the number of vehicle-mounted BSs required to guarantee wireless coverage for a group of distributed ground users. Similar work in [30] proposes a placement algorithm for vehicle-mounted BSs that maximizes the number of covered ground users using the minimum UAVs. In [31], authors investigate the UAV coverage problem and propose a multi-UAV coverage model based on energy-efficient communication. The work in [32,33] focuses on the application of a multi-layout multi-subpopulation genetic algorithm achieving significantly better performance results than the other meta-heuristic algorithms also considered to improve the coverage deployment of multi-UAV networks. An explicit definition of the minimum-energy paths between a predefined initial and final configuration of a quadrotor by solving an optimal control problem concerning the angular accelerations of rotors is detailed in [34]. Their solution yielded minimum-energy and fixed-energy paths for the aerial vehicle.

#### **3. Problem Statement**

As it has just been mentioned, one of the main challenges in multi-UAV systems is to keep all the target geographic areas covered overtime by UAVs, since their battery life is limited (minutes) as compared to the typical mission timelines (hours). In order to face this problem, our approach is to use a fleet with the number of UAVs that are required to cover the whole scenario, and then maintain extra UAVs in a backup pool to serve as replacement units (as it can be seen in Figure 2). Once a replacement has been scheduled by the GCS, a fully recharged UAV enters the scenario while the replaced UAV goes back home to substitute its empty battery and be therefore ready to be changed by the next active UAV that requires a replacement. However, this procedure of identifying the minimum number of necessary (extra) UAVs and scheduling UAV replacements in the appropriate moment (to guarantee a minimum level of service availability) resembles a sophisticated approach and is the main problem that is treated in this paper.

**Figure 2.** Multi-UAV system during a mission for three target areas (*j* = 3) and four UAVs (*i* = 4).

Figure 2 depicts the reference scenario considered in our analysis. In this scenario, different colors are used to represent different geographic areas, which encompass the target areas where UAVs are intended to provide network coverage to end-users, the geographic location where UAVs are directed for a battery replacement, and the specific area where the backup pool of UAVs is kept for subsequent use. These color patterns have been reproduced in Figure 3 to classify not only what task the UAVs are doing at a certain moment but also to show which UAVs are covering the target areas at any given time. The following subsection faces the practical UAV replacement problem from an optimization viewpoint stating a simplified and manageable procedure, checking its complexity, and solving it through different approaches (optimal brute force algorithm, heuristic algorithm).

#### *Complexity Analysis*

In this subsection, we prove that a simplified version of the proposed problem maps to an NP-hard problem (bin-packing problem [35] in this particular case) so that we are able to state its complexity. We denote a UAV using the index *i* and the target areas using the index *j*. Each UAV *i* has *CBi* (*t*) battery level at instant *t* and the UAV might be in four different states: (i) battery replacement state (landed in the GCS), (ii) flying state (towards the GCS, or towards a region where it is intended to provide a network service), (iii) covering a region, or (iv) waiting in the reserve UAVs area to replace an active UAV. These four states can be appreciated in Figure 3. This diagram represents a hypothetical scenario with three regions (*j* = 3) and 4 UAVs (*i* = 4), also indicating when the replacements take place to guarantee system availability over time. Note that a region/area indicates where a UAV has to fly. In the case (e.g., the number of users, a high volume of traffic) that two UAVs have to be geographically near, we consider two different areas.

**Figure 3.** Multi-UAV system states during a mission for three target areas (*j* = 3) and four UAVs (*i* = 4).

This problem statement must guarantee each region *j* is always covered by a UAV, i.e.,

$$\sum\_{i} x\_{i,j}(t) = 1, \quad \forall j, \forall t \tag{1}$$

with *xi*,*j*(*t*) = 1 whenever UAV *i* covers region *j*. Moreover, at any time *t*, the battery level of UAV *i* has to stay above a safe threshold *aj* (e.g., 20%) for each region, so as to ensure the flight back to the GCS:

$$\sum\_{j} a\_{j} \mathbf{x}\_{i,j}(t) \le \mathbf{C}\_{B\_{i}}(t) y\_{i}(t), \quad \forall i, \ \forall t,\tag{2}$$

here *yi*(*t*) = 1 whenever UAV *i* is not in the GCS.

Additionally, battery levels keep on decreasing while UAV is covering a region. Otherwise, we consider its battery levels is set to 100% once it has returned to the GCS, and the operator has replaced the battery:

$$\mathbb{C}\_{B\_i}(t+1) = \left(\mathbb{C}\_{B\_i}(t) - \mathcal{c}\right) y\_i(t) + R\_{T\_d, T\_r} \left(1 - y\_i(t)\right), \quad \forall i, \; \forall t. \tag{3}$$

with *c* the battery consumption, and *RTd*,*Tr* the average battery charge ratio during the time spent in returning to the GCS *Td*, and the operator replacement task *Tr*.

*Td* remains constant in this simplified version of the problem, no matter how far a UAV *i* is from the region *j* it was covering, to the GCS.

The main goal of this problem is to minimize the number of active UAVs over time:

$$\min \sum\_{i,t} y\_i(t). \tag{4}$$

This optimization problem with objective function (4), and constraints (1) and (2), maps to the bin-packing problem. Notice that this simplified problem has as bins the drones and as items the areas. Battery levels *CBi* (*t*) are just the bin capacities (Note that having time-dependent variables corresponds to having *t* repeated in such variables multiple times, i.e., with *t* = {1, 2}, *yi*(*t*) is expressed as two different variables *yi*,1 and *yi*,2.), and the battery threshold of each region *j* becomes the items' weights. Thus, constraint (2) is just the bin-packing restriction that prevents exceeding bin capacities. Furthermore, constraint (1) imposes that all items (our regions *j*) are fitted inside a bin.

Without considering (2), we already have an instance of the bin-packing problem. Since this makes some instances of our problem being NP-hard, our reduced problem automatically becomes NP-hard. Then, the next step is to generate a heuristic algorithm that will provide a sub-optimal solution. At the same time, it is required to develop a methodology that will enable the algorithm evaluation.

#### **4. Methodology**

This section describes the different elements depicted in Figure 1 and explains the steps to be followed by the mission planner to provide uninterrupted network services. It first describes the parameters that UAVs must report to the GCS in order to serve as input for the scheduler algorithms. Then, it details the diverse assumptions taken for system modeling, that enable simulations to evaluate the preliminary proposals. Later, it presents the metrics to assess the performance of the proposed solutions. Finally, it describes the different strategies used in this article to schedule UAVs replacements.

#### *4.1. Reported Parameters*

Current UAV systems regularly report to their control station their location (GPS coordinates if the UAV incorporates this type of navigation) and the remaining battery. However, this knowledge may not be enough to have a holistic view of the UAV network which enables the scheduler algorithm to satisfy the objective function (4) of minimizing the number of required UAVs to provide guaranteed service availability. This is the list of parameters periodically reported by the UAVs to the GCS that enable the calculation of essential inputs for the scheduler algorithms that will be defined later to make the appropriate replacement scheduling:


#### *4.2. Assumptions*

Using a discrete-time model and in order to provide a reasonable implementation of the UAV replacement strategies, it is required to make some assumptions (simplifications):


Although all these assumptions may affect the results of the simulations, the primary purpose of this article is not to achieve accurate results but to verify that it is worth using a replacement scheduler algorithm to manage moderately large UAV fleets. Once this hypothesis is demonstrated, the progressive replacement of each simplification opens interesting future work for the evaluation of more realistic results.

#### *4.3. Performance Metric*

The average number of users connected (over time) is used as the metric to evaluate the performance of UAV replacement strategies. Each sampling period (e.g., 5 s in our simulations), this metric is examined in order to calculate the percentage of end-users connected to the GCS, which is in charge of providing Internet connectivity (i.e., the number of end-users that through a path established across the UAVs network are actually connected to the GCS). The average value of all the partial results during the simulation time will be used as performance metric.

#### *4.4. Scheduler Algorithm Proposals*

The following subsection outlines the strategies that have been taken into account when performing the simulations. Obtaining the optimal solution and defining a heuristic algorithm is part of the optimization process. The optimal solution will not predictably serve for large scenarios (in a reasonable time), but it will validate the heuristic algorithm in small scenarios for its future application in real environments. A summary of the parameters that describe the proposed UAV replacement strategies is shown in Table 1.

#### 4.4.1. Optimal Algorithm

To find the optimal UAV scheduling strategy that minimizes the number of UAVs used to cover a certain analysis region and a given number of users, the a brute-force algorithm has been proposed. This algorithm is an evolution of the strategy developed by us and presented in [8], which has incorporated positioning information, a number of users, and specific parameters related to the displacement between the GCS and the regions to be covered (i.e., landing time, take off time, and cruising speed of UAVs). In this regard, the proposed algorithm can be seen as an evolution of the

approach addressed in [8]. The UAV scheduling strategy is explained in Algorithm 1, and its operation can be summarized in the following three stages.



a detailed example of the combinatorial analysis of UAVs to cover services is described in Table 4 in [8].

• Analysis of the percentage of battery charge to perform the replacement. With the information from the previous step (UAV allocation per service region), the algorithm analyses the optimal charge level for each UAV in which the corresponding replacement must be performed. This procedure is carried out by means of an exhaustive exploration of each level of charge for every UAV, and seeks to guarantee the highest service availability time and efficient use of the available resources (minimization of the number of UAVs for replacements). In a traditional approach, as shown in Figure 4a, replacement is performed when the battery capacity reaches its minimum threshold (*TB* = 75 s in the example). Although this procedure allows the full capacity of the battery to be used, the simultaneous discharge of several or all UAVs may cause a greater demand of resources (UAVs) for the subsequent allocations (in the worst case *i* = *j*) and unavailability of one or several regions if there are no UAVs available for replacements. On the contrary, a desynchronization in the replacement time, as shown in Figure 4b, allows not only a greater availability of services but also minimization of the number of UAVs in the system. In the example presented in Figure 4a, four UAVs are required (two UAVs in services and two in the reserve) to guarantee a service availability equal to 100%, whereas in Figure 4b, only three UAVs are necessary to reach the same availability level. Once the algorithm has determined the charge levels for a replacement that allow the maximum service availability and the maximum number of UAVs available for the next allocation, these UAVs are allocated to their corresponding regions. The allocation process continues iteratively (i.e., execution of step two and step three) until reaching the maximum time horizon *Tw*, as shown in the example in Figure 4b with *Tw* = 100 s.

(**a**) UAV replacement considering all battery consumption

**Figure 4.** Differences between the analysed scheduling procedures. Example for *j* = 2 and *i* = 3 (2 UAVs in services and 1 UAV for replacement).

The proposed optimal strategy is an offline exhaustive search mechanism whose complexity is given by (5)

$$f(i\_\prime j\_\prime, T\_B, T\_S) = \mathbb{C}(i \times j\_\prime, j) + \left( \left\lceil \frac{T\_B}{T\_s} \right\rceil \right)^j \tag{5}$$

where the first term represents the combinatorial analysis for the allocation of UAVs and the second term corresponds to the analysis of the charge levels for replacements. Both terms in (5) are non-polynomial, the first term is the dominant and, according to the Big-O classification [39], the order of growth of the algorithm is *O*(*C*(*ixj*, *j*)), i.e., non-polynomial. Based on preliminary tests we can report that if the UAVs have the same characteristics (i.e., equal battery capacity) the analysis in step three (exploration of the charge levels) is only necessary for the first allocation process, because desynchronization is maintained all other allocations, as shown in Figure 5. While this mechanism can partially reduce the complexity of the algorithm, obtaining an optimal solution using exhaustive search limits it to real-time applications and reveals the drawbacks in selecting the number of regions and UAVs (at most *i* = 12 and *j* = 6). In this regard, this strategy can be used in planning stages to estimate the number of UAVs needed for a mission, such as an emergency or rescue scenario. However, in these cases a suitable alternative is a strategy described in [8], because it is a more generic and less complex approach.

**Figure 5.** Proposed scenarios for algorithm performance evaluation.

Therefore, the hardness of the problem analysed in Section 3 (NP-Hard) and the complexity of the optimal solution shown in (5) (exponential) demonstrate the need for less complex heuristic mechanisms that can be used in real-time implementations. These strategies are described in the following sections and represent the major contributions of this paper.

#### 4.4.2. BETA: Betweenness Centrality Heuristic Algorithm

Heuristic algorithms are employed to solve optimization problems that are out of scope in reasonable times by optimal algorithms. In this particular case, it is also essential that this heuristic algorithm has a fast execution time because it must be run in real-time. BETA schedules the replacements based on the relevance of each participant within the network. To determine the relevance of an area in a network scenario, we apply graph theory fundamentals. Each area/UAV (an area is covered by a UAV) would correspond to the graph vertices (also called nodes), while the links among UAVs correspond to the graph edges (also called links or lines). One of the most well-known metrics to identify which are the most significant vertices in a graph is centrality, more specifically the betweenness centrality, which resembles the number of times a vertice acts as a connection along the shortest path between two other nodes. However, in the proposed multi UAV networks, nodes do not communicate with other nodes randomly since they do it with those that have Internet connectivity to the public network (either the GCS or a 5G-enabled UAV), as this provides the ground users with Internet connectivity.

To formulate this custom metric, we have divided the graph into two sub-graphs: (i) sub-graph which is composed by those UAVs that do not have Internet connectivity and (ii) sub-graph which is formed not only by the GCS, but also by the UAVs that may eventually have connectivity to the core network. Therefore, due to these specifications, the centrality metric has been calculated in the following way:

$$\mathbf{g}'(v) = \sum\_{\substack{s \neq v \\ s \in A \\ t \in B}} (\frac{\sigma\_{\rm st}(v)}{\sigma\_{\rm st}} \mathbf{L}\_{\mathbf{s}})\_{\prime} v \in A \tag{6}$$

where *σst*(*v*) is the number of shortest paths from UAV *s* to UAV *t* that traverse UAV *v*, and *σst* the total number of shortest paths from UAV *s* to UAV *t*. *Us* is the number of users connected to UAV *Us*. The amount of users is crucial since if there are no users connected to UAV *Us*, there is no impact on the network. This statement (6) (which is quite versatile) despite being designed for FANETs is also suitable for BS scenarios (where UAVs are directly connected to the core network).

A ranking was computed using the *g* (*v*) metric as input. In case two UAVs have the same value *g* (*v*) the one closer to the GCS will be above the other in the ranking since this will minimize the total replacement procedure time, and the replaced UAV will be active sooner to perform another UAV replacement. Now it is possible to assume which scheduling strategy to follow. BETA attends the following strategy (it can be appreciated in Algorithm 2): (i) if there is any topological change or the ground users move around the scenario, the algorithm must compute the ranking again; (ii) whenever there is a UAV available in the reserve, the algorithm schedules a replacement to the UAV with less remaining battery. However, this replacement takes place only if it does not affect UAVs that have a higher position in the ranking, i.e., that means that the remaining lifetime of the top UAVs is shorter than the time needed to make a UAV available again (after flying towards the GCS and battery replacement). In the case that this UAV replacement cannot be performed, the same analysis is repeated for the next UAV with the lower battery until the algorithm finds a UAV to make the replacement. For this algorithm to work correctly, it has to be executed periodically. In our case BETA runs every 5 s which coincides with the sampling period.


#### **5. Simulation Details and Results**

In order to validate these algorithms we have used different scenarios with different properties that will be discussed in this section. The following subsections detail *(i)* the simulation parameters and the justification of their selection, *(ii)* the ground-truth solutions with which the BETA algorithm is also compared, *(iii)* the simulation setup, and finally *(iv)* the simulated scenarios in combination with achieved results.

#### *5.1. Simulation Parameters*

This subsection details the parameters that have been taken into account to carry out the simulations and the selection criteria. This data, together with the related notation, can be seen in Tables 1 and 2.



**Table 2.** Simulation parameters.


The time needed to perform a battery replacement is based on [15]. The battery capacity is based on the Parrot Bebop 2 specifications [7] (It is chosen because we have performed several tests using this model, and it is the selected unit in the technical validations we have worked previously [38,40,41], since it has demonstrated that it is able to carry a single board computer onboard like a Raspberry Pi for a reasonable time without problems and a reasonable cost). To calculate the device consumption, we have assumed that the UAV flies for 20 min (also specified in the technical characteristics). For WiFi range and although the standards state that the range is quite large, in practice, we have found that the WiFi range is relatively short for an acceptable received signal level [42]. The cruising speed has been calculated based on its maximum speed (also on technical specifications). Meanwhile, takeoff and landing times have also been calculated by our own measurements since we have not found accurate information. The simulations are iterated assuming a fixed number of areas to be served (and obviously one UAV per area) and then increasing the number of replacement UAVs (starting by 0 and increasing until the number of UAVs in reserve equals the number of UAVs in the scenario, which would mean doubling the size of the fleet).

#### *5.2. Ground-Truth Solutions*

To provide context to the BETA and optimal algorithms performance, they will both be compared with two alternative solutions (with smaller complexity). The primary purpose of the article is not to measure how far the heuristic solution is from the optimal but to highlight that the use of this type of solution is worthwhile and under which conditions and in which scenarios. In order to do that, the four scheduling techniques will be compared (BETA, optimal, baseline and simple scheduling) and different conclusions will be obtained

#### 5.2.1. Baseline

This is the simplest strategy. UAVs are assumed to periodically send their current battery level and GPS position, however, no further calculations are made from the GCS. When an active UAV reaches a minimum battery threshold, i.e., only the required battery to return to the GCS plus a safety threshold, e.g., 20%, a replacement is scheduled (if UAVs are available), i.e., the drained UAV flights to the GCS, and at that moment (when the drained UAV starts flying to the GCS), a fresh UAV takes off and flies to the uncovered target area to provide the service. If no fresh UAVs are available, there will be no service in that area until a UAV is ready to go and cover it again. The lack of intelligence in this baseline solution prevents us from reaching 100% service provisioning in any case because even with infinite UAVs to serve as fresh replacements there will always be a gap without network service corresponding to the time that passes since the drained UAV leaves the stage towards the GCS until the moment the new UAV enters the stage and starts operating.

#### 5.2.2. Simple Scheduling

This strategy is inspired by [15]. UAVs are also assumed to send their current battery level and GPS periodically. However, in this case, the controller is required to estimate a battery threshold (based on battery reports and other parameters such as the UAV speed and the takeoff and landing times) that includes not only the minimum battery needed to return to GCS but also includes the time needed for the fresh UAV (in case there are available units) to reach the target area. That way the new UAV will start serving the area just after the old one leaves and predictably, if there are enough UAVs in reserve, all the areas can be covered for the whole mission time (or at least a high percentage of the time). In case the active UAV reaches the threshold and there is no fresh UAV to perform the replacement, the UAV can still continue providing service until it reaches the battery level needed to reach the GCS enlarging the service time.

#### *5.3. Simulation Setup*

A Matlab (Matlab R2017b) event-based simulator achieves all the paper results. To calculate the BETA, simple scheduling, and baseline solutions in all the scenarios, a computer equipped with a 2.6 GHz Intel Core i5 processor and 8 GB RAM was used. Meanwhile, the optimal algorithm has been run on a computer equipped with a 3.33 GHz x 12 cores Intel Core i7 Extreme processor and 24 GB RAM. If the reader is interested in reproducing the experiment, the code is available in this repository [43].

#### *5.4. Validation Scenarios*

#### 5.4.1. Scenario I: Proof of Concept

To start the analysis, we have defined a basic scenario (it can be appreciated in Figure 5a) as a proof of concept. In this stage, there are a total of 6 coverage areas and 300 ground users heterogeneously distributed. In a scene with these reduced dimensions, it is possible (in terms of reasonable computation time) to run the algorithm that provides the optimal solution so it will be possible to compare all the alternatives. Figure 6a depicts the average connected users for the four algorithms when the UAVs act

as a FANET which means that UAVs onboard commodity WiFi equipment and use the created UAV WiFi adhoc network itself to connect to the GCS (which in turn provides the Internet connectivity). Therefore if one of the UAVs that are geographically closer to the GCS (and hence connecting part of the topology to the GCS) runs out of battery and there is no possible UAV replacement, some parts of the network may get disconnected even though the rest of UAVs may be successfully covering other target areas. Following this logic, whenever there is a failure in the backbone UAV, the system gets completely divided. On the other hand, Figure 6b depicts the average connected users for the four algorithms when UAVs act as BSs (they are directly connected to the public network without the need of a hop by hop network like a FANET). These scenarios are usually employed in massified events where the existing cellular network is operating correctly, but may be insufficient. As expected, these results are better than the FANETs results since each UAV is only responsible for its own end users. However these on-boarded BS solutions are usually more expensive and it is not always viable (when the infrastructure does not exist or is temporarily damaged for instance).

Figure 6a shows that both BETA and the simple scheduling strategies perform similarly and are close to the optimal solution. To reach 100% of connected users with the simple scheduling approach, it is required to double the UAV fleet (12 UAVs) but in any case in reduced scenarios, the simple scheduling solution is enough to provide an adequate service. On the other hand, the baseline algorithm provides erratic and unintuitive results considering that the performance decreases as the fleet increases. This phenomenon happens because although the time that UAVs are covering the target areas is higher, the network is disconnected for longer, i.e., having more UAVs does not guarantee overall connectivity if the backbone UAV is not working. If there are no reserve UAVs in reserve (the fleet size is equal to the number of target areas), the return and battery replacement process (of all the UAVs in the scenario) is almost synchronized (and operate simultaneously). However, if there are some UAVs in reserve, this process may be unsynchronized. For this reason, the baseline results decrease and, in consequence, are worse and inconstant.

The results in Figure 6b (UAVs acting as BSs) are better as we commented and again BETA and simple scheduling strategies are close to the optimal solution. The baseline solution performance improves in this case, as the size of the fleet increases. All the strategies in fact stabilize with a fleet of 8 UAVs, two in reserve (fleet 25% oversize), and both BETA and simple scheduling achieve acceptable values.

In scenarios with reduced dimensions this 25% of fleet oversize (having two UAVs in reserve) seems quite reasonable. However, in a scenario with numerous areas, e.g., 25 areas, 50 areas, this oversize may imply a rather expensive operation. It is then important to validate the solutions in much bigger scenarios and see the performance of the algorithms there.

#### 5.4.2. Scenario II: Grid

Figure 5b shows a scenario with 25 coverage areas and 250 ground users homogeneously distributed, i.e., ten ground users per area. We have selected a grid topology which is fail-tolerant since there are multiple alternative paths to reach the GCS from each area. Moreover, all the areas have the same number of users, which makes the difference in the UAV ranking insignificant in the FANET scenario and almost nonexistent in the BSs scenario.

Figure 6c shows that the performance of the heuristic strategy is better than the simple scheduling solution (when the fleet is formed by 30 UAVs, 5 UAVs in reserve, the results improve by more than 10%). Both strategies reach acceptable levels from 35 UAVs fleet. The heuristic solution reaches 100% of users connected with a 38 UAVs fleet while the simple scheduling solution, as in the first scenario, needs to double the fleet size to reach 100% of connected users. On the other hand, the baseline solution has similar behavior to the previous scenario. This outcome highlights that if no strategy (however simple) is used to schedule the UAV replacements, the results can be harmful, and even over-dimensioning the resources does not guarantee favorable results.

**Figure 6.** Average number of users connected in different scenarios increasing the fleet size.

Figure 6d presents the results of UAVs acting as BSs. In this case, the heuristic algorithm behaves similarly to the simple scheduling solution. The heuristic algorithm schedules the UAV replacements based on *g* (*v*) metric (6), which is determined using graph theory. In this scenario, the nodes representing the UAV network have the same *g* (*v*) since they are all directly connected to the infrastructure and provide connectivity to the same ground users; therefore, all UAVs connect the

same number of users to the network. For this reason, scheduling the UAV replacements using the heuristic strategy has no advantage other than that they are performed as soon as there is an available fresh UAV. This phenomenon reveals that the heuristic solution makes the difference in scenarios where UAVs have different relevance within the network.

#### 5.4.3. Scenario III and Scenario IV: Tree

Finally, we have designed two tree type scenarios with the users distributed very heterogeneously. This type of scheme makes some UAVs much more relevant, and scheduling replacements effectively seems to have a substantial impact on the final performance. The first scenario has 25 coverage areas and 300 ground users. The second scenario has 50 coverage areas, 500 ground users. The areas and user distribution can be appreciated in Figure 5c,d.

Figure 6e reveals that the difference between the BETA solution and the simple scheduling solution is significant in these scenarios. For a 30 UAVs fleet (5 UAVs in reserve), we achieve a 20% improvement by using the heuristic scheduler, which is an important variation when providing a network service. The heuristic strategy obtains 100% of users connected from 36 UAVs. As in previous scenarios, the baseline solution produces insufficient results. It is interesting to observe that as the UAV fleet increases (which have high economic cost), the users are not connected longer.

Similarly, when UAVs act as BSs, Figure 6f, we obtain better results using the heuristic strategy. This variation is because, in this scenario, the ground users are heterogeneously distributed, and consequently, UAVs have different *g* (*v*) since they connect diverse numbers of ground users, which implies that performing the correct replacement has more impact.

The conclusions of scenario IV are similar to the ones of scenario III, although the performance (Figure 6g,h) is worse because of the greater complexity of the UAV network topology and the greater failure possibility.

#### *5.5. Comparison of the UAV Replacement Strategies*

Once the results have been obtained (Figure 6) and discussed (Section 5.4), in this section we will perform a comparison of the results by computing: *(i)* for scenario I, the distance from the optimal solution (*Opt*) to the suboptimal or approximate solutions (*SubOpt*), in order to verify the quality of the results, and *(ii)* for the succeeding scenarios, the difference between the simple scheduling and baseline approaches against the Heuristic strategy. To this end, the criterion of approximation ratio (*ρ*) has been used [23]. This parameter, in the context of the proposal, estimates how many times lower the approximate solution is as compared to the exact or optimal result; it is defined as the average value of the ratio between the suboptimal and optimal solutions, as shown in

$$\rho = \frac{1}{i} \sum\_{i} \frac{SubOpt\_i}{Opt\_i} \,\prime \tag{7}$$

where *SubOpti* and *Opti* are the results for all the variation of UAVs in reserve (from zero to fleet size) for the optimal and suboptimal strategies, respectively. The *ρ* factor ranges from 0 (whether *Opt* and *SubOpt* have completely different values) to 1 (if *Opt* and *SubOpt* produce the same solution); an intermediate value (i.e., 0 < *ρ* < 1) represents the similarity or closeness factor to the optimal solution (*SubOpt* = *ρxOpt*) [23]. For a better understanding of the calculation of this parameter, (8) presents an example for the simple scheduling approach of Scenario I when UAVs act as APs (Figure 6a).

$$\rho = \frac{1}{7} \times \left( \frac{72.95}{74.99} + \frac{83.38}{87.51} + \frac{99.02}{100} + \frac{99.24}{100} + \frac{99.27}{100} + \frac{99.28}{100} + \frac{100}{100} \right) = 0.98. \tag{8}$$

The result of (8) shows that the suboptimal solution (simple scheduling approach) is similar to the optimal solution in a factor equal to 0.98 (98% similarity between solutions). The rest of the *ρ* factors for Scenario I (Figure 6a,b) are summarized in Figure 7, while the *ρ* values for other scenarios are presented in Figure 8. The comparison between the optimal solution and the approximate solutions in Scenario I, based on *ρ* factor, reveals that all the proposed strategies produce not only near-optimal solutions, but also a stable performance (i.e., high-quality feasible solutions). In all cases, as illustrated in Figure 6a,b and then corroborated in Figure 7, the approximate algorithms (BETA, simple scheduling and baseline) generate solutions very close to the optimal, even with 99% similarity to the exact result (1% of error), which is achieved by the BETA approach. Then, this strategy is used as a baseline to evaluate the performance of the other strategies (simple scheduling and baseline) for Scenario II, Scenario III, and Scenario IV. In summary, from the information analyzed in Section 5.4, the average number of connected users allows us to appreciate where one algorithm improves another, while the distance to the optimal solution computed in this section allows us to quantify this variation. To provide a higher level of detail, *ρ* has been analyzed by ranges depending on the fleet size (all the analysis has been carried out by increasing the number of UAVs gradually). In addition, this value has also been computed at a global level. The main reason is to analyze in which areas the solution improves and quantify it. Since from a particular value, the solutions provide similar results, computing this metric at a global level makes it difficult to recognize the areas of improvement.

**Figure 7.** Approximation ratio *ρ*: optimal strategy vs. heuristic strategies for Scenario I.

It can be seen in Figure 8 that the central area of improvement is in the first two ranges, i.e., from 25 to 35 UAVs. This result is positive because, as expected, if there is a reasonable amount of UAVs (with their corresponding cost), a typical solution can perform adequately. However, in scenarios with limited resources using the heuristic strategy improves in all cases the simple scheduling solution.

Analyzing the above metrics, we can conclude that using strategies to make replacements is worthwhile. However, the heuristic strategy designed in this article is considerably aggressive since it schedules a replacement whenever UAVs are available. This strategy can result in the number of replacements skyrocketing over time, as well as the number of batteries to be used, which would bring a high economic cost. It should be noted that the price of a battery is much lower than that of a UAV but in any case it is not negligible.

Figure 9 displays the number of UAV replacements as a function of the number of UAVs forming the fleet and the mission time for scenario III using the BETA strategy. Approximately 1200 replacements are needed to provide a service of 10 h and obtain 100% of users connected. Furthermore, due to the aggressive nature of BETA, the replacement grows exponentially after reaching a fleet size that guarantees 100% of connected users. In addition, it should be mentioned that the main advantage of small UAVs is that deployments are generally done very quickly and very flexibly, but as we have seen it is both economically and logistically difficult, to achieve reasonable solutions when service time largely exceeds battery lifetime. Other alternatives should be used in these cases like bigger UAVs with more battery capacity (or even using fuel), land the UAVs on the ground to improve their autonomy, or even deploying of a fixed infrastructure if service is expected to be maintained for a long.

**Figure 9.** Number of UAV replacements using BETA in scenario III.

#### **6. Conclusions and Future Work**

This article states the practical UAV replacement problem, where a multi-UAV network is expected to provide long-endurance network services (in the order of hours) using constrained devices with limited autonomy (in the order of minutes). It is verified that the optimal UAV scheduling to minimize the number of UAVs for replacements while providing a guaranteed service availability, is NP-hard and that its optimal solution has exponential complexity. In this regard, some heuristics approaches have been analyzed and evaluated.

Secondly, the article details a methodology, including the simulation environment and the parametrization required to perform a preliminary evaluation of these heuristic strategies. The simulator code is available in [43] to reproduce the experiment and evaluate upcoming future strategies.

The article also introduces BETA, a heuristic replacement strategy that performs the replacements as soon as possible based on the relevance of each UAV within the network. BETA is presented as an example in order to verify if it is worthwhile using a heuristic replacement approach or not. BETA is capable of running in real-time with a 99% similarity with the optimal solution in some simple scenarios (scenario I). In heterogeneous scenarios, BETA improves the basic solutions, achieving the most significant improvement in instances where the scenarios are heterogeneous, and the resources are limited. Furthermore, we conclude that it is far better to have a replacement strategy (no matter how simple it is) than having no strategy at all. BETA is compared with the optimal algorithm in order to evaluate the distance whenever possible and with other alternatives in some other scenarios and it has been possible to see that in some situations the advantages are not so relevant as in other ones.

The article opens several lines of future research, such as to be able to provide priority of replacements for UAVs serving users in emergency/disaster scenarios. The application of replacement strategies in disaster scenarios includes an uncertainty degree caused by several factors caused by moving UAVs while they are operating (that may change the topology), or extreme conditions that may force the engines and battery consumption. Some other futures lines include the combination of FANETs and BSs in the same scenario, to test UAVs models with different battery capacity, to model the energy consumption according to more realistic consumption patterns based on experimentation, or to schedule UAV replacements without making them through the GCS.

**Author Contributions:** Conceptualization, V.S.-A., F.V., I.V., C.T. and X.H.; methodology, V.S.-A., F.V., I.V., C.T. and X.H.; software, V.S.-A. and C.T.; validation, V.S.-A. and C.T.; formal analysis, V.S.-A., F.V. and C.T.; investigation, V.S.-A., F.V., I.V., C.T. and X.H.; data curation, V.S.-A. and C.T.; writing–original draft preparation, V.S.-A., F.V. and C.T.; writing–review and editing, V.S.-A., F.V., I.V., C.T. and X.H.; visualization, V.S.-A. and C.T.; supervision, F.V. and X.H.; project administration, F.V. and X.H.; funding acquisition, F.V. and X.H. All authors have read and agreed to the published version of the manuscript.

**Funding:** This article has been partially supported by the 5G-City project (TEC2016-76795-C6-1/3-R) funded by the Spanish Ministry of Economy and Competitiveness, and the H2020 5GRANGE project (grant agreement 777137).

**Acknowledgments:** The authors wish to thank Jorge Martín-Pérez for his guidance and support in the problem statement section. Christian Tipantuña acknowledges the support from Escuela Politécnica Nacional and from SENESCYT for his doctoral studies at UPC.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Abbreviations**

The following abbreviations are used in this manuscript:


#### **References**


[CrossRef]


c 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **A Decentralized Low-Chattering Sliding Mode Formation Flight Controller for a Swarm of UAVs**

**Thiago F. K. Cordeiro 1, João Y. Ishihara <sup>2</sup> and Henrique C. Ferreira 2,\***


Received: 30 March 2020; Accepted: 7 May 2020; Published: 30 May 2020

**Abstract:** In this paper, a nonlinear robust formation flight controller for a swarm of unmanned aerial vehicles (UAVs) is presented. It is based on the virtual leader approach and is capable of achieving and maintaining a formation with time-varying shape. By using a decentralized architecture, the local controller in each UAV uses information only from the UAV itself, its neighbors, and from the virtual leader. Also, a synchronization control objective provides a mechanism to weight between the fleet achieving the desired formation shape, that is, achieving the desired relative position between the UAVs, and each UAV achieving its desired absolute position. The use of a combination of a sliding mode controller and a low pass filter reduces the usual chattering effect, providing a smooth control signal while maintaining robustness. Simulation results show the effectiveness of the proposed decentralized controller.

**Keywords:** unmanned aerial vehicle; synchronized multi-agent formation; decentralized sliding mode control

#### **1. Introduction**

The use of an unmanned aerial vehicle (UAV) swarm brings several advantages in search and rescue, disaster monitoring, aerial mapping, traffic monitoring, reconnaissance missions, and surveillance [1–3]. A swarm of UAVs provides system redundancy, reconfiguration ability, and structure flexibility, being more effective, flexible, robust, and reliable than single vehicles [4,5]. The formation control is a critical task of attempting cooperation among UAVs. In general, a formation control problem is to find a coordination scheme to enable UAVs to reach and maintain some desired, possibly time-varying formation or group configuration [6].

In the view of communication networks, the existing formation control approaches can be classified into the centralized method, where a single controller is used to control the whole team based on the information from the whole team [7] and the decentralized method, where each team member generates its own control based on local information from its neighbors [1,2,4,8–11]. Centralized formation control can be a good strategy for a small team of UAVs. When considering a team with a large number of UAVs, the need for greater computational capacity and a large communication bandwidth would mandate a decentralized formation control [4].

The main structures considered for formation control of UAVs swarm are leader-follower, behavioral, and virtual structure/virtual leader [12,13]. In the leader-follower approach [3,6,10,14], a common leader is chosen and the rest of the agents are assigned as followers. The group leader broadcasts its position information to the followers who then begin to follow the leader at an offset. In the behavioral approach [15,16], several desired behaviors are prescribed for agents in this approach. Such desired behaviors may include cohesion, collision avoidance, obstacle avoidance. In the virtual structure/virtual leader approach [4,9,11], the entire formation is treated as a single rigid body. The virtual structure can evolve as a whole in a given direction with some given orientation and can maintain a rigid geometric relationship among multiple vehicles. In the virtual leader approach, the leader is a known virtual entity and its information can be made available for each agent software.

There are several control techniques used in UAV formation control, based on distinct premises and aiming to achieve distinct objectives. A common approach is to use a nonlinear dynamic inversion (NLDI) which, via nonlinear functions, encapsulates the nonlinear system in a box with virtual inputs/outputs that behaves as a linear system. This linearized system act as a set of double integrators, that then is controlled by any linear or nonlinear technique, such as pole placement [14,17], H<sup>∞</sup> control [9], differential game approach [10] or sliding mode control (SMC) [8,18]. There are two main approaches when using the NLDI fixed-wing UAV formation flight control, related to in which frame the whole formation is described. One is to choose a global frame, such as north-east-down [9,10], and the virtual inputs are accelerations in north/south, east/west and up/down directions. Other is to use a leader related frame [8,14,17,18], and the virtual inputs accelerates toward forward/backward, left/right and above/below the leader.

In References [14,17], classic controllers were designed after applying of NLDI procedure in the nonlinear dynamics of UAVs formation flight. In Reference [10], a differential game approach is used to achieve an optimal controller that weights between minimizing the terminal position and velocity error of each UAV and minimizing the control effort. Another option, as in Reference [2], is the model predictive control (MPC), which can be used to compute an optimal control output to achieve formation control while avoiding obstacle and dealing with actuator saturation. It is however computationally expensive, since it reevaluates, at each time instant, the optimal control output over a finite time horizon. In Reference [2], the computational cost is partially reduced by maintaining the previously computed control output and reevaluating only when certain trigger events indicates that the control output must be changed, which works well in steady maneuvers, such as straight level flight or in constant-rate turns. In all of these approaches [2,10,14,17], the project does not account for the effect of disturbances or model uncertainty. Robust [8,9,11,18,19] and adaptive [1] approaches are appropriate for tackling this problem, where robust approaches usually has fast response but has a high control effort and/or chattering, whereas adaptive approach has a slower convergence, but uses a smoother control signal. In this way, the robust approach is recommended if precision of the formation is more important than control effort. In Reference [9], the proposed H<sup>∞</sup> linear controller is robust to noises, disturbances, and delays in communication between the UAVs. The SMC is an interesting technique since it, ideally, can completely compensates the effects of model uncertainties and bounded disturbances. As disadvantage, it provides a discontinuous, chattering control signal whose source is a signum (sign) function [11,19,20]. A possible solution is to change this signum function to a saturation (sat) function, as in our previous work for UAV formation flight [18], but this generates a trade-off between precision and chattering. Another solution is the use of a second-order SMC (SOSMC) [21,22], which uses the integral of the chattering signal as control input of a plant. A generalization of the second-order SMC is the low-pass filter (LPF) [23–26]. The integral, or the more general low-pass filter, is included as part of the plant model, which improves precision. For example, in Reference [23] an architecture with sliding mode control and low pass filter is proposed for synchronous position control for multiple robotic manipulator systems. However, its control law involves the computation of derivatives whose order is higher than the plant order, which can difficult implementation for example in an embedded system. In Reference [24] an attitude controller for the reentry of a space vehicle based on low pass filter SMC architecture is proposed. Differently from, for example, Reference [23], the LPF in Reference [24] is used to filter only the signal component that contains a signum function, while bypassing the smooth component of the control signal directly to the plant. This approach, compared to the approach in Reference [23], avoids the computation of higher-order derivatives.

In this paper, using an LPF-based SMC approach, a decentralized controller for a time-varying synchronized formation of multiple UAVs with a virtual leader is proposed. It is considered that each UAV is subject to unknown bounded disturbance. The computation of higher-order derivatives is not required. This is achieved by decomposing the control signal in smooth and in chattering components and filtering only the chattering component. Compared with Reference [24], which considers a single space vehicle, the proposed controller considers a synchronized and decentralized formation of multiple UAVs. In a synchronized formation, multiple UAVs simultaneously converge to desired positions. In comparison with Reference [23], which uses LPF SMC for synchronized position control for multiple robotic manipulator systems in a ring-link communication topology, the proposed controller not require computation of higher-order derivatives, a more general information exchange topology is adopted, and the problem of UAV formation flying is considered. Different from our previous work [18], the proposed controller uses an LPF for chattering attenuation. The finite-time convergence to a linear sliding surface is proven by introduction of an appropriated Lyapunov function candidate and simulation results show the effectiveness of the proposed control architecture.

To use the LPF-based SMC, the upper bound of the disturbance must be known. Most of this disturbance is better described in the wind frame of the aircraft. For example, a model uncertainty can affect the lift force computation. The difference between the true and computed lift forces is equivalent to a disturbance force applied in the lift direction. Similar discussion can be made of the thrust, drag and side forces. If the NLDI linearizes the system in the leader's wind frame [8,14,17,18], this upper bound can be used directly, under the assumption that the leader's wind frame is similar enough to the followers' wind frames, as the fleet in formation flies approximated to the same direction. If, however, the NLDI linearizes the system in a global frame [9,10], the wind-frame-described upper bound must be translated to the global frame. The equations to translate the disturbance upper bound to the global frame are developed in this paper.

The main contribution is now summarized. A formation flight controller is developed that includes, in a single controller, the following characteristics:


It is worth noting that each individual characteristic of the controller listed above has already been developed in other papers but, to the best knowledge of the authors, there is no controller that includes all characteristics in a single controller. It is also worth noting that, to include all characteristics in a single controller, an appropriate Lyapunov that unifies theses characteristics is developed.

As a second contribution, a set of equations that translate the disturbance upper bound and is derivative upper bound from the wind frame to the global frame is proposed. These equations are used in the proposed LPF-SMC, but can be used, with minor modifications, in most fixed-wing formation SMC or SOSMC that are described in a global frame.

The remainder of this paper is organized as follows. Section 2 defines the problem, presents the mathematical models for the UAVs, formation flight, and communication graph. Section 3 presents the proposed controller, equations to compute the disturbance's upper bound, and proves the stability of the controller. Section 4 evaluates the proposed controller by simulation against an unfiltered SMC, where it is shown that the controller significantly reduces its chattering without significantly reducing its performance, and Section 5 concludes this paper.

#### **2. Preliminaries**

In this section, models for an individual UAV and for a fleet formation are presented.

#### *2.1. UAV Model*

The dynamics relating the input and output of the *i*-th vehicle in a fleet of *n* UAVs can be described by using the so-called point-mass aircraft model. It assumes a non-rotating flat Earth with a constant gravitational acceleration *g*. This model provides adequate precision to aircraft guidance and control problems and for short-range trajectory planning. It also assumes that the intensity of the wind is mild such that the airflow can be considered aligned with the vehicle fuselage, that is, that the angle-of-attack and sideslip angle are null, which are reasonable suppositions to cruise flight and coordinate maneuvers. Under these assumptions, as depicted in Figure 1, the drag force **D***i*(*t*), generated by the airflow, is aligned to the fuselage, pointing backward, whereas the lift force **L***i*(*t*) is perpendicular to the fuselage. It is assumed that the propulsive system provides a thrust force vector **T***i*(*t*) aligned with the fuselage/airflow and in the opposite direction to **D***i*(*t*) and that the net angular momentum generated by the propulsive system is null. Finally, it is assumed that the vehicle mass *mi* is (approximately) constant, i.e., the propulsive system is electric or if fuel-based, that the consumption is small compared to the total vehicle mass. These simplified suppositions are commonly used in the literature [9,10,27,28]. To achieve extra precision in more aggressive maneuvers, the effect of the angle-of-attack and sideslip angle should be included in the model [14]. In general, these quantities require a dedicated sensory system—which is usually not present in small UAVs—to these angles be measured. In this work, the angle-of-attack and sideslip angle are allowed to be unmeasured but it is supposed that their effect can be incorporated in the model as a bounded disturbance **b***i*(*t*).

The state vector of the point-mass model of the *i*-th UAV is composed of its position vector **p***i*(*t*) = [ *pxi*(*t*) *pyi*(*t*) *pzi*(*t*) ]*<sup>T</sup>* described in the inertial Cartesian reference frame NED (North-East-Down) and by its velocity, described in a spherical coordinate system composed by the ground speed *Vi*(*t*), flight path angle *γi*(*t*) and course angle *χi*(*t*). By rotating the reference frame first by *χi*(*t*) around the *z* axis and after by *γi*(*t*) around the rotated *y* axis, the *i*-th aircraft wind frame is obtained. Since it is assumed that the aircraft velocity vector is aligned with its fuselage, *χi*(*t*) and *γi*(*t*) are equivalent respectively to the yaw *ψi*(*t*) and pitch *θi*(*t*) attitude angles. The point-mass model includes also the roll attitude angle *φi*(*t*), which is a rotation of the fuselage around the direction of the velocity vector. The definition of the roll, pitch and yaw angles can be seen in [29]. Figure 1 shows the *i*-th UAV and its vectors and attitude angles.

**Figure 1.** The *i*-th Unmanned Aerial Vehicle (UAV), its force and velocity vectors, and attitude angles.

The state change given by the derivative of **p***i*(*t*) is computed as

$$\dot{\mathbf{p}}\_{i}(t) = \begin{bmatrix} \dot{p}\_{xi}(t) \\ \dot{p}\_{yi}(t) \\ \dot{p}\_{zi}(t) \end{bmatrix} = \mathbf{R}\_{i}(t) \begin{bmatrix} V\_{i}(t) \\ 0 \\ 0 \end{bmatrix} = V\_{i}(t) \begin{bmatrix} \cos\gamma\_{i}(t)\cos\chi\_{i}(t) \\ \cos\gamma\_{i}(t)\sin\chi\_{i}(t) \\ -\sin\gamma\_{i}(t) \end{bmatrix},\tag{1}$$

where

$$\mathbf{R}\_{i}(t) = \begin{bmatrix} \cos\gamma\_{i}(t)\cos\chi\_{i}(t) & -\sin\chi\_{i}(t) & \sin\gamma\_{i}(t)\cos\chi\_{i}(t) \\ \cos\gamma\_{i}(t)\sin\chi\_{i}(t) & \cos\chi\_{i}(t) & \sin\gamma\_{i}(t)\sin\chi\_{i}(t) \\ -\sin\gamma\_{i}(t) & 0 & \cos\gamma\_{i}(t) \end{bmatrix} \tag{2}$$

is a rotation matrix that rotates from the wind frame to the reference frame with angular velocity

$$
\omega\_i(t) = \begin{bmatrix} -\dot{\chi}\_i(t)\sin\gamma\_i(t) \ \dot{\gamma}\_i(t) \ \dot{\chi}\_i(t)\cos\gamma\_i(t) \end{bmatrix}^T,\tag{3}
$$

and the variables *Vi*(*t*), *χi*(*t*), and *γi*(*t*) can be computed by

$$\tan \chi\_i(t) = \frac{\dot{p}\_{yi}(t)}{\dot{p}\_{xi}(t)}, \qquad \sin \gamma\_i(t) = -\frac{\dot{p}\_{zi}(t)}{V\_i(t)}, \qquad V\_i^2(t) = \dot{p}\_{xi}^2(t) + \dot{p}\_{yi}^2(t) + \dot{p}\_{zi}^2(t). \tag{4}$$

The UAV dynamics is described by [9]

$$\begin{aligned} \dot{V}\_i(t) &= \frac{T\_i(t) - D\_i(t)}{m\_i} - g\sin\gamma\_i(t) + b\_{ti}(t),\\ \dot{\chi}\_i(t) &= \frac{L\_i(t)\sin\phi\_i(t)}{m\_i V\_i(t)\cos\gamma\_i(t)} + \frac{b\_{\psi i}(t)}{V\_i(t)\cos\gamma\_i(t)},\\ \dot{\gamma}\_i(t) &= \frac{L\_i(t)\cos\phi\_i(t)}{m\_i V\_i(t)} - \frac{g\cos\gamma\_i(t)}{V\_i(t)} + \frac{b\_{\theta i}(t)}{V\_i(t)},\end{aligned} \tag{5}$$

where the disturbance signal **b***i*(*t*)=[ *bti*(*t*) *bθi*(*t*) *bψi*(*t*) ] encompasses model approximations, parameter uncertainty, and disturbances in acceleration, generated by several sources, such as wind. It is supposed that **b***i*(*t*) and **b**˙ *<sup>i</sup>*(*t*) are unknown but with known bounds. The subscripts *t*, *θ*, and *ψ* from the elements of **b***i*(*t*) means respectively thrust, pitch, and yaw. The thrust force magnitude *Ti* is a function of the engine throttle; *Di*(*t*) is the magnitude of the drag force; the lift force magnitude *Li* is a function of several parameters, such as air density and aircraft speed, and is adjusted mainly by changing the elevator position and *φ<sup>i</sup>* is adjusted by a combination of aileron and rudder positions. The variables *Ti*(*t*), *Li*(*t*), and *φi*(*t*) are the control inputs at the *i*-th UAV. It can be seen that Equation (5) presents a singularity when *Vi*(*t*) = 0. Fixed-wing vehicles must maintain non-null airspeed to maintain its lift. Assuming null or mild wind speed, the ground speed *Vi*(*t*) is also non-null and this singularity does not occur. It can be seen that cos *γi*(*t*) = 0 also presents a singularity in Equation (5). This occurs only when the UAV is flying exactly in an up or down direction. However, this is not an achieved state, except in highly acrobatic vehicles.

By defining the load factor *ni*(*t*) as [28]

$$m\_i(t) \stackrel{\Delta}{=} \frac{L\_i(t)}{m\_i \mathfrak{g}} \, \tag{6}$$

and defining the following virtual control input

$$\Gamma\_i(t) = \begin{bmatrix} a\_{ti}(t) \\ a\_{\psi i}(t) \\ a\_{\theta i}(t) \end{bmatrix} = \begin{bmatrix} \frac{T\_i(t) - D\_i(t)}{m\_i} \\ g n\_i(t) \sin \phi\_i(t) \\ g n\_i(t) \cos \phi\_i(t) \end{bmatrix} \tag{7}$$

the dynamics Equation (5) can be rewritten as

$$\begin{split} \dot{V}\_{i}(t) &= a\_{li}(t) - \operatorname{g}\sin\gamma\_{i}(t) - b\_{li}(t), \\ \dot{\chi}\_{i}(t) &= \frac{a\_{\psi i}(t) - b\_{\psi i}(t)}{V\_{i}(t)\cos\gamma\_{i}(t)}, \\ \dot{\gamma}\_{i}(t) &= \frac{a\_{\theta i}(t) - \operatorname{g}\cos\gamma\_{i}(t) - b\_{\theta i}(t)}{V\_{i}(t)}. \end{split} \tag{8}$$

By deriving **p**˙ *<sup>i</sup>*(*t*) from Equation (1), and applying some manipulations, is obtained that

$$\ddot{\mathbf{p}}\_i(t) = \mathbf{R}\_i(t) \left(\Gamma\_i(t) + \mathbf{b}\_i(t)\right) + \mathbf{g}\_{\prime} \tag{9}$$

where **g** = [ 0 0 *g* ] *<sup>T</sup>* is the gravitational acceleration vector. By defining

$$\tau\_i(t) = [\ \tau\_{xi}(t) \ \tau\_{yi}(t) \ \tau\_{zi}(t)]^T \triangleq \mathbf{R}\_i(t)\Gamma\_i(t) + \mathbf{g}\_{\prime} \tag{10}$$

$$\mathbf{d}\_{i}(t) = [\,\_{d}d\_{xi}(t)\,\,d\_{yi}(t)\,\,d\_{zi}(t)\,]^{T}, \stackrel{\scriptstyle\Delta}{=} \mathbf{R}\_{i}(t)\mathbf{b}\_{i}(t),\tag{11}$$

the dynamics are finally rewritten as

$$
\ddot{\mathbf{p}}\_i(t) = \pi\_i(t) + \mathbf{d}\_i(t), \tag{12}
$$

where *<sup>τ</sup>i*(*t*) <sup>∈</sup> <sup>R</sup><sup>3</sup> is a virtual controller input and **<sup>d</sup>***i*(*t*) <sup>∈</sup> <sup>R</sup><sup>3</sup> is the virtual disturbance described in the reference frame.

For the controller design, the model given by Equation (12) will be used. Once the virtual control signals are known, the original variables can be obtained. Since for any rotation matrix, **R**−<sup>1</sup> *<sup>i</sup>* (*t*) = **R***<sup>T</sup> <sup>i</sup>* (*t*), the virtual input **Γ***i*(*t*) can always be obtained from *τi*(*t*) by

$$\mathbf{T}\_i(t) = \mathbf{R}\_i^T(t) \left(\boldsymbol{\pi}\_i(t) - \mathbf{g}\right),\tag{13}$$

and then *Ti*(*t*), *ni*(*t*), *φi*(*t*) can be obtained from Equation (7), which can finally be used as the input of an inner loop controller that actuates over the engine and control surfaces [8].

#### *2.2. UAV Formation*

It is considered a formation of UAVs with a virtual leader scheme. The virtual leader is designated here as the 0-th UAV, and consists of a virtual point with a position **p**0(*t*)=[ *px*0(*t*) *py*0(*t*) *pz*0(*t*) ]*<sup>T</sup>* in space, known by all UAVs, which describes a smooth trajectory as a function of time. The results of this work can also be used for a non-virtual leader configuration by assuming that the leader UAV can broadcast its position to all followers UAVs.

The fleet formation is planned by the generation of the desired position **p***<sup>d</sup> <sup>i</sup>* (*t*) = [ *p<sup>d</sup> xi*(*t*) *<sup>p</sup><sup>d</sup> yi*(*t*) *<sup>p</sup><sup>d</sup> zi*(*t*) ]*<sup>T</sup>* for the *<sup>i</sup>*-th UAV which is described as

$$\mathbf{p}\_i^d(t) = \mathbf{p}\_0(t) + \mathbf{\bar{p}}\_i(t),\tag{14}$$

where **p**˜*i*(*t*)=[ *p*˜*xi*(*t*) *p*˜*yi*(*t*) *p*˜*zi*(*t*) ]*<sup>T</sup>* is the desired (time-varying) clearance, which is described in the reference frame.

To achieve a formation shape that rotates with the leader is interesting to describe the desired clearance **p**˜*i*(*t*) in the leader's wind frame or any other frame related to the leader as

$$
\vec{\mathbf{p}}\_i(t) = \mathbf{R}\_r(t)\vec{\mathbf{p}}\_i^r(t), \tag{15}
$$

where **p**˜ *<sup>r</sup> <sup>i</sup>*(*t*) is the clearance vector described in a leader's frame, such as wind, and the formation rotation matrix **R***r*(*t*) rotates from the leader's frame to the reference frame. For example, by defining **R***r*(*t*) = **R**0(*t*) (see Equation (2) with *i* = 0), it is achieved formation description aligned with the (virtual) leader's trajectory as in, for example, Reference [8,9]. If, instead, **R***r*(*t*) is defined as

$$\mathbf{R}\_{\tau}(t) = \mathbf{R}\_{\chi}(t) \stackrel{\triangle}{=} \begin{bmatrix} \cos \chi\_0(t) & -\sin \chi\_0(t) & 0\\ \sin \chi\_0(t) & \cos \chi\_0(t) & 0\\ 0 & 0 & 1 \end{bmatrix},\tag{16}$$

it is achieved a formation description aligned with the horizontal projection of the (virtual) leader's trajectory, used in, for example, References [14,17].

Another option is to describe the formation using a leader's frame defined by the attitude Euler angles yaw *ψ*0(*t*), pitch *θ*0(*t*), and roll *φ*0(*t*). This can be useful, for example, for maneuvers involving close interaction between the leader and the followers, such as to a boom-receptacle automatic aerial refueling. In this case, **R***r*(*t*) = **R***b*(*t*), where [29]

$$\mathbf{R}\_b(t) \triangleq$$

⎡ ⎢ ⎣ cos *ψ*<sup>0</sup> cos *θ*<sup>0</sup> cos *ψ*<sup>0</sup> sin *θ*<sup>0</sup> sin *φ*<sup>0</sup> − sin *ψ*<sup>0</sup> cos *φ*<sup>0</sup> cos *ψ*<sup>0</sup> sin *θ*<sup>0</sup> cos *φ*<sup>0</sup> + sin *ψ*<sup>0</sup> sin *φ*<sup>0</sup> sin *ψ*<sup>0</sup> cos *θ*<sup>0</sup> sin *ψ*<sup>0</sup> sin *θ*<sup>0</sup> sin *φ*<sup>0</sup> + cos *ψ*<sup>0</sup> cos *φ*<sup>0</sup> sin *ψ*<sup>0</sup> sin *θ*<sup>0</sup> cos *φ*<sup>0</sup> − cos *ψ*<sup>0</sup> sin *φ*<sup>0</sup> − sin *θ*<sup>0</sup> cos *θ*<sup>0</sup> sin *φ*<sup>0</sup> cos *θ*<sup>0</sup> cos *φ*<sup>0</sup> ⎤ ⎥ <sup>⎦</sup> . (17)

The derivatives of **p***<sup>d</sup> <sup>i</sup>* (*t*) in Equation (14) can be computed as

$$
\dot{\mathbf{p}}\_i^d(t) = \dot{\mathbf{p}}\mathbf{o}(t) + \dot{\mathbf{p}}\_i(t), \tag{18}
$$

$$
\ddot{\mathbf{p}}\_i^d(t) = \ddot{\mathbf{p}}\_0(t) + \ddot{\mathbf{p}}\_i(t). \tag{19}
$$

Using the Theorem of Coriolis [29], the derivatives of **p**˜*i*(*t*) in Equation (15) can be computed as

$$
\dot{\mathbf{p}}\_{i}(t) = \mathbf{R}\_{r}(t) \left[ \dot{\mathbf{p}}\_{i}^{r}(t) + \boldsymbol{\omega}\_{r}(t) \times \dot{\mathbf{p}}\_{i}^{r}(t) \right], \tag{20}
$$

$$\ddot{\mathbf{p}}\_{i}(t) = \mathbf{R}\_{\varGamma}(t) \left\{ \ddot{\mathbf{p}}\_{i}^{\varGamma}(t) + 2\omega \boldsymbol{\nu}\_{\varGamma}(t) \times \dot{\mathbf{p}}\_{i}^{\varGamma}(t) + \dot{\boldsymbol{\omega}}\_{\varGamma}(t) \times \ddot{\mathbf{p}}\_{i}^{\varGamma}(t) + \boldsymbol{\omega}\_{\varGamma}(t) \times \left[ \boldsymbol{\omega}\_{\varGamma}(t) \times \ddot{\mathbf{p}}\_{i}^{\varGamma}(t) \right] \right\},\tag{21}$$

where *ωr*(*t*) is the angular velocity between the rotating leader's frame and the reference frame and is given by

$$\omega\_r(t) = \begin{cases} \text{leader's gyro measurements,} & \text{if } \mathbf{R}\_l(t) = \mathbf{R}\_b(t), \\ \left[ -\dot{\chi}\_0 \sin \gamma\_0 \ \dot{\gamma}\_0 \ \dot{\chi}\_0 \cos \gamma\_0 \right]^T, & \text{if } \mathbf{R}\_r(t) = \mathbf{R}\_0(t), \\ \left[ 0 \ 0 \ \dot{\chi}\_0 \right]^T, & \text{if } \mathbf{R}\_l(t) = \mathbf{R}\_X(t). \end{cases} \tag{22}$$

It is worth noting that when using the non-virtual leader's body frame, the angular velocity *ωr*(*t*) is the body angular velocity, which can be directly measured by a gyro sensor at the leader. By using a non-virtual leader and any of the wind frame variants, the ground velocity obtained from a GPS sensor or from a navigation algorithm must be used. When using a virtual leader approach, its trajectory is smooth, pre-known, and artificially generated, in a way that *ωr*(*t*) can be pre-computed analytically or numerically with arbitrary precision depending on how the trajectory is created.

#### *2.3. Communication Graph*

Each follower UAV can exchange data with their neighbors. The communication network is represented by an undirected graph, which means that, if an *i*-th UAV receives data from a *j*-th UAV, this means that the *j*-th UAV receives data from the *i*-th UAV. The set of the UAVs that are neighbors of the *i*-th UAV is defined as N*i*.

The Laplacian matrix **L** represents the connectivity between the UAVs

$$\mathbf{L}\_{ij} = \begin{cases} -a\_{ij\prime} & \text{if } j \neq i \text{ and } j \in \mathcal{N}\_{i\prime} \\ \sum\_{k \in \mathcal{N}\_i} a\_{ik\prime} & \text{if } j = i\_{\prime} \\ 0, & \text{otherwise} \end{cases} \tag{23}$$

where *aij* = 0 means that there is no communication between the *i*-th and *j*-th UAVs and *aij* > 0 means that there is a communication link between the *i*-th and *j*-th UAVs, and the value of *aij* is used as a weight to the control algorithm that is developed in this paper. If all UAVs are reachable, that is, if someone starts from any UAV and can achieve any-other UAV via the communication links, **L** is semidefinite positive.

In decentralized controllers, the weight that is given to the information present in the own *i*-th UAV is also described, which is given by the diagonal matrix **Λ**. The matrix **H** includes both the own weight and the neighborhood weight. These matrices are given by

$$\mathbf{H} = \mathbf{A} + \mathbf{L},\tag{24}$$

$$\mathbf{A} = \text{diag}(\lfloor \lambda\_1 \rfloor \dots \lfloor \lambda\_n \rfloor). \tag{25}$$

Note that since *λ*1,..., *λ<sup>n</sup>* > 0 and **L** is semidefinite positive, the matrix **H** is invertible.

#### *2.4. Formation Tracking and Synchronization Errors*

The tracking error of each aircraft **<sup>e</sup>***i*(*t*)=[*exi*(*t*) *eyi*(*t*) *ezi*(*t*) ]*<sup>T</sup>* <sup>∈</sup> <sup>R</sup>3, relative to a desired position in the reference frame, is defined as

$$\mathbf{e}\_{i}(t) \triangleq \mathbf{p}\_{i}(t) - \mathbf{p}\_{i}^{d}(t). \tag{26}$$

The synchronization error <sup>Δ</sup>**e***ij*(*t*)=[ <sup>Δ</sup>*exij*(*t*) <sup>Δ</sup>*eyij*(*t*) <sup>Δ</sup>*ezij*(*t*) ]*<sup>T</sup>* <sup>∈</sup> <sup>R</sup>3, which can be seen as a relative position error between the UAVs, is defined as

$$
\Delta \mathbf{e}\_{i\uparrow}(t) \stackrel{\triangle}{=} \mathbf{e}\_{i}(t) - \mathbf{e}\_{\not\slash}(t) = \mathbf{p}\_{i}(t) - \mathbf{\bar{p}}\_{i}(t) - \left(\mathbf{p}\_{\not\slash}(t) - \mathbf{\bar{p}}\_{\not\slash}(t)\right). \tag{27}
$$

It can be seen that Δ**e***ij*(*t*) can be computed without knowing the leader's position. However, since the computation of **p**˜*i*(*t*) and **p**˜ *<sup>j</sup>*(*t*) in Equation (15) can be chosen to be dependent on the leader's flight direction or attitude angles, it is assumed here that the leader's data is available to all UAVs.

It is assumed that each *i*-th UAV can communicate only with a correspondent set of neighbor UAVs, N*<sup>i</sup>* ⊂ {1, 2, ... , *n*}. The communication graph is assumed to be undirected, connected, not change with time, and previously known. Each UAV receives the tracking error information of other UAVs in the fleet only through its neighbors (as, for example, in the simulation in Section 4). The virtual leader can be seen as an extra node in the graph, that connects to every other UAV in a directed way, from leader to each follower.

The coupled error at *i*-th UAV is defined as the weighted sum of its tracking error and the synchronization error with respect to its neighbors, that is,

$$\mathbf{e}\_i^c(t) = [\boldsymbol{\varepsilon}\_{\dot{x}i}^c(t)\,\boldsymbol{\varepsilon}\_{\dot{y}i}^c(t)\,\boldsymbol{e}\_{\dot{z}i}^c(t)]^T \triangleq \lambda\_i \mathbf{e}\_i(t) + \sum\_{j \in \mathcal{N}\_i} a\_{ij} \Delta \mathbf{e}\_{i\dot{j}}(t) = \lambda\_i \mathbf{e}\_i(t) + \sum\_{j=1}^n a\_{ij} \Delta \mathbf{e}\_{i\dot{j}}(t),\tag{28}$$

in which *λ<sup>i</sup>* > 0 weights its own tracking error and *aij* > 0 weights the error difference between the neighbor UAV *j* of the UAV *i*. In the last equality in Equation (28), if *j* ∈/ N*<sup>i</sup>* then *aij* = 0. The synchronization control objective is to make the coupled errors approach to zero.

#### *2.5. A Componentwise Formation Description*

It is supposed that each component of **d***i*(*t*) is independent of each other which implies that each component of **p**¨*i*(*t*) is independent of each other. In this way, the controller design is simplified since the description of only one axis is sufficient. A controller policy can be developed to a single axis and then it can be directly applied to the other two.

The one-dimensional dynamics from axis *l* = *x*, *y*, *z*, of the reference frame, is obtained from Equation (12) as

$$
\psi\_{li}(t) = \tau\_{li}(t) + d\_{li}(t). \tag{29}
$$

Accordingly, the coupled tracking-synchronization error is obtained from Equation (28) as

*Sensors* **2020**, *20*, 3094

$$\mathfrak{e}\_{li}^{\varepsilon}(t) = \lambda\_i \mathfrak{e}\_{li}(t) + \sum\_{j \in \mathcal{N}\_i} a\_{ij} \left[ \mathfrak{e}\_{li}(t) - \mathfrak{e}\_{lj}(t) \right]. \tag{30}$$

#### **3. Proposed Controller**

Here, a synchronous sliding mode controller is proposed. Figure 2 shows the proposed control structure. It achieves robustness against model uncertainty and disturbance. The chattering is attenuated by the use of a low pass filter (LPF).

**Figure 2.** Block diagram of the control structure.

To achieve synchronization, each UAV uses tracking errors of its neighbors to compute a sliding surface in the coupled error space. The sliding surface at the *i*-th UAV for the *l* axis is defined as

$$s\_{li}(t) = \dot{e}\_{li}^{\circledast}(t) + k\_d \dot{e}\_{li}^{\circledast}(t) + k\_p e\_{li}^{\circledast}(t). \tag{31}$$

As usual for sliding mode controllers, it is shown in the next subsection that *sli*(*t*) converges to zero in finite time, and maintains equal to zero thereafter. On the sliding surface, that is, when *sli*(*t*) = 0, the coupled error behaves according to the linear system

$$
\dot{\epsilon}\_{li}^{\varepsilon}(t) + k\_d \dot{\epsilon}\_{li}^{\varepsilon}(t) + k\_P \epsilon\_{li}^{\varepsilon}(t) = 0,\tag{32}
$$

which has all poles in the left plane and, thereafter, is exponentially asymptotically stable for project parameters *kd*, *kp* > 0.

The proposed control law for *i*-th UAV is

*τ*˙ *f*

$$
\pi\_{\rm li}(\mathbf{t}) = \pi\_{\rm li}^s(\mathbf{t}) + \pi\_{\rm li}^f(\mathbf{t}), \tag{33}
$$

where *τ<sup>s</sup> li*(*t*) and *<sup>τ</sup> <sup>f</sup> li*(*t*) are, respectively, a smooth signal and a filtered signal of the control law, computed by

$$
\pi\_{li}^{s}(t) = \ddot{p}\_{li}^{d}(t) - k\_d \dot{e}\_{li}(t) - k\_P e\_{li}(t), \tag{34}
$$

$$
\pi\_{li}^f(t) + \mathfrak{J}\_i \pi\_{li}^f(t) = u\_{li}(t), \tag{35}
$$

$$\mu\_{li}(t) = -\text{sign}(s\_{li}(t))\eta. \tag{36}$$

Equation (35) defines a low pass filter with cutoff frequency *ξ<sup>i</sup>* > 0 that converts a chattering signal *uli*(*t*) to a smooth signal *<sup>τ</sup> <sup>f</sup> li*(*t*). The parameter *η* must be chosen by the designer to guarantee the stability of the overall system.

The proposed control law given by Equations (33)–(36) contains only information from the virtual leader (or from a broadcasting non-virtual leader), from the own *i*-th UAV, and from its neighborhood N*i*. The neighborhood information is contained in *sli*(*t*), defined in Equation (31), which is a function of *e<sup>c</sup> li*(*t*) from Equation (30), which is a function of the own local error *eli*(*t*) and the neighborhood errors *elj*(*t*), *j* ∈ N*i*.

**Remark 1.** *The variables kp and kd define the natural frequency and damping factor of the 2nd order local sliding surface sli of the i-th UAV from Equation* (31)*. As can be seen in [20], these gains also define a control bandwidth,* *which must be sufficiently small to account for, for example, to actuator dynamics. Since it is chosen the same gain kp and the same gain kd to all UAVs, it means that they have sliding surfaces that share the same control bandwidth. This is reasonable if all UAVs have similar physical, actuator, and aerodynamic characteristics. However, if there are distinct UAVs, the constants must be chosen to respect the control bandwidth of the UAV with the slowest dynamics.*

#### *3.1. Disturbance Model*

Measurement or computation errors and the effect of non-modeled dynamics are incorporated in the dynamics model, given by Equation (12), as a disturbance signal described in the reference frame, **d***<sup>i</sup>* = [ *dxi dyi dzi* ] *<sup>T</sup>*. It is supposed that the controller has no access to **d***<sup>i</sup>* but there are known upper bounds Δ*xi*, Δ*yi* and Δ*zi* on the magnitude of the components of **d***<sup>i</sup>* and upper bounds Δ˜ *xi*, Δ˜ *yi*, and Δ˜ *zi* on the derivatives of the components of **d***i*, that is,

$$|d\_{li}(t)| \le \Delta\_{li\prime} |\dot{d}\_{li}(t)| \le \tilde{\Delta}\_{li\prime} \quad l = \{x, y, z\}. \tag{37}$$

These upper bounds are used to define the value of *η* in Equation (36), as explained in Section 3.2. As a contribution of this paper is shown that the upper bounds on the components in the reference frame coordinates can be computed from the upper bounds *δti*, *δθi*, and *δψ<sup>i</sup>* on the components of the disturbance signal in the wind frame **b***i*(*t*),

$$|b\_{\rm ti}(t)| \le \delta\_{\rm ti}, \quad |b\_{\theta i}(t)| \le \delta\_{\theta i}, \quad |b\_{\psi i}(t)| \le \delta\_{\psi i} \tag{38}$$

and from the upper bounds ˜ *δti*, ˜ *δθ<sup>i</sup>* and ˜ *δψ<sup>i</sup>* for the

$$|\dot{b}\_{\text{ti}}(t)| \le \delta\_{\text{ti}}, \quad |\dot{b}\_{\theta i}(t)| \le \delta\_{\theta i}, \quad |\dot{b}\_{\psi i}(t)| \le \delta\_{\psi i}. \tag{39}$$

The wind frame components of the disturbances are more naturally obtained, for example, in description of imprecision in the computation of drag or thrust forces. Assume that there is an upper bound **Ω***<sup>i</sup>* for the *i*-th UAV angular velocity *ω<sup>i</sup>* and define the bounds vectors *δ<sup>i</sup>* - [ *δti δθ<sup>i</sup> δψ<sup>i</sup>* ] *T* and *δ***˜** *<sup>i</sup>* - [ ˜ *δti* ˜ *δθ<sup>i</sup>* ˜ *δψ<sup>i</sup>* ] *<sup>T</sup>*. From Equation (11), it can be seen that

$$|d\_{li}(t)| \le \|\mathbf{d}\_i(t)\| = \|\mathbf{R}\_i(t)\mathbf{b}\_i(t)\| = \|\mathbf{b}\_i(t)\| \le \|\boldsymbol{\delta}\_i\|. \tag{40}$$

The upper bounds of each component of **d***<sup>i</sup>* are

$$
\Delta\_{xi} = \Delta\_{yi} = \Delta\_{zi} = ||\mathcal{S}\_i||.\tag{41}
$$

Since Equation (11) involves two frames in which one rotates related to the other, its derivative is obtained by using the Theorem of Coriolis [29]

$$\mathbf{d}\_{i}(t) = \mathbf{R}\_{i}(t) \left( \mathbf{\dot{b}}\_{i}(t) + \boldsymbol{\omega}\_{i}(t) \times \mathbf{b}\_{i}(t) \right), \tag{42}$$

where **d**˙ *<sup>i</sup>*(*t*) contains two components. The first, **b**˙ *<sup>i</sup>*(*t*), is the derivative of the disturbance **b***i*(*t*), as seen by the wind frame. The second, *ωi*(*t*) × **b***i*(*t*), is generated by the rotation of the wind frame related to the inertial frame. See that a constant disturbance in the wind frame is a varying disturbance in the inertial frame, because of its rotation. Finally, **R***i*(*t*) is used to represent the sum of these components in the inertial frame.

For the bounds *δi*, *δ***˜** *<sup>i</sup>*, and **Ω***i*, it is obtained

$$\left|\dot{d}\_{li}(t)\right| \le \left\|\dot{\mathbf{d}}\_{i}(t)\right\| \le \left\|\dot{\mathbf{b}}\_{i}(t)\right\| + \left\|\boldsymbol{\omega}\_{i}(t) \times \mathbf{b}\_{i}(t)\right\| \le \left\|\tilde{\boldsymbol{\delta}}\_{i}\right\| + \left\|\Omega\_{i}\right\| \left\|\boldsymbol{\delta}\_{i}\right\|.\tag{43}$$

In this way,

*Sensors* **2020**, *20*, 3094

$$
\bar{\Delta}\_{\rm xi} = \bar{\Delta}\_{\rm yi} = \bar{\Delta}\_{\rm zi} = ||\bar{\delta}\_{\bar{i}}|| + ||\Omega\_{\bar{i}}|| ||\delta\_{\bar{i}}||.\tag{44}
$$

Equations (40) and (44) provide the upper bounds to the proposed controller.

#### *3.2. Stability Proof*

To analyze the overall fleet behavior, all local variables must be concatenated in vectors. Concatenating the positions **p***i*, virtual control inputs *τi*(*t*), and disturbances **d***i*(*t*) from all UAVs of the fleet results in respectively **P**(*t*)=[ **p***<sup>T</sup>* <sup>1</sup> (*t*) ... **<sup>p</sup>***<sup>T</sup> <sup>n</sup>* (*t*) ]*T*, *τ*(*t*)=[ *τ<sup>T</sup>* <sup>1</sup> (*t*) ... *<sup>τ</sup><sup>T</sup> <sup>n</sup>* (*t*) ]*T*, and **D**(*t*) = [ **d***<sup>T</sup>* <sup>1</sup> (*t*) ... **<sup>d</sup>***<sup>T</sup> <sup>n</sup>* (*t*) ]*T*, all R3*<sup>n</sup>* vectors. In this way, the dynamics of the fleet of UAVs is given by concatenating Equation (29) as

$$
\Phi(t) = \pi + \mathbf{D}(t). \tag{45}
$$

Similarly, the error and coupled error in *x* axis are R*<sup>n</sup>* vectors given by **E**(*t*)=[ **e***<sup>T</sup>* <sup>1</sup> (*t*) ... **<sup>e</sup>***<sup>T</sup> <sup>n</sup>* (*t*) ]*<sup>T</sup>* and **E***c*(*t*)=[ **e***cT* <sup>1</sup> (*t*) ... **<sup>e</sup>***cT <sup>n</sup>* (*t*) ]*<sup>T</sup>* which are related by

$$\mathbf{E}^c(t) = (\mathbf{H} \otimes \mathbf{I}\_3)\mathbf{E}(t),\tag{46}$$

where ⊗ denotes the Kronecker product and matrix **H** is given by Equation (24). The concatenation of the *n* UAVs sliding surfaces **S**(*t*)=[ **s***<sup>T</sup>* <sup>1</sup> (*t*) ... **<sup>s</sup>***<sup>T</sup> <sup>n</sup>* (*t*) ]*<sup>T</sup>* is obtained as

$$\mathbf{S}(t) = \mathbf{\dot{E}}^c(t) + k\_d \mathbf{\dot{E}}^c(t) + k\_p \mathbf{E}^c(t) = (\mathbf{H} \otimes \mathbf{I}\_3) \left( \mathbf{\dot{E}}(t) + k\_d \mathbf{\dot{E}}(t) + k\_p \mathbf{E}(t) \right). \tag{47}$$

The proposed sliding mode control law is written as

$$
\pi(t) = \pi^s(t) + \pi^f(t),
\tag{48}
$$

where *τs*(*t*) and *τ <sup>f</sup>*(*t*) are computed by

$$\mathbf{r}^s(t) = \mathbf{\dot{P}}^d(t) - k\_d \mathbf{\dot{E}}(t) - k\_p \mathbf{E}(t),\tag{49}$$

$$
\dot{\boldsymbol{\pi}}^f(t) + \boldsymbol{\Xi}\boldsymbol{\pi}^f(t) = \mathbf{U}(t),
\tag{50}
$$

$$\mathbf{U}(t) = -\text{diag}\left\{\frac{\eta}{|s\_{li}(t)|}\right\} \mathbf{S}(t),\tag{51}$$

with **U**(*t*)=[ **u***<sup>T</sup>* <sup>1</sup> (*t*) ... **<sup>u</sup>***<sup>T</sup> <sup>n</sup>* (*t*) ]*T*, **u***i*(*t*)=[ *uxi*(*t*) *uyi*(*t*) *uzi*(*t*) ]*T*, and **Ξ** diag([ *ξ*<sup>1</sup> ... *ξ<sup>n</sup>* ]) ⊗ **I**<sup>3</sup> ∈ R3*n*×3*n*.

To analyze the fleet stability, the following Lyapunov functional candidate is proposed

$$\mathcal{V}(t) = \frac{1}{2} \mathbf{S}^T(t) (\mathbf{H} \otimes \mathbf{I}\_3)^{-1} \mathbf{S}(t). \tag{52}$$

Note that, since **<sup>H</sup>** and **<sup>H</sup>** ⊗ **<sup>I</sup>**<sup>3</sup> are a positive definite matrix, **<sup>H</sup>**−<sup>1</sup> and (**<sup>H</sup>** ⊗ **<sup>I</sup>**3)−<sup>1</sup> are also a positive definite matrix, so V(*t*) is always positive for **S**(*t*) = 0.

By using Equations (45), (48) and (49), the sliding surface given by Equation (47) can be rewritten as

$$\begin{split} \mathbf{S}(t) &= \left( \mathbf{H} \otimes \mathbf{I}\_3 \right) \left( \dot{\mathbf{P}}(t) - \dot{\mathbf{P}}^d(t) + k\_d \dot{\mathbf{E}}(t) + k\_p \mathbf{E}(t) \right) \\ &= \left( \mathbf{H} \otimes \mathbf{I}\_3 \right) \left( \mathbf{r}^s(t) + \mathbf{r}^f(t) + \mathbf{D}(t) - \ddot{\mathbf{P}}^d(t) + k\_d \dot{\mathbf{E}}(t) + k\_p \mathbf{E}(t) \right) \\ &= \left( \mathbf{H} \otimes \mathbf{I}\_3 \right) \left( \mathbf{D}(t) + \mathbf{r}^f(t) \right) . \end{split} \tag{53}$$

Since (**<sup>H</sup>** ⊗ **<sup>I</sup>**3)−<sup>1</sup> is constant, the derivative of Equation (52) is

$$\dot{\mathcal{V}}(t) = \mathbf{S}^{T}(t) \left(\mathbf{H} \otimes \mathbf{I}\_{3}\right)^{-1} \dot{\mathbf{S}}(t). \tag{54}$$

By deriving Equation (53) and after using Equation (50), V˙ (*t*) is rewritten to

$$\begin{split} \dot{\mathcal{V}}(t) &= \mathbf{S}^{T}(t) \left( \dot{\mathbf{D}}(t) + \dot{\mathbf{r}}^{f}(t) \right) \\ &= \mathbf{S}^{T}(t)\mathbf{D}(t) - \mathbf{S}^{T}(t)\mathbf{E}\boldsymbol{\pi}^{f}(t) + \mathbf{S}^{T}(t)\mathbf{U}(t) \\ &= \sum\_{i=1}^{n} \left( \mathbf{s}\_{i}^{T}(t)\dot{\mathbf{d}}\_{i}(t) - \boldsymbol{\xi}\_{i}\mathbf{s}\_{i}^{T}(t)\mathbf{r}\_{i}^{f}(t) + \mathbf{s}\_{i}^{T}(t)\mathbf{u}\_{i} \right) \\ &= \sum\_{i=1}^{n} \left[ \sum\_{l=\{x,y,z\}} \left( s\_{li}\dot{d}\_{li} - \boldsymbol{\xi}\_{i}\mathbf{s}\_{li}\mathbf{r}\_{li}^{f} - |s\_{li}|\eta \right) \right] \\ &\leq \sum\_{i=1}^{n} \left[ \sum\_{l=\{x,y,z\}} |s\_{li}| \left( |\dot{d}\_{li}| + \boldsymbol{\xi}\_{i}'|\mathbf{r}\_{li}^{f}| - \eta \right) \right]. \end{split} \tag{55}$$

The upper bounds of the disturbance and its derivative are given, respectively, by Δ*li* ≥ |*dli*(*t*)| and <sup>Δ</sup>˜ *li* ≥ | ˙*dli*(*t*)|, which are computed by, respectively, Equations (41) and (44). It is shown in [24] that |*τ f li*(*t*)|≤|*dli*(*t*)| ≤ Δ*li*. By using these upper bounds in Equation (55), it can be seen that

$$\dot{\mathcal{V}}(t) \le \sum\_{i=1}^{n} \left[ \sum\_{l=\{\mathbf{x}, \mathbf{y}, \mathbf{z}\}} |s\_{li}| (\tilde{\Delta}\_{li} + \tilde{\xi}\_{i} \Delta\_{li} - \eta) \right]. \tag{56}$$

By choosing *η* satisfying

$$
\eta \ge \vec{\Delta}\_{\text{li}} + \t\_{\text{i}} \Delta\_{\text{li}} + \epsilon, \quad \forall i \in \{1, \ldots, n\}, \quad \forall l \in \{\mathbf{x}, y, z\}, \tag{57}
$$

for some arbitrarily chosen constant > 0, it is obtained

$$\mathcal{V}(t) \le -\sum\_{i=1}^{n} \sum\_{l=\{x,y,z\}} |s\_{li}(t)|\varepsilon = -\varepsilon \sum\_{i=1}^{n} \sum\_{l=\{x,y,z\}} |s\_{li}(t)| = -\varepsilon \left\| \mathbf{S}(t) \right\|\_{1} \tag{58}$$

where **S**(*t*)<sup>1</sup> is the 1-norm of **S**(*t*). Using the fact that the 1-norm is greater than the Euclidean norm of the same vector, then

$$\dot{\mathcal{V}}(t) \le -\epsilon \|\mathbf{S}(t)\|,\tag{59}$$

which means that V(*t*) and, therefore, **S**(*t*) go to zero in finite time [20]. On the sliding surface, the system behaves as a stable linear system given by Equation (32) and the error converges asymptotically to zero.

**Remark 2.** *Note that the sliding surface given by Equation* (47)*, when rewritten in Equation* (53)*, is a function only of the disturbance* **D**(*t*) *and the output of the filter τ <sup>f</sup>*(*t*)*. This has two main implications:*


#### **4. Simulation**

In this section, a simulation is made to show the effectiveness of the proposed controller. A scenario of 5 UAVs with communication links described by Figure 3 is used.

**Figure 3.** Five UAVs and their undirected communication links. The virtual leader is not shown here. All UAVs have access to the virtual leader's trajectory information.

The matrices

$$\mathbf{L} = \begin{bmatrix} 2 & -1 & -1 & 0 & 0 \\ -1 & 3 & -1 & -1 & 0 \\ -1 & -1 & 3 & 0 & -1 \\ 0 & -1 & 0 & 1 & 0 \\ 0 & 0 & -1 & 0 & 1 \end{bmatrix} \tag{60}$$

and **Λ** = **I**<sup>5</sup> are chosen to give the same weight for the UAV own error and for each of its relative errors. The choice *kp* = 0.5 and *kd* = 0.0625 provide a critically damped sliding surface with natural frequency *ω<sup>n</sup>* = 0.25 rad/s. These gains are chosen relatively small, as a way to limit the maximum commanded acceleration, even if the UAVs are initially far from their desired position. The low pass filters are settled such that **Ξ** = **I**<sup>5</sup> ⊗ **I**3.

A fleet with a non-rectilinear 3D trajectory is described, which is defined by the virtual leader path given by

$$\begin{cases} p\_{x0}(t) = 80 + 45t & \text{[m]}, \\ p\_{y0}(t) = 20 \cos(0.1t) & \text{[m]}, \\ \gamma\_0(t) = \frac{\pi}{36} \text{ rad}, & (z\_0(0) = -100 \text{ m}). \end{cases} \tag{61}$$

For easy visualization, a time-varying formation is considered, whose horizontal projection in the reference frame has a V-shape and the altitude has time-varying oscillation. Accordingly, the formation rotation matrix **R***<sup>r</sup>* is defined as **R***<sup>χ</sup>* from Equation (16) and the clearance vectors **p**˜ *<sup>r</sup> <sup>i</sup>*(*t*) related to the virtual leader are

$$\begin{aligned} \dot{\mathbf{p}}\_1^I(t) &= \begin{bmatrix} 0\\ 0\\ 10\sin(0.1t) \end{bmatrix}, \quad \dot{\mathbf{p}}\_2^I(t) = \begin{bmatrix} -40\\ -40\\ 10\sin(0.1t + 2\pi/5) \end{bmatrix}, \quad \dot{\mathbf{p}}\_3^I(t) = \begin{bmatrix} -40\\ 40\\ 10\sin(0.1t + 4\pi/5) \end{bmatrix},\\ \dot{\mathbf{p}}\_4^I(t) &= \begin{bmatrix} -80\\ -80\\ 10\sin(0.1t + 6\pi/5) \end{bmatrix}, \quad \dot{\mathbf{p}}\_5^I(t) = \begin{bmatrix} -80\\ 80\\ 10\sin(0.1t + 8\pi/5) \end{bmatrix}. \end{aligned} \tag{62}$$

The initial position of each UAV is defined as

$$\mathbf{p}\_1(0) = \begin{bmatrix} 60 \\ 0 \\ -100 \end{bmatrix}, \quad \mathbf{p}\_2(0) = \begin{bmatrix} 20 \\ -30 \\ -100 \end{bmatrix}, \quad \mathbf{p}\_3(0) = \begin{bmatrix} 50 \\ 20 \\ -100 \end{bmatrix}, \quad \mathbf{p}\_4(0) = \begin{bmatrix} 10 \\ -50 \\ -100 \end{bmatrix}, \quad \mathbf{p}\_5(0) = \begin{bmatrix} 20 \\ 80 \\ -100 \end{bmatrix}. \tag{63}$$

The initial velocity of each UAV is defined as

$$\dot{\mathbf{p}}\_1(0) = \begin{bmatrix} 50 \\ 5 \\ 0 \end{bmatrix}, \quad \dot{\mathbf{p}}\_2(0) = \begin{bmatrix} 40 \\ 10 \\ 0 \end{bmatrix}, \quad \dot{\mathbf{p}}\_3(0) = \begin{bmatrix} 45 \\ -10 \\ 0 \end{bmatrix}, \quad \dot{\mathbf{p}}\_4(0) = \begin{bmatrix} 40 \\ -5 \\ 0 \end{bmatrix}, \quad \dot{\mathbf{p}}\_5(0) = \begin{bmatrix} 45 \\ 0 \\ 0 \end{bmatrix}. \tag{64}$$

The disturbance is simulated as

$$\mathbf{b}\_{i}(t) = 0.2 \begin{bmatrix} \cos(0.5t) & \cos(0.5t) & \cos(0.5t) \end{bmatrix}^{T}, \qquad \forall i \in \{1, 2, 3, 4, 5\}. \tag{65}$$

From Equation (65), the magnitude of the upper bound vector *δ<sup>i</sup>* of **b***i*(*t*) is computed as *δi* <sup>=</sup> 0.35. The magnitude of the upper bound vector of *<sup>δ</sup>***˜** *<sup>i</sup>* is computed also from Equation (65) as *δ***˜** *<sup>i</sup>* = 0.17.

The upper bound of each component of **d***i*(*t*) is computed by Equation (41) resulting in Δ*xi* = Δ*yi* = Δ*zi* = 0.35. By simulation experiments it is verified that Ω*<sup>i</sup>* = 0.17 rad/s is an upper bound for the angular velocity amplitude; the upper bound in **d**˙ *<sup>i</sup>*(*t*) is computed by Equation (44), resulting in Δ˜ *xi* = Δ˜ *yi* = Δ˜ *zi* = 0.23. By choosing = 0.42, it is obtained from Equation (57) that *η* = 1.

The system is implemented using an ode4 Runge-Kutta solver, with a fixed-step size of 1 ms. Since it is impossible to perfectly simulate the effect of a chattering input signal in a continuous differential equation, the controller output is evaluated at 10 ms time steps and maintained constant between time intervals.

For comparison purposes, the unfiltered synchronous formation flight controller presented in Reference [18] is also simulated. It is configured to be as similar as possible to the proposed controller. The first order sliding surface is defined with the same natural frequency as the proposed controller, that is, *ω<sup>n</sup>* = 0.25 rad/s. By using the same upper bound Δ*xi* = Δ*yi* = Δ*zi* = 0.35 and by choosing the same = 0.42, it is computed *η* = 0.77. Other parameters are exactly the same as the proposed controller.

Figure 4 shows the desired trajectory for each UAV in black, and the trajectory achieved by each UAV in distinct colors. Square and '\*' markers show respectively the desired and achieved positions in specific and equally spaced time instants. When a '\*' is inside the square, the UAV is in its desired position.

**Figure 4.** Desired trajectory and UAV position.

Figure 5 shows the formation flight error components *exi*, *eyi*, and *ezi* for each *i*-th UAV for both controllers. Figure 6, shows the coupled error of each *i*-th UAV, which is given by Equation (46) for both controllers. It can be seen that, for both controllers, the system rapidly enters in sliding mode, the coupled errors slide in the prescribed linear sliding surface and achieve the performance described by the linear system that defines the sliding surface. It can also be seen that the error converges to zero, which shows that both controllers completely compensate for the added input disturbance.

Figure 7 shows the controller output *τxi*, *τyi*, and *τzi* for each *i*-th UAV, which is generated by adding the smooth *τ<sup>s</sup> <sup>i</sup>* control signal and *<sup>τ</sup> <sup>f</sup> <sup>i</sup>* , obtained by filtering the chattering signal **U***<sup>i</sup>* in the proposed controller, or is the unfiltered control signal in the controller from Reference [18]. As can be seen, the proposed control output is smooth, whereas the control output from the unfiltered SMC chatters.

#### **5. Conclusions**

A decentralized architecture for synchronous formation flight of UAVs based on sliding mode control with a low pass filter was proposed. The use of the SMC technique provides robustness to disturbances, in a way that the system slides in the prescribed sliding surface even in the presence of disturbances. The LPF virtually removes the chattering while maintaining the convergence to a null error in steady-state. In the proposed architecture only the chattering component of the control signal is filtered. As a result, the controller has a simpler expression when compared to recent results of the literature, such as in [23]. Also, it is presented an equation that is used to compute the upper bounds in the disturbance and in its derivative to a formation described in a global frame. This equation assumes that the upper bounds are known in the wind frame of each follower UAV. It is proved that the proposed controller is stable, achieving a prescribed sliding surface in finite time.

For future work, more realistic models for UAV and wind gusts can be implemented. Also, it is desired to implement other SOSMC, such as presented in References [21,22], in the context of the synchronous formation flight.

**Author Contributions:** Conceptualization, J.Y.I. and H.C.F.; Formal analysis, T.F.K.C. and J.Y.I.; Investigation, T.F.K.C.; Methodology, T.F.K.C., J.Y.I. and H.C.F.; Software, T.F.K.C.; Supervision, H.C.F.; Visualization, T.F.K.C.; Writing—original draft, T.F.K.C., J.Y.I. and H.C.F. All authors have read and agree to the published version of the manuscript.

**Funding:** This work was supported in part by National Council for Scientific and Technological Development—CNPq, under grants 312528/2017-5/PQ and 460311/2014-0 and by District Federal Research Foundation—FAPDF, under grant 0193.001153/2015.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


c 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

*Article*

## **Modeling and Flight Experiments for Swarms of High Dynamic UAVs: A Stochastic Configuration Control System with Multiplicative Noises**

#### **Hongbo Zhao 1,\*, Sentang Wu 1, Yongming Wen 2, Wenlei Liu <sup>1</sup> and Xiongjun Wu <sup>3</sup>**


Received: 13 June 2019; Accepted: 23 July 2019; Published: 25 July 2019

**Abstract:** UAV Swarm with high dynamic configuration at a large scale requires a high-precision mathematical model to fully exploit its boundary performance. In order to instruct the engineering application with high confidence, uncertainties induced from either systematic measurement or the environment cannot be ignored. This paper investigates the *Ito*ˆ stochastic model of the UAV Swarm system with multiplicative noises. By combining the cooperative kinematic model with a simplified individual dynamic model of fixed-wing-aircraft for the first time, the configuration control model is derived. Considering the uncertainties in actual flight, multiplicative noises are introduced to complete the *Ito*ˆ stochastic model. Following that, the estimator and controller are designed to control the formation. The mean-square uniform boundedness condition of the proposed stochastic system is presented for the closed-loop system. In the simulation, the stochastic robustness analysis and design (SRAD) method is used to optimize the properties of the formation. More importantly, the effectiveness of the proposed model is also verified using real data of five unmanned aircrafts collected in outfield formation flight experiments.

**Keywords:** stochastic system; UAV swarm; configuration control; multiplicative noises; dynamic model; stochastic robustness analysis and design

#### **1. Introduction**

Swarms of UAVs, which can autonomously implement missions [1], have received significant attention in recent years. There are many application scenarios for UAV swarms, such as comprehensive combat [1], distributed reconfigurable sensor networks [2], surveillance [3], and reconnaissance systems [4]. The primary concern of the UAV swarm is the configuration control problem and related research mainly focuses on mathematical modeling [5], control strategies and methods [6,7] and collision and obstacle avoidance algorithms [8,9]. However, most of them are based on a deterministic system or a system with ideal Gaussian noises. High dynamic USCC model with multiplicative noise remains as one of the primary and practical issues for utilizing UAV swarms in engineering applications.

Generally, the stochastic differential equation refers to a stochastic process-driven system or an ordinary differential equation with a random coefficient [10–14]. Random factors are introduced into the system in the following three ways [10]: 1. the system's initial conditions or inputs are taken as random variables, 2. the system's random external disturbances, 3. the system's parameters and structures taken as random variables. The disturbances in the UAV swarm system are mainly induced from the sensor measurement, internet transmission and the task environment, and they fall under point 2 mentioned above.

Existing work on the formation configuration control problem has been extensively investigated for low dynamic deterministic systems. In [15], the formation containment problems based on linear state equations of the multiagent systems were investigated. In [6,16], some theoretical and practical problems of multiple quadrotors were studied. In [17–19], the dynamical model based formation control problems of multirobot systems were studied. Most of the studies focus on low-speed vehicles, such as multiple quadrotors or multirobot systems, with an ideal environment; however, very few studies consider the formation control problems of high dynamic fixed-wing unmanned aircraft under complex environments. The mathematical modeling of stochastic high dynamic UAV swarms remains to be solved.

In this paper, we introduce the stochastic model for the following two reasons: (1) The UAV swarm, especially for high dynamic, dense configuration and large scale swarms, requires a high-precision mathematical model to describe the dynamic relationships among formation members and fully exploit its boundary performance; (2) When the UAV swarm is carrying out missions, the influence of various uncertainties (systematic measurement random interferences, network-induced random interferences and mission environment random interferences) cannot be ignored, and the relative movements of its members are usually random. Communication topology, caused by a network change, inevitably influences the process of information sending and receiving [20]. Therefore, it is necessary to combine the mathematical model of the USCC with the stochastic system to instruct the engineering practice such as cooperative detection under complex mission environments with higher confidence.

Undoubtedly, constructing a more adaptable stochastic model for multiple vehicle systems is an urgent task. The problem of Brownian motion-driven multiagent tracking was discussed in [21] and sufficient conditions for the tracking of multi-agents were obtained by using the auxiliary function of Brownian motion and random *Ito*ˆ integral technology. A time lag multiagent system model with measurement noise was set up in [22], and the stability theory of stochastic differential equations was used. In [23], the stochastic factor has been considered in the leader-following multiagent model based on the event-triggered control strategy, *Itô* formula and stability theory. Studies [20,24–27] have considered the influence of stochastic disturbances on the multiagent system (MAS) and have established a stochastic model to control the MAS. However, existing works about the stochastic MAS are mainly based on assumed state matrices. Although the assumed formula can be applied across multiple levels, extra difficulties will occur in a certain practical system; for example, the process of combining the flight member's dynamics model and the formation's kinematics model is more complex; the modeling of process noises is more complex because the coefficients are presented in a very complex way in the practical system; the simulation and flight test are more difficult because of the complex system and high-risk environment. Therefore, numerous problems remain to be solved.

Typically, there are four main methods for formation coordination modeling reported in the literature: the leader-follower framework [28], virtual structure approach [29,30], behavior-based model [31,32] and graph theories [33]. Most of them focus on the consensus problems based on the kinematic model. In this paper, intending to improve both the formation properties and individual capabilities of UAV swarm in complex environments, we first use a simplified nonlinear dynamics model of the fixed-wing aircraft flight control system and then construct the group dynamic cooperative control system model together with the relative kinematic model. Based on Reynolds' three criteria [31], the model comprehensively considers the flight member's individual properties and the whole swarm's cooperativeness.

To the best of our knowledge, although few efforts have been devoted to the modeling of dynamic formation systems, there is almost no literature on the stochastic model of the UAV swarms. For example, a six-degree-of-freedom (DOF) unified nonlinear dynamic model of spacecraft formation was presented in [34]; In [35,36], the formation control laws for YF-22 aircraft models with six DOF dynamics plus kinematic equations were designed. Although these formation models are efficient for the cooperative of a group without stochastic disturbance, the control strategy may be invalid when control objects are moving in the noise environment.

To the best of the authors' knowledge, few papers discuss the quality of the configuration, and most of them have come up with an algorithm to control the formation, thereby achieving the desired configuration or avoiding collision [6–8]. However, they do not discuss the robustness of UAV swarms in much detail. In order to improve the robustness of the stochastic system, the parameters of the estimator and the controller are optimized by the stochastic robustness analysis and design (SRAD) method [37] in the simulation. Furthermore, few studies have carried out fixed-wing flight test experiments. A multi-UAV outfield flight experiment was carried out to verify the effectiveness of the formation collision forecast and coordination algorithm in [9]. In [36], a set of flights was performed to assess the performance of the formation control laws. To extend the previous outfield flight test results, the overall design in this paper is validated experimentally by flight testing using the leader-follower configuration.

Motivated by the discussions above, the stochastic USCC model with multiplicative noises is investigated in this paper. Compared with the existing literature, the main theoretical and experimental contributions of our work are summarized as follows:


The rest of this paper is structured as follows:

In Section 2, the problem's formulation and preliminary studies of the USCC stochastic system are presented. In Section 3, the mathematical model of formation control stochastic system is illustrated in detail. The estimator and controller are also designed to control the formation, and SRAD has been used to optimize the controller and estimator. The mean-square uniformly bounded condition of the proposed stochastic system is then presented. In Section 4, simulations and experiments are conducted to verify the effectiveness of the model. Finally, concluding remarks are given in Section 5.

#### **2. Preliminary and Problem Formulation**

*2.1. Ito Stochastic System* ˆ

Consider the linear *Ito*ˆ stochastic system as the model to be investigated as follows:

$$d\mathbf{x} = [A(t)\mathbf{x} + B(t)]dt + \sum\_{i=1}^{m} [F\_i(t)\mathbf{x} + G\_i(t)]dW\_i \tag{1}$$

$$d\mathbf{x} = A(t)\mathbf{x}dt + \sum\_{i=1}^{m} F\_i(t)\mathbf{x}dW\_i \tag{2}$$

where *x*, *B*, *Gi* ∈ *Rn*, *A*, *Fi* ∈ *Rn*×*n*, *W*(*t*) = [*W*1(*t*), *W*2(*t*), ··· , *Wm*(*t*)] *<sup>T</sup>*,(*<sup>t</sup>* <sup>≥</sup> <sup>0</sup>) are *<sup>m</sup>* dimensional standard Wiener processes, which are defined in the complete probability space (Ω, F, P) and are independent of each other. Define the following matrices:

$$\begin{cases} \begin{aligned} \mathcal{M}(t) &= A(t) \oplus A(t) + \sum\_{i=1}^{m} F\_i(t) \otimes F\_i(t) \\ \mathcal{R}(t) &= 2 \Big[ I\_N \otimes B^T(t) \Big] \mathbb{K}\_N + 2 \sum\_{i=1}^{m} \Big[ I\_N \otimes G\_i^T(t) \Big] \mathbb{K}\_N F\_i(t) \\ \mathcal{K}(t) &= \sum\_{i=1}^{m} G\_i(t) \otimes G\_i(t) \end{aligned} \tag{3} \end{cases} \tag{3}$$

where *<sup>N</sup>* = *<sup>n</sup>*2, '⊕' denotes the Kronecker tensor for a matrix, and *<sup>A</sup>* <sup>⊕</sup> *<sup>A</sup>* = *In* <sup>⊗</sup> *<sup>A</sup>* + *<sup>A</sup>* <sup>⊗</sup> *In*, '⊗' is the Kronecker tensor product of a matrix.

$$K\_N = \begin{bmatrix} 1 & 0 \cdots & 0 & 0 \cdots & 0 & 0 \cdots & 0 \\ 0 & 0 \cdots & 1 & 0 \cdots & 0 & 0 \cdots & 0 \\ & & & \cdots & \cdots & \\ 0 & 0 \cdots & 0 & 0 \cdots & 1 & 0 \cdots & 0 \end{bmatrix} \tag{4}$$

where the element '1' appears in the first column, (*n* + 1)*th* column and [(*n* − 1)*n* + 1]*th* column of *KN*.

#### *2.2. Mean-Square Uniform Boundedness of the Stochastic System*

Since the stochastic system is complicated by external interferences, its stability condition is relatively strict and there is no trivial solution to the equation. To solve this problem, we take advantage of the mean-square uniform boundedness of the stochastic system. The condition for boundedness is slightly less strict than that of stability. Under the condition of boundedness, the states of the system are converging to bounded areas instead of certain stable points as time tends to infinity. Therefore, for the USCC stochastic system model, stability refers to the mean-square uniform boundedness.

**Definition 1.** *If there is a positive number c:*

$$\lim\_{t \to \infty} \sup \mathcal{E} \{ \left\| X\_{ij}(t, t\_0, X\_{ij0}) \right\|^2 \} \le c \tag{5}$$

Then the states *Xij*(*t*,*t*0, *Xij*0) are mean-square bounded. The subscript '0' represents the initial value, *Xij*<sup>0</sup> denotes the initial states of the system which is composed of two members: *i and j*, *t*<sup>0</sup> denotes the initial time, *t* denotes the current time, · is the Euclidean norm, E{·} is the mathematical expectation, and sup is the minimum upper bound.

**Lemma 1.** *[12] The necessary and su*ffi*cient condition for the mean-square boundedness of the solution for the time-varying linear stochastic system (1) is that the following time-varying linear deterministic system is bounded:*

$$
\dot{y} = \begin{bmatrix} M(t) & R(t) \\ 0 & A(t) \end{bmatrix} y + \begin{bmatrix} K(t) \\ B(t) \end{bmatrix} \tag{6}
$$

**Lemma 2.** *[12] If (2) is a time-invariant linear system (i.e., A*, *Fi* = *const*, *i* = 1, 2, ··· , *m), system (2) is uniformly asymptotically stable if and only if <sup>M</sup>* <sup>=</sup> *<sup>A</sup>* <sup>⊕</sup> *<sup>A</sup>* <sup>+</sup> *<sup>m</sup> i*=1 *Fi* ⊗ *Fi is stable, i.e., M is a Hurwitz matrix, or the real parts of the eigenvalues of matrix M are negative.*

**Lemma 3.** *[12] If B*(*t*), *Gi*(*t*),(*i* = 1, 2, ··· , *m*) *are bounded and system (2) is mean-square uniformly asymptotically stable, then the solution of system (1) is mean-square uniformly bounded.*

Based on Lemma 1, Lemma 2, Lemma 3, we present sufficient conditions for the stability of the *Ito*ˆ stochastic model and prove them.

**Theorem 1.** *The su*ffi*cient conditions for the stability of the Ito*ˆ *stochastic model (or the mean-square uniform boundedness of the stochastic system (1)) are:*


**Proof.** For Equation (1), although we can use the conclusion of Lemma 1 to obtain the necessary and sufficient conditions for the mean-square boundedness directly, given that *A* and *Fk*, *k* = 1 ∼ *m* are linear time-invariant matrices, we further simplify the mean-square boundedness conditions.

According to Lemma 3, if it satisfies condition (1), the mean-square uniform boundedness of the system (1) is equivalent to system (2) and is mean-square uniformly asymptotically stable:

According to Lemma 2, system (2) is mean-square uniformly asymptotically equivalent to condition (2). Proof completed.

*2.3. Estimator of Ito Stochastic System* ˆ

**Lemma 4.** *[38] considering the Ito stochastic system in the form as follows:* ˆ

$$d\mathbf{x}(t) = A\mathbf{x}(t)dt + A\_0\mathbf{x}(t)dw(t) + dw\_1(t) \tag{7}$$

$$dy(t) = H\mathbf{x}(t)dt + H\_0\mathbf{x}(t)dw(t) + dw\_2(t)\tag{8}$$

*where x*(*t*) ∈ *Rn, y*(*t*) ∈ *Rm are system states and measured values, respectively.A, A*0*, H and H*<sup>0</sup> *are constant matrices (they can also be extended to time-varying matrices if needed). w*(*t*) *is a standard scalar Wiener process, as well as w*1(*t*) *and w*2(*t*)*, where w*1(*t*) ∈ *R<sup>n</sup> and w*2(*t*) ∈ *Rm. The initial state x*(0) *is a zero mean second-order stochastic process.*

Assuming that *x*(0) is independent of *w*(*t*), *w*1(*t*), and *w*2(*t*), and it satisfies:

$$\begin{cases} \begin{aligned} E\{\mathbf{x}(0)\mathbf{x}^T(0)\} &= D(0) \\ E\{dw(t)dw^T(t)\} &= dt \\ E\{dw\_1(t)dw\_1^T(t)\} &= Qdt \\ E\{dw\_2(t)dw\_2^T(t)\} &= Rdt \end{aligned} \end{cases} \tag{9}$$

Then the linear estimator with minimum mean square error is:

$$d\hat{\mathfrak{x}}(t) = [A - K(t)H]\hat{\mathfrak{x}}(t)dt + \mathcal{K}(t)dy(t) \tag{10}$$

$$
\hat{x}(0) = 0 \tag{11}
$$

The gain of the estimator is:

$$K(t) = \left[P(t)H^T + A\_0 D(t)H\_0^{\;\!\!T}\right] \left[H\_0 D(t)H\_0^{\;\!\!T} + R\right]^{-1} \tag{12}$$

where *P*(*t*) can be obtained as follow:

$$\begin{array}{ll}dP(t) = & AP(t)dt + P(t)A^T dt + A\_0 D(t)A\_0^T dt + Qdt\\ & - \mathcal{K}(t)[H\_0 D(t)H\_0^T + R]\mathcal{K}^T(t)dt\end{array} \tag{13}$$

$$P(0) = D(0) \tag{14}$$

$$dD(t) = A D(t)dt + D(t)A^T dt + A\_0 D(t)A\_0^{\;\;\;\;\;\;\;D}dt + \mathcal{Q}dt\tag{15}$$

**Remark 1.** *In this study, according to the above lemmas, we can construct a more applicable and e*ff*ective USCC stochastic model to improve the formation properties of the UAV swarm system. Moreover, the measurement equation with Gaussian noises, the optimal estimator and controller are ingeniously involved in the closed-loop model. The mean square uniform boundedness condition of USCC stochastic system can be obtained based on the lemmas and theorem proposed above.*

#### **3. USCC Stochastic System Modeling**

The group dynamic cooperative control system model comprehensively considers the personality of the individual members and the cooperativeness of the whole formation based on Reynolds' three criteria [31]. Generally, the model is built with virtual forces: individuality and interaction forces. Individuality force describes the nodes' individual characteristics. Interaction force indicates the quality of group collaboration among nodes and describes the group dynamic cooperative characteristics, reflecting the ability to obey Reynolds' three criteria. We will use it as a general theory to guide the modeling of the USCC stochastic system.

#### *3.1. Model of Individual Flight Control System*

The individual flight control system of the UAV swarm adopts the north-up-east coordinate system.

**Assumption 1.** *The formation moves in a two-dimensional plane, thus the flight path inclination and pitch velocity are zero; the aircraft adopts side slip turning, thus the speed inclination angle, roll angle, roll angle velocity, angle of attack and side slip angle are all small values.*

**Assumption 2.** *Thrust P is independent of velocity V*.

The simplified nonlinear mathematical model of individual flight control system is:

$$\begin{cases} m & \frac{dV}{dt} = P - X \\ mV & \frac{d\phi}{dt} = -P\beta + Z \\ mV & \frac{d\phi}{dt} = \omega\_y + P\beta - Z \\ Jy & \frac{d\phi\_y}{dt} = M\_y \\ & \frac{dP}{dt} = -\frac{1}{T\_P}P + \frac{K\_P}{T\_P}\delta\_{Pc} \\ & \frac{d\delta\_y}{dt} = -\frac{1}{T\_{\delta\_y}}\delta\_y + \frac{K\_{\delta\_y}}{T\_{\delta\_y}}\delta\_{yc} \end{cases} \tag{16}$$

where *V* is the flight velocity; ϕ is the flight path declination; β is the lateral slip angle; ω is the rotational angle velocity of the body's coordinate system relative to the ground coordinate system. The subscript "y" denotes the *y*-component of ω. *Jy* is the *y*-component of inertial moments of the body's coordinate system. *My* is the *y*-component of moment caused by the external force (including thrust) on the mass center; *P* is the thrust; *X* is the resistance force; *Z* is the lateral force; δ is the rudder declination; *K*δ, *T*<sup>δ</sup> are gain and time constants of the control surface response, respectively (subscripts *x*,*y*, *z* are aileron, rudder, and elevator, respectively); *Kp*, *Tp* are gain and time constants for the thrust response, respectively; δ*c*, δ*Pc* are rudder angle command and thrust command, respectively.

By performing a small-disturbance linearization on (16) [39], we can obtain:

$$\begin{array}{ll} \frac{d\Delta V}{dt} = -\frac{\dot{X}^{\flat}}{m} \Delta V + \frac{1}{m} \Delta P\\ \frac{d\Delta \phi}{dt} = \frac{P - \dot{Z}^{\flat}}{mV} \Delta \beta - \frac{Z^{\flat}\_{\flat}}{mV} \Delta \delta\_{y}\\ \frac{d\Delta \phi}{dt} = -\frac{P - \dot{Z}^{\flat}}{mV} \Delta \beta + \Delta \alpha y + \frac{Z^{\flat}\_{\flat}}{mV} \Delta \delta\_{y}\\ \frac{d\Delta \alpha\_{y}}{dt} = (\frac{M\_{y}^{\delta}}{l\_{y}} - \frac{M\_{y}^{\delta}}{l\_{y}} \frac{P - \dot{Z}^{\flat}}{mV}) \Delta \beta + (\frac{M\_{y}^{\omega\_{y}}}{l\_{y}} + \frac{M\_{y}^{\dot{\delta}}}{l\_{y}}) \Delta \alpha\_{\dot{\delta}} + (\frac{M\_{y}^{\delta\_{y}}}{l\_{y}} + \frac{M\_{y}^{\dot{\delta}}}{l\_{y}} \frac{Z^{\delta\_{y}}}{mV}) \Delta \delta\_{y}\\ \frac{d\Delta \theta}{dt} = -\frac{1}{T\_{\delta\_{y}}} \Delta P + \frac{K\_{\delta\_{y}}}{T\_{\delta\_{y}}} \Delta \delta\_{y}\\ \frac{d\Delta \delta\_{y}}{dt} = -\frac{1}{T\_{\delta\_{y}}} \Delta \delta\_{y} + \frac{K\_{\delta\_{y}}}{L\_{\delta\_{y}}} \Delta \delta\_{y} \end{array} \tag{17}$$

where *X<sup>V</sup>* = <sup>∂</sup>*<sup>X</sup>* <sup>∂</sup>*<sup>V</sup>* , the same as other elements which is in the same form with *<sup>X</sup><sup>V</sup>* in (17).

#### *3.2. Model of Formation Control System*

⎧

⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩

For the convenience of outfield experiments and the safety of UAV swarm, we construct the model in a two-dimensional (2D) plane to make it more adaptive to complex task environments such as flat and dense formations, under which the aircraft would carry out missions at low altitude with almost no vertical maneuver space. Moreover, theoretical results can be covered fully and extended to the three-dimensional (3D) space. Therefore, we focus on the problem of USCC in the two-dimensional plane, as shown in Figure 1.

**Figure 1.** Relative movement between neighboring nodes.

In Figure 1, *vi* and *vj* are the two nodes adjacent to each other. The flight path coordinate of node *vi* is set as the relative coordinate system, in which the x-axis represents the direction of the velocity and the z-axis is perpendicular to the x-axis shown as in Figure 1.

The ground coordinate system is set as the fixed coordinate system. *dij* is the distance between the two nodes;*xij* and *zij* are the relative distances in the forward and lateral directions of the flight path coordinate system, respectively. *Vi*, *Vj*, ϕ*<sup>i</sup>* and ϕ*<sup>j</sup>* represent their velocities and flight path declinations in the ground coordinate system, respectively.

With the relationship from theoretical mechanics: absolute velocity = relative velocity + convected velocity, the following kinematics equation for node *vi* and node *vj* can be derived:

$$
\overrightarrow{V}\_{j} = \dot{\overrightarrow{d}}\_{ij} + \overrightarrow{V}\_{i} + \dot{q}\_{i} \times \overrightarrow{d}\_{ij} \tag{18}
$$

The above equation can be decomposed in the flight path coordinate system of the node *vi* as:

$$\begin{cases} \frac{dx\_{ij}}{dt} = V\_j \cos(\varphi\_j - \varphi\_i) - V\_i + \frac{d\varphi\_i}{dt} z\_{ij} \\ \frac{dz\_{ij}}{dt} = V\_j \sin(\varphi\_j - \varphi\_i) - \frac{d\varphi\_i}{dt} x\_{ij} \end{cases} \tag{19}$$

Substituting the second formula in (17) into (19) and performing small perturbation linearization yields:

$$\begin{cases} \begin{aligned} \frac{d\Delta x\_{ij}}{dt} &= [-1 + \frac{(p\_{i} - \mathbf{X}\_{i})(-P\_{i}\delta\_{i} + \mathbf{Z}\_{i}) + m\_{i}V\_{i}\mathbf{Z}\_{i}^{\prime}}{m\_{i}^{2}V\_{i}^{2}} z\_{ij}]\Delta V\_{i} + V\_{j}\sin(\varphi\_{j} - \varphi\_{i})\Delta q\_{i} \\ &+ \frac{(-P\_{i} + \mathbf{Z}\_{i}^{\prime})z\_{ij}}{m\_{i}V\_{i}}\Delta\beta\_{i} - \frac{\beta\_{i}z\_{ij}}{m\_{i}V\_{i}}\Delta P\_{i} - \frac{\beta\_{i}v\_{zij}}{m\_{i}V\_{i}}\Delta\delta\_{ij} + \cos(\varphi\_{j} - \varphi\_{i})\Delta V\_{j} - V\_{j}\sin(\varphi\_{j} - \varphi\_{i})\Delta q\_{j} \\ \frac{d\Delta z\_{ij}}{dt} &= [\frac{(P\_{i} - \mathbf{X}\_{i})(-P\_{j}\beta\_{i} + \mathbf{Z}\_{i}) + m\_{i}V\_{i}Z\_{i}^{\prime}}{m\_{i}V\_{i}^{2}}\mathbf{x}\_{ij}]\Delta V\_{i} - V\_{j}\cos(\varphi\_{j} - \varphi\_{i})\Delta q\_{i} \\ &- \frac{(P\_{i} + \mathbf{Z}\_{i}^{\prime})\mathbf{x}\_{ij}}{m\_{i}V\_{i}}\Delta\beta\_{i} + \frac{\beta\_{i}x\_{ij}}{m\_{i}V\_{i}}\Delta P\_{i} + \frac{\beta\_{i}v\_{xi}}{m\_{i}V\_{i}}\Delta\delta\_{ij} + \sin(\varphi\_{j} - \varphi\_{i})\Delta V\_{j} + V\_{j}\cos(\varphi\_{j} - \varphi\_{i})\Delta\varphi\_{j} \end{aligned} \tag{20}$$

Note that the aircraft flight momentum *miVi* is relatively large. Moreover, β*<sup>i</sup>* has small values according to Assumption 1, and <sup>ϕ</sup>*<sup>j</sup>* <sup>−</sup> <sup>ϕ</sup>*<sup>i</sup>* <sup>≈</sup> 0, thus (*Pi*−*Xi*)(−*Pi*β*i*+*Zi*) *mi* <sup>2</sup>*Vi* <sup>2</sup> <sup>Δ</sup>*Vi*, <sup>β</sup>*<sup>i</sup> miVi* Δ*P*, and sin(ϕ*<sup>j</sup>* − ϕ*i*)Δϕ*<sup>i</sup>* are second-order small quantities. Meanwhile, cos(ϕ*<sup>j</sup>* − ϕ*i*) ≈ 1, sin(ϕ*<sup>j</sup>* − ϕ*i*) ≈ ϕ*<sup>j</sup>* − ϕ*i*. Ignoring these second-order small quantities and simplifying (20), we can get:

$$\begin{cases} \frac{d\Delta\overline{\boldsymbol{w}}\_{ij}}{dt} = (\frac{Z\_i^{V\_i}\boldsymbol{z}\_{ij}}{m\_i\boldsymbol{V}\_i} - 1)\Delta V\_i + \frac{(-\boldsymbol{P}\_i + \boldsymbol{Z}\_i^{\delta\_i})\boldsymbol{z}\_{ij}}{m\_i\boldsymbol{V}\_i}\Delta\beta\_i - \frac{Z\_i^{\delta\_{\overline{i}}}\boldsymbol{z}\_{\overline{i}j}}{m\_i\boldsymbol{V}\_i}\Delta\delta\_{\overline{i}j} + \Delta V\_j\\ \frac{d\Delta\boldsymbol{z}\_{ij}}{dt} = \frac{Z\_i^{V\_i}\boldsymbol{z}\_{\overline{i}\overline{j}}}{m\_i\boldsymbol{V}\_i}\Delta V\_i - V\_j\Delta q\_i - \frac{(-\boldsymbol{P}\_i + \boldsymbol{Z}\_i^{\delta\_i})\boldsymbol{z}\_{\overline{i}j}}{m\_i\boldsymbol{V}\_i}\Delta\beta\_i + \frac{Z\_i^{\delta\_{\overline{i}}}\boldsymbol{z}\_{\overline{i}\overline{j}}}{m\_i\boldsymbol{V}\_i}\Delta\delta\_{\overline{i}j} + V\_j\Delta q\_j \end{cases} \tag{21}$$

Combining Equations (17) with (19) and (21) yields the formation control system model:

$$\begin{cases} \frac{d\Delta x\_{ij}}{dt} = a\_1 \Delta V\_i + a\_2 \Delta \beta\_i - a\_3 \Delta \delta\_{ij} + \Delta V\_j\\ \frac{d\Delta \psi\_i}{dt} = a\_4 \Delta V\_i - V\_j \Delta \rho\_i - a\_5 \Delta \beta\_i + a\_6 \Delta \delta\_{ij} + V\_j \Delta \alpha\_{ij}\\ \frac{d\Delta V\_i}{dt} = -a \tau \Delta V\_i + \frac{1}{m\_i} \Delta P\_i\\ \frac{d\Delta \psi\_i}{dt} = a\_8 \Delta \beta\_i - a\_9 \Delta \delta\_{ij}\\ \frac{d\Delta \beta\_i}{dt} = -a\_{10} \Delta \beta\_i + a \Delta \upsilon\_{iy} + a\_9 \Delta \delta\_{iy}\\ \frac{d\Delta \psi\_i}{dt} = a\_{10} \Delta \beta\_i + a\_{11} \Delta \omega\_{iy} + a\_{12} \Delta \delta\_{iy}\\ \frac{d\Delta P\_i}{dt} = -\frac{1}{T\_{i0}} \Delta P\_i + \frac{K\_{i0}}{T\_{i0}} \Delta \delta\_{i0}\\ \frac{d\Delta \delta\_{iy}}{dt} = -\frac{1}{T\_{i\lambda\_y}} \Delta \delta\_{iy} + \frac{K\_{i\lambda\_y}}{T\_{i\lambda\_y}} \Delta \delta\_{iy\csc} \end{cases} (22)$$

where *<sup>a</sup>*<sup>1</sup> <sup>=</sup> *Zi Vi zij miVi* <sup>−</sup> 1, *<sup>a</sup>*<sup>2</sup> <sup>=</sup> (−*Pi*+*Zi* <sup>β</sup>*i*)*zij miVi* , *<sup>a</sup>*<sup>3</sup> <sup>=</sup> *Zi* <sup>δ</sup>*iy zij miVi* , *<sup>a</sup>*<sup>4</sup> <sup>=</sup> *Zi Vi xij miVi* , *<sup>a</sup>*<sup>5</sup> <sup>=</sup> (−*Pi*+*Zi* <sup>β</sup>*i*)*xij miVi* , *<sup>a</sup>*<sup>6</sup> <sup>=</sup> *Zi* <sup>δ</sup>*<sup>i</sup> <sup>y</sup> xij miVi* , *<sup>a</sup>*<sup>7</sup> = *Xi Vi mi* , *<sup>a</sup>*<sup>8</sup> <sup>=</sup> *Pi*−*Zi* β*i miVi* , *<sup>a</sup>*<sup>9</sup> <sup>=</sup> *Zi* δ*iy miVi* , *<sup>a</sup>*<sup>10</sup> <sup>=</sup> *<sup>M</sup>*β*<sup>i</sup> iy Jiy* <sup>−</sup> *<sup>M</sup>* . β*i iy Jiy Pi*−*Zi* β*i miVi* , *<sup>a</sup>*<sup>11</sup> <sup>=</sup> *<sup>M</sup>*ω*iy iy Jiy* <sup>+</sup> *<sup>M</sup>* . β*i iy Jiy* , *<sup>a</sup>*<sup>12</sup> <sup>=</sup> *<sup>M</sup>*δ*iy iy Jiy* <sup>+</sup> *<sup>M</sup>* . β*i iy Jiy Zi* δ*iy miVi* . Note that: the state coefficients *Vi*, *Pi*, *xij*, *zij* and *Zi Vi* in (22) are obtained at the balanced point.

*3.3. Random Noises Analysis and Its Modeling*

#### 3.3.1. Process Noises

The formation could be easily affected by various forces in the atmosphere that cannot be accurately measured in advance. Therefore, the process noises cannot be ignored.

For the node ν*i*, assuming that the mass *mi*, velocity *Vj* and flight path declination angle ϕ*<sup>j</sup>* of the adjacent node ν*j*, which are obtained from the supporting network, are given values (i.e., consider that *Vj* and ϕ*<sup>j</sup>* are already estimated in ν*j*, and ignore the random transmission interference), but *xij*, *zij*, *Vi*, *Pi*, *Xi Vi* , *Zi Vi* , *Zi* <sup>β</sup>*<sup>i</sup>* , *Zi* <sup>δ</sup>*iy* , *M*β*<sup>i</sup> iy*, *M* . β*i iy*, *<sup>M</sup>*ω*iy iy* and *<sup>M</sup>*δ*iy iy* are determined by the aircraft's instantaneous state (such as velocity, altitude, attack angle, yaw angle, etc.); These states and their influences are random in the real flight environment. Therefore, based on the central limit theorem, we assume that the above parameters approximately obey the Gaussian distribution, that is:

$$
\begin{pmatrix} (1) \begin{array}{l} \boldsymbol{x}\_{\overline{i}} = \boldsymbol{x}\_{\overline{i}\boldsymbol{b}} + \boldsymbol{w}\_{\overline{i}\boldsymbol{\zeta}} \\ \boldsymbol{w}\_{\overline{\boldsymbol{m}}\boldsymbol{\zeta}} \sim \operatorname{N}(0, \sigma^{2}\_{\overline{\boldsymbol{m}}\boldsymbol{\zeta}}) \end{array} , \qquad (2) \begin{cases} \boldsymbol{z}\_{\overline{i}} = \boldsymbol{z}\_{\overline{i}\boldsymbol{b}} + \boldsymbol{w}\_{\overline{\boldsymbol{m}}\boldsymbol{\zeta}} \\ \boldsymbol{w}\_{\overline{\boldsymbol{m}}\boldsymbol{\zeta}} \sim \operatorname{N}(0, \sigma^{2}\_{\overline{\boldsymbol{m}}\boldsymbol{\zeta}}) \end{array} , \qquad (3) \begin{cases} \boldsymbol{V}\_{\overline{i}} = \boldsymbol{V}\_{\overline{i}\boldsymbol{b}} + \boldsymbol{w}\_{\overline{\boldsymbol{V}}\boldsymbol{\zeta}} \\ \boldsymbol{w}\_{\overline{\boldsymbol{V}}\boldsymbol{\zeta}} \sim \operatorname{N}(0, \sigma^{2}\_{\overline{\boldsymbol{V}}\boldsymbol{\zeta}}) \end{cases} , \qquad (4) \begin{cases} \boldsymbol{P}\_{\overline{i}} = \boldsymbol{P}\_{\overline{i}\boldsymbol{b}} + \boldsymbol{w}\_{\overline{\boldsymbol{V}}\boldsymbol{\zeta}} \\ \boldsymbol{w}\_{\overline{\boldsymbol{V}}\boldsymbol{\zeta}} \sim \operatorname{N}(0, \sigma^{2}\_{\overline{\boldsymbol{V}}\boldsymbol{\zeta}}) \end{cases} , \qquad (5) \begin{cases} \boldsymbol{P}\_{\overline{i}} = \boldsymbol{P}\_{\overline{i}\boldsymbol{b}} + \boldsymbol{w}\_{\overline{\boldsymbol{V}}\boldsymbol{\zeta}} \\ \boldsymbol{w}\_{\overline{\boldsymbol{V}}\boldsymbol{\zeta}} \sim \operatorname{N}(0, \sigma^{2}\_{\overline{\boldsymbol{V}}\boldsymbol{\zeta}}) \end{cases} , \qquad (6) \begin{cases} \boldsymbol{V}\_{\overline{\boldsymbol{\$$

The subscript "*b*" denotes that the values are determined and they are obtained at the balanced point. The formation states do not change much when they fly around the balanced point; thus, the variances of the random variables are approximately constant, and it can be assumed that the above parameters are independent of each other. In the following, we use *n*1, *n*2, ··· , *n*<sup>12</sup> to represent the random variables in (23).

For  $a\_{1} = \frac{Z\_{i}^{V\_{i}}z\_{ij}}{m\_{i}V\_{i}} - 1$ , substituting (2) (3) (6) in (23) into  $a\_{1}$  yields: 
$$a\_{1} = \frac{Z\_{i}^{V\_{i}}z\_{ij}}{m\_{i}V\_{i}} - 1 = \frac{(Z\_{\hat{\nu}}^{V\_{i}} + w\_{\hat{\nu}\_{i}})(z\_{i\hat{\nu}} + w\_{\hat{\nu}\_{i}})}{m\_{i}(V\_{i} + w\_{\hat{\nu}\_{i}})} - 1 = \frac{Z\_{\hat{\nu}}^{V\_{i}}z\_{i\hat{\nu}}}{m\_{i}(V\_{\hat{\nu}} + w\_{\hat{\nu}\_{i}})} - 1 + \frac{Z\_{\hat{\nu}}^{V\_{\hat{\nu}\_{\text{ZF}}}}z\_{ij}}{m\_{i}(V\_{\hat{\nu}} + w\_{\hat{\nu}\_{i}})} + \frac{z\_{i\hat{\nu}\_{i}}w\_{\hat{\nu}\_{i}}}{m\_{i}(V\_{\hat{\nu}} + w\_{\hat{\nu}\_{i}})} + \frac{w\_{\hat{\nu}\_{i}}w\_{\hat{\nu}\_{i}}}{m\_{i}V\_{\hat{\nu}} + w\_{\hat{\nu}\_{i}}}\tag{24}$$

Assume that *wVi* is relatively small compared to the aircraft speed *Vib*. Since *Vib* + *wVi* is in the denominator, the impact of *wVi* on *a*<sup>1</sup> is small and can be ignored; assuming that both *wZi Vi* and *wzij* are small,*wZi Vi wzij* is a second-order small quantity and can be ignored. Then we get:

$$a\_1 = (\frac{Z\_{\bar{\nu}}V\_{\bar{z}\_{\bar{\nu}}\bar{b}} - 1) + \frac{z\_{\bar{\nu}}\mu v\_{\bar{z}\_{\bar{\nu}}\bar{V}\_i}}{m\_i V\_{\bar{\nu}}} + \frac{Z\_{\bar{\nu}}V\_{\bar{z}\_{\bar{\nu}}\bar{b}}}{m\_i V\_{\bar{\nu}}} = (\frac{Z\_{\bar{\nu}}V\_{\bar{z}\_{\bar{\nu}}\bar{b}}}{m\_i V\_{\bar{\nu}}} - 1) + \frac{Z\_{\bar{\nu}}V\_{\bar{z}\_{\bar{\nu}}\bar{b}}}{m\_i V\_{\bar{\nu}}}\\m\_2 + \frac{z\_{\bar{\nu}}\mu v\_{\bar{z}\_{\bar{\nu}}\bar{V}\_i}}{m\_i V\_{\bar{\nu}}}n\_b \triangleq a\_{1\bar{\nu}} + a\_{1\bar{\nu}}n\_2 + a\_{1\bar{\nu}}n\_6\tag{25}$$

Assuming that all the random parts of (23) are small, and ignoring the second-order small quantity. For the same reason as *a*1, the expression of the coefficients *a*<sup>2</sup> ∼ *a*<sup>15</sup> could be derived. The results are given as follows:

$$a\_2 = \frac{(-P\_{\rm B} + Z\_{\rm D}{}^{\delta\_{\rm I}})z\_{\rm i\uparrow}}{m\_{\rm i}V\_{\rm D}} + \frac{(-P\_{\rm B} + Z\_{\rm D}{}^{\delta\_{\rm I}})z\_{\rm i\downarrow}}{m\_{\rm i}V\_{\rm D}}u\_2 + \frac{-z\_{\rm i\uparrow}\rho\cdot\nu\_{\rm I}}{m\_{\rm i}V\_{\rm D}}u\_4 + \frac{z\_{\rm i\uparrow}\rho\cdot\nu\_{\rm I}}{m\_{\rm i}V\_{\rm D}}v\_7 \triangleq a\_{2b} + a\_{2b}u\_2 + a\_{2b}u\_4 + a\_{2b}v\_7 \tag{26}$$

$$a\_3 = \frac{Z\_{\rm ib}{}^{\delta\_{iy}} z\_{\rm ij}}{m\_i V\_{\rm ib}} + \frac{Z\_{\rm ib}{}^{\delta\_{iy}} \sigma\_{Z\_{\rm ij}}}{m\_i V\_{\rm ib}} n\_2 + \frac{z\_{\rm ij} \mu \sigma\_{Z\_i} \delta\_{iy}}{m\_i V\_{\rm ib}} n\_8 \triangleq a\_{3b} + a\_{3b2} n\_2 + a\_{3b8} n\_8 \tag{27}$$

$$a\_4 = \frac{Z\_{i\bar{b}}V\_{i\bar{t}}\imath\_{i\bar{j}\bar{b}}}{m\_iV\_{i\bar{b}}} + \frac{Z\_{i\bar{b}}V\_{i\sigma\_{\bar{i}\bar{j}}}}{m\_iV\_{i\bar{b}}}n\_1 + \frac{X\_{i\bar{j}b}\sigma\_{Z\_i}V\_i}{m\_iV\_{i\bar{b}}}n\_6 \triangleq a\_{4b} + a\_{4b1}n\_1 + a\_{4b6}n\_6\tag{28}$$

$$a\_5 = \frac{(-p\_{\overline{0} + \overline{Z}\_0 \overline{0}})u\_{\overline{0} 0}}{m\_i V\_B} + \frac{(-P\_{\overline{0}} + Z\_0 \overline{0})u\_{\overline{Z}\_{\overline{0}}}}{m\_i V\_B} u\_1 + \frac{-z\_{\overline{0} \oplus P\_{\overline{1}}}}{m\_i V\_B} u\_4 + \frac{z\_{\overline{1} \oplus P\_{\overline{1}}}}{m\_i V\_B} u\_7 \triangleq a\_{50} + a\_{50}u\_1 + a\_{50}u\_4 + a\_{50}u\_7 \tag{29}$$

$$n\_{\delta} = \frac{Z\_{i\bar{b}}\delta\_{i\bar{y}}\chi\_{\bar{i}\bar{j}\bar{b}}}{m\_{i}V\_{i\bar{b}}} + \frac{Z\_{i\bar{b}}\delta\_{i\bar{y}}\sigma\_{X\_{\bar{i}\bar{j}}}}{m\_{i}V\_{i\bar{b}}}n\_{1} + \frac{\chi\_{i\bar{j}b}\sigma\_{Z\_{i}^{\delta\_{i}\bar{y}}}}{m\_{i}V\_{i\bar{b}}}n\_{\delta} \triangleq a\_{\delta b} + a\_{\delta b}n\_{1} + a\_{\delta b\delta}n\_{\delta} \tag{30}$$

$$a\_7 = \frac{X\_{i\bar{b}}V\_i}{m\_i} + \frac{\sigma\_{X\_i^V i}}{m\_i}n\_5 \triangleq a\_{7b} + a\_{7b5}n\_5\tag{31}$$

$$a\_8 = \frac{P\_{\rm ib} - Z\_{\rm ib}\beta\_i}{m\_i V\_{\rm ib}} + \frac{\sigma\_{P\_i}}{m\_i V\_{\rm ib}} n\_4 + \frac{-\sigma\_{Z\_i\beta\_i}}{m\_i V\_{\rm ib}} n\_7 \triangleq a\_{8b} + a\_{8b4} n\_4 + a\_{8b7} n\_7 \tag{32}$$

$$a\_{\mathfrak{B}} = \frac{Z\_{\text{ib}}{m\_{\text{i}}} V\_{\text{ib}}}{m\_{\text{i}} V\_{\text{ib}}} + \frac{\sigma\_{Z\_{\text{i}}}}{m\_{\text{i}} V\_{\text{ib}}} n\_{\text{8}} \triangleq a\_{\mathfrak{B}\mathfrak{b}} + a\_{\mathfrak{B}\mathfrak{b}\mathfrak{C}} n\_{\text{8}} \tag{33}$$

$$\begin{array}{llll} a\_{10} &= \left(\frac{M^{\dot{\theta}\_{i}}\_{i\dot{\eta}b}}{l\_{\dot{\eta}}} - \frac{M^{\dot{\theta}\_{i}}\_{i\dot{\eta}b}}{l\_{\dot{\eta}}} \frac{p\_{\dot{\eta}} - Z\_{\dot{\eta}}\beta\_{i}}{m\_{i}V\_{\dot{\eta}b}}\right) + \frac{-M^{\dot{\theta}\_{i}}\_{i\dot{\eta}b}\alpha\_{i}}{l\_{\dot{\eta}}m\_{i}V\_{\dot{\eta}}}n\_{4} + \frac{M^{\dot{\theta}\_{i}}\_{i\dot{\eta}b}\sigma\_{Z\_{i}\dot{\theta}\_{i}}}{l\_{\dot{\eta}}m\_{i}V\_{\dot{\eta}}}n\_{7} + \frac{\sigma\_{M^{\dot{\theta}\_{i}}\_{i\dot{\eta}}}}{l\_{\dot{\eta}}}n\_{9} + \frac{M^{\dot{\theta}\_{i}}\_{i\dot{\eta}}\sigma\_{\dot{\eta}}}{l\_{\dot{\eta}}m\_{i}V\_{\dot{\eta}}}n\_{10} \end{array} \tag{34}$$

$$a\_{11} = \frac{M\_{\dot{y}\dot{y}}^{\omega\_{\dot{y}}} + M\_{\dot{i}\dot{y}b}^{\dot{\beta}\_{i}}}{I\_{\dot{i}y}} + \frac{\sigma\_{M\_{\dot{y}}^{\dot{\beta}\_{i}}}}{I\_{\dot{i}y}} n\_{10} + \frac{\sigma\_{M\_{\dot{y}}^{\omega\_{\dot{y}}}}}{I\_{\dot{i}y}} n\_{11} \triangleq a\_{11b} + a\_{11b10}n\_{10} + a\_{11b11}n\_{11} \tag{35}$$

$$\begin{array}{llll} a\_{12} = & \left( \frac{M\_{\dot{y}\dot{y}}^{\dot{\varepsilon}\_{\dot{y}}}}{l\_{\dot{y}\dot{y}}} + \frac{M\_{\dot{i}\dot{y}}^{\dot{\delta}\_{i}}}{l\_{\dot{y}\dot{y}}} \frac{Z\_{\dot{i}\dot{b}}}{m\_{\dot{i}}V\_{\dot{i}\dot{b}}} \right) + \frac{M\_{\dot{i}\dot{y}}^{\dot{\delta}\_{i}}\sigma\_{Z\_{\dot{i}}}\delta\_{\dot{i}\dot{y}}}{l\_{\dot{i}\dot{y}}m\_{\dot{i}}V\_{\dot{i}\dot{b}}}n\_{8} + \frac{Z\_{\dot{i}\dot{b}}^{\dot{\delta}\_{\dot{i}}}\sigma\_{\dot{i}}}{l\_{\dot{i}\dot{y}}m\_{\dot{i}}V\_{\dot{i}\dot{b}}}n\_{10} + \frac{\sigma\_{\dot{M}\_{\dot{i}\dot{y}}}}{l\_{\dot{i}\dot{y}}}n\_{12} \\ \triangleq & a\_{12}b + a\_{12}b\_{8}n\_{8} + a\_{12}b\_{10}n\_{10} + a\_{12}b\_{12}n\_{12} \end{array} \tag{36}$$

The states of the system are *Xij* = [ Δ*xij* Δ*zij* Δ*Vi* Δϕ*<sup>i</sup>* Δβ*<sup>i</sup>* Δω*iy* Δ*Pi* Δδ*iy* ] *T* ; the inputs are *Uij* = [ Δδ*iPc* Δδ*iyc* Δ*Vj* Δϕ*<sup>j</sup>* ] *T* ; Δ*Vj* and Δϕ*<sup>j</sup>* are determined random inputs. Substituting (25) to (36) into (22), then the open-loop state equation of the formation stochastic system could be get:

$$
\dot{X}\_{ij} = A\_{ij}X\_{ij} + B\_{ij}lI\_{ij} + \sum\_{k=1}^{12} F\_{ijk}X\_{ij}n\_k \tag{37}
$$

$$
\text{where } A\_{ij} = \begin{bmatrix} 0 & 0 & a\_{1b} & 0 & a\_{2b} & 0 & 0 & -a\_{3b} \\ 0 & 0 & a\_{4b} & -V\_j & -a\_{5b} & 0 & 0 & a\_{6b} \\ 0 & 0 & -a\_{7b} & 0 & 0 & 0 & 1/m\_i & 0 \\ 0 & 0 & 0 & 0 & a\_{8b} & 0 & 0 & -a\_{9b} \\ 0 & 0 & 0 & 0 & -a\_{8b} & 1 & 0 & a\_{9b} \\ 0 & 0 & 0 & 0 & a\_{10b} & a\_{11b} & 0 & a\_{12b} \\ 0 & 0 & 0 & 0 & 0 & 0 & -1/T\_{\text{IP}} & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & -1/T\_{\text{Si}} \end{bmatrix}, B\_{ij} = \begin{bmatrix} 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & V\_j \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ K\_{\text{IP}}/T\_{\text{IP}} & 0 & 0 & 0 \\ 0 & K\_{\text{Si}\_2}/T\_{\text{Si}\_2} & 0 & 0 \end{bmatrix}.
$$

For convenience of description, we define the square matrix *Tmn* whose order is eight and has only one nonzero element; the value of the nonzero element is '1' and it lies in column m and row *n*.

Then matrices *Fijk*(*k* = 1, 2, ... , 12) can be described as follows:

$$\begin{aligned} F\_{ij1} &= a\_{4b1} T\_{12} - a\_{6b1} T\_{12} + a\_{7b1} T\_{28}, \\ F\_{ij2} &= a\_{2b2} T\_{15} - a\_{3b2} T\_{18}, \\ F\_{ij3} &= a\_{163} T\_{13}, \\ F\_{ij4} &= a\_{2b4} T\_{15} - a\_{5b4} T\_{25} + a\_{8b4} T\_{45} - a\_{89b4} T\_{55} + a\_{10b4} T\_{65}, \\ F\_{ij5} &= -a\_{7b5} T\_{33}, \\ F\_{ij6} &= a\_{166} T\_{13} + a\_{4b6} T\_{23}, \\ F\_{ij7} &= a\_{267} T\_{15} - a\_{5b7} T\_{25} + a\_{8b7} T\_{45} - a\_{8b7} T\_{55} + a\_{10b7} T\_{65}, \\ F\_{ij8} &= -a\_{3b8} T\_{18} + a\_{6b8} T\_{28} - a\_{9b8} T\_{48} + a\_{9b8} T\_{58} + a\_{12b8} T\_{68}, \\ F\_{ij10} &= a\_{1069} T\_{65}, \\ F\_{ij11} &= a\_{11611} T\_{66}, \\ F\_{ij11} &= a\_{111611} T\_{66}, \\ F\_{ij12} &= a\_{12121} T\_{68}. \end{aligned}$$

#### 3.3.2. Measurement Noises

The states of the stochastic system (37) are measured by the support network and relative navigation in the UAV swarm's information acquisition system. The measuring vector is defined as: *Yij* = [ Δ*xijm* Δ*zijm* Δ*Vim* Δϕ*im* Δβ*im* Δω*iym* Δδ*iym* ] *T* , assuming that Δ*Pi* cannot be measured. These measured values are mainly affected by sensor measurement, network transmission and random disturbances in the flight environment. Assuming that the measured noises of the system approximately obey the Gaussian distribution, whose mathematical expectation is 0 and variance is σ*m*2, the measurement equation is:

$$\begin{cases} \Delta \mathbf{x}\_{ijm} = \Delta \mathbf{x}\_{ij} + \sigma\_{\Delta x\_{ijm}} n\_{13} \\ \Delta z\_{ijm} = \Delta z\_{ij} + \sigma\_{\Delta z\_{ijm}} n\_{14} \\ \Delta V\_{im} = \Delta V\_i + \sigma\_{\Delta v\_{im}} n\_{15} \\ \Delta \rho\_{im} = \Delta \rho\_i + \sigma\_{\Delta \rho\_{im}} n\_{16} \\ \Delta \beta\_{im} = \Delta \beta\_i + \sigma\_{\Delta \beta\_{im}} n\_{17} \\ \Delta \omega\_{ijm} = \Delta \omega\_{ij} + \sigma\_{\Delta \omega\_{ijm}} n\_{18} \\ \Delta \delta\_{ijm} = \Delta \delta\_{ij} + \sigma\_{\Delta \delta\_{ijm}} n\_{19} \end{cases} \tag{38}$$

The subscript "*m*" means the measured value of the system, *n*13, *n*15, ··· , *n*<sup>19</sup> are standard Gaussian white noises independent of each other. They are also independent of *n*1, *n*2, ··· , *n*12.

In summary, the measurement equation is:

$$Y\_{i\bar{j}} = H\_{i\bar{j}} X\_{i\bar{j}} + \sum\_{k=13}^{19} E\_{i\bar{j}k} n\_k \tag{39}$$

$$\begin{aligned} \text{where, } H\_{ij} &= \begin{bmatrix} 1 & 0 & & & & & \\ & 1 & & & & & \\ & & 1 & & & & \\ & & & 1 & & & & \\ & & & & 1 & & & \\ & & & & & 1 & 0 & \\ & & & & & & 0 & 1 \end{bmatrix} . E\_{ij13} = \begin{bmatrix} \sigma\_{\text{Ax}\_{ijn}} \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \end{bmatrix}, E\_{ij14} = \begin{bmatrix} 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \end{bmatrix}, E\_{ij15} = \begin{bmatrix} 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \end{bmatrix}, E\_{ij15} = \begin{bmatrix} 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \end{bmatrix}, E\_{ij16} = \begin{bmatrix} 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \end{bmatrix}, E\_{ij18} = \begin{bmatrix} 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \end{bmatrix}. \end{aligned}$$

#### *3.4. Formation Estimator Design*

For the state estimating problem of the formation control stochastic system (37) and (39), we use a novel *Ito*ˆ stochastic system estimator which has a fixed gain according to [38].

In Lemma 4, the gain of the estimator is time-varying, but in this paper, the USCC problem is investigated during cruising and the formation maintains a certain configuration with certain speed and height in that period, the formation does not change very much. Therefore, we use an estimator with fixed gain in (10) to estimate states, which can significantly simplify the computation and improve the real-time performance of the system.

The formation control stochastic system (37) and (39) can be described as the following *Ito*ˆ stochastic system:

$$dX\_{i\bar{j}}(t) = A\_{i\bar{j}}X\_{i\bar{j}}(t)dt + B\_{i\bar{j}}lI\_{i\bar{j}}(t)dt + \sum\_{k=1}^{12} F\_{i\bar{j}k}X\_{i\bar{j}}(t)d\mathcal{W}\_k(t) \tag{40}$$

$$dY\_{i\bar{j}}(t) = H\_{i\bar{j}}dX\_{i\bar{j}}(t) + \sum\_{k=13}^{19} E\_{i\bar{j}k} d\mathcal{W}\_k(t) \tag{41}$$

where *Wk*(*t*) ( *<sup>k</sup>* <sup>=</sup> <sup>1</sup> <sup>∼</sup> 12) in (40) are standard scalar Wiener processes. <sup>19</sup> *k*=13 *EijkdWk*(*t*) is a 7-dimension Wiener process.

The initial states *Xij*(0) satisfy:

$$\begin{cases} E\left[X\_{ij}(0)X\_{ij}^{-T}(0)\right] = D\_{ij}(0) \\ E\left[d\mathcal{W}\_k(t)d\mathcal{W}\_k^{-T}(t)\right] = dt \\ E\left\{\left[\sum\_{k=13}^{19} E\_{ijk} d\mathcal{W}\_k(t)\right] \left[\sum\_{k=13}^{19} E\_{ijk} d\mathcal{W}\_k(t)\right]^T\right\} = R\_{ij}dt \end{cases} \tag{42}$$

Then, substituting (40) into (41) and the fixed gain estimator of the formation control stochastic system can be obtained by Lemma 4:

$$d\hat{X}\_{i\rangle}(t) = (A\_{i\rangle} - K\_f H\_{i\rangle} A\_{i\rangle}) \hat{X}\_{i\rangle}(t)dt + B\_{i\rangle} \mathcal{U}\_{i\rangle}(t)dt + K\_f d\mathcal{Y}\_{i\rangle}(t) \tag{43}$$

$$
\dot{X}\_{ij}(0) = 0 \tag{44}
$$

where *X*ˆ*ij* = [ Δ*x*ˆ*ij* Δ*z*ˆ*ij* Δ*V*ˆ*<sup>i</sup>* Δϕˆ*<sup>i</sup>* Δβˆ *<sup>i</sup>* Δωˆ*iy* Δ*P*ˆ*<sup>i</sup>* Δδˆ *iy* ] *<sup>T</sup>* is the state estimate vector; *Kf* is fixed gain of the estimator. *Uij* is the control input.

#### *3.5. Formation Controller Design*

It can be seen from (22) that the system states have a high degree of coupling between each other (such as the forward distance Δ*xij* and the sideslip angle Δβ*i*, the lateral distance Δ*zij* and Δ*Vi*, they all have high degree of coupling between each other), so the forward and lateral distance will be controlled with a couple in this paper. The PID-based formation control system structure is a commonly used design method in the engineering application at present [40]. According to the clustering algorithm proposed in [41], the PID formation controller we adopted is:

$$
\mathcal{U}\_{ij} = \mathcal{K}\_{\mathcal{L}} \mathcal{K}\_{\omega ij} (\hat{X}\_{ij} - X\_{ij}^\*) + \mathcal{U}\_{jd} \tag{45}
$$

$$X\_{ij}^\* = \begin{bmatrix} \Delta x\_{ij'}^\* \Delta z\_{ij'}^\* \Delta V\_{i'}^\* \Delta q\_{i'}^\* \Delta \beta\_{i'}^\* \, \Delta \beta\_{i'}^\* \, 0, 0, 0 \end{bmatrix}^T \tag{46}$$

$$K\_{\omega ij} = \text{diag}(\omega\_{ij}, \omega\_{ij}, 1, 1, 1, 1, 1, 1) \tag{47}$$

$$\Delta I\_{jd} = \begin{bmatrix} 0, 0, \Delta V\_j, \Delta q \rho\_j \end{bmatrix}^T \tag{48}$$

where *X*ˆ*ij* is the output of the estimator; *X*<sup>∗</sup> *ij* is the system command; superscript "∗" indicates the command, the same as below; *Ujd* is the determined interference input vector of the adjacent node; *Kc* <sup>∈</sup> *<sup>R</sup>*4×<sup>8</sup> is the control law, in which the last two rows in *Kc* are zero vectors because <sup>Δ</sup>*Vj* and <sup>Δ</sup>ϕ*<sup>j</sup>* are the determined interference inputs in *Uij*; *K*ω*ij* is the adjacency adjustment matrix, 0 ≤ ω*ij* ≤ 1 are adjacency coefficients. The larger it is, the stronger the adjacency relationship between node ν*<sup>i</sup>* and ν*j*.

The key to designing a better PID controller is to contrive the proper PID gain parameters. In order to get a better parameter of the feedback coefficients, we use the SRAD method which combines the genetic algorithm and Monte Carlo simulation to improve the robustness of the stochastic system.

We can see from (45) that *Kc*<sup>Δ</sup>*vi* (Δ*V*ˆ*<sup>i</sup>* <sup>−</sup> <sup>Δ</sup>*V*<sup>∗</sup> *i* ) and *Kc*<sup>Δ</sup>ϕ*<sup>i</sup>* (Δϕˆ*<sup>i</sup>* − Δϕ<sup>∗</sup> *i* ) reflect the individuality forces of the individual aircraft, *Kc*<sup>Δ</sup>*xij*ω*ij*(Δ*x*ˆ*ij* − Δ*x*<sup>∗</sup> *ij*) and *Kc*<sup>Δ</sup>*zij*ω*ij*(Δ*z*ˆ*ij* − Δ*z*<sup>∗</sup> *ij*) reflecting the interaction forces which represent the quality of the formation cooperation. Both of them can contribute to maintaining the configuration during formation maneuvering.

#### *3.6. Closed-Loop USCC Stochastic System*

In summary, the *Ito*ˆ stochastic system of the USCC is as follows:

$$\begin{cases} dX\_{ij} = A\_{ij}X\_{ij}dt + B\_{ij}\mathcal{U}\_{ij}dt + \sum\_{k=1}^{12} F\_{ijk}X\_{ij}d\mathcal{W}\_{k} \\ \qquad dY\_{ij} = H\_{ij}dX\_{ij} + \sum\_{k=13}^{19} E\_{ijk}d\mathcal{W}\_{k} \\ \qquad \mathcal{U}\_{ij} = \mathcal{K}\_{\mathcal{K}uij}(\hat{\mathcal{X}}\_{ij} - \mathcal{X}\_{ij}^{\*}) + \mathcal{U}\_{jd} \\ \qquad d\hat{\mathcal{X}}\_{ij} = (A\_{ij} - \mathcal{K}\_{f}H\_{ij}A\_{ij})\hat{\mathcal{X}}\_{ij}dt + B\_{ij}\mathcal{U}\_{ij}dt + \mathcal{K}\_{f}d\mathcal{Y}\_{ij} \end{cases} \tag{49}$$

where the first equation is the state equation, the second is the measurement equation, the third is the control input, and the fourth is the state estimation equation. The above four equations together construct the expansion closed-loop equation for the stochastic system of USCC (as shown in Figure 2):

$$d\overline{X}\_{ij} = \left[\overline{A}\_{ij}\overline{X}\_{ij} + \overline{B}\_{ij}(t)\right]dt + \sum\_{k=1}^{19} \left[\overline{F}\_{ijk}\overline{X}\_{ij} + \overline{G}\_{ijk}\right]d\mathcal{W}\_{i} \tag{50}$$

**Figure 2.** The framework of the closed-loop system.

The states of the expansion system are:

$$\overline{X}\_{ij} = \begin{bmatrix} X\_{ij} \\ X\_{ij} \end{bmatrix} \tag{51}$$

where *Xij* <sup>∈</sup> *<sup>R</sup>*16×1; *Xij* <sup>∈</sup> *<sup>R</sup>*8×<sup>1</sup> are original states; *<sup>X</sup>*ˆ*ij* <sup>∈</sup> *<sup>R</sup>*8×<sup>1</sup> are estimated states.

The state transfer matrix of the expansion system is:

$$
\overline{A}\_{\rm ij} = \begin{bmatrix} A\_{\rm ij} & B\_{\rm ij} K\_{\rm c} K\_{\rm uij} \\ K\_f H\_{\rm ij} A\_{\rm ij} & \left[ (I\_{8 \times 8} - K\_f H\_{\rm ij}) A\_{\rm ij} + (I\_{8 \times 8} + K\_f H\_{\rm ij}) B\_{\rm ij} K\_c K\_{\rm uij} \right] \end{bmatrix} \tag{52}
$$

where *Aij* <sup>∈</sup> *<sup>R</sup>*16×16; *Aij* <sup>∈</sup> *<sup>R</sup>*8×<sup>8</sup> is the state transfer matrix of the original system; *Bij* <sup>∈</sup> *<sup>R</sup>*8×<sup>4</sup> is the input matrix of the original system; *Hij* <sup>∈</sup> *<sup>R</sup>*7×<sup>8</sup> is the estimate matrix of the original system; *<sup>K</sup>*ω*ij* <sup>∈</sup> *<sup>R</sup>*8×<sup>8</sup> is the adjacent adjustment matrix in *Uij*; *Kc* <sup>∈</sup> *<sup>R</sup>*4×<sup>8</sup> is the control law; *Kf* <sup>∈</sup> *<sup>R</sup>*8×<sup>7</sup> is the gain of the estimator.

The input matrix of the expansion system is:

$$
\overline{B}\_{ij}(t) = \begin{bmatrix}
\end{bmatrix} \tag{53}
$$

where *Bij*(*t*) <sup>∈</sup> *<sup>R</sup>*16×1; *<sup>X</sup>*<sup>∗</sup> *ij* <sup>∈</sup> *<sup>R</sup>*8×<sup>1</sup> is the system command; *Ujd* <sup>∈</sup> *<sup>R</sup>*4×<sup>1</sup> is the determined input of the adjacent node in *Uij*. Note that: *Bij*(*t*) is a time-varying matrix because *X*<sup>∗</sup> *ij* is time-varying.

The expansion stochastic state transfer matrix is:

$$
\overline{F}\_{ijk} = \begin{bmatrix} F\_{ijk} & 0 \\ K\_f H\_{i\bar{j}} F\_{i\bar{j}k} & 0 \end{bmatrix} \tag{54}
$$

where *Fijk* <sup>∈</sup> *<sup>R</sup>*16×16; *Fijk* = 08×8( *<sup>k</sup>* = <sup>13</sup> <sup>∼</sup> 19).

The stochastic input matrix of the expansion system is:

$$\overline{G}\_{ijk} = \begin{bmatrix} 0\\ \,^\prime K\_f E\_{ijk} \end{bmatrix} \tag{55}$$

where *Gijk* <sup>∈</sup> *<sup>R</sup>*16×1; *Eijk* = 07×1( *<sup>k</sup>* = <sup>1</sup> <sup>∼</sup> 12).

The standard Wiener process is:

$$\mathcal{W} = \begin{bmatrix} \mathcal{W}\_1, \mathcal{W}\_2, \dots, \mathcal{W}\_{19} \end{bmatrix}^T \tag{56}$$

where *W*<sup>1</sup> ∼ *W*<sup>19</sup> are the Wiener processes in (49); *W* is an independent 19-dimension standard Wiener process defined in the complete probability space: (Ω, F, P).

*3.7. Main Results*

Define the following matrix:

$$M\_{ij} = \overline{A}\_{ij} \otimes \overline{A}\_{ij} + \sum\_{k=1}^{19} \overline{F}\_{ijk} \otimes \overline{F}\_{ijk} \tag{57}$$

According to Theorem 1, the sufficient conditions of the stability of USCC stochastic model are:


It can be seen from (53) and (55) that all the elements in the matrices *Bij*(*t*), *Gijk* are bounded according to their definition, because in *Bij*, *Kip*, *Tip* are constant parameters of the thrust response and *Ki*<sup>δ</sup>*<sup>y</sup>* , *Ti*<sup>δ</sup>*<sup>y</sup>* are constant parameters of the elevator response; *Kc* <sup>∈</sup> *<sup>R</sup>*4×<sup>8</sup> is the control law which can be obtained after simulation; *K*ω*ij* is an adjacency adjustment matrix whose elements are in the range of (0, 1); *X*∗ *ij* are bounded command values; *Ujd* is the determined interference input vector of the adjacent node; *Kf* <sup>∈</sup> *<sup>R</sup>*8×<sup>7</sup> is the fixed gain of the estimator which can be determined from simulation; the nonzero elements in the matrix *Hij* <sup>∈</sup> *<sup>R</sup>*7×<sup>8</sup> are '1'; and in *Gijk*, *Eijk*( *<sup>k</sup>* = <sup>13</sup> <sup>∼</sup> 19) is the bounded variance vector of measuring noises. Therefore, we can further deduce the following:

**Proposition 1.** *Under normal circumstances, a su*ffi*cient condition for the stability (or the mean square uniform boundedness of the stochastic system (50)) of the USCC system is that the real parts of the eigenvalues are negative, that is:*

$$\max\{\text{Re}\lambda(M\_{ij})\} < 0\tag{58}$$

*where the max* Reλ(*Mij*) *is the maximum real part of the eigenvalue of Mij*.

#### **4. Simulation and Experiments**

With the aim of cooperative detection under complex environments, we ran simulations with two aircraft to verify the effectiveness of the proposed stochastic model. Moreover, the equivalent outfield flight test was carried out to complete the mission of cooperative detection in a certain area. The results demonstrate that the formation could be achieved effectively and accurately.

#### *4.1. Simulation Results*

#### 4.1.1. Initial State

To make it convenient for analysis, we set the number of UAV swarm *n* = 2. The configuration of the formation during cruising is *x*12*<sup>b</sup>* = 100 m, *z*12*<sup>b</sup>* = −173.2 m (that is, ν<sup>2</sup> located 100 meters forward and 173.2 meters left of ν1), and the cruising speeds are *V*1*<sup>b</sup>* = *V*2*<sup>b</sup>* = 100m/s. The cruising trajectory declination is ϕ1*<sup>b</sup>* = ϕ2*<sup>b</sup>* = 0 rad. The current configuration of the formation is the cruising formation, the current speed is *V*<sup>1</sup> = *V*<sup>2</sup> = 100 m/s, and the current flight path declination is ϕ<sup>1</sup> = ϕ<sup>2</sup> = 0 rad. The original mass is *m*<sup>1</sup> = *m*<sup>2</sup> = 1400 Kg, the y-components of inertial moments are *Iy*<sup>1</sup> = *Iy*<sup>2</sup> = 3980 Kg · <sup>m</sup>2.

#### 4.1.2. Formation Parameters

Assume that the supporting network is strongly connected and that the adjacency coefficient ω*ij* in (47) is 1.

#### 4.1.3. Standard deviation of random interference

Assume that the standard deviations of random interference in Equations (23) and (38) are:

<sup>σ</sup>*xij* <sup>=</sup> 1 m, <sup>σ</sup>*zij* <sup>=</sup> 1 m, <sup>σ</sup>*Vi* <sup>=</sup> 0.1 m/s, <sup>σ</sup>*Pi* <sup>=</sup> 5 N, <sup>σ</sup> *Xi Vi* = 0.25 kg/s, <sup>σ</sup> *Zi Vi* = 0.5 kg/s, σ *Zi* <sup>β</sup>*<sup>i</sup>* = 0.5 N/rad, <sup>σ</sup> *Zi* <sup>δ</sup>*<sup>i</sup> <sup>y</sup>* = 1.0 N/rad, <sup>σ</sup> *<sup>M</sup>*β*<sup>i</sup> iy* = 1.0 Nm/rad, σ *M* . β*i iy* = 0.5 Nms/rad, σ *<sup>M</sup>*ω*iy iy* = 1.0 Nms/rad, σ *<sup>M</sup>*δ*iy iy* = 1.0 Nm/rad, σΔ*xijm* = 4.8 m, σΔ*zijm* = 4.8 m, σΔ*vim* = 6.9 m/s, σΔϕ*im* = 0.005 rad, σΔβ*im* = 0.01 rad, σΔω*iym* = 0.01 rad/s, σΔδ*iym* = 0.001 rad.

#### 4.1.4. Formation Commands

The expected configuration of the system are *x*∗ *ij* = 50 m, *z*<sup>∗</sup> *ij* = −73.2 m, the expected formation speed is *Vf* = 100 m/s, and the expected formation declination is ϕ*<sup>f</sup>* = 0 rad.

#### 4.1.5. Optimal Design of Estimator Gain and Control Law

In order to improve the robustness of the stochastic system, the estimator gain in (43) and the control law in (45) are optimized by the SRAD [37].

The SRAD design flow is shown as in Figure 3, which is composed of two parts: a modern optimization algorithm and a control structure design. MCE denotes Monte Carlo evaluation, SRA denotes stochastic robustness analysis.

The cost function we designed is *Jij* <sup>=</sup> <sup>12</sup> *i*=1 *wiIi* <sup>2</sup>(*qi*), where *qi* represents the 12 indicators which are shown in Table 1, *wi* is the weight of each indicator, and *Ii*(·) is the membership function of each indicator which obeys the rising-ridge distribution (59) or 0–1 distribution (60).

**Figure 3.** The design flow of SRAD.

**Table 1.** The stability and indicators.


For indicators 1–9 and 12, the membership function obeys the rising-ridge distribution, i.e.,

$$I(\mathbf{x}) = \begin{cases} 0 & \mathbf{x} \le a \\ \frac{1}{2} + \frac{1}{2} \sin \frac{\pi}{b-a} (\mathbf{x} - \frac{a+b}{2}) & a < \mathbf{x} \le b \\ 1 & \mathbf{x} > b \end{cases} \tag{59}$$

where *a*, *b* are the best and allowable value of the indicators respectively. *x* is the simulation result. For indicators 10 and 11, the membership function obeys the 0–1 distribution, i.e.,

$$I(\mathbf{x}) = \begin{cases} 1 & \mathbf{x} < a \\ 0 & a \le \mathbf{x} \le b \\ 1 & \mathbf{x} > b \end{cases} \tag{60}$$

where (*a*, *b*) is the allowable range of the indicators, *x* is the simulation result.

*p*ˆ*i*(*Kc*, *Kf*) is the probability of indicator *i* that cannot satisfy the stability and requirements whose probability distribution function is *Ii*(·).

The minimum cost function J reflects the minimum probability that any indicator cannot satisfy the stability and requirements. It also indicates the minimum errors of the selected properties. Therefore, the obtained controller has high-quality robustness, and the probability that the control system does not meet the requirements is significantly reduced after multiple simulations.

The design steps in Figure 3 are as follows:


(3) Carry out Monte Carlo simulation on the closed-loop system to obtain the probability *p*ˆ*i*(*Kc*, *Kf*) that cannot satisfy the stability and performance;

(4) Constitute a random cost function ˆ*J*(*Kc*, *Kf*) to satisfy both of robust stability and performance;

(5) Apply a modern optimization algorithm to get an optimal value. After we get the minimum value of ˆ*J*(*Kc*,*Kf*), then we obtain a stochastic robust optimal controller and an optimal estimator.

The optimization process of the designed parameters *Kf* and *Kc* is shown in Figure 4. After iterating 15 times with the genetic algorithm and running the Monte Carlo simulation 100 times per iteration, we got the optimal cost value: *J* = 4.56 and the optimal parameters:

**Figure 4.** The iterative process of the genetic algorithm. (Note that the best fitness is the minimum value of the cost function in each iteration and the mean fitness is the mean value of the 100 Monte Carlo simulations in each iteration).

#### 4.1.6. Simulation Framework

The framework of the simulation model is shown in Figure 5.

In the framework shown in Figure 5, the flight members in the formation are referred to nodes in the supporting network. The node obtains information through the formation support network and the sensor system, including neighbor nodes' information and environment information. The decision management system allocates missions to flight members and plans the flight route for the formation. Finally, formation control system and member flight control system carry out missions using the *Ito*ˆ stochastic system model (49) and the parameters presented above.

**Figure 5.** Simulation framework.

#### 4.1.7. Simulation Results

In order to better reflect the performance of the USCC stochastic system, we define the following indicator.

**Definition 2.** *The weighted variance of the estimate errorXij* <sup>=</sup> *Xij* <sup>−</sup> *<sup>X</sup>*ˆ*ij is:*

$$V(\widetilde{X}\_{ij}) = \frac{1}{t\_f - t\_0} \int\_{t\_0}^{t\_f} (\widetilde{X}\_{ij} - E\{\widetilde{X}\_{ij}\})^T \mathcal{W}\_{\mathcal{V}}(\widetilde{X}\_{ij} - E\{\widetilde{X}\_{ij}\}) dt \tag{63}$$

where the diagonal matrix *Wv* <sup>∈</sup> *<sup>R</sup>*8×<sup>8</sup> is the weighted matrix of weighted variances, and the diagonal elements correspond to the states *X*ˆ*ij* of the estimator. The larger the weight is, the more accurate the estimation of the corresponding state becomes. The trace of *Wv* is *tr*(*Wv*) = 1. Assuming that the estimation accuracies of Δ*x*ˆ*ij*, Δ*z*ˆ*ij*, Δ*V*ˆ*<sup>i</sup>* and Δϕˆ*<sup>i</sup>* are required to be higher, the weighting matrix we designed is: *Wv* = *diag*(0.2, 0.2, 0.2, 0.2, 0.1, 0.05, 0, 0.05).

The simulation results correspond with the optimal cost value and the optimal parameters are shown as in Table 1 and Figure 6. Note that the fluctuation in the table refers to the standard deviation of the difference between the real-time configuration and the expected configuration.

It can be observed from Table 1 that all the indicators are in the appropriate range. The real part of the maximum eigenvalue is negative and the weighted variance of the estimation error calculated from simulation results is 3.86. From Figure 6 we can observe that the two aircraft achieved the desired configuration after 54.7 s, and the forward distance steady error is 0.0332 m, the lateral distance steady error is 0.0944 m. The formation speed and the formation declination meet the designed requirements well. Thus the system is mean-square uniform bounded according to Proposition 1.

**Figure 6.** Simulation results of node νi. (**a**) forward distance; (**b**) lateral distance; (**c**) speed; (**d**) flight path declination.

#### *4.2. Autonomous Flight Experiments*

In order to observe the performance of the model and verify the effectiveness of the USCC stochastic system, we conducted an equivalent outfield autonomous formation flight test by using multiple UAVs. As shown in Figure 7, we carried out experiments with seven nonholonomic UAVs. The UAV swarm can cooperatively search the certain area with different configurations.

**Figure 7.** Seven UAVs.

The loads in each cabin of the UAV are shown in Figure 8, including the power module, formation cooperative guidance module, autopilot module, detection module, formation communication module and flight data transmission module. In the experiment, we adopted the proposed USCC stochastic model into the formation cooperative guidance module to instruct the flight members to reach the desired position and maintain a steady configuration. The framework of the whole system is shown in Figure 9.

**Figure 8.** The loads in the cabin of the UAV.

**Figure 9.** The framework of the USCC outfield flight system.

The formation ground station is set to observe the real-time formation flight process, and upload commands to instruct the formation to change its configuration or adjust relative distances. It can also control the pod to capture the configuration. The flight member's digital monitor station is set to observe the real-time flight member's flight statuses and to ensure the safety of the whole flight process.

Two configurations of five drones we designed in the experiments are shown in Figure 10a,b. The lateral and forward distance between neighbor UAVs was set to 50 m. Moreover, the configurations in Figure 10c,d were captured by the UAV that was flying higher with a pod.

It can be observed from Figure 10 that the UAVs achieved the desired configuration smoothly and maintained the formation effectively under the proposed model.

For the convenience of analysis, the actual flight data could be saved through the formation monitoring station. In this paper, we take the flight data of the wedge configuration with and without the USCC stochastic system model in the experiment to invert the flight process and evaluate the effectiveness of the USCC stochastic system. The results are shown in Figure 11.

The curves shown as in Figure 11 demonstrate that the five UAVs achieved the desired configuration and steadily maintained the formation. The flight paths in the rectangle that the arrows "1" and "2" point to in Figure 11a are amplified in Figure 11c,d. The data are summarized in Table 2, from which we can observe that in the flight test not using the model, the relative height of the UAVs should be maintained at the very least at 50 m to avoid the risk of collision, while the heights of the five UAVs with USCC stochastic system model converged to 200 m and the relative distance in height was zero. The average 3D distance between flight members in the flight test with the proposed model was shortened by 32.14% compared to the test without the model.

**Figure 10.** Configurations of five drones. (**a**) Designed lateral configuration; (**b**) Designed wedge configuration; (**c**) Real time lateral configuration; (**d**) Real time wedge configuration.

**Figure 11.** Experimental results of the wedge configuration. (**a**). The flight path of the five UAVs; (**b**). The height of five UAVs. The flight paths in full line belong to the UAV0 to UAV4 with USCC stochastic system model while the dotted line belong to the UAV0' to UAV4' are without the model; (**c**). Northward distances between the five UAVs which are indicated by "2" in (**a**). The flight paths in full line of UAV0 to UAV4 utilize the USCC stochastic system model while the dotted line of UAV0' to UAV4' do not use the model; (**d**). Eastward distances between the five UAVs which are indicated by "1" in (**a**). The flight paths in full line of UAV0 to UAV4 use USCC stochastic system model while the dotted line of UAV0' to UAV4' do not use the model.


**Table 2.** The data of configurations with and without USCC stochastic system model.

The results show that the introduction of multiplicative noises improves the formation maintenance performance and extends the boundary properties of the formation flight, such as the safety distance is shorter and the relative height can be eliminated. Therefore, the proposed stochastic model could provide the UAV swarm with a larger maneuvering space, and then improve the efficiency and quality of mission execution, enhance the operational capability in high-risk environments and improve the adaptation of the system to the complex environment.

#### **5. Conclusions**

In this paper, the problem of the state estimation and control of the UAV swarm system with the consideration of multiplicative noises is studied. The closed-loop *Ito*ˆ stochastic system we constructed is the combination of a state equation introduced from group kinematic model and individual dynamic model considering the multiplicative noises, an observation equation considers the measurement noises, an estimator and a controller. Following that, the proof to verify the mean-square uniform boundedness of the system is presented. The optimal estimator and controller are obtained with the use of SRAD in the simulation. Finally, simulation results show that the system is stable and the selected indicators meet the requirements. The outfield experiment results demonstrate that the configuration with the proposed model could be significantly condensed by 32.14% compared to the test with the traditional model. Therefore, the stochastic system of USCC with multiplicative noises proposed in this paper could contribute to effectively exploiting the boundary performances of the system and constructing a high dynamic formation in practical application, thus better matching the actual environment.

However, there is much to be researched further in this area. For example, in the practical application, the time delay cannot be ignored, especially for large scale formations. The modeling of the USCC stochastic system considering time delay is currently under investigation.

**Author Contributions:** H.Z., S.W. and Y.W. proposed the research ideas and conducted modeling design. H.Z. developed the software and wrote the manuscript, S.W. reviewed the model and instructed the simulation and experiments, Y.W. improved the software and implemented simulation, W.L. carried out flight experiments, and X.W. reviewed the modeling process and polished the paper.

**Funding:** This work was funded by the Industrial Technology Development Program under Grant B1120131046.

**Acknowledgments:** The authors appreciate editors and reviewers for their suggestions which are of great value to the paper.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

*Article*

## **Data Offloading in UAV-Assisted Multi-Access Edge Computing Systems: A Resource-Based Pricing and User Risk-Awareness Approach**

#### **Giorgos Mitsis 1, Eirini Eleni Tsiropoulou 2,\* and Symeon Papavassiliou <sup>1</sup>**


Received: 21 March 2020; Accepted: 15 April 2020; Published: 24 April 2020

**Abstract:** Unmanned Aerial Vehicle (UAV)-assisted Multi-access Edge Computing (MEC) systems have emerged recently as a flexible and dynamic computing environment, providing task offloading service to the users. In order for such a paradigm to be viable, the operator of a UAV-mounted MEC server should enjoy some form of profit by offering its computing capabilities to the end users. To deal with this issue in this paper, we apply a usage-based pricing policy for allowing the exploitation of the servers' computing resources. The proposed pricing mechanism implicitly introduces a more social behavior to the users with respect to competing for the UAV-mounted MEC servers' computation resources. In order to properly model the users' risk-aware behavior within the overall data offloading decision-making process the principles of Prospect Theory are adopted, while the exploitation of the available computation resources is considered based on the theory of the Tragedy of the Commons. Initially, the user's prospect-theoretic utility function is formulated by quantifying the user's risk seeking and loss aversion behavior, while taking into account the pricing mechanism. Accordingly, the users' pricing and risk-aware data offloading problem is formulated as a distributed maximization problem of each user's expected prospect-theoretic utility function and addressed as a non-cooperative game among the users. The existence of a Pure Nash Equilibrium (PNE) for the formulated non-cooperative game is shown based on the theory of submodular games. An iterative and distributed algorithm is introduced which converges to the PNE, following the learning rule of the best response dynamics. The performance evaluation of the proposed approach is achieved via modeling and simulation, and detailed numerical results are presented highlighting its key operation features and benefits.

**Keywords:** data offloading; UAV-enabled computing; resource-based pricing; risk-awareness; multi-access edge computing systems

#### **1. Introduction**

Towards realizing the emerging applications supported by the fifth generation (5G) wireless networks and the Internet of Things (IoT), while demanding ultra-reliable and low latency communication (URLLC), ubiquitous, distributed, and intelligent computing is one of the key enabling technologies. IoT is foreseen to reach 500 billion devices connected to the Internet by 2030 [1], while the global mobile traffic is expected to increase seven-fold by 2021 [2]. Thus, it is evident that traditional cloud computing architectures cannot support the latency constraints of the next generation networking environments, such as Tactile Internet [3]. The reasons are that the powerful cloud centers are often deployed far away from the end users; thus, huge amounts of traffic are usually transmitted through

intermediate nodes resulting in heavy load, congestion, delay uncertainties, high energy consumption, and multiple security threats [4]. Thus, multi-access edge computing (MEC), which brings computing resources from the core network to the edge network, becomes a natural and promising solution to support these applications.

Combined with the MEC concept, Unmanned Aerial Vehicles (UAVs), equipped with communication and computing facilities, become a core component of next generation networks due to their salient attributes, such as hovering ability, flexibility and effortless deployment, maneuverability, mobility, low cost, strong line-of-sight (LoS) connection links, adjustable usage, and adaptive altitude [5]. The MEC servers are embedded in the UAVs that fly in closer proximity to the users compared to the conventional MEC servers typically residing at the Macro Base Stations (MBSs) of the macrocells or at the Access Points (APs) of the small cells [6]. Thus, the UAV-mounted MEC servers more efficiently support the end users applications' data offloading and processing at the flying edge servers, by creating a flexible and dynamic computing environment paradigm [7].

#### *1.1. Related Work*

Cheng et al. [8] discuss the benefits introduced by the UAV-mounted MEC servers with respect to caching and computing, in a hybrid architecture consisting of UAV-mounted and ground MEC servers. Luo et al. [9] introduce a cloud-based UAV-assisted system and study its stability with respect to the sensors big data offloading rate. Valentino et al. [10] consider a fleet of UAV-mounted MEC servers and the optimization problem of increasing the UAVs fleet lifetime, while decreasing the overall computation time of the users' offloaded tasks is formulated and solved. In particular, the authors exploit neighboring UAV clusters with sufficient computing resources to offload the users' computation tasks.

Xiong et al. [11] formulate a joint optimization problem to optimize the users' data offloading to the UAV-mounted MEC servers, the UAVs' trajectory, and the data allocation during transmission to the different UAVs. An end-to-end solution is introduced by Jeong et al. [12], where the authors jointly optimize the users' data offloading to the UAV-mounted MEC servers (i.e., uplink) and the output processed data returned to the users (i.e., downlink), while considering the computation tasks' latency constraints. Jeong et al. [13] focus on the UAV-mounted MEC servers' energy constraints to jointly optimize the users' data offloading by considering orthogonal and non-orthogonal communication multiple access techniques, and the UAVs' trajectory. Furthermore, Zhou et al. consider a wireless powered communication environment in [14,15], where the UAVs except from acting as UAV-mounted MEC servers providing computing services to the end-users, they also provide energy to them. Accordingly, the users can exploit the harvested energy to perform local computing and/or transmit their data to the UAV-mounted MEC servers.

#### *1.2. Motivation and Contributions*

All the aforementioned research works, though having demonstrated significant benefits and potential, have made two fundamental assumptions regarding the examined UAV-assisted multi-access edge computing system: (i) the UAV-mounted MEC servers offer their communication and computing services to the users for free; and (ii) the users act as neutral utility maximizers aiming at simply maximizing their perceived satisfaction from offloading and processing their data to the UAV-mounted MEC servers, thus exhibiting a risk-neutral behavior. However, in a realistic communication and computing environment, both assumptions may not always hold true.

The operator of the UAV-mounted MEC server should enjoy some form of profit by offering its computing services and capabilities to the end-users. Depending on the operation mode, business model and use case under consideration, the profit could be—either explicit or implicit—expressed in different forms (e.g., monetary cost, etc.) [16,17]. In our work, we consider that the profit for the UAV-Mounted MEC server originates directly from applying a usage-based charging policy for allowing the exploitation of the server's computing resources.

Moreover, it has been argued recently that in real life the end-users are characterized by loss averse and risk seeking behavior in terms of exploiting the system's available resources, especially in resource-constrained environments (e.g., [18–20]). In our case, the resources that the users may opt for and compete for refer to the UAV-mounted MEC server's computing resources. Specifically, based on the users' behavioral characteristics, some users may act aggressively and opportunistically in terms of offloading their data to the UAV-mounted MEC servers in order to avoid consuming their personal resources to process their data. Those users exhibit risk seeking behavior, as the UAV-mounted MEC server may not be able to serve all the users' data offloading and processing requests. On the other hand, there are more conservative users, who exhibit loss averse behavior, thus being more willing to process their data locally at their devices, instead of taking the risk of finally not being served by the UAV-mounted MEC server due to the potential overexploitation of the latter.

Towards jointly addressing the aforementioned assumptions and filling the respective research gaps, in this paper, we exploit the power and principles of Prospect Theory [18] to capture the users risk-based behavior in their data offloading decision-making process, under the operation framework of a usage-based pricing mechanism of the UAV-mounted MEC servers' computing resources. To the best of our knowledge, this is the first research work in the existing literature that jointly combines a pricing-aware and risk-aware framework to deal with the data offloading problem in UAV-mounted multi-access edge computing systems.

In particular, we assume that the users have two available options for executing their tasks, namely the local computation and the remote computation, the latter achieved through data offloading. The local computation resources of the user's device act as safe resources, since the users do not compete with each other for consuming those resources. On the other hand, the computation resources of the UAV-mounted MEC server are treated as a Common Pool of Resources (CPR), as they are non-excludable, i.e., all the users have the right to exploit them, while they are rivalrous and subtractable, i.e., their exploitation by one user reduces the ability to be exploited by another user. In principle, the UAV-mounted MEC server resources have the potential to provide significantly higher satisfaction to the user (compared to the lower satisfaction that could be obtained through the limited user local computation resources), if properly utilized and allocated. However, if the users selfishly offload their data to the UAV-mounted MEC server, then the computing capabilities of the latter will be overexploited resulting in suboptimal outcomes for the entire set of users, possibly leading to the complete "failure" of the CPR UAV-mounted MEC server. The failure of the CPR UAV-mounted MEC server refers to its inability to concurrently handle the large amount of offloaded data and corresponding computation tasks by the users, due to its limited computation capability.

To treat this situation and differentiate the performance and usage of the available computation resources, we capitalize on the theory of Tragedy of the Commons [21,22], while also introducing a usage-based pricing mechanism to capture the users' cost for exploiting the server's computation resources. The proposed pricing mechanism implicitly introduces a more social behavior to the users and supports the fairness among them, in terms of competing for the UAV-mounted MEC server's computation resources. The users' behavioral characteristics in the data offloading decision-making process, is captured through the principles of Prospect Theory that models users' decisions under the uncertainty of the available computation resources at the UAV-mounted MEC server. Prospect Theory is a well known behavioral economic theory studying the autonomous decision-making of the individuals under risk and uncertainty of the associated payoff of their choices, which is estimated with some probability [23]. The performance evaluation and validation of the proposed pricing and risk-aware data offloading framework in UAV-assisted MEC systems is achieved via modeling and simulation, in terms of efficient exploitation of all the available computation resources, realistic capturing of users' interaction with the computing environment, and its scalability.

#### *1.3. Outline*

The rest of this paper is organized as follows. Section 2 presents the considered system model, while Section 3 initially discusses the main principles of Prospect Theory and the theory of the Tragedy of the Commons, and then accordingly designs and formulates the users' prospect-theoretic utility function. Section 4 introduces the pricing and risk-aware data offloading framework in a UAV-assisted multi-access edge computing system by formulating and solving the corresponding optimization data offloading problem, via adopting the theory of S-modular games. In Section 5, a low complexity and iterative algorithm is introduced to determine the Pure Nash Equilibrium (PNE) of the users' data offloading optimization problem. Section 6 contains the performance evaluation of the proposed framework, while Section 7 concludes the paper.

#### **2. System Model**

A UAV-assisted multi-access edge computing system is considered consisting of a set of mobile users N = {1, ... , *n*, ... , *N*} and a UAV-mounted MEC server attached to the UAV. Each user *n* has a computation task *Jn* that needs to execute. Each task is accordingly defined as *Jn* = (*bn*, *dn*), where *bn* [bits] is the user's *n* size of the input data needed for the computation task and *dn* [CPU-cycles] is the number of CPU cycles required in order to accomplish the computation task. The UAV-mounted MEC server is available to the users to offload and process their data remotely instead of processing them locally on their device and consuming their own local resources. Each user decides to offload *bMEC <sup>n</sup>* [bits] data to the UAV-mounted MEC server, while the rest (*bn* − *<sup>b</sup>MEC <sup>n</sup>* ) [bits] data are processed locally on the user's device. An indicative topology of the considered UAV-assisted MEC system is presented in Figure 1. In this work we mainly focus on the modeling and provisioning of the computing resources, rather than on the user to UAV wireless communication aspects. The UAV flexibility and adaptability capabilities can ensure strong communication channels and links with the users.

**Figure 1.** UAV-assisted multi-access edge computing system.

For each user *n*, the time ˆ*tn* [s] to process the whole amount of data *bn* locally is defined as:

$$
\hat{f}\_n = \frac{d\_n}{f\_n} \tag{1}
$$

where *fn* [CPU-cycles/s] is the computation capability of each user's *n* device. Apart from the processing time needed, each computation task has some energy requirements as well. The energy *e*ˆ*<sup>n</sup>* [J] needed to process the whole amount of data *bn* locally for each user *n* is defined as:

$$
\pounds\_n = \gamma\_n d\_n \tag{2}
$$

where *γ<sup>n</sup>* [J/CPU-cycle] is the coefficient denoting the consumed energy per CPU cycle locally at each user's *n* device.

We assume that the UAV-mounted MEC server applies a fair usage-based pricing policy to the users, while charging them proportionally to their offloaded data and to their demand of consuming computation resources, as they are indicated by the nature of their computation task. Thus, the cost imposed by the UAV-mounted MEC server to the user *n* in order to process the user's offloaded data *bMEC <sup>n</sup>* is defined as:

$$c\_n(b\_n^{MEC}) = c d\_n \frac{b\_n^{MEC}}{b\_n} \tag{3}$$

where *c* [1/CPU-cycles] represents a constant pricing factor imposed by the UAV-mounted MEC server to every user. Intuitively, the cost imposed to each user (Equation (3)) is proportional to the percentage of the number of CPU cycles *dn* of the user's computation task that is actually offloaded, i.e., the greater the part of the computation task offloaded to the UAV-mounted MEC server is, the greater is the cost that the user experiences by the UAV-mounted MEC server to process remotely its data. It is noted that, without loss of generality, the cost *cn*(*bMEC <sup>n</sup>* ) imposed by the UAV-mounted MEC server to the user *n* in order to process the offloaded data of the latter is assumed to be a unitless metric in this research work, and can represent any type of usage-based cost or monetary cost in a realistic implementation. Based on the above proposed model, we can therefore formulate the problem of determining the optimal *bMEC*<sup>∗</sup> *<sup>n</sup>* that each user should offload considering each user's risk-aware behavioral characteristics and the pricing imposed by the UAV-mounted MEC server.

#### **3. Users Prospect-Theoretic Utility Function in UAV-Assisted MEC Environment**

In the dynamic computation environment considered in this research work, consisting of the UAV-mounted MEC server's and the users' local computing capabilities, the users exhibit a risk-aware behavior in terms of deciding where to process the data of their computation tasks. Therefore, the users do not act as risk-neutral utility maximizers following the conventional Expected Utility Theory (EUT) [23], but instead they rather exhibit a loss averse or gain seeking behavior when utilizing the UAV-mounted MEC server's computation resources. To capture the exploitation and usage characteristics and principles of the available computation resources in the considered UAV-assisted MEC system, we adopt the theory of the Tragedy of the Commons [22]. Specifically, the UAV-mounted MEC server's computation resources are considered as a Common Pool of Resources (CPR), as all the users have access to them and can offload their data to the UAV-mounted MEC server in order to be processed. If the users overexploit the computation resources of the UAV-mounted MEC server, the latter will fail to serve their computation demands and none of the users will be satisfied. On the other hand, the user's device's local computation resources are considered as safe resources, as each user exclusively exploits them for its own benefit. It is noted that the safe resources provide a guaranteed satisfaction to the user; however, the user can potentially experience lower satisfaction compared to exploiting the CPR, as the user has to spend its own resources, e.g., energy to process locally its data.

As mentioned before, towards capturing the users' loss averse and gain seeking behavior in terms of exploiting the CPR and safe computation resources, the principles of Prospect Theory are adopted [24]. Prospect Theory is a behavioral economic theory that quantifies individuals' behavioral patterns, which demonstrate systematic deviations from the Expected Utility Theory. Under the prospect theoretic model, the users experience greater dissatisfaction from a potential outcome of losses compared to their satisfaction from gains of the same amount. In addition, the level of the users' satisfaction and dissatisfaction is evaluated with respect to a reference point, which is considered as the ground truth of the examined system. Recently, several efforts have appeared in the literature, where Prospect Theory has been adopted in various environments and application domains, including dynamic resource management in 5G wireless networks [18,25], public safety networks [19], anti-jamming communications in cognitive radio networks [20], users' transmission power management and anti-jamming techniques in UAV-assisted networks [5], and Quality of Experience in cyber-physical social systems [21].

Following the principles of Prospect Theory, the user's prospect theoretic utility is defined as [24,26]:

$$P\_{\mathbb{H}}(\mathcal{U}\_{\mathbb{H}}) = \begin{cases} (\mathcal{U}\_{\mathbb{H}} - \mathcal{U}\_{\mathbb{n},0})^{\mathbb{A}\_{\mathbb{n}}}, & \text{if } \mathcal{U}\_{\mathbb{n}} \ge \mathcal{U}\_{\mathbb{n},0} \\ -k\_{\mathbb{n}}(\mathcal{U}\_{\mathbb{n},0} - \mathcal{U}\_{\mathbb{n}})^{\mathbb{A}\_{\mathbb{n}}}, & \text{otherwise} \end{cases} \tag{4}$$

where *Un*,0 = <sup>1</sup> ˆ*tne*ˆ*<sup>n</sup> bn* denotes the reference point expressing the user's *n* perceived satisfaction by processing all of its data locally at its device, which is the safe choice in terms of receiving a guaranteed satisfaction. Similarly, *Un* denotes the user's actual perceived satisfaction from offloading part of its data to the UAV-mounted MEC server, and is given by Equation (5) below.

The parameters *αn*, *β<sup>n</sup>* where *αn*, *β<sup>n</sup>* ∈ (0, 1] express the sensitivity of users to the gains and losses of their actual perceived satisfaction *Un*, respectively. In particular, the user's risk averse behavior in gains and risk seeking behavior in losses is captured by small values of the parameter *α<sup>n</sup>* ∈ (0, 1]. Similarly, a small value of the parameter *β<sup>n</sup>* ∈ (0, 1] captures a higher decrease in the user's prospect theoretic utility, when its actual perceived satisfaction is close to the reference point. It is noted that the values of the parameters *αn*, *β<sup>n</sup>* can be determined and quantified based on statistical analysis of existing open datasets stemming from qualitative results of users' behavioral models (e.g., [27]). Furthermore, the loss aversion parameter *kn* <sup>∈</sup> <sup>R</sup><sup>+</sup> quantifies the impact of losses compared to the gains in user's prospect theoretic utility. Specifically, for *kn* > 1, the user weighs the losses more than the gains, while the exact opposite holds true for 0 ≤ *kn* ≤ 1. For simplicity and without loss of generality, in this work, we assume *α<sup>n</sup>* = *βn*.

Specifically, the user's actual perceived satisfaction from offloading part of its data (denoted by *bMEC <sup>n</sup>* ) to the UAV-mounted MEC server is denoted as *Un*(*bMEC <sup>n</sup>* ) and is formally defined as follows:

$$\mathrm{LI}\_{n}(\mathbf{b}^{\mathrm{MEC}}) = \begin{cases} \frac{1}{\bar{t}\_{n}\bar{t}\_{n}}b\_{n\prime} & \text{if } b\_{n}^{\mathrm{MEC}} = 0\\\frac{1}{\bar{t}\_{n}\bar{t}\_{n}}(b\_{n} - b\_{n}^{\mathrm{MEC}}) + b\_{n}^{\mathrm{MEC}}\mathrm{RoR}(d\_{\mathrm{r}}) - c\_{n}(b\_{n}^{\mathrm{MEC}}), & \text{if } b\_{n}^{\mathrm{MEC}} \neq 0 \text{ and } \mathrm{MEC} \text{ survives} \\\frac{1}{\bar{t}\_{n}\bar{t}\_{n}}(b\_{n} - b\_{n}^{\mathrm{MEC}}) - c\_{n}(b\_{n}^{\mathrm{MEC}}), & \text{if } b\_{n}^{\mathrm{MEC}} \neq 0 \text{ and } \mathrm{MEC} \text{ fails} \end{cases} (5)$$

The first branch of Equation (5) expresses the user's actual perceived satisfaction from processing all of its data locally to its mobile device. The second branch of Equation (5) captures the user's actual perceived satisfaction by processing part of its data locally (first term) and part of them to the UAV-mounted MEC server (second term), while experiencing the corresponding usage-based cost (third term) for exploiting the UAV-mounted MEC server's computation resources in the case that the MEC server can process all the users' requests. The third branch of Equation (5) represents the user's utility in the case that the MEC server fails to process the users' data due to its overexploitation. The user's actual perceived satisfaction from processing part of its data to the UAV-mounted MEC server depends on the server's rate of return function *RoR*(*dτ*), where *dτ*(**bMEC**), **bMEC** = (*bMEC* <sup>1</sup> , ... , *<sup>b</sup>MEC <sup>N</sup>* ) is a normalized increasing function with respect to the users' total demand

of computation resources by the UAV-mounted MEC server. The vector **bMEC** = (*bMEC* <sup>1</sup> , ... , *<sup>b</sup>MEC <sup>N</sup>* ) denotes the data offloading strategies of all the users in the examined system to the UAV-mounted MEC server. For demonstration purposes and without loss of generality, the users' total demand function *<sup>d</sup>τ*(**bMEC**) ∈ [0, 1] of computation resources by the UAV-mounted MEC server is defined as follows:

$$d\_{\mathbf{r}}(\mathbf{b}^{\text{MEC}}) = -1 + \frac{2}{1 + \frac{-\theta \sum\_{n=1}^{N} d\_n \frac{b\_n^{\text{MEC}}}{b\_n}}} \tag{6}$$

where *θ* > 0 is a positive constant calibrating the sigmoidal curve of Equation (6) based on the computing capabilities of the UAV-mounted MEC server. The users' total computation demand function *dτ*(**bMEC**) is a continuous and strictly increasing function with respect to the users' total amount of offloaded data. Equation (6) is a representative example of the users' total computation demand function, while any other function that follows the above described properties can be adopted for the following analysis without loss of generality. In a nutshell, the UAV-mounted MEC server's rate of return function *RoR*(*dτ*) provides positive experience, i.e., *RoR*(*dτ*) > 0, if the server has sufficient computation resources to serve the users' total computation demand *dτ*(**bMEC**). The UAV-mounted MEC server's rate of return function *RoR*(*dτ*) is a continuous, monotonically decreasing, and concave function with respect to the users' total demand of computation resources, since the server's computation resources assigned to each user and correspondingly the users' perceived actual satisfaction decrease for increasing values of the users' total computation demand [28]. For demonstration purposes, in this paper, we adopt an indicative rate of return function that respects all aforementioned properties and is defined as follows:

$$RoR(d\_\tau) = 2 - \mathcal{e}^{d\_\tau - 1} \tag{7}$$

Following the above discussion and focusing on the user's prospect theoretic utility function, as defined in Equation (4), it is noted that the first branch of Equation (4) expresses the user's *n* risk-aware satisfaction in the case that the UAV-mounted MEC server survives and can support the users' total computation demand. In that case, each user targets at the maximization of its gains, while, in the opposite case, i.e., the second branch of Equation (4), the user targets at the minimization of its losses, as the UAV-mounted MEC server has failed due to overexploitation.

If the UAV-mounted MEC-server survives, then the user's actual utility is determined by the second branch of Equation (5), given that the user offloaded part of its data to the MEC server. Thus, in combination with the first branch of Equation (4), the user's prospect theoretic utility is given as follows:

$$\begin{split} P\_n^{\text{surr.}} (\mathcal{U}\_n) &= (\mathcal{U}\_n - \mathcal{U}\_{n,0})^{a\_n} \\ &= (b\_n^{\text{MEC}})^{a\_n} [(2 - e^{d\_\tau - 1}) - \frac{1}{\hat{t}\_\mathcal{H} \mathfrak{e}\_n} - c \frac{d\_\mathcal{U}}{b\_n}]^{a\_n} \end{split} \tag{8}$$

If the opposite holds true, that is, the UAV-mounted MEC server's computation resources are overexploited by the users and the server fails to serve them, then by combining the second branch of Equation (4) and the third branch of Equation (5), the user's prospect theoretic utility can be written as follows:

$$\begin{split} P\_n^{f\text{nil}}(\mathcal{U}\_n) &= -k\_n (\mathcal{U}\_{n,0} - \mathcal{U}\_n)^{a\_n} \\ &= -k\_n (b\_n^{\text{MEC}})^{a\_n} (\frac{1}{\hat{t}\_n \mathcal{E}\_n} + c \frac{d\_{\mathcal{U}}}{b\_n})^{a\_n} \end{split} \tag{9}$$

Furthermore, the probability of failure of the UAV-mounted MEC server, which is the server's probability to fail serving the users' total computation demand *d<sup>τ</sup>* (Equation (6)), is denoted by *Pr*(*dτ*). The UAV-mounted MEC server's probability of failure function *Pr*(*dτ*), 0 ≤ *Pr*(*dτ*) ≤ 1 is assumed

to be continuous, strictly increasing, convex, and twice differentiable function with respect to the users' total computation demand *dτ*. In the following, we adopt the square function to present the UAV-mounted MEC server's probability of failure, as shown below:

$$Pr(d\_{\mathbb{T}}) = d\_{\mathbb{T}}^2 \tag{10}$$

It is noted that the rest of the paper's analysis still holds true for any probability of failure function that is characterized by the properties described above and the selection of the square function for the probability of failure is mainly made for presentation purposes. Accordingly, the UAV-mounted MEC server's probability to survive and process the users' total amount of offloaded data are (1 − *Pr*(*dτ*)). Moreover, due to the nature of the user's total computation demand (Equation (6)), the UAV-mounted MEC server's probability of failure (Equation (10)) is convex on low to medium users' computation demand and concave on high demand, while it asymptotically converges to one, as shown in Figure 2.

Combining Equations (8)–(10), the user's expected prospect theoretic utility by offloading *bMEC n* data to the UAV-mounted MEC server is defined as follows, jointly capturing the uncertainty of the UAV-mounted MEC server's computation resources, the pricing of the UAV-mounted MEC server, as well as the user's risk-aware characteristics in its data offloading decision:

$$\mathbb{E}(\mathcal{U}\_n) = P\_n^{surv.} (\mathcal{U}\_n) (1 - Pr(d\_\tau)) + P\_n^{fail}(\mathcal{U}\_n) Pr(d\_\tau). \tag{11}$$

**Figure 2.** Probability of failure vs *<sup>x</sup>* when *Pr*(*x*)=(−<sup>1</sup> <sup>+</sup> <sup>2</sup> <sup>1</sup>+*e*−*<sup>x</sup>* )2.

#### **4. Pricing and Risk-Aware Data Offloading in UAV-Assisted MEC Systems**

In this section, the distributed pricing and risk-aware data offloading problem in UAV-assisted multi-access edge computing systems is formulated by adopting the principles of non-cooperative game theory and solved based on the theory of S-modular games.

#### *4.1. Problem Formulation*

Each user aims at maximizing its expected prospect theoretic utility function (Equation (11)) by distribution and autonomously deciding its optimal data offloading strategy *bMEC*<sup>∗</sup> *<sup>n</sup>* to the UAV-mounted MEC server, while considering the imposed pricing policy and its personal risk-aware characteristics. Accordingly, the users' pricing and risk-aware data offloading problem is formulated as a distributed optimization problem as follows:

$$\max\_{b\_{\mathbf{n}}^{MEC} \in [0, b\_{\mathbf{n}}]} \mathbb{E}(\mathcal{U}\_{\mathbf{n}}(b\_{\mathbf{n}}^{MEC}, \mathbf{b}\_{-\mathbf{n}}^{MEC})) \tag{12a}$$

$$\text{s.t.} \quad 0 \le b\_n^{MEC} \le b\_n \tag{12b}$$

where **bMEC** <sup>−</sup>**<sup>n</sup>** denotes the amount of the offloaded data by the rest of the users except for user *n*.

The distributed optimization problem of users' data offloading can be formulated as a non-cooperative game among the users *<sup>G</sup>* = [<sup>N</sup> , *An*,E(*Un*(*bMEC <sup>n</sup>* , **bMEC** <sup>−</sup>**<sup>n</sup>** ))], where <sup>N</sup> is the set of users, *An* = [0, *bn*] is the user's *n* data offloading strategy space, and E(*Un*(*bMEC <sup>n</sup>* , **bMEC** <sup>−</sup>**<sup>n</sup>** )) denotes the user's *n* expected prospect theoretic utility function, as defined in the previous section. The solution of the non-cooperative game *G* should determine each user's optimal data offloading strategy *bMEC*<sup>∗</sup> *<sup>n</sup>* in order to maximize its expected prospect theoretic utility. The Pure Nash Equilibrium (PNE) approach is adopted and described below, towards analytically seeking the solution of the pricing and risk-aware data offloading problem (Equation (12a) and (12b)).

**Definition 1.** *(Pure Nash Equilibrium Point): A data offloading vector* **bMEC**<sup>∗</sup> **<sup>n</sup>** = (*bMEC*<sup>∗</sup> <sup>1</sup> , ... , *<sup>b</sup>MEC*<sup>∗</sup> *<sup>N</sup>* ) *in the strategy space <sup>b</sup>MEC*<sup>∗</sup> *<sup>n</sup>* ∈ *An* = [0, *bn*] *is a Pure Nash Equilibrium point if for every user <sup>n</sup> the following condition holds true:*

$$\mathbb{E}(\mathcal{U}l\_{\boldsymbol{\eta}}(b\_{\boldsymbol{n}}^{\mathrm{MEC}\*}, \mathbf{b}\_{-\mathbf{n}}^{\mathbf{MEC}\*})) \geq \mathbb{E}(\mathcal{U}l\_{\boldsymbol{\eta}}(b\_{\boldsymbol{n}}^{\mathrm{MEC}}, \mathbf{b}\_{-\mathbf{n}}^{\mathbf{MEC}\*})) \tag{13}$$

*for all bMEC <sup>n</sup>* ∈ *An.*

The physical interpretation of the above definition is that, at the Pure Nash Equilibrium point, no user has the incentive to unilaterally change its data offloading strategy to the UAV-mounted MEC server given the data offloading strategies of the rest of the users, as its achieved expected prospect theoretic utility cannot be improved.

#### *4.2. Problem Solution*

In order to prove the existence of at least one PNE of the non-cooperative game *G*, as a solution of the maximization problem (Equation (12a) and (12b)), the theory of submodular games is adopted [29]. The submodular games are characterized by strategic substitutes, i.e., when a user offloads more data to the UAV-mounted MEC server, the rest of the users tend to avoid following similar behavior, as the UAV-mounted MEC server's computation resources can become overexploited and none of the users be satisfied. The submodular games are of great interest and practical importance as an optimization tool, due to the fact that they guarantee the existence of at least one PNE, while learning and adjustment tools (such as the best response dynamics) can be used in order to determine such a point.

**Definition 2.** *(Submodular Games): The non-cooperative game <sup>G</sup>* = [<sup>N</sup> , *An*,E(*Un*(*bMEC <sup>n</sup>* , **bMEC** <sup>−</sup>**<sup>n</sup>** ))] *is submodular, if, for all the users, the following conditions hold true [30]:*


Additionally, in a submodular game, there always exist external equilibria [31]: a largest best response strategy *bMEC <sup>n</sup>* = *sup*{*bMEC <sup>n</sup>* ∈ *An* : *BR*(*bMEC <sup>n</sup>* , **bMEC** <sup>−</sup>**<sup>n</sup>** ) <sup>≥</sup> *<sup>b</sup>MEC <sup>n</sup>* } and a smallest best response strategy: *bMEC <sup>n</sup>* = *inf* {*bMEC <sup>n</sup>* ∈ *An* : *BR*(*bMEC <sup>n</sup>* , **bMEC** <sup>−</sup>**<sup>n</sup>** ) <sup>≤</sup> *<sup>b</sup>MEC <sup>n</sup>* } of the non-empty set of Pure Nash Equilibria, where *BR*(*bMEC <sup>n</sup>* , **bMEC** <sup>−</sup>**<sup>n</sup>** ) denotes the user's *<sup>n</sup>* best response strategy to the other users' strategies.

**Theorem 1.** *The non-cooperative game <sup>G</sup>* = [<sup>N</sup> , *An*,E(*Un*(*bMEC <sup>n</sup>* , **bMEC** <sup>−</sup>**<sup>n</sup>** ))] *is submodular for all <sup>d</sup><sup>τ</sup>* <sup>∈</sup> (0, *<sup>μ</sup>*)*, where <sup>μ</sup>* <sup>∈</sup> (0, 1)*, and c* <sup>&</sup>lt; *bn dn* (<sup>1</sup> <sup>−</sup> <sup>1</sup> ˆ*tne*ˆ*<sup>n</sup>* )*, and has at least one Pure Nash Equilibrium point.*

**Proof.** The strategy space *An* = [0, *bn*] is a compact subset of a Euclidean space. The user's expected prospect theoretic utility function E(*Un*(*bMEC <sup>n</sup>* , **bMEC** <sup>−</sup>**<sup>n</sup>** )), as defined in Equation (11), is smooth, as it has derivatives of all orders everywhere in its domain *An*. Towards showing that the user's expected prospect theoretic utility function is submodular in *bn* and has non-increasing differences in (*bMEC <sup>n</sup>* , **bMEC** <sup>−</sup>**<sup>n</sup>** ), we examine the properties of the second order partial derivative of the user's expected prospect theoretic utility function, i.e., *<sup>∂</sup>*2**E***n*( *bMEC*) *∂bMEC <sup>j</sup> <sup>∂</sup>bMEC* ≤ 0.

*n*

We can rewrite Equation (11) using Equations (8) and (9), as follows:

$$\mathbb{E}\left(\mathbb{L}l\_{\mathbb{H}}(b\_{\underline{n}}^{\rm MEC},\mathbf{b}\_{-\mathbf{n}}^{\rm MEC})\right) = (b\_{\underline{n}}^{\rm MEC})^{a\_{\underline{n}}}\left\{[(2-\varepsilon^{d\_{\Gamma}-1})-\frac{1}{l\_{\rm d}\underline{l}\_{\rm d}}-\varepsilon\frac{d\_{\rm u}}{b\_{\underline{n}}}]^{a\_{\underline{n}}}(1-Pr(d\_{\Gamma}))-k\_{\rm d}(\frac{1}{l\_{\rm d}\underline{l}\_{\rm d}}+\varepsilon\frac{d\_{\rm u}}{b\_{\underline{n}}})^{a\_{\underline{n}}}Pr(d\_{\Gamma})\right\} \tag{14}$$

We define *RoR*(*dτ*) = [(2<sup>−</sup> *<sup>e</sup>dτ*−1) <sup>−</sup> <sup>1</sup> <sup>ˆ</sup>*tne*ˆ*<sup>n</sup>* <sup>−</sup> *<sup>c</sup> dn bn* ] *<sup>α</sup><sup>n</sup>* as the user's specific rate of return, which should be positive in order for the user to have an incentive to offload part of its data to the UAV-mounted MEC server. From Equation (7), the UAV-mounted MEC server's rate of return function *RoR*(*dτ*) is decreasing. Thus, the minimum value of *RoR*(*dτ*), and correspondingly of the function *RoR*(*dτ*), is determined at *d<sup>τ</sup>* = 1. The physical notion of *d<sup>τ</sup>* = 1 is that all the users offload their total amount of data to the UAV-mounted MEC server for further processing. Following this observation, we can determine the boundaries of the constant pricing factor *c* that the UAV-mounted MEC server imposes on the users, in order for the latter to still have an incentive to offload part of their data to the MEC server without the imposed pricing to become a prohibitive factor. Therefore, the feasible boundaries of the constant pricing factor are determined as follows:

$$\overline{RoR}(d\_{\tau} = 1) > 0 \Rightarrow \mathbf{c} < \frac{b\_n}{d\_n} (1 - \frac{1}{\hat{\mathbf{f}}\_n \hat{\mathbf{e}}\_n}) \tag{15}$$

In addition, the following conditions hold true by performing the corresponding derivations: *∂dτ ∂bMEC n* > 0, *<sup>∂</sup>d<sup>τ</sup> ∂bMEC j* > 0, *<sup>∂</sup>RoR*(*dτ*) *∂bMEC n* < 0, *<sup>∂</sup>RoR*(*dτ*) *∂bMEC j* <sup>&</sup>lt; 0, *<sup>∂</sup>*2*RoR*(*dτ*) *∂bMEC <sup>j</sup> <sup>∂</sup>bMEC n* < 0, *<sup>∂</sup>Pr*(*dτ*) *∂bMEC n* > 0, *<sup>∂</sup>Pr*(*dτ*) *∂bMEC j* <sup>&</sup>gt; 0, *<sup>∂</sup>*2*Pr*(*dτ*) *∂bMEC <sup>n</sup> ∂bMEC j* = 0. For notational convenience, we set *A* = *kn*( <sup>1</sup> <sup>ˆ</sup>*tne*ˆ*<sup>n</sup>* <sup>+</sup> *<sup>c</sup> dn an* )*α<sup>n</sup>* <sup>&</sup>gt; 0, and we calculate the second order partial derivative of the user's expected prospect theoretic utility function, as follows:

*∂*2E(*Un*(*bMEC <sup>n</sup>* , **bMEC** <sup>−</sup>**<sup>n</sup>** )) *∂bMEC <sup>j</sup> <sup>∂</sup>bMEC n* =*αn*(*bMEC <sup>n</sup>* )*αn*−1{ *<sup>∂</sup>RoR*(*dτ*) *∂bMEC j* [1 − *Pr*(*dτ*)] − *RoR*(*dτ*) *∂Pr*(*dτ*) *∂bMEC j* <sup>−</sup> *<sup>A</sup> <sup>∂</sup>Pr*(*dτ*) *∂bMEC j* }+ (*bMEC <sup>n</sup>* )*α<sup>n</sup>* { *<sup>∂</sup>*2*RoR*(*dτ*) *∂bMEC <sup>j</sup> <sup>∂</sup>bMEC n* [<sup>1</sup> <sup>−</sup> *Pr*(*dτ*)] <sup>−</sup> *<sup>∂</sup>RoR*(*dτ*) *∂bMEC n ∂Pr*(*dτ*) *∂bMEC j* <sup>−</sup> *<sup>∂</sup>RoR*(*dτ*) *∂bMEC j ∂Pr*(*dτ*) *∂bMEC n* } =(*bMEC <sup>n</sup>* )*αn*−1{*α<sup>n</sup> ∂RoR*(*dτ*) *∂bMEC j* [1 − *Pr*(*dτ*)] − *αnRoR*(*dτ*) *∂Pr*(*dτ*) *∂bMEC j* − *Aα<sup>n</sup> ∂Pr*(*dτ*) *∂bMEC j* + *bMEC n ∂*2*RoR*(*dτ*) *∂bMEC <sup>j</sup> <sup>∂</sup>bMEC n* [<sup>1</sup> <sup>−</sup> *Pr*(*dτ*)] <sup>−</sup> *<sup>b</sup>MEC n ∂RoR*(*dτ*) *∂bMEC n ∂Pr*(*dτ*) *∂bMEC j* <sup>−</sup> *<sup>b</sup>MEC n ∂RoR*(*dτ*) *∂bMEC j ∂Pr*(*dτ*) *∂bMEC n* } (16)

Let *<sup>ψ</sup>*(*dτ*) = *<sup>∂</sup>RoR*(*d<sup>τ</sup>* ) *∂bMEC j* [*α<sup>n</sup>* − *<sup>α</sup>nPr*(*dτ*) − *<sup>b</sup>MEC <sup>n</sup> <sup>∂</sup>Pr*(*d<sup>τ</sup>* ) *∂bMEC n* ] − *<sup>b</sup>MEC <sup>n</sup> <sup>∂</sup>RoR*(*d<sup>τ</sup>* ) *∂bMEC n ∂Pr*(*d<sup>τ</sup>* ) *∂bMEC j* . We can rewrite Equation (16), as follows:

$$\frac{\partial^{2}\mathbb{E}(l\mathbb{L}\_{\text{H}}(b\_{\text{n}}^{\text{MCE}},b\_{-\text{n}}^{\text{MEC}}))}{\partial b\_{\text{j}}^{\text{MEC}}\partial b\_{\text{n}}^{\text{MEC}}} - (b\_{\text{n}}^{\text{MEC}})^{a\_{\text{n}}-1} \left\{ \boldsymbol{\uprho}(d\_{\text{r}}) - \boldsymbol{a}\_{\text{n}} \overline{\boldsymbol{\text{R}}\boldsymbol{\text{R}}} \boldsymbol{d}\_{\text{r}} \right\} \frac{\partial \text{Pr}(d\_{\text{r}})}{\partial b\_{\text{j}}^{\text{MEC}}} - \boldsymbol{A} \boldsymbol{a}\_{\text{n}} \frac{\partial \text{Pr}(d\_{\text{r}})}{\partial b\_{\text{j}}^{\text{MEC}}} + b\_{\text{n}}^{\text{MEC}} \frac{\partial^{2}\overline{\boldsymbol{\text{R}}\boldsymbol{\text{R}}} \boldsymbol{d}\_{\text{r}}}{\partial b\_{\text{j}}^{\text{MEC}} \partial b\_{\text{n}}^{\text{MEC}}} \left[1 - \text{Pr}(d\_{\text{r}})\right] \right\} \tag{17}$$

It is observed that the last three terms of Equation (17) are negative; thus, we study the properties of the function *<sup>ψ</sup>*(*dτ*), ∀*<sup>n</sup>* ∈ N . For *<sup>d</sup><sup>τ</sup>* = 0, we have *<sup>b</sup>MEC <sup>n</sup>* = 0. Thus, we calculate:

$$
\psi(d\_{\tau} = 0) = \frac{\partial \overline{RoR}(0)}{\partial b\_{j}^{MEC}} a\_{\pi} < 0 \tag{18}
$$

For *<sup>d</sup><sup>τ</sup>* ≈ 1, we have *<sup>b</sup>MEC <sup>n</sup>* = *bn*, ∀*n* ∈ N . Thus, we calculate:

$$\psi(d\_{\pi} \approx 1) - b\_{n} [\frac{\partial \overline{Ro\mathbb{K}}(1)}{\partial b\_{\restriction}^{MEC}} \frac{\partial Pr(1)}{\partial b\_{n}^{MEC}} + \frac{\partial \overline{Ro\mathbb{K}}(1)}{\partial b\_{n}^{MEC}} \frac{\partial Pr(1)}{\partial b\_{\restriction}^{MEC}}] > 0 \tag{19}$$

Since *ψ*(*dτ*) is continuous, using the Bolzano Theorem [32], we conclude that there exists at least one *μ* ∈ (0, 1) such that *ψ*(*d<sup>τ</sup>* = *μ*) = 0. Given that *ψ*(*d<sup>τ</sup>* = 0) < 0 (Equation (18)), then, if *μ* is the smallest possible value in (0, 1) such that *ψ*(*d<sup>τ</sup>* = *μ*) = 0, then *ψ*(*dτ*) < 0, ∀*d<sup>τ</sup>* ∈ (0, *μ*). Thus, we conclude that

$$\frac{\partial^2 \mathbb{E}(\mathcal{U}\_n(b\_n^{\rm MEC}, \mathbf{b}\_{-\mathbf{n}}^{\rm MEC}))}{\partial b\_j^{\rm MEC} \partial b\_n^{\rm MEC}} < 0, \forall d\_\mathbb{\tau} \in (0, \mu), \mu \in (0, 1). \tag{20}$$

Thus, the non-cooperative game *<sup>G</sup>* is submodular <sup>∀</sup>*d<sup>τ</sup>* <sup>∈</sup> (0, *<sup>μ</sup>*), *<sup>μ</sup>* <sup>∈</sup> (0, 1) and *<sup>c</sup>* <sup>&</sup>lt; *bn dn* (<sup>1</sup> <sup>−</sup> <sup>1</sup> ˆ*tne*ˆ*<sup>n</sup>* ). Therefore, the non-cooperative game *<sup>G</sup>* = [<sup>N</sup> , *An*,E(*Un*(*bMEC <sup>n</sup>* , **bMEC** <sup>−</sup>**<sup>n</sup>** ))] has at least one Pure Nash Equilibrium point **bMEC**<sup>∗</sup> **<sup>n</sup>** = (*bMEC*<sup>∗</sup> <sup>1</sup> ,..., *<sup>b</sup>MEC*<sup>∗</sup> *<sup>N</sup>* ) [33].

#### **5. Pricing and Risk-Aware Distributed Data Offloading Algorithm**

Towards enabling the users to determine their optimal data offloading strategy *bMEC*<sup>∗</sup> *<sup>n</sup>* in a distributed manner, the Best Response Dynamics (BRD) approach is adopted. The best response strategy of each user subject to the selected data offloading strategies of the rest of the users is formally determined as follows:

$$BR(b\_n^{MEC}, \mathbf{b\_{-n}^{MEC}}) = b\_n^{MEC\*} = \underset{b\_n^{MEC} \in [0, b\_n]}{\text{arg}\max} \, \, \mathbb{E}(l I\_n(b\_n^{MEC}, \mathbf{b\_{-n}^{MEC}})).\tag{21}$$

Given that we have already proven that the non-cooperative game *<sup>G</sup>* = [<sup>N</sup> , *An*,E(*Un*(*bMEC <sup>n</sup>* , **bMEC** <sup>−</sup>**<sup>n</sup>** ))] belongs to the class of submodular games as stated above, and therefore possesses at least one PNE point, it also readily follows that the iterated best-response dynamics always converges to a Pure Nash Equilibrium point [34,35].

Subsequently, capitalizing on the above argumentation, a distributed iterative and low-complexity algorithm is introduced in order to determine the users' optimal data offloading strategies to the UAV-mounted MEC server (see Algorithm 1). The proposed algorithm follows the philosophy and principles of the best response dynamics learning mechanism, and, at each iteration, each user aims at maximizing its expected prospect theoretic utility given the data offloading strategies of the rest if the users. The complexity of the pricing and risk-aware data offloading algorithm is *O*(*N* ∗ *Ite* ∗ *A*), where *Ite* is the total number of iterations in order for the algorithm to converge to the PNE, and *A* is the complexity of solving Equation (21). Detailed numerical results regarding the operation performance and scalability of our approach and algorithm, in terms of iterations, are presented in the following section as well.

#### **Algorithm 1** Pricing and risk-aware data offloading algorithm

**Input:** *N*, *c*, *bn*, *dn*, *fn*, *γn*, ∀*n* ∈ N **Output: bMEC**<sup>∗</sup> **Initialization:** *ite* = 0, *Convergence* = **false**, *<sup>b</sup>MEC*(*ite*=0) *<sup>n</sup>* , ∀*n* ∈ N **while** *Convergence* == **false do** *ite* = *ite* + 1; **for** *n* = 1 to *N* **do** user *n* determines *bMEC*∗(*ite*) *<sup>n</sup>* w.r.t. *<sup>b</sup>MEC*∗(*ite*−1) *<sup>n</sup>* (Equation (21)) and receives E(*Un*)(*ite*) **end for if** *<sup>b</sup>MEC*∗(*ite*) *<sup>n</sup>* <sup>=</sup>*bMEC*∗(*ite*−1) *<sup>n</sup>* **then** *Convergence* = **true end if end while**

#### **6. Numerical Results**

In this section, we provide a series of numerical results, obtained via modeling and simulation, evaluating the performance and the inherent attributes of the proposed pricing and risk-aware data offloading framework. Initially, the pure operational characteristics of the proposed framework are presented (Section 6.1), while the impact of the introduced usage-based pricing scheme is quantified and studied (Section 6.2). Moreover, a scalability analysis of the proposed framework is performed in Section 6.3, while the impact of the prospect theoretic parameters reflecting the user behavioral pattern in terms of loss aversion and sensitivity, on the overall system performance is evaluated in Section 6.4. The performed simulations were executed on an Intel Core i5-4300U CPU @ 1.90 GHz × 4 with 8 GB RAM (New York, NY, USA). The main parameters used in our simulation, along with their typical values, are presented in Table 1. In the rest of the analysis, and in particular in Sections 6.1 and 6.2, we have considered *N* = 25 users, and sensitivity (*kn*) and loss aversion (*αn*) parameter values as indicated in Table 1. However, in Sections 6.3 and 6.4, a wider range of the number of users and the loss aversion and sensitivity parameters are considered.



#### *6.1. Pure Operation of the Framework*

Figure 3 presents the amount of offloaded data by each user to the UAV-mounted MEC server, as well as the average amount of offloaded data as a function of the pricing and risk-aware data offloading algorithm's iterations. The results reveal that the introduced best response dynamics-based algorithm converges to the PNE quite fast and in small iterations (less than 10 iterations are required for all users). Moreover, Figures 4 and 5 illustrate each user's expected prospect-theoretic utility and the corresponding usage-based pricing imposed by the UAV-mounted MEC server as a function of the algorithm's iterations. The corresponding results reveal that initially the users tend to offload a great portion of their data to the MEC server, as observed in Figure 3, and therefore their expected prospect-theoretic utility increases (Figure 4). Specifically, at the first iteration of the algorithm, the users present an aggressive behavior in terms of offloading a large amount of data to the UAV-mounted MEC server (Figure 3) towards enjoying a high expected utility. (Figure 4). However, at the same time, this behavior is expected to lead to the increase of the probability of failure of the UAV-mounted MEC server (as it is confirmed below in Figure 7), and accordingly to the users being penalized with a high price. This is demonstrated in Figure 5, where, due to the fact that the users exploit more the computing capabilities of the MEC server, the latter imposes on them a higher usage-based pricing. Consequently, in combination with the impact of probability of failure and rate of return, as the iterations evolve, the users decrease the amount of data that they offload to the MEC server (Figure 3) following the learning mechanism of the best response dynamics, in order to experience a lower pricing (Figure 5) and finally they converge to the PNE.

**Figure 3.** Amount of data offloaded by each user vs. iterations.

**Figure 4.** Expected utility of each user vs. iterations.

**Figure 5.** Pricing imposed by the server on each user vs. iterations.

Figure 6 depicts the users' average expected prospect-theoretic utility and the users' average experienced usage-based pricing for exploiting the UAV-mounted MEC server's computing capabilities, as a function of the algorithm's iterations. In addition, Figure 7 presents the UAV-mounted MEC server's probability of failure as a function of the algorithm's iterations. The above described trend in users' data offloading strategies is observed from the system's point of view. Specifically, all the users tend initially to aggressively offload a large amount of data to the MEC server in order to achieve a greater utility (Figure 6). However, the probability of failure of the UAV-mounted MEC server increases due to the over-exploitation of its computing capabilities (Figure 7). Thus, the MEC server imposes a higher pricing on the users (Figure 6) to control their greedy and selfish data offloading behavior.

**Figure 6.** Average users' expected utility and average users' pricing vs. iterations.

**Figure 7.** Probability of failure of MEC server vs. iterations.

#### *6.2. Impact of Usage-Based Pricing*

In this section, we study the impact of the usage-based pricing imposed by the UAV-mounted MEC server, on the users' data offloading strategies, as well as on the overall operation of the system. Specifically, Figure 8 presents the probability of failure of the MEC server as a function of the pricing factor *c* (Equation (3)). Moreover, the users' average expected utility, the users' average amount of

offloaded data, and the pricing imposed by the MEC server are presented in Figure 9, as a function of the pricing factor *c* as well. The results reveal that, as the pricing policy becomes stricter (i.e., increasing values of the pricing factor), the usage-based pricing experienced by the users increases (Figure 9) and the exploitation of the MEC server's computing capabilities becomes cost inefficient after some point (with respect to the total offloaded data). Consequently, the users tend to offload a smaller amount of data to the MEC server (Figure 9), and the MEC server becomes less congested in terms of processing the users' computation tasks, and its probability of failure decreases (Figure 8).

Based on the results presented in Figure 9, it is observed that the users' average expected utility is concave with respect to the pricing factor. Specifically, small values of the pricing factor correspond to less-strict pricing policies; thus, the users over-exploit the MEC server's computing capabilities (i.e., high values of MEC server's probability of failure are observed), resulting in low values of expected utility. On the other hand, high values of the pricing factor result in discouraging the users to exploit the UAV-mounted MEC server's computing capabilities, thus concluding again to low levels of users' average expected utility. Therefore, a balanced pricing policy is required to keep the quality of experience of the users at high levels.

**Figure 8.** Probability of failure of MEC server vs. the pricing factor.

**Figure 9.** Average expected utility, offloaded data, and pricing vs. the pricing factor.

#### *6.3. Scalability Evaluation*

In this section, a scalability evaluation of the proposed pricing and risk-aware data offloading framework is provided considering an increasing number of users in the system. Table 2 presents the iterations and the overall corresponding execution time of the proposed algorithm in order to converge to the PNE point. Given the distributed nature of the best response dynamics approach, we observe that its execution time scales quite well for increasing number of users, achieving a close to real-time implementation in realistic scenarios. Respectively, the users' average expected utility, the users' average amount of offloaded data, and the imposed pricing by the UAV-mounted MEC server are presented in Figure 10, as a function of the number of users. The scalability evaluation is complemented by the results presented in Figure 11 that depict the convergence of the users' average amount of offloaded data as a function of the required number of iterations, for different numbers of users. In particular, we observe that, as the number of users in the system increases, they tend to offload a lower average amount of data to the MEC server (Figures 10 and 11), as the latter becomes over-congested. Thus, they experience both lower pricing (Figure 10) and lower expected utility (Figure 10), as they drive themselves in processing more data locally on their local devices and accordingly consume their own resources, i.e., battery. It is also observed that the user's experienced pricing *cn*(*bMEC <sup>n</sup>* ) and the user's offloaded data *bMEC <sup>n</sup>* (Figure 10) has the same trend, due to their one-to-one relationship stemming from Equation (3), while the corresponding curves also appear to be overlapping. However, it should be noted here that the actual values for the two curves are different, since there are two different right vertical axes in Figure 10 (each one reflecting the values of each curve respectively).

**Table 2.** Algorithm's execution time per user for a different number of users.


**Figure 10.** Users' average expected utility, users' average offloaded data and pricing at the PNE vs. number of users on the system.

**Figure 11.** Users' average data offloading vs. iterations for different numbers of users.

#### *6.4. Impact of Prospect Theoretic Parameters and User Competition*

In the following, the impact of the prospect theoretic parameters, reflecting the user behavioral pattern in terms of loss aversion and sensitivity, on the overall system performance is evaluated.

Specifically, in Figures 12 and 13, initially we present the average user offloaded data and corresponding Probability of failure, as functions of the sensitivity parameter *α<sup>n</sup>* and the loss aversion index *kn*, respectively. As can be seen from Figure 12, by increasing the sensitivity parameter *αn*, the users tend to offload more data to the MEC server since they opt to value more the larger gains, compared to those of smaller magnitude. The increased volume of data offloaded results in an increase in the corresponding Probability of Failure of the server as well. In Figure 13, on the other hand, we can see that, as the loss aversion index *kn* increases, less data are offloaded to the server, since higher value signifies more loss aversion for the users, resulting in smaller Probabilities of Failure of the server.

**Figure 12.** Average offloading data and PoF vs. sensitivity parameter *αn*.

**Figure 13.** Average offloading data and PoF vs. loss aversion index *kn*.

In order to further study the effect of competition of users for the CPR (i.e., UAV-mounted MEC server), we use the Fragility under Competition (FuC) metric [38]. This metric is expressed as the ratio between the Probability of Failure of the MEC server when *N* users are competing for the MEC server's resources at the equilibrium state, versus the Probability of Failure of the MEC server when there is only one user offloading data. Formally, the Fragility under Competition is defined as: *FuC* <sup>=</sup> *Pr*(**bMEC**<sup>∗</sup> **<sup>N</sup>** ) *Pr*(**bMEC**<sup>∗</sup> **<sup>1</sup>** ) , where *bMEC*<sup>∗</sup> *<sup>N</sup>* denotes the equilibrium point when *N* users are present and *bMEC*<sup>∗</sup> <sup>1</sup> denotes the corresponding equilibrium point if only one user was present, with the same risk preferences as the group of *N* users.

In Figures 14 and 15, we present the FuC metric as a function of the number of users in the system, for different values of the sensitivity parameter *α<sup>n</sup>* and the loss aversion index *kn*, respectively. In both figures, we observe that, as the number of users increases, the FuC increases as well, since more users are competing for the CPR and consequently more data are offloaded to the server, until it eventually plateaus. Concerning the effect that the prospect theoretic parameters have on the FuC metric, in Figure 14, we can see that the higher the value of the sensitivity parameter *αn*, the higher the FuC as well. This is justified by the fact that, the higher the values of *αn*, the greater the sensitivity of the users towards gains and losses of higher magnitude compared to those of smaller magnitude (Figure 12). As a result, users tend to offload more data to the MEC server and the server is more prone to failure, and accordingly an increase in FuC is expected. With respect now to the loss aversion index *kn*, we can see in Figure 15 that, as *kn* increases, the FuC decreases. This is due to the fact that, as *kn* increases, users become more loss averse and thus they tend to offload less data to the MEC server in order to avoid potential failure as already shown in Figure 13. The less data are offloaded to the server, the less the probability that the server will fail, thus resulting in lower FuC. Furthermore, based on the results of Figures 14 and 15, the FuC appears initially more sensitive to the number of users in the case *α<sup>n</sup>* compared to *kn*, our setting and experiments. It is clarified that the overall observed increasing trend of the FuC w.r.t. to the increasing number of users in these figures is well aligned with the fact that the failure probability is an increasing function of the total offloaded data of all users. However, the actual slope of the corresponding curves mainly depends on the used values for *α<sup>n</sup>* and *kn* for the generation of these curves, which are selected here only for demonstration purposes, and are not correlated with each other in any way.

**Figure 14.** Fragility under Competition vs. no. of users for different sensitivity parameters *αn*.

**Figure 15.** Fragility under Competition vs. no. of users for different loss aversion indices *kn*.

#### **7. Conclusions**

In this paper, a resource-based pricing and user risk-aware data offloading framework is proposed for UAV-assisted multi-access edge computing systems. In particular, a usage-based pricing mechanism is introduced regarding the exploitation of the MEC server's computing capabilities by the users, and is properly incorporated within the principles and modeling of Prospect Theory, which is used to capture the users' risk-aware behavior in the overall data offloading decision-making. Initially, the user's prospect-theoretic utility function is formulated by quantifying the user's risk seeking and loss aversion behavior, while taking into account the pricing mechanism. Accordingly, the users' pricing and risk-aware data offloading problem is formulated as a distributed maximization problem of each user's expected prospect-theoretic utility function and addressed as a non-cooperative game among the users. The existence of a Pure Nash Equilibrium for the formulated non-cooperative game is shown based on the theory of submodular games. An iterative and distributed algorithm is introduced that converges to the PNE, following the learning rule of the best response dynamics. Detailed numerical results are presented highlighting the operation feature and scalability properties of the proposed framework, while at the same time providing useful insights about the benefits of adopting the usage-based pricing scheme.

Our current and future research work focuses on treating the overall key problem of data offloading in various cloud computing environments, such as fog computing, where a large number of computing devices imposes additional scalability and stability challenges. Moreover, it is noted that, in this work, the data offloading problem was mainly treated from a computing resources perspective. However, depending on the environment assumed, the overall process could be affected by the wireless communication aspects between the UAV and users. The proposed framework could be adapted and extended to treat this aspect, either implicitly through the cost factors and functions considered when using the server resources, or explicitly by modeling the transmission characteristics (e.g., delay, rate, energy) involved in the offloading process.

**Author Contributions:** All authors contributed extensively to the work presented in this paper. G.M. contributed to the design of the algorithm, developed the code of the overall framework, executed the evaluation experiment, and contributed to the discussions and analysis of the overall theoretical framework. E.E.T. and S.P. were responsible for the overall orchestration of the performance evaluation work and had the overall coordination in the writing of the article. All authors have read and agreed to the published version of the manuscript.

**Funding:** The research work was supported by the Hellenic Foundation for Research and Innovation (H.F.R.I.) under the "First Call for H.F.R.I. Research Projects to support Faculty members and Researchers and the procurement of high-cost research equipment grant" (Project Number: HFRI-FM17-2436). The research of Eirini Eleni Tsiropoulou was conducted as part of the NSF CRII-1849739.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Edge Computing Resource Allocation for Dynamic Networks: The DRUID-NET Vision and Perspective**

#### **Dimitrios Dechouniotis 1,\*, Nikolaos Athanasopoulos 2, Aris Leivadeas 3, Nathalie Mitton 4, Raphael Jungers <sup>5</sup> and Symeon Papavassiliou <sup>1</sup>**


Received: 23 March 2020; Accepted: 10 April 2020; Published: 14 April 2020

**Abstract:** The potential offered by the abundance of sensors, actuators, and communications in the Internet of Things (IoT) era is hindered by the limited computational capacity of local nodes. Several key challenges should be addressed to optimally and jointly exploit the network, computing, and storage resources, guaranteeing at the same time feasibility for time-critical and mission-critical tasks. We propose the DRUID-NET framework to take upon these challenges by dynamically distributing resources when the demand is rapidly varying. It includes analytic dynamical modeling of the resources, offered workload, and networking environment, incorporating phenomena typically met in wireless communications and mobile edge computing, together with new estimators of time-varying profiles. Building on this framework, we aim to develop novel resource allocation mechanisms that explicitly include service differentiation and context-awareness, being capable of guaranteeing well-defined Quality of Service (QoS) metrics. DRUID-NET goes beyond the state of the art in the design of control algorithms by incorporating resource allocation mechanisms to the decision strategy itself. To achieve these breakthroughs, we combine tools from Automata and Graph theory, Machine Learning, Modern Control Theory, and Network Theory. DRUID-NET constitutes the first truly holistic, multidisciplinary approach that extends recent, albeit fragmented results from all aforementioned fields, thus bridging the gap between efforts of different communities.

**Keywords:** edge computing; internet of things; mobile robots; resource allocation; control co-design

#### **1. Introduction**

The Internet of Things (IoT) consists of low-cost efficient sensors, actuators, and computing units and provides great benefits to people to synthesize a system of interrelated computing, sensing, and communication devices that facilitates and improves everyday life, in cities and industry. IoT is foreseen to reach 500 billion devices that are connected to the Internet by 2030 [1], while the global mobile traffic is expected to increase sevenfold by 2021 [2]. Though significant improvements have been obtained in terms of hardware advances and processing capabilities at the device level; still, in most cases, IoT devices (e.g., smart devices, sensors, actuators, mobile agents) cannot meet, and more importantly cannot guarantee the required high performance and/or fulfillment of time constraints, for time-critical and mission-critical IoT-enabled applications. Thus, offloading computation and energy intensive tasks to powerful computing infrastructure for further processing becomes of vital importance.

The success of the computation offloading, and consequently the performance of IoT-enabled applications, depends on many contextual parameters, e.g., the user's mobility, various wireless parameters, and the resource availability of the computing resources in the data center. Most of the modern IoT-enabled applications rely on continuously moving people or mobile agents. Regarding the latter, various types of autonomous mobile agents or unmanned vehicles are used. Typical examples of these agents are the unmanned aerial vehicles (UAV), which are widely used in several human activities in the context of smart city, agriculture, area surveillance, rescue missions, and event coverage [3]. UAVs can be used individually or in a swarm, and they are equipped with various sensors in order to complete a mission or to execute their own tasks, such as trajectory planning and positioning. Their limited computing resources and energy reserves do not allow local data processing. Thus, the data offloading seems the only viable solution for using massively UAVs in various daily scenarios. Data are transmitted through wireless links, i.e., cellular or WiFi, and the quality of the wireless connection heavily depends on signal strength, interference, packet dropouts, and other parameters related to the wireless environment, which must be considered in the offloading decision.

The computation offloading aims to save time and energy at the end user's side. Cloud computing seems the natural selection for offloading, as it is the prevalent service delivery model nowadays. However, the high network delay for sending data over public internet counterbalances the benefits of the powerful computing resources that are available at a cloud data center. Accordingly, Multi-Access Edge Computing (MEC) [4] and Fog Computing [5] have arisen as promising approaches to overcome this obstacle and provide the benefits of cloud computing in the proximity of the end-users. Over the last few years, powerful UAVs have been considered as a means to provide computing support to the end-users by acting as UAV-mounted MEC servers [6]. In that respect, the UAV-mounted MEC servers in combination with ground MEC servers collectively create a fog computing system [7], supporting end-users' applications' task offloading. Similarly, the use of clusters of UAV-mounted MEC servers is suggested [8], allowing the opportunistic task offloading to the neighboring UAV clusters with sufficient computing resources. In such a UAV-assisted network, computing intensive tasks are offloaded and executed in a nearby small-size edge data center, either directly connected with a wireless access point, or it is embedded on the UAV itself. The key difference between cloud and edge data center is that the latter has a finite amount of computing resources, which requires fine-grained resource management towards meeting the strict constraints of the deployed time- and mission-critical applications.

#### *The DRUID-NET Perspective and Contributions*

This article presents the vision and perspective of the DRUID-NET (eDge computing ResoUrce allocatIon for Dynamic NETworks) framework, along with a detailed description of its main concepts and objectives. While considering the end-user's mobility and the parameters of the wireless connection environment, the DRUID-NET framework aims at developing workload profile holistic and modular dynamic performance models of IoT-enabled applications based on the appropriate theoretical tools. Furthermore, the article aims to outline the control principles of novel resource management systems for these kinds of applications. In particular, the key research threads and topics of this article are summarized as follows:


prediction mechanisms will enable more accurate, adaptive, and successful data offloading and resource allocation mechanisms.


The rest of the article is organized as follows. In Section 2, the current state of the art is presented. Section 3 demonstrates the conceptual architecture of the DRUID-NET framework, while Section 4 describes three IoT-enabled use cases where the proposed solution is applicable. Finally, Section 5 draws the conclusions and future directives of our research.

#### **2. Related Work and Motivation**

This section provides a thorough yet comprehensive presentation of the most relative studies to the DRUID-NET framework, in the recent literature. Aligned with the DRUID-NET objectives, Abdelzaher et al. [9] presented five challenges on IoT applications and Edge Computing. This study focused mostly on deep learning-based application modeling, optimal offloading, closed loop guarantees, and collaborative offloading. Towards these directions, the related work is categorized under three major classes; (i) IoT workload profile, (ii) performance modeling and resource allocation, and (iii) control co-design.

#### *2.1. IoT Workload Profile*

The estimation of the workload and communication patterns in IoT-Fog/Edge networks has only been explored a little due the high heterogeneity of co-existing devices. Nevertheless, there is no doubt that the proper estimation of the offered workload and communication patterns could lead to a more efficient utilization of the underlying infrastructure.

Authors in [10] considered a two-tier network architecture consisting of shallow and deep cloudlets, and explored the benefits of hierarchical capacity provisioning based on queuing analysis. Although shown to be efficient in very specific cases, this approach cannot be generalized in principle. Osmotic Computing [11] relied on the deployment of lightweight microservices on resource-constrained IoT platforms at the network edge, coupled with more complex microservices running on large-scale datacenters. MobiQoR [12] introduced a new metric, Quality of Results, to validate the quality of edge resource deployment. Nevertheless, none of these approaches attempted to estimate the IoT workload, which in turn could significantly enhance the corresponding deployments. The authors of [13] and subsequently of [14] analyzed the resource allocation of a

three-layer infrastructure (IoT, Edge, Cloud) under dynamic network conditions. However, they took into consideration the dynamic opt-in and out of IoT devices into the network, while ignoring their instantaneous workload generation. To the best of our knowledge, the only attempts to estimate workload are referring to the cloud utilization [15,16], and as such they did not capture the locality of the heterogeneous IoT traffics.

A promising approach to derive workload profile is to use machine learning techniques. Applying deep or machine learning techniques for IoT applications is not new [17], but most of the time they are centralized and do not need any adaptation to fit specific IoT devices limitations. In DRUID-NET, we will rely on existing estimation methods, such as [18], to estimate the workload of hardware constrained devices. In existing works, the focus is placed on one specific resource each time (e.g., energy, memory, computing, etc.) [19]. The DRUID-NET framework aims at extending them to multiple resources, while combining these approaches with predictive methods, which have only been slightly explored for IoT due to resource limitations. Thus far, methods such as ARIMA [20], deadreckonning [21], Kalman filters [22], Thompson sampling [23], or Bayesian approaches [24] have mainly been investigated for navigation and position prediction [20], data reduction [24], link prediction [25], or medium occupation [23]. Our aim is to provide a unique distributed and adaptive multi-resource estimation and prediction suitable for IoT devices. The DRUID-NET goal is to derive some communication patterns, clearly defined in time and size, towards assessing the need in edge resources in time and space. This edge-resource sizing combined with performance modeling, controlled mobility of edge-resource and resource allocation, will enable the adaptive deployment of sufficient resources, on demand, and in an efficient manner.

#### *2.2. Performance Modeling and Resource Allocation in Cloud and Edge Computing*

Resource allocation has become one of the most important open research problems in Cloud and Edge computing and IoT. In the cloud computing environment, the computing resources are assumed to be infinite; thus, static or empirical models combined with coarse resource scheduling techniques have been shown sufficient to provide high performance through over-provisioning. However, these approaches are neither optimal nor able to provide QoS guarantees. Regarding application's performance modeling, the empirical or fixed models considered already known request sizes and execution times, which are not only hardware-specific, but generally very difficult to be precisely computed. Furthermore, many studies relied on queuing models [26], e.g., G/G/1 or G/G/n, which are reliable only for steady state. It is obvious that this kind of modeling cannot capture transient phenomena due to dynamic workload demand. With this capacity, System Theory [27] can provide dynamic modeling methodologies, appropriate for Cloud/IoT-based applications. The interesting reader may refer to survey [28] for an extended analysis of control theoretic approaches on performance modeling and cloud elasticity. Close to DRUID-NET concepts, Dechouniotis et al. [29] proposed Linear Parameter Varying (LPV) modeling of cloud applications combined with set-theoretic controllers to guarantee a feasible solution of the elasticity in cloud data centers, while Leontiou et al. [30] derived fuzzy Takaki–Sugeno models and designed robust controllers to address simultaneously the problems of vertical and horizontal scaling, and load balancing with stability guarantee.

Contrarily to cloud computing, the resources of edge computing are rather limited; thus, static allocation techniques cannot achieve optimal resource utilization. Furthermore, modern time- and mission-critical IoT-enabled applications [31,32] have strict performance requirements that only dynamic modeling and intelligent allocation algorithms can guarantee. Similarly to cloud, in the edge computing context, most of the relative studies proposed static models alongside with the optimization of a single performance criterion, e.g., energy consumption or response time. Towards this direction, Sonmez et al. [33] proposed a two-stage fuzzy mechanism for offloading requests to edge and cloud infrastructure. The set of fuzzy rules are empirically decided and the VM (Virtual Machine) utilization modeling is threshold-based, which is applicable only for specific types of IoT applications. Queec [34] formulated the problem of scheduling multi-user tasks to multiple edge nodes

as an optimization problem which minimizes the overall offloading latency of all tasks. Jalali et al. [35] analyzed fixed flow-based and time-based energy consumption models, and they presented a detailed comparison on energy consumption between cloud and edge computing systems under various network settings. Lyu et al. [36] presented a collaborative Cloud-MEC-IoT architecture and proposed a request modeling scheme and an admission control framework to address the scalability problem of these platforms. Although the authors considered heterogeneous edge resources, the computation model was not dynamic. The authors of [37] addressed both the problems of network selection and service placement for MEC infrastructure. Towards the reduction of the complexity of the general problem, they decomposed it into a series of sub-problems and solved them in an iterative fashion. However, the proposed performance model focused only on network related parameters ignoring the processing time of the application.

In the 5G era, Network Functions Virtualization (NFV) and Software Defined Networks (SDN) play key roles for the realization of many type of verticals, which are comprised of several IoT applications. In this context, virtualized and isolated Service Chains (SCs) comprised of a series of Virtualized Network Functions (VNFs) implemented as VMs need to be deployed in the available MEC infrastructure to offer networking services to the IoT traffic. Normally, the objective of this kind of resource allocation mechanism aims to minimize the overall deployment cost (e.g., the computational and communication resources that an SC needs in order to be provisioned) [38]. Another common approach is to minimize the overall delay, since several IoT applications are characterized as mission critical and delay sensitive. Thus, a valid approach is to utilize the MEC resources that are closer to the IoT devices [39]. An alternative approach to minimize the delay is to create resource clusters inside the MEC infrastructure, where the various requested SCs can be deployed [40]. Minimizing the number of clusters and appropriately positioning the VNFs can lead to a reduction of the communication delay. Efforts have also been dedicated to optimize the energy consumption. The authors of [41] modeled the energy dissipation of the resources in the IoT and MEC infrastructures and constructed a Linear Programming algorithm to carefully select the resources to place the SCs. Another objective focuses on the optimal allocation and scheduling of the available edge resources. This objective can be translated into either: (a) minimizing the overall resource usage, to enable multiple heterogeneous SCs, servicing heterogeneous IoT applications, to co-exist in the MEC layer [42], or (b) minimizing the resource idleness of the infrastructure [43]. Load balancing can also be applied by minimizing the maximum link utilization and reducing the bandwidth consumption [44]. This can be achieved by adopting appropriate queuing and QoS modeling during the optimization problem to minimize the resource utilization [45]. Even though all the above solutions target valid and open challenges of resource allocation in the IoT/MEC, they only propose static approaches failing to provide a holistic mechanism that takes into consideration a multi-objective and dynamic solution. Following the performance modeling and control design principles of [9,46], DRUID-NET aspires to provide multi-variable dynamic models and design modern control methodologies that ensure the desired user's performance requirements and optimize the utilization objectives of the infrastructure provider simultaneously.

#### *2.3. Control-Theoretic Resource Allocation and Control Co-Design*

In control theory, the effect of a shared, imperfect communication network between the controller and the sensor/actuator network has been studied extensively for almost three decades, generating the separate branch of Networked Control Systems (NCS), Ref. [47,48]. NCS suffer from many non-idealities. For instance, networked induced delays or, even worse, packet dropouts occur, as the information from the sensor to the controller or from the controller to actuator(s) can be lost in a time interval. Moreover, due to the limited energy available at decentralized nodes, bandwidth can be low, so that the effect of quantization in the communication channels may not be neglected. In addition, switching or hybrid phenomena may occur due to the asynchrony between disconnected agents, or due to event-triggered strategies. Finally, the computational problem, to be performed at the nodes, may be part of a global optimization problem, which is split into decentralized subtasks.

Several methods have addressed these non-idealities separately. Time delays, for instance, have been tackled utilizing perturbation theory, Lyapunov stability theory, and hybrid systems analysis, but also probabilistic methods involving Markov chains and stochastic automata [49]. Quantization problems have led to a rich literature, where the controllability of a plant subject to quantized control is ruled by the so-called *entropy* of the system [50]. From the hybrid control point of view, researchers from real-time computing have dealt with the schedulability problem of distributed control settings, leading to the design of several protocols for a stable closed-loop behavior, [51–53]. Decentralised computation/optimization has been another major topic of research in Systems and Control [54]. Here, though the state of the art is rich, the interaction of this constraint with others is not well understood and studied. Let us note however that the consensus problem has been deeply studied, in many settings, e.g., quantized communications [55].

Additionally to the stochastic results [56], recent theoretical work on the controllability and observability properties of the NCS [57] has shown that a more refined modeling of the communication network allows the proper definition and verification of such properties, thus adding new tools to the NCS community. Furthermore, proof-of-concept work has shown that, under a new modeling framework for hybrid systems and specifically constrained switching systems [58], the control performance can be directly associated with the network quality [59].

Rather than designing the control and communication protocol in two steps, co-design methods aim to synthesize simultaneously controllers and the communication patterns (sampling, delays, scheduling protocols). Applied only to networked control systems with constrained communication resources so far, co-design methods have been extensively studied the last decade [60–63]. Perhaps the most relevant breakthrough in this area is the emergence of event-triggered and self-triggered control mechanisms that allow asynchronous sampling, thus reducing the network traffic, while at the same time behaving sub-optimally [64–66].

Nevertheless, there is limited work on the co-design of controllers taking into account simultaneously more than one phenomena (schedulability, network utilization, edge resource utilization, energy consumption, etc.) caused by the distribution of computing and communication resources. It is anticipated that the research developments in the upcoming decades will allow for encapsulating, comparing, and subsequently altering the impact of the several non-idealities, and this in turn will have a significant impact on future control applications, where resources must be used parsimoniously, in balance with the constraints and the overall considered objective. This will require and motivate new paradigms in Systems and Controls, where multi-objective optimization, model-free (data-driven) approaches, approximate optimality (however with firm safety guarantees), reconfigurability, and resilience take a central place.

#### **3. DRUID-NET Conceptual Architecture**

Figure 1 illustrates a high-level overview of the overall DRUID-NET framework. The architecture follows the NFV/SDN paradigm and separates the flow of information into control and data planes. At the lowest layer, the IoT applications are deployed, and the generated workload (data flow) can be offloaded for further processing at the upper level of Edge Computing. In this layer, any component of the application is provided as a virtualized service. As it is shown in the figure, a virtualized service corresponds either to IoT specific functionalities, e.g., path planning and image recognition, or control components such as learning algorithms or optimization solvers. The modeling and control framework collects information (control flow) about the status of the computing and network infrastructure at the edge computing level in order to create workload-resource profiles, update the performance model for every application, and realize the feedback control mechanism for the resource allocation, while simultaneously implementing a resource-aware control strategy for the cyber-physical system to be controlled (control flow). This holistic approach allows the application's dynamical modeling taking

various contextual information into account. Furthermore, the controller co-design treats the resource allocation algorithms as application components in the virtualized services. Each major component of the modeling and control framework is described in more detail in the following subsections.

**Figure 1.** Conceptual architecture.

#### *3.1. IoT Workload Profiling*

As mentioned before, a major challenge for solving the resource allocation problem in edge computing settings is to predict the time-varying characteristics of the workload/traffic, as different traffic flows and volatile conditions can influence significantly the resource allocation mechanism. Aspects such as the load generated from an IoT device, latency specifications, the transmitting data frequency, the wireless protocol, the mobility of the devices, and the number of devices associated with the IoT gateway, change the amount of resources requested from the edge, while also influencing the scheduling process. Until now, only generic traffic models have been proposed to estimate the traffic aggregated at the edge layer, while stationary IoT devices are assumed, leading to a static rate model, which however limits its effectiveness and applicability in real scenarios.

Going a step beyond from the pertinent literature, which only considers average and general traffic characteristics of the IoT applications (e.g., Brownian motion as one-fits-all model), DRUID-NET framework aims to differentiate and categorize the requirements of various IoT applications using appropriate data analytic and mathematical models. In particular, we classify and categorize the IoT applications by leveraging the transmission patterns, the spatial and temporal correlation of the traffic, as well as other traffic related characteristics such as the frame size distribution, and the burstiness of the traffic of the IoT applications. The novelty of this approach is that we create prediction mechanisms to treat the dynamics and uncertainty in the corresponding traffic profiles. Each predictive mechanism targets specific categories of IoT applications with similar requirements and characteristics to define the type, the size, as well as the time and the location of the requested resources. Furthermore, with this

approach, we can dispose the erroneous assumption that specific tasks are associated with static and pre-specified resource footprints. In contrast, we replace this analogy with an opportunistic association between the requested resources and the IoT traffic dynamicity, thus introducing a holistic mechanism inspired by data analytics, and traffic analysis methods.

#### 3.1.1. IoT Applications Classification

A first classification of the IoT applications can be produced by simply answering yes or no, to questions regarding the involved "things"/devices. Indicative such questions can be identified as follows: (i) Are the devices heterogeneous? (ii) Are they battery-powered? (iii) Are they sending data with high or low frequency? (iv) Are they data rich (e.g., multiple number of sensor measurements)? (v) Are the devices mobile?

The answer to such questions will help us to create a first clustering of the IoT applications. These clusters will contain IoT applications with similar device characteristics and behavior. Nonetheless, this first-phase categorization does not necessarily mean that the IoT applications belonging in the same cluster will present the same exactly resource requirements at the Edge. The reason is that different network access technologies can significantly affect the network requirements of the IoT applications. For example, different access technologies (e.g., LoRaWAN, Wi-Fi, IEEE 802.15.4, cellular, etc.) have different characteristics in terms of packet length, transmission range supported, MAC mechanisms, topological characteristics of the associated IoT devices (e.g., star, mesh, peer-to-peer), number of device connections supported, etc.

Thus, DRUID-NET takes into consideration both the functional and network requirements of the IoT applications in order to provide a complete and realistic IoT application classification.

#### 3.1.2. IoT Applications Workload Prediction

The above categorization will help us to extract the workload generated from each cluster of applications in terms of bandwidth, latency, and other important Key performance Indicators (KPIs) during the offloading of IoT tasks to the Edge. Specifically, through this approach, we can propose appropriate mathematical models to simulate the traffic behavior of the various IoT applications. Nonetheless, even with this modeling, a lot of ambiguity will exist. The reason is that IoT access networks include several uncertainties, usually being wireless, lossy, and unreliable. Hence, the goal of DRUID-NET is not only to categorize and classify IoT applications based on their traffic profiling, but to also apply network analytics to make the communication as deterministic as possible.

Our goal is to replace the so far average estimates of the IoT applications with instantaneous and accurate transmission metrics. To this end, appropriate machine learning algorithms (i.e., Thompson, ARMA, Bayesian) need to be integrated in the traffic profiling in order to learn and predict the network conditions between the IoT and the Edge. This can be decisive in the performance of the subsequent resource allocation at the Edge. The Edge controller will be able to adapt to and predict the changing workload arriving at the Edge infrastructure, creating a holistic and realistic resource allocation approach.

#### *3.2. Performance Modeling*

The available resource models are usually single-input single-output. Energy or response time are typically the model's outputs, while computing resources (e.g., CPU, memory), incoming requests, and network bandwidth are the control variables. In most of the current studies, the relation between input and output is fixed and empirically derived. For example, the processing time of a request is proportional to its file size and inversely proportional of the service rate measured in CPU cycles or millions of instructions per second. Although this assumption is reasonable, the actual processing time depends on several time-varying parameters, which are not easily measured. Furthermore, in combination with static resource allocation mechanisms, the offloading decision performs adequately only for specific operating conditions, being unable to guarantee stability under fluctuating workload and heterogeneous IoT communication infrastructure.

Contrary to current approaches that provide empirical static models, we aim to develop formal, realistic, and dynamic traffic and resource models applicable to emulate the generated traffic from various IoT applications. For this purpose, DRUID-NET adopts hybrid dynamical models [67] that have the capacity to include several performance metrics (i.e., state variables) and resources as control parameters (input variables). This type of modeling takes into account in a single formulation the various contributions of the diverse objectives and constraints to the performance/cost. This framework moreover allows for discovering the trade-offs between accuracy, complexity of representation, and real-time feasibility of the resource allocation strategy. Furthermore, the chosen framework will be capable of capturing structural changes interpreted as discrete jumps in the dynamics, e.g., user mobility, change in wireless protocols and topology, and addition/removal of edge servers. Finally, alongside with the dynamic models, the DRUID-NET framework aims to identify the uncertainties of these models and quantify their boundaries in order to facilitate the design of the respective control laws.

#### *3.3. Resource Allocation*

The workload profile estimator and the dynamic model of the resources and the overall status of the network/servers provide the foundation upon which the resource allocation algorithm will be developed. Specifically, the objective is to develop a joint communication, computing and storing virtualization paradigm that is updated and adapted dynamically. For this purpose, we consider the problem of simultaneously (i) allocating storage, computing and communication resources, (ii) modifying network topology/ protocol, and (iii) structuring the edge computing data centres (such as VM distribution). Two distinct approaches relating to static and dynamic resource allocation are considered.

#### 3.3.1. Static Resource Allocation

In this approach, we do not take into account the dynamic nature of the processes under study; however, we consider the full resource allocation problem. The method is oriented towards solving multi-objective optimization problems fast that will in turn provide the optimal operating point for the communication network, and the computing and storage allocation in the edge/cloud servers. Our goal is to describe the complex interrelations between the aforementioned resources in an analytic manner, merging available models, e.g., from queuing theory and Markov models. Next, we plan to solve the optimization problems using mixed-integer, linear, and nonlinear programming. Since the complexity of these problems does not allow often exact real-time solutions, our intention is to propose approximate solution algorithms that provide guarantees of the level of suboptimality of the identified solution. Additionally, we will employ machine learning algorithms to relax the complexity of these highly nonlinear/nonconvex problems so they can be solved in real-time, thus respecting hard time constraints. This approach will focus on problems involving complex specifications and mostly static models, aiming to maximize the QoS delivered.

#### 3.3.2. Dynamic Resource Allocation

In this approach, the DRUID-NET framework proposes dynamic control-theoretic resource allocation mechanisms. Utilizing the models established by capturing the performance metrics dynamics in a hybrid dynamical system, our goal is to follow control-theoretic approaches that provide formal guarantees on important properties describing the resource allocation problem. For example, a main objective is to provide guarantees of the speed of convergence of the performance metrics to a pre-determined range, defined by translating the QoS requirements to mathematical statements. Moreover, our goal is to provide decision mechanisms that allow structural changes in some cases (for example, turning on and off edge servers in a cluster, changing the topology in a communication

network), together with continuous strategies (such as CPU and memory utilization in a server). The natural, main challenge in this approach is the scalability of the decision algorithm, which will be tackled by proposing smart allocation strategies that allow trade-offs between performance and real-time implementation. Another challenge is to establish resource allocation mechanisms using only partial information, which is the most realistic scenario. This issue will be addressed by proposing distributed control mechanisms that take continuously into account local information and receive only intermittently information about the states of the whole system.

#### *3.4. Co-Design of Controllers*

As we have mentioned before, in the broad field of Systems and Control, several different paradigms have emerged in the last few decades, to deal with the control of IoT-enabled cyber-physical systems. Indicative examples include hybrid behavior, quantized control, varying delays, safety-criticality, nonlinear control, etc. Although these challenges are typically met together in IoT environments, the research activities have led to disconnected communities, and likewise very specific and custom control techniques that limit their implementability in a holistic framework. In a real-life IoT control application, these non-idealities take place all together. We argue that the different paradigms separately introduced for each of these non-idealities are hard to reconcile, thus the DRUID-NET framework is devoted to deploying the theoretical results in actual applications.

Modern IoT applications need controllers that address a mixture of these undesired phenomena. Our goal is to establish a formal decision mechanism that will be able to change the provisioning of the resources in real time, adapt its control objective to the available bandwidth, weigh the cost of communication with respect to the advantage of involving decentralized agents, and eventually address a multitude of practical challenges appearing in networked, resource-constrained control applications. Such a new generation of controllers will be made possible by the merging of two sets of hybrid models, namely (a) the performance model having as internal variables performance metrics of the infrastructure and as inputs the resource distribution and utilization, and (b) the process model (having, for example, variables related to position, orientation, velocity and acceleration of mobile agents, lighting conditions, room temperature, mode of operation of sensors, etc). To provide an example of the challenges that will be met in this setting, let us raise the following question: what do traditional data-rate theorems from quantized control (see, e.g., [68]) become in an environment with packet losses and varying delays, and varying computational resources? Another important line of research that will be necessary to follow in a time-varying resources setting is to categorize and model the complexity of the control algorithms. Allowing their dynamic adjustment will eventually provide the coupling between the process/application to be controlled and the control algorithm resource provisioning. In turn, this will enable (i) the establishment of real-time control mechanisms with formal guarantees for the closed-loop system, and (ii) the optimal utilization of resources, either in the network or the edge.

#### **4. IoT-Enabled Applications**

The proposed architecture is generic enough offering a holistic paradigm, while its estimation, modeling, and control methodologies are applicable in several categories of IoT applications, such as the ones based on mobile agents (e.g., UAVs) or designed for crowded smart areas or emergency scenarios. The following subsections demonstrate three representative use cases of the DRUID-NET framework.

#### *4.1. Collaborative Robotics*

Collaborative robotics is a prerequisite for Industry 4.0, especially in the Industrial Internet of Things setting. The current trend is to produce and program robots that have the capability to work together, or in close proximity, to humans in a shared environment. Removing a physical (or virtual) cage from the robot brings many challenges, the most critical of which is guaranteeing safety/avoiding

collision, without leading to an unsatisfactory performance, e.g., the robot working in a non-acceptable speed. The setting can be extended to the case where there are many robotic agents and humans sharing the factory floor, or any other indoor or outdoor environment, e.g., a logistics warehouse, an airport, swarm UAVs networks, etc. In all aforementioned cases, similar challenges appear, namely: (i) intermittent and noisy measurements of the position of the agents either by static sensors or sensors mounted on the robots, (ii) faulty wireless communication networks, (iii) stringent safety specifications as humans and robots move freely in the same environment, and (iv) time-critical specifications. These challenges become harder when the computing/storing/communication resources are limited, or not always available to a control application, which is the typical case. Thus far, a few approaches, aligned with the ones appearing in the cyber-physical systems control problems, take explicitly into account a part of these challenges, e.g., [69]. Controllers, which are co-designed with the resource allocation and computation offloading mechanisms, can be used for human–robot collaboration in the IoT-enabled environment or for real-time, large scale coordination of mobile robots. It should be noted that an additional control challenge in this case, additional to the presence of constraints, is the complex temporal specifications that need to be satisfied. Currently, the control objective has moved away from just ensuring stability or tracking for a prespecified set of reference trajectories, to satisfying statements, for example "robot A and B should collaborate towards a task X and eventually return to their initial positions if these positions are not occupied", as shown in Figure 2. These control applications are often time-critical as well as safety-critical; thus, a very careful co-design procedure should be developed for the controller that leads to formal guarantees without requiring many, possibly idle, resources.

**Figure 2.** Human–Robot collaboration.

This scenario enables, and is enabled by, a combination of almost all components of the DRUID-NET framework, namely workload and resource estimation, and control co-design of a set of control applications in a platform where resources are shared and their availability is volatile.

#### *4.2. Rapid Resource Deployment for Physical Disaster Scenarios*

In the case of a physical disaster, the fixed communication infrastructure could be destroyed or unavailable due to high workload demand. Furthermore, for rescue operations, it could be critical to deploy additional on-demand computing and network resources at the proper place and time, in order to alleviate any remaining network infrastructure and collect data from remaining communicating devices such as mobile phones or sensors, towards helping to locate and rescue survivors. Mobile agents, especially UAVs, are suitable for these kinds of missions and can provide additional edge resources capable of processing the data at low latency and organizing the rescue operation. In order to serve the survivors devices as much as possible, there is a need to predict the kind and amount of resources these devices will request and the location of these resources. Some UAV-mounted edge resources may need to be deployed sporadically and temporarily at different locations based on IoT devices needs and mobility. Thus, there is a need to anticipate the deployment of edge services and to estimate the time they will be required at a given place to decide whether it is worth deploying durable edge resources, or instead mobile temporary resources could suffice. In this latter case, the estimation of the location and quantity of required resources should be anticipated to allow their timely deployment. The deployment of edge resources will be such that a maximum of IoT devices can be served within the required latency, either directly or through multi-hop communications. Direct communications will be favored for devices with very-low latency requirements, while multi-hop communications could be used for weaker latency requirements non-necessary communications. The trajectory of distributed UAV should be consciously planned accordingly, taking into consideration the time restrictions (robots should be deployed at the proper place before we need them).

Figure 3 illustrates the operation of the proposed framework under a physical disaster scenario, such as for example the occurrence of a gas leakage in a large factory. In this case, swarms of mobile robots will be deployed in order to find victims or survivors that require immediate medical assistance. Two types of mobile robots can be deployed, namely, (i) Unmanned ground vehicles (UGVs) and (ii) UAVs. UGVs will cover the ground area (x,y dimensions), while UaVs can provide a certain altitude coverage (z dimension) or coverage in non accessible areas by the UGVs (e.g., upper floors, atriums, etc.). Normally, we expect to find more obstacles in the ground area (e.g., offices, machines, shelfs), which can be translated in a higher number of UGVs in comparison with the UAVs (a ratio of 2:1), as shown in Figure 3. The goal of the interconnected UAVs is to locate living or dead persons, while at the same time send footage of the interior of the factory in order to create a 3D visualization of the area. In this manner, the users of the application (e.g., fire brigade) can immediately detect persons in need and send help to the corresponding location, eliminating the risk of long exposure to harmful gas for the rescuers. For the path planning of the mobile agents, the robots will be capable of detecting the Wi-Fi or LTE preamble and accordingly plan the route towards the source of the signal. The notion behind this behavior is that normally people have in close vicinity their mobile phones or other wireless devices (e.g., smart watches). This will facilitate the path planning and the pointless roaming of the UAVs in space. In order to prioritize the traffic and eliminate the impact of poor wireless communication, the swarm of robots can intensify the load of images/video and increase their quality only in areas with high probability of detecting a person, and send this traffic at the Edge for further processing. UGVs can approach the victims and sense if they are living or unconscious (e.g., detect eye movement, detect sound, etc.). When a UGV finds a survivor, it can communicate with the UAV of the swarm nearby, which in turn can lower its altitude close to the position of the person in need and drop an oxygen mask until help arrives.

In this case, using swarms of mobile robots will assist in eliminating the non-essential communications. Combining service differentiation and smart data offloading to UAVs, there will be reduction of any unnecessary communication between users and overhead due to extensive signaling. Since the number of available robots may still be inadequate to serve all ground services, the prioritization of the applications, flows, and devices is of paramount importance for the success of critical missions. Under these circumstances, the priority will be given to the areas with many victims or of major importance for the completion of the mission; thus, the available swarms of mobile robots should be distributed accordingly. Even in the case of homogeneous UGVs and UAVs with identical computing and networking capabilities, the optimal allocation of UAVs or UGVs formulates a dynamic optimization problem, which depends on the size of the damaged area, the communication ranges of both UGVs and UAVs, the flying altitude of UAVs, the propagation conditions, the data communication requirements (amount of data, frequency of collection, etc.) and the number and type of devices to serve. For example, if the victims are equally spread in different locations, the robot swarms would be equally scattered to deploy their resources in these areas so that the maximum number of devices would be served directly. On the other hand, the mobile robots will be driven to the most damaged area in order to serve the required network traffic.

**Figure 3.** Rapid resource development for physical disasters.

This scenario illustrates the use and combination of the different control components of the DRUID-NET framework, and in particular: (1) workload estimation in quantity, time, and space, (2) resource allocation (tasks assignments to UAV and/or robots) and (3) path trajectory.

#### *4.3. Mobility-Aware Edge Computing*

Most of the modern smart city applications rely on mobile end-devices of continuously moving humans. Thus, the user's mobility is a dominant parameter of IoT systems. As shown in Figure 4, in the case of an urban touristic areas, e.g., museums and squares, the visitors collect information about Points of Interests (PoIs) (i.e., exhibits or social events) using their mobile devices. For example, leveraging the augmented or virtual reality technologies, they can retrieve media-enriched information about the surrounding PoIs. However, it is prohibited for the mobile end-devices with limited resources to run these types of applications locally. Thus, the edge computing infrastructure is essential to host the smart applications and meet the user's QoS requirements. Additionally, in crowded touristic areas, the number of visitors varies significantly during short-term (i.e., a day) or long-term periods (summer or winter); therefore, an accurate prediction methodology is important for optimal resource scheduling. Moreover, the offloading decision should be based on both the user's transmission capability and the availability of edge resources. With this capacity, in order to maximize the admittance of users, the main features of the overall generated traffic should be extracted alongside with patterns of the user's mobility. Then, utilizing the dynamic models, effective controllers can be designed towards the horizontal and vertical scaling of resources and the simultaneous guarantee of any QoS and Quality of Experiment (QoE) requirements.

**Figure 4.** Mobility-aware edge computing.

This use case illustrates the necessity and the collaboration of the involved components of the DRUID-NET framework. Particularly, workload estimation, dynamic performance modeling, and resource allocation components interact to meet the respective requirements and optimize the resource utilization under varying workload conditions.

#### **5. Conclusions**

This article presents the most important challenges of IoT-enabled applications, along with the perspective and the basic concepts and objectives of the novel DRUID-NET framework. The corresponding components of the DRUID-NET framework are carefully designed to address several emerging challenges at any level of the IoT/Edge/Cloud system, stemming from mobile end-devices up to powerful cloud data servers.

In particular, the workload estimation aims to create a profile of IoT applications that includes features of the generated data, parameters of the wireless connection, and patterns of the user's mobility. The performance modeling components identify the multi-input multi-output dynamical systems that capture the dynamic operation of the applications, and are utilized to design the resource allocation and the offloading decision strategies. The resource allocation component is in turn responsible for deciding any control action at any level of the hierarchical system. Depending on the objectives of

the controller, the resource allocation can be either static or dynamic, providing guarantees on QoS metrics, e.g., response time or energy consumption, and system properties, such as stability. Finally, the co-design of the controllers enables the binding between the IoT application and the resource control algorithm in order to provide guarantees for the closed-loop system.

The DRUID-NET framework aspires to verify its modular architecture and components through different IoT scenarios. These use cases are carefully selected in order to cover all main challenging and emerging aspects of the IoT applications and Edge computing paradigm.

**Author Contributions:** All authors contributed equally to conceptualization, investigation, and writing—original draft. All authors have read and agreed to the published version of the manuscript.

**Funding:** Nikolaos Athanasopoulos, Aris Leivadeas, Nathalie Mitton, and Raphael Jungers are partially supported by the CHIST-ERA-2018-DRUID-NET project.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Abbreviations**

The following abbreviations are used in this manuscript:


#### **References**


c 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

*Article*

## **A Comparison of the Influence of Vegetation Cover on the Precision of an UAV 3D Model and Ground Measurement Data for Archaeological Investigations: A Case Study of the Lepelionys Mound, Middle Lithuania**

#### **Algimantas Cesnuleviˇ ˇ cius \*, Arturas Bautr ¯ enas, Linas Bevainis and Donatas Ovodas ˙**

Department of Cartography and Geoinformatics, Institute of Geosciences, Faculty of Chemistry and Geosciences, Vilnius University, LT-03101 Vilnius, Lithuania; arturas.bautrenas@gf.vu.lt (A.B.); linas.bevainis@gf.vu.lt (L.B.); ovodas@gmail.com (D.O.)

**\*** Correspondence: algimantas.cesnulevicius@gf.vu.lt

Received: 10 October 2019; Accepted: 28 November 2019; Published: 2 December 2019

**Abstract:** The aim of this research was to conduct a comparative analysis of the precision of ground geodetic data versus the three-dimensional (3D) measurements from unmanned aerial vehicles (UAV), while establishing the impact of herbaceous vegetation on the UAV 3D model. Low (up to 0.5 m high) herbaceous vegetation can impede the establishment of the anthropogenic roughness of the surface. The identification of minor surface alterations, which enables the determination of their anthropogenic origin, is of utmost importance in archaeological investigations. Vegetation cover is regarded as one of the factors influencing the identification of such minor forms of relief. The research was conducted on the Lepelionys Mound (Prienai District Municipality, Lithuania). Ground measurements were obtained using Trimble GPS, and UAV "Inspire 1" was used for taking aerial photographs. Following the data from the ground measurements and aerial photographs, large scale surface maps were drawn and the errors in the measurement of the position of the isolines were compared. The results showed that the largest errors in the positional measurements of fixed objects were conditioned by the height of grass. Grass with a height of up to 0.1 m resulted in discrepancies of up to 0.5 m, whereas grass that was up to 0.5 m high led to discrepancies up to 1.3 m high.

**Keywords:** GPS measurement; UAV; 3D models; measurement precision

#### **1. Introduction**

During the initial stage of an archaeological investigation, one of the most important principles is to identify a potential object, to determine its boundaries and area. Traditionally, large-scale topographic maps and geodetic measurements are widely used during the initial stage of reconstruction.

Aerial photographs were mainly used where archaeological sites coincided with the areas covered by aerial topography and only on fragmentary basis due to their high cost. High-resolution space images have only become possible within the last decade, but they do not cover continuous areas. Moreover, high resolution photos are not always available for academic research or studies. At the beginning of the 21st century unmanned aerial vehicles, better known as drones, were employed to identify and map potential archaeological objects. They have a number of advantages which include the following characteristics: low price, high resolution, large scale, and multispectral. A very important advantage of unmanned aerial vehicles is the creation of 3D models using photogrammetric techniques. These 3D models reveal the small roughness of the surface. Such alterations in the surface serve as identifiers, when searching for potential archaeological sites.

The issues of reliability and accuracy of aerial photographs obtained using unmanned aerial vehicles have already been addressed in studies by many academic researchers [1–21]. The use of unmanned aerial vehicles provide a fast and inexpensive way to explore ground surface and to identify objects of interest [22], however, research on assessing the precision of aerial images from unmanned aerial vehicles is scarce [23–26]. The accuracy of aerial images produced with the help of unmanned aerial vehicles (UAVs) can be affected by a number of factors, for example, altitude of flight, the image quality of the photo camera, the design of the UAV route, the methods of georeferencing, and others. An appropriate design of the UAV route ensures cruise altitude and constant aerial image coverage of the whole territory. An appropriate project for the flight and a high-quality photo camera effect the efficiency of photogrammetric processing of the images obtained. Further investigations are simplified by using the well-tested and broadly applied mathematical and photogrammetric algorithms for image processing. The problems occur while designing a 3D model of the territory captured in aerial images. The initial 3D model is created in the conditional coordinate system, which is later linked to the officially used coordinate system. The coordinates can be connected in one of the following two ways: by direct graphical connection of the position of the object in the aerial image to the coordinate system (less precise) or by linking the GPS measurements of fixed objects to the coordinates of the aerial images. The accuracy of the vertical positioning of given points is a highly important factor in designing the 3D relief models that are used for identification, analysis, and mapping of archaeological objects.

Recent research [18–32] has shown that while aiming for high accuracy of the vertical positioning of the objects, it is not enough to use a global navigation satellite system (GNSS); ground control points (GCPs) have to be applied as well. Such a combined technique allows for the design of a more accurate digital relief model (DRM), where the precision of vertical positioning of points equals 0.7 cm.

The aim of this research is to conduct a comparative analysis of the precision of ground geodetic measurements and aerial photographs from an unmanned aerial vehicle, while establishing the positional accuracy of the identified objects. The archaeological objects of the Middle Ages in the eastern coast of the Baltic sea are often related to natural relief forms, which were modified by people while building fortifications and settlements around them [33–37]. These archaeological objects are now in forests, agricultural lands, and urbanized territories. The surface of archaeological objects in such urban territories has been exposed to significant changes or has been fully destroyed. The use of aerial images from unmanned aerial vehicles for the positional identification of archaeological objects is highly limited. Due to dense vegetation and the foliage of tall trees, the application of aerial imaging in wooded territories is restricted. The surface of archaeological objects in agricultural territories is partially extant. Therefore, aerial images can be rather efficient in seeking to identify positions of archaeological objects in meadows and woodless territories.

The narrow spectral and surface thermal analysis methods are applied for the investigation of the structural diversity of vegetation cover on the basis of UAV aerial images [38–42]. Studies have mainly focused on the influence of big ligneous plants on the mapping of surface elements, whereas the impact of low herbaceous vegetation on low forms of archaeological relief has, so far, not been exhaustively researched [43]. Our research aims to assess the quality of aerial images, ultimately seeking to design accurate digital 3D relief models for the identification of archaeological objects [44–50].

For the identification of small surface irregularities (small archaeological objects) we applied the computer program "Circle\_3p", developed by the Department of Cartography and Geoinformatics, Vilnius University, applying the classical Delaunay method (author Arturas Bautr ¯ enas). The results of ˙ the study showed that this method is effective in grassy mounds.

#### **2. Research Object, Materials, and Methods**

The object of the research is the Lepelionys Mound, which is located in the Prienai Administrative Region of Kaunas County (Figure 1). It dates back to the second half of the first millennium. At the beginning of the second millennium a settlement was established there, covering an area of 9 hectares around the mound. The Lepelionys Mound is on the left side of the road from Vilnius to Prienai (60 km to the west of Vilnius). The territory of the ancient settlement is on both sides of the road, but its bigger part is located on the left side. The Vilnius-Prienai road was built in the second half of the 20th century. While designing the road, the relief of the former ancient settlement was affected but some small and low relief forms of anthropogenic origin still remain, dating back to between the 9th and 12th centuries [51,52]. The main archaeological object, the Lepelionys Mound, was investigated by archaeologists in the second half of the 20th century. During these archaeological investigations the territory boundaries and the protection zone of the ancient settlement were distinguished (Figure 1).

**Figure 1.** The location of Lepelionys Mound and the ancient settlement: (**1**) mound boundary and (**2**) ancient settlement territory boundary (according to V. Juškaitis [51]).

Ground geodetic measurements and photos taken by the camera on the unmanned aerial vehicles were applied while designing the three-dimensional relief models. Comparisons of accuracy between the UAV 3D model and the ground measurements of the Lepelionys mound were carried out twice, in August 2018 and June 2019.

Ground geodetic measurements were carried out with a Trimble R4 GPS device (measurable accuracy in favorable conditions: X, Y is set to ±8 mm and Z to ±15 mm) on 9 August 2018. Since the mound is in a fully open area and not covered by buildings or greenery (Figure 1), the measurements were collected with maximum accuracy. During the collection of these measurements, the coordinates of 212 characteristic ground-surface points were recorded. After analyzing the accuracy of the measured point coordinates, 179 points were mapped to the LKS-94 coordinate system (Figure 2).

**Figure 2.** The selected points that were mapped to the LKS-94 coordinate system.

Since the topographic photograph can be used to estimate the accuracy of the aerial photographs, 10 ground control points (GCPs) were measured in parallel to the ground points (Figure 3).

**Figure 3.** The ground control point marks (**A**) and the diagram of the ground control point (GCP) arrangement (**B**). The red circle defines the location of the ground mark.

Figure 4 shows two objects, the coordinates of which were used for creating the aerial photograph model.

**Figure 4.** Examples of identified objects. The red circles define the small objects location, whose measured coordinates are used to adjust the 3D model.

The vegetation is one of the most important indicators of archaeological objects. Information on human activities is reflected in the variation of the lushness of vegetation. Homogeneous vegetation is characteristic of the investigated territory, since for several decades most of the surroundings of the mound have been used as pasture. Local differences in herbaceous vegetation in the mound surroundings over a long period of time have been predetermined by changes in the surface relief layer caused by the following human activities:


All the aforesaid factors resulted in physical differences in the present vegetation, i.e., lusher or sparser vegetation. It is important to point out that currently there is a pasture in the former territory of the settlement, where grazing starts at the end of April and lasts until October. The whole area is grazed in this time and the anthropogenic impact on the surface was equal during photofixation in August.

The picture in Figure 5 provides a visual representation of the camera sensor and the field of view. Using the width of the camera sensor, the focal length, and the drone altitude the ground sample distance (GSD) can be calculated (Figure 5).

The equation we use to calculate the *GSD* is:

$$GSD = \frac{(sensors\ width \times altitude \times 100)}{(focal\ length \times image\ width)} \tag{1}$$

Photofixation of aerial images was conducted using the unmanned aerial vehicle (UAV) INSPIRE 1. Its technical parameters are presented in Table 1.

**Figure 5.** Visual representation of the nadir facing camera on the drone.



The front overlap of the pictures taken is 80% and the side overlap is 70%. The "double grid" mission flight plan was used for a more detailed and accurate 3D model. The flight was made at the height of 50 m, therefore, respectively, the GSD equals 1.7 cm.

There were 199 photos that were processed with special photogrammetric "Pixoprocessing" software. The point cloud, the digital surface model (DSM), and the orthomosaic were obtained during this process.

The study included an assessment of the mismatches between the elevation isoline positions acquired from the ground geodetic measurements and from the aerial images from the UAVs. An associate professor of the Department of Cartography and Geoinformatics, Arturas Bautr ¯ enas, designed ˙ the computer program "Circle\_3p", which employs the classical method of Delaunay and ensures a consistent systemic selection of points. Using the Delaunay triangulation method, altitude interpolation of the ground measurement points was performed and an isoline view was generated. An analogous method was used for the interpolation of the elevation of surface points and the generation of isolines using the images taken by the camera on the UAV (Figure 6). The following indicators were calculated: ±Ni which is the sequence number of the analyzed point in the positive or the negative deviation from the base (ground geodetic measurement) isoline, ±ΔSi which is the length of the perpendicular to the positive or the negative side of the analyzed point, ±ΔZi which is the calculated correction of the overdose to the positive or the negative side, and ±D which is the distance between the base (ground geodetic measurement) isoline point and the UAV isoline point.

**Figure 6.** The scheme of elevation isoline position assessment using ground geodetic measurements and the mismatch between it and the aerial images from UAVs.

For the calculation of the deviation of the target position the following formula was used:

$$\mathbf{\dot{x}} \pm \Delta \mathbf{S}\_{\text{li}} = \begin{pmatrix} \mathbf{y}\_{\text{Ni}} \ - \ \mathbf{y}\_{\text{Ti}} \end{pmatrix} \cos \alpha \ - \ (\mathbf{x}\_{\text{Ni}} \ - \ \mathbf{x}\_{\text{Ti}}) \sin \alpha. \tag{2}$$

where α is the directional angle of the segment Ni–Ni <sup>+</sup> 1, Ti is the number of the interpolated UAV measurement point, and i is the number of the point for each fragment of the ground geodetic and UAV isolines.

As we know what the isoline step is (0.5 m), the distance between the horizontal (±*D*) at each Ni point can be calculated by geometric interpolation. The difference in height ± Δ*Zi* is calculated using the formula:

$$\pm \Delta Z\_i = \frac{\pm h}{\pm D\_i} \times \pm \Delta S\_{i'}$$

where *h* is the isoline step, *D* is the distance between the horizontal at the calculated point, and the ± sign depends on the direction of the horizontal deviation.

#### **3. Results**

#### *3.1. Creating a Two-Dimensional (2D) Relief Model*

In order to confidently state that the 2D relief model is sufficiently precise and can be used as a benchmark for estimating the models, which were made by using aerial photometric methods, the horizontals were drawn automatically in accordance with strict interpolation rules.

In order to perform the automated relief modelling, it was necessary to select pairs of measured points, among which it would be possible to calculate the exact horizontal surfaces of the relief, i.e., interpolate heights. Therefore, the Delaunay triangulation method was chosen to interpolate the heights [53].

#### *3.2. Drawing of a Topographic Plan*

First, the Delaunay triangulation (Figure 7) was completed among 179 selected topographic points using the program "Circle\_3p". This allowed 321 triangles to be selected, among which the interpolation of the triangle vertices was performed.

**Figure 7.** The Delaunay triangulation (**A**) among the 179 selected topographic points (**B**) using the program "Circle\_3p".

Among these triangle vertices, horizontal interpolation was performed in the LAS07 height system using a selected step of 0.5 m (Figure 8). The coordinates of 1676 extra points, plotted as horizontals, were calculated during the interpolation.

**Figure 8.** The interpolation among the vertices of selected triangles.

The interpolation points were uploaded to TopoPlan (AutoCAD 2016). The horizontals were plotted using the "Spline" function (Figures 9 and 10).

**Figure 9.** The points of interpolation (**A**), an interpolation sequence (**B**) and the relief isolines (**C**).

**Figure 10.** A cross-section of the Lepelionys Mound. The roughness in the red line indicates the remains of the former tree trunk fencing.

The cross-sections of the Lepelionys Mound were created with the help of aerial images taken by the camera on UAV, which highlighted the minor anthropogenic forms of relief on the slope of the mound, i.e., the remains of the former tree trunk wall (Figure 11).

**Figure 11.** The variance of the equal height isolines in plane position.

#### **4. Discussion**

#### *4.1. Evaluation of the Precision of theAaerial Images*

One of the most time-consuming tasks in aerial photography is to set out the GCPs and to coordinate them. Therefore, it is necessary to find the optimal number of GCPs in order to minimize the preparatory work. It should also be possible to estimate the feasible use of coordinated stable land objects (Figure 4) instead of bearing marks, which would further simplify the preparatory work. Therefore, ten ground control points (marks) in the study area are used to estimate the accuracy of the coordinates of 10 objects in the study area (Figure 3).

In order to evaluate the accuracy of the 3D model, it was created incorporating all ten marks and the coordinates of all the objects were measured in this model. The differences between the coordinates of objects in the 3D model and the coordinates measured from the topographic image do not exceed the double (Trimble GPS) accuracy (Tables 2 and 3) for those objects that are clearly seen in the 3D model (np-508, -515, -582, and -587). The accuracy of the other objects is poorer due to the vegetation (grass), which complicates their identification.

The random error distribution depends on the accuracy of the object identification, and therefore the graph consists of taking the errors in absolute size in mm.

The error analysis shows that they increase significantly when the orientation marks are fewer than five (Table 5, Figure 5), even for those objects that are visible in the 3D model (np-508, -515, -582, and -587). Therefore, it can be argued that in order to maintain the accuracy of measurements, there should be at least five orientation marks. It has been noticed that the error rate is influenced not only by the vegetation but also by the experience and thoroughness of the operator measuring the 3D model. As the precision of the well-known objects is practically unchanged (from ten to five marks), it can be argued that the 3D model should operate with maximum accuracy with five GCPs and the use of easily visible coordinated objects (Figure 12, Tables 2–5).



**Figure 12.** A diagram illustrating the parameters of variance of equal height isolines in plane position.


**Table 3.** The errors of the measurements of the objects when using ten marks.


**Table 4.** The absolute errors of objects when a 3D model is made on the basis of 10 marks with' ground control points.

**Table 5.** A comparison of the absolute errors when different numbers of ground control points are used for a precise calculation. Values are in mm.


As seen in Table 3, the random error distribution depends on the accuracy of the object identification. Similarly, the absolute errors of objects have been calculated for the 3D models with different number of ground control points (Tables 4 and 5).

#### *4.2. Evaluation of the Influence of Vegetation Covers*

In 2019, a comparison of the Lepelionys mound surface isolines obtained using the UAV 3D model or the ground measurements showed that there are significant deviations in the plane and height positions between the two. The comparison was carried out in different vegetation height zones (Figure 13). At the top of the mound, where the grass was mown and its height was only 1 to 2 cm, the maximum discrepancies between the UAV 3D model and the ground measurement isolines were 0.75 m for the plane position and 0.42 m for the height. On the slopes of the mound, where the height of the grass was between 5 and 10 cm, the maximum discrepancies between the plane position of the isolines were up to 0.41 m, and up to 0.42 m for the height. At the foot of the mound, where the height of the unheated grass was 60 to 100 cm, the maximum discrepancies between the plane position of the

isolines reached 6.63 m, and up to 0.77 m for the height. The results of the discrepancies between the plane position of the isolines and their height are presented in Table 6.

The sharpness and contrast in aerial images are both becoming important issues for the use of UAV aerial imagery. Aerial image contrast problems occur in areas that fall under the shadow of trees or rough terrain on a sunny day. In this study, the image contrast of the aerial photographs was adjusted and, where necessary, increased. During the 2018 photofixation, the western and southwestern parts of the mound slope were in shadow. To highlight the terrain microforms in parts of the image on the southwest slope we used the brightness/contrast, shadows/highlights, color balance, hue/saturation, and photo files tools in Adobe Photoshop software.

The comparison of the large-scale maps of the Lepelionys mound surface created using the UAV 3D model or by using the ground topographic measurements, shows that the plane position of the isolines in the 3D model is highly micro-sinuous. This is due to the methods of isoline interpolation applied in the UAV 3D model, i.e., the calculation of the interfaces between multiple point pairs (about seven million pixel pairs) creates the non-continuous isolines.

Three-dimensional terrain modelling using UAV aerial imagery is currently expanding. The wider application of collaborative mapping initiatives in archaeology [54–56] will lead to an increasing use of nonprofessional UAV aerial imagery to identify undefined and unexplored archaeological sites from the 19th to the early 20th century.

**Figure 13.** The zones of different herbaceous height on the Lepelionys mound.


**Table 6.** The deviation between the ground measurement and the UAV isoline plane and height positions (data from June 2019).


**Table 6.** *Cont*.

\* Explanation of the superscripts in the table: <sup>a</sup> number of negative (−N) and positive (+N) derivation, <sup>b</sup> maximum of negative (−Δ) and positive (+Δ) plane derivation, <sup>c</sup> average of negative (<sup>−</sup>Δ/−N) and positive (<sup>+</sup>Δ/+N) plane derivation, <sup>d</sup> maximum of negative (−Δz max) and positive (+Δz max) height derivation, <sup>e</sup> average of negative ( <sup>−</sup>Δz/Nz) and positive (<sup>+</sup>Δz/Nz) height derivation, and <sup>f</sup> ratio of average positive/negative derivation and number of positive/negative derivation +Δ −Δ / N+ N− .

#### **5. Conclusions**


**Author Contributions:** A.C.: conceptualization, methodology, visualization, writing—original draft, ˇ writing—review & editing. A.B.: data curation, formal analysis, investigation, validation. L.B.: investigation, software, visualization. D.O.: investigation, software, validation.

**Funding:** This research received no external funding.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Complex Field Network Coding for Multi-Source Multi-Relay Single-Destination UAV Cooperative Surveillance Networks**

#### **Rui Xue 1,\*, Lu Han <sup>1</sup> and Huisi Chai <sup>2</sup>**


Received: 31 January 2020; Accepted: 10 March 2020; Published: 11 March 2020

**Abstract:** Relay-based cooperative communication for unmanned aerial vehicle (UAV) networks can obtain spatial diversity gains, expand coverage, and potentially increase the network capacity. A multi-source multi-relay single-destination structure is the main topology structure for UAV cooperative surveillance networks, which is similar to the structure of network coding (NC). Compared with conventional NC schemes, complex field network coding (CFNC) can achieve a higher throughput and is introduced to surveillance networks in this paper. According to whether there is a direct communication link between any source drone and the destination, the information transfer mechanism at the downlink is set to one of two modes, either mixed or relay transmission, and two corresponding irregular topology structures for CFNC-based networks are proposed. Theoretical analysis and simulation results with an additive white Gaussian noise (AWGN) channel show that the CFNC obtains a throughput as high as 1/2 symbol per source per channel use. Moreover, the CFNC applied to the proposed irregular structures under the two transmission modes can achieve better reliability due to full diversity gain as compared to that based on the regular structure. Moreover, the reliability of the CFNC scheme can continue to be improved by combining channel coding and modulation techniques at the expense of rate loss.

**Keywords:** unmanned aerial vehicle (UAV); cooperative communication; topology structure; complex field network coding (CFNC)

#### **1. Introduction**

Recently, wireless communications aided by unmanned aerial vehicles (UAVs, also known as drones) have drawn a lot of attention from academic and industrial fields, as well as the general public [1]. Due to their ease of deployment, low cost, high mobility, and ability to hover [2] compared to conventional terrestrial infrastructure, UAVs hovering in the air are more likely to set up wireless links with favorable channel conditions and thus are considered as a promising vector of support for wireless communications in a great number of practical applications [3], such as security and surveillance, the real-time monitoring of road traffic, providing wireless coverage, remote sensing, search and rescue operations, the delivery of goods, precision agriculture, and civil infrastructure inspection [4]. However, it is difficult to complete the complex missions with a single UAV because of its limited detection capacity, energy resources, load, and other factors [5]. The solution to such a problem is ad-hoc formation using multiple UAVs [6]. The number of UAVs and their travel distances vary over a wide range for different applications here, as shown in Figure 1 [2]. Multiple small UAVs as a swarm to complete various tasks have gained more interest, as they improve the effectiveness of a single UAV system [7].

**Figure 1.** Application areas over a range of distance vs. number of nodes.

An emerging swarm application is the use of small UAVs as source nodes to collect information by their own airborne sensors, and the use of other UAVs as relay nodes to form reliable communication links for ad-hoc ground networks in tactical situations [8–10]. With the application of new sensors (e.g., high-definition aviation digital cameras, airborne imaging spectrometers, aviation imaging radars, etc.) in a single UAV, the information gathered from several source drones is sharply increased. Therefore, determining how to improve the throughput of UAV surveillance networks is a problem worth studying. A multi-source multi-relay single-destination (MSMRSD) structure is the main topology structure of UAV cooperative surveillance networks, and clusters are formed respectively among the source nodes and relay nodes. Effective information sharing among closely spaced intra-cluster nodes (i.e., among source nodes and/or among relay UAVs) is used to facilitate the cooperation [11], which is similar to the structure of network coding (NC) [12]. NC is an effective method to increase network throughput, and a real-time of UAV communication system can be greatly enhanced by introducing network coding principles.

NC is a technique used for effective and secure communication by improving network capacity, throughput, efficiency, and robustness [13]. Its core idea is to employ intermediate nodes to process the received data rather than the traditional forwarding of data, i.e., linear combination or some kind of coding to previously received information. The destination nodes can recover the original data by the part of received data, such that the throughput of the network is efficiently improved and the network's security is increased [14]. Up to now, the main application of network coding in UAV communication networks has been random linear network coding (RLNC) [15,16] or physical-layer network coding (PNC) [17–19]. RLNC can achieve throughput arbitrarily close to the capacity in an unreliable single-hop broadcast network while yielding an acceptable decoding delay [20]. However, the throughput advantage of RLNC in a dynamic UAV network does not seem to be remarkable when the topology of a UAV network is relatively complex [21,22]. Besides, traditional RLNC comes with a sacrifice in service delay because if the users are not able to collect a full size of the encoding packets, the useful information cannot be recovered under the wireless fading channel [23]. Compared with the conventional relay system, PNC can double the throughput of a two-way relay channel (TWRC) by reducing the time slots for the exchange of one packet from four to two [24]. It has been a common belief that PNC requires tight synchronization [25], which is difficult to achieve in UAV networks. Complex field network coding (CFNC), as a generalized version of RLNC, is simple to implement and

can facilitate the transmission of 1/2 symbol per source per channel use for multi-source cooperative relay networks [26]. Furthermore, the symbol-level synchronization of CFNC is more convenient to attain than bit-level synchronization [27]. In view of the above advantages, the CFNC is introduced to UAV cooperative surveillance networks in this paper.

The topology structure of NC is also multi-source multi-relay single-destination, as shown in Figure 2 [28]. In the structure, each source node simultaneously connects all relay nodes and the destination node. Moreover, all relay nodes are connected with the destination node. Any source node or relay node links the same numbers of edges, so this structure is called the regular structure by this paper. However, the regular structure is inapplicable to a dynamic time-varying UAV network for two main reasons. One is that not all source drones are always connected with the command and control center (destination node) when the distance between them is beyond communication range, typically for the purpose of expanding the surveillance range or because some obstacles are between them, as illustrated in Figure 3 [29]. It can be seen from Figure 3 that number 4 drone does not have a direct communication link to the command and control center because of a mountain barrier. The other reason for inapplicability is that every source drone cannot be always connected with all relay nodes due to its own mobility or some obstacles between them. In practical applications, any source drone should not always connect with all relay nodes and destination node simultaneously, and the corresponding structure is described as an irregular structure. According to whether there is a direct communication link between any source drone and the command and control center, the information transfer mechanism in downlink is set to one of two modes, either mixed or relay transmission. The specific meaning of mixed and relay transmissions will be expanded upon in Section 2.

**Figure 2.** The conventional topology structure of network coding.

**Figure 3.** An example of an unmanned aerial vehicle (UAV) cooperative surveillance network being applied in a mountainous area.

The rest of this paper is organized as follows: Section 2 presents two irregular topology structures for a CFNC-based network according to the mixed and relay modes. For the different NC schemes, both throughput performance evaluation and the encoding/decoding derivation of CFNC in the two modes are provided by Section 3. Section 4 mainly analyzes the reliability of CFNC combined with the two proposed topology structures over an additive white Gaussian noise (AWGN) channel. Finally, we conclude the paper in Section 5.

#### **2. Design of the Topology Structure**

In order to enlarge the coverage area, a UAV cooperative network for surveillance purposes has to employ some drones as relay nodes to transmit messages. A very common topology structure in UAV cooperative networks is multiple surveillance drones, multiple relay drones, and a single command and control center. As shown in Figure 2, a conventional topology structure of NC consists of some source and relay nodes, as well as a destination node. If the source nodes, relay nodes, and the destination node are considered as surveillance drones, relay drones, and the command and control center, respectively, the topology structure of NC is similar to that of the surveillance network. Theoretically, the structure of the former could be applied to the latter.

The prominent feature of a NC structure is that each source node is always connected with all relay drones and the command and control center on the ground. However, this feature is not suitable for the changing dynamics of UAV cooperative networks. On the one hand, some source drones cannot deliver messages to the destination directly because the distance between them exceeds the maximum communication range or because direct communication is blocked by certain obstacles, such as mountains or buildings. On the other hand, it is unreasonable to expect every source drone to connect with all relay drones as obstacle blocking is likely to appear, or the distance among them may be beyond their own individual communication range. From this point of view, the topology structure of NC needs to be appropriately revised before application.

For the multi-source multi-relay single-destination structure expressed as *Ns*-*Nr*-1, the edges among different types of nodes are the most important factor influencing the total performance of the UAV cooperative surveillance network when the number of source drones (*Ns*) and relay drones (*Nr*) is fixed. The edge refers to a direct communication link between any two different types of nodes in this paper. These edges are divided into three groups, namely, edges between source nodes and the destination node, edges between source nodes and relay nodes, and edges between relay nodes and the destination node. The *Ns*-*Nr*-1 structure is made up of three types of node and a certain number of

edges, so we can consider the structure as a special triple bipartite graph. Based on the characteristics of the bipartite graph, three group edges can be represented by different matrices. A row matrix, **M**, is introduced to express the edges between the source nodes and the destination node. If the *i*th element of *mi* in the matrix is equal to '1', this indicates that the *i*th source node *Si* can deliver messages to the destination node *D* directly without a relay. Additionally, if *mi* = 0 this means there is no direct communication link between *Si* and *D*. Likewise, matrix **G** is employed here to represent the edges between the source nodes and relay nodes, and the rows and columns of this matrix indicate the relay and source nodes, respectively. If the element *Gij* in the matrix is '1', this means that there is a direct communication link between the source node *Sj* and the relay node *Ri*. Here, if *Gij* = 0 this represents that *Sj* cannot send messages to *Ri*. For convenience, we assume that all relay nodes are always connected to the destination node, which means the edges between them can be expressed as an identity row matrix.

For the conventional topology structure of NC, as illustrated in Figure 2, **M**1×*<sup>n</sup>* and **G***k*×*<sup>n</sup>* are both identity matrices, which is why we call the structure a regular structure. Through the above analysis, we may draw a conclusion that the regular structure of NC is not suitable for UAV cooperative surveillance networks, that is to say that all elements in **M**1×*<sup>n</sup>* and **G***k*×*<sup>n</sup>* cannot always be equal to '1'. The number of edges is variable, even if *Ns* and *Nr* are constant, which leads to the diversity in structure. Similar to the characteristics of a check matrix in low-density parity-check (LDPC) codes, the density of '1' in the both matrices is uncertain. The uncertainty results in a large number of irregular structures, even if the values of *Ns* and *Nr* are small. According to whether there is a direct communication link between any source drone and the command and control center, the information transfer mechanism at the downlink is set one of two modes, either mixed or relay transmission. In the first mode, the information is transmitted from all source drones to the destination by at least a direct link and multi-relay forwarding, which indicates that **M** is a non-zero matrix. In the other mode, all the source drones deliver messages to relay nodes within their communication range, that is to say, no direct communication link between the source nodes and the destination can be utilized, which means that **M** is a zero matrix.

Based on the two modes, two corresponding irregular topology structures for a CFNC-based network are proposed and Figures <sup>4</sup> and <sup>5</sup> will serve as an example. The matrix **<sup>M</sup>** is set to [1 0 ··· <sup>1</sup>]1×*Ns* and [0 0 ··· <sup>0</sup>]1×*Ns* in the mixed and relay modes, respectively, and the matrix **<sup>G</sup>** in the two modes is represented as follows, respectively:

$$\begin{bmatrix} 1 & 1 & \cdots & 0 \\ 1 & 1 & \cdots & 0 \\ \vdots & \vdots & \cdots & \vdots \\ 0 & 0 & \cdots & 1 \end{bmatrix}\_{\text{Nr} \times \text{Ns}} \tag{1}$$
 
$$\begin{bmatrix} 1 & 1 & \cdots & 0 \\ 1 & 1 & \cdots & 0 \\ \vdots & \vdots & \cdots & \vdots \\ 0 & 1 & \cdots & 1 \end{bmatrix}\_{\text{Nr}' \times \text{Ns}'} \tag{2}$$

The process of information transmission in the two topology structures is quite different. For the mixed mode, source drones will transmit information to the destination node via available direct links and the relay nodes within communication range simultaneously in the first time slot. In the second time slot, the relay nodes deliver the demodulated information to the destination node. In the second mode, all the source drones will transmit information to the relay nodes within communication range in the first time slot, then the relay drones deliver the demodulated information to the destination node in the second time slot.

**Figure 4.** The irregular topology structure for the mixed transmission mode.

**Figure 5.** The irregular topology structure for the relay transmission mode.

#### **3. Network Coding**

In traditional relay communications, each source node takes advantage of a different time slot to transmit information, and each relay node also successively uses a different time slot to deliver information, which will result in poor real-time performance for information transmission [30]. Network coding can greatly reduce time slots, and the excellent characteristics of this suggest network coding has a very promising future in wireless multicast networks [31,32]. The classification of network coding, different network coding performance evaluations, and the encoding and decoding derivation of CFNC in the two modes are provided by Section 3.

#### *3.1. The Classification of Network Coding*

Based on the arithmetic mode, network coding can be divided into several categories, such as the binary field, the Galois field, complex field, and so on. The application of network coding in UAV cluster must consider the characteristics of UAV communication. With the application of new mission payloads, such as large-area and high-resolution digital aerial cameras, synthetic aperture radars, infrared imagers, etc., the information quantity detected by drones is growing exponentially. Saving on the return time of reconnaissance information implies a decrease in discovery probability. Next, we investigate which network coding scheme has the best real-time performance.

In general, network coding designs are based on the Galois field, which implements bit level operations. This coding scheme can improve throughput to some extent, but the advantage is diminished with an increasing number of source and relay nodes. A *Ns*-source *Nr*-relay single-destination structure with traditional network coding is depicted in Figure 6. Assuming that each node is equipped with an antenna, *Ns* sources (*S*1, ··· , *SNs*) transmit information to the destination (*D*) directly and via the relays (*R*1,*R*2, ··· ,*RNr*). To avoid interference, sources *S*1, ··· , *SNs*, in the traditional relay format, transmit over orthogonal channels, e.g., via time division multiple access (TDMA) [27]. To start with, source *S*<sup>1</sup> transmits information symbols *x*<sup>1</sup> to *R*1, *R*2, ··· , *RNr* and *D* simultaneously during channel use (CU) 1. Then, the relay *R*<sup>1</sup> forwards *x*ˆ1 to *D* in CU 2, and *x*ˆ1 is the decoding output of *R*<sup>1</sup> according to *x*1. From CU 3 to CU (*Nr*+1), the *R*2, ··· , *RNr* relays send *x*ˆ1 to *D* successively. The information symbol *x*<sup>1</sup> takes (*Nr*+1) CU from source *S*<sup>1</sup> to the destination *D* through relays *R*1,*R*2, ··· ,*RNr*. For the information symbol sequence {*x*1, *x*2, ··· , *xNs*}, a total of *Ns*(*Nr*+1) channel uses are needed to deliver *Ns* symbols with *Ns* sources, and the throughput of this scheme is 1/(*Ns*(*Nr*+1)) symbol per source per channel use (sym/S/CU).

**Figure 6.** Traditional relay.

The relay scheme based on Galois field network coding (GFNC) is depicted in Figure 7. In CU 1, source *S*<sup>1</sup> transmits information symbol *x*<sup>1</sup> to both *R*1,*R*2, ··· ,*RNr* and *D*, the same as in a traditional relay. From CU 2 to CU *Ns*, information symbols *x*2, ··· , *xNs* are sent to *R*1, *R*2, ··· , *RNr* and *D* successively. *R*<sup>1</sup> forwards the Galois field coded symbol *x*ˆ1 ⊕ *x*ˆ2 ⊕ ... ⊕ *x*ˆ*Ns* to *D* in CU (*Ns*+1), where ⊕ denotes a bitwise exclusive XOR operation. Likewise, *RNr* forwards the Galois field coded symbol *x*ˆ1 ⊕ *x*ˆ2 ⊕ ... ⊕ *x*ˆ*Ns* to *D* in CU (*Ns*+*Nr*). From the above analysis, we can deduce that (*Ns*+*Nr*) channel uses are needed for information symbol sequence {*x*1, *x*2, ··· , *xNs*} transmission from *Ns* sources to *D*. Thus, the throughput of a GFNC-based relay is 1/(*Ns*+*Nr*) sym/S/CU.

**Figure 7.** Relay with Galois field network coding (GFNC).

For improving the real-time performance, a CFNC is introduced in this paper. As illustrated in Figure 8, before transmission in time slot 1, the source information *xi* from *Si* is multiplied by θ*i*, which is the *i*th element of θ*<sup>T</sup> <sup>S</sup>* <sup>=</sup> [θ1, <sup>θ</sup>2, ··· , <sup>θ</sup>*Ns*]. We assume that <sup>θ</sup>*<sup>T</sup> <sup>S</sup>* is available at every node in the network. The choice for a diversity maximizing θ*<sup>T</sup> <sup>S</sup>* value is not unique but is available for any *Ns*. Among the different (parametric/non-parametric) choices for θ*<sup>T</sup> <sup>S</sup>* , [28] takes it to be any row of the Vandermonde matrix, i.e.:

$$
\boldsymbol{\Theta} = \begin{bmatrix}
1 & \delta\_1 & \cdots & \delta\_1^{Ns-1} \\
1 & \delta\_2 & \cdots & \delta\_2^{Ns-1} \\
\vdots & \vdots & \cdots & \vdots \\
1 & \delta\_{Ns} & \cdots & \delta\_{Ns}^{Ns-1}
\end{bmatrix}\_{\text{Nis}\times\text{Nis}} \tag{3}
$$

where the so-called generators, {δ*n*} *Ns <sup>n</sup>*=1, have a unit modulus in complex field *<sup>C</sup>*. Relays *<sup>R</sup>*1, ··· ,*RNr* simultaneously receive information symbols θ1*x*1, ··· , θ*NsxNs*, transmitted by *S*1, ··· , *SNs* in CU 1, and the agreed coefficients <sup>θ</sup>1, ··· , <sup>θ</sup>*Ns* drawn from *<sup>C</sup>* will be specified later. After detecting *<sup>x</sup>*1, ··· , *xNs* as *x*ˆ1, ··· , *x*ˆ*Ns*, *R*1, ··· , *RNr* forwards θ1*x*ˆ1 + ... + θ*Nsx*ˆ*Ns* to *D* in CU 2. Therefore, the throughput of CFNC is 1/2 sym/S/CU. The throughput comparison of the above three schemes is listed in Table 1.

**Figure 8.** Relay with complex field network coding (CFNC).


**Table 1.** The throughput performance of various network coding schemes.

As can be seen from Table 1, GFNC is superior to traditional coding in terms of throughput, and the advantage gradually decreases with the increasing number of source and relay nodes, but CFNC can naturally avoid such a problem. The unique coding method employed by CFNC makes the throughput increase to 1/2 sym/S/CU, which is beneficial to improving the real-time performance. Moreover, the XOR operation is usually adopted by the GFNC, which will cause one-to-one mapping to be impossible between the source information and the received information. By contrast, the received information *u*ˆ (*u*ˆ = θ1*x*ˆ1 + ··· + θ*Nsx*ˆ*Ns*) and information symbol sequence {*x*1, ··· , *xNs*} easily satisfy one-to-one mapping, unless *x*<sup>1</sup> = *x*<sup>2</sup> = ··· = *xNs*. Meanwhile, the mapping offers a method to detect *x*ˆ1, ··· , *x*ˆ*Ns* through the received information *u*ˆ.

#### *3.2. Information Transmission Based on Complex Field Network Coding (CFNC) in Mixed Mode*

Based on the theoretical analysis in the previous section, we have deduced that the CFNC obtains overwhelming superiority over other network coding schemes in terms of throughput when the source and relay nodes are of large quantities. Next, the information transmissions based on CFNC applied to the proposed topology structures is derived for the mixed and relay modes, respectively. According to the irregular topology structure *Ns*-*Nr*-1 for the mixed mode, as shown in Figure 4, the information symbol transmission based on CFNC merely involves two channel uses. The received symbols at *Rj* and *D* after CU 1 are given as follows (see Figure 9):

$$\begin{array}{lcl} y\_{\text{SR}\_{\circ}}(t) &= h\_{\text{S}\_{1}\text{R}\_{\circ}}\theta\_{1}\mathbf{x}\_{1}(t) + \dots + h\_{\text{S}\_{\text{Ms}}\text{R}\_{\circ}}\theta\_{\text{Ns}}\mathbf{x}\_{\text{Ns}}(t) + n\_{\text{SR}\_{\circ}}(t) \\ &= \theta\_{\text{S}}^{\text{T}}\mathbf{H}\_{\text{SR}\_{\circ}}\mathbf{x}(t) + n\_{\text{SR}\_{\circ}}(t) \end{array} \tag{4}$$

$$\begin{array}{rcl}y\_{\rm SD}(t) &= h\_{\rm S1D}\theta\_1 \mathbf{x}\_1(t) + \dots + h\_{\rm S\mu D}\theta\_{\rm NS} \mathbf{x}\_{\rm NS}(t) + n\_{\rm SD}(t) \\ &= \theta\_{\rm S}^{\rm T} \mathbf{H}\_{\rm SD} \mathbf{x}(t) + n\_{\rm SD}(t) \end{array} \tag{5}$$

where for each subscript duplet, *hij* <sup>∼</sup> *<sup>C</sup>N*(0, <sup>σ</sup><sup>2</sup> *ij*) denotes the channel coefficient and *nij* <sup>∼</sup> *<sup>C</sup>N*(0, *<sup>N</sup>*0) denotes the AWGN term. The instantaneous and average signal-to-noise ratios (SNRs) are given respectively by *ij* <sup>=</sup> <sup>0</sup> 0 0*hij* 0 0 0 <sup>2</sup> and *ij* <sup>=</sup> <sup>σ</sup><sup>2</sup> *ij*, where = *Px*/*N*<sup>0</sup> and *Px* denote the average transmission power of source symbol *x*, which is assumed to be drawn from a finite alphabet *Ax* with cardinality |*Ax*| [27]. Here, **H***SRj* = diag(*hS*1*Rj* , *hS*2*Rj* , ··· , *hSNsRj* ), **H***SD* = diag(*hS*1*D*, *hS*2*D*, ··· , *hSNsD*), and information symbol vector **x**(*t*) = [*x*1(*t*), ··· , *xNs*(*t*)] *<sup>T</sup>*, where *<sup>t</sup>* <sup>=</sup> 1, ··· , *Nr* and *<sup>j</sup>* <sup>=</sup> 1, ··· , *Nr*.

**Figure 9.** *Ns*-source setup with CFNC in the mixed mode.

The design of θ*<sup>T</sup> <sup>S</sup>* in Equations (4) and (5) is critical to CFNC. The design relates the linear complex field (LCF) encoder given in [33] for multiple input multiple output (MIMO) systems. Based on the concept of Euler numbers and their properties, two systematic designs of these generators are provided in [34]: <sup>δ</sup>*<sup>n</sup>* = *<sup>e</sup>j*π(4*n*−1)/2*Ns* if *Ns* = <sup>2</sup>*<sup>k</sup>* and <sup>δ</sup>*<sup>n</sup>* = *<sup>e</sup>j*π(6*n*−1)/3*Ns* if *Ns* = <sup>3</sup> <sup>×</sup> <sup>2</sup>*k*, where *<sup>n</sup>* indicates the *<sup>n</sup>*th row of Vandermonde matrix. In other words, θ*<sup>i</sup>* = *ej*π(4*n*−1)(*i*−1)/2*Ns* if *Ns* = 2*<sup>k</sup>* and θ*<sup>i</sup>* = *ej*π(6*n*−1)(*i*−1)/3*Ns* if *Ns* = 3 × 2*k*, where *i* = 1, ··· , *Ns*. However, the similarities with MIMO-LCF designs stop here. Notice that the coded symbol *u* = θ1*x*<sup>1</sup> + ··· + θ*NsxNs* in CFNC is transmitted through different nodes (sources) in the network simultaneously, instead of through multiple co-located antennas on one terminal [33]. Therefore, a normalizing factor, as in ([34], Eq. (3.68)), to meet the power constraint on one node is not necessary here [28].

After *Nr* relay channels, the maximum likelihood (ML) of detection at relay *Rj* is given as follows:

$$\hat{\mathbf{x}}\_{\rangle}(t) = \operatorname\*{argmin}\_{\mathbf{x}(t)} \| y\_{SR\_{\neq}}(t) - \boldsymbol{\theta}\_{\boldsymbol{S}}^{T} \mathbf{H}\_{SR\_{\neq}} \mathbf{x}(t) \| \,\!\!/ \,\tag{6}$$

The relaying node *Rj* re-encodes the demodulation results then sends it to the target node. The input/output (I/O) relationship in CU 2 is expressed as follows:

$$\mathbf{y}\_{\mathbb{R}\_{\rangle}\mathbf{D}}(t) = \sqrt{\alpha\_{\rangle}} \mathbf{h}\_{\mathbb{R}\_{\rangle}\mathbf{D}} \mathbf{f}\_{\mathbb{R}}^{T} \hat{\mathbf{x}}\_{\mathbb{J}} + \mathbf{n}\_{\mathbb{R}\_{\neq}\mathbf{D}}, \ j = 1, \cdots, Nr,\tag{7}$$

where **x**ˆ*<sup>j</sup>* = **x**ˆ*T <sup>j</sup>* (1), ··· , **<sup>x</sup>**ˆ*<sup>T</sup> <sup>j</sup>* (*Nr*) *T* , α*<sup>j</sup>* represents a link-adaptive scalar which controls the transmission power at *Rj*, <sup>θ</sup>*<sup>R</sup>* is an *NsNr* <sup>×</sup> 1 vector designed as the above, i.e., <sup>θ</sup>*<sup>T</sup> <sup>R</sup>* <sup>=</sup> θ <sup>1</sup>, <sup>θ</sup> <sup>2</sup>, ··· , <sup>θ</sup> *Ns*×*Nr* . For *Nr* × *Ns* = 2*k*, the entries of θ*<sup>R</sup>* are given by θ *<sup>i</sup>* <sup>=</sup> *<sup>e</sup>j*π(4*n*−1)(*i*−1)/(2*Ns*×*Nr*) and *<sup>i</sup>* <sup>=</sup> 1, 2, ··· , *Ns* <sup>×</sup> *Nr*, and for *Nr* × *Ns* = 3 × 2*k*, θ *<sup>i</sup>* <sup>=</sup> *<sup>e</sup>j*π(6*n*−1)(*i*−1)/(3*Ns*×*Nr*) for any *<sup>n</sup>* <sup>=</sup> 1, 2, ··· , *Ns* <sup>×</sup> *Nr*.

The symbol rate is 1/2 sym/S/CU, because *Ns* sources transmit *Ns* signals over 2 channels. After passing through 2 channels, the ML detection result at *D* is given as follows:

$$\mathbf{\dot{x}}\_{D} = \operatorname\*{argmin}\_{\mathbf{x}'} \left\{ \sum\_{t=1}^{Nr} \left\| y\_{SD}(t) - \boldsymbol{\theta}\_{S}^{T} \mathbf{H}\_{SD} \mathbf{x}(t) \right\|^{2} + \sum\_{j=1}^{Nr} \left\| y\_{R\_{j}D}(t) - \sqrt{a\_{j}} t\_{R\_{j}D} \boldsymbol{\theta}\_{R}^{T} \mathbf{x}' \right\|^{2} \right\},\tag{8}$$
 
$$\therefore \qquad \cdot \qquad \cdot$$

where **<sup>x</sup>** = **x***T*(1), ··· , **x***T*(*Nr*) *T* .

#### *3.3. Information Transmission Based on CFNC in Relay Mode*

There are no any direct communication links between the source drones and the command and control center when the source drones move beyond their communication range or the links among them are totally blocked. In such a situation, the conventional topology structure of NC exhibited in Figure 4 is inapplicable for such an application. Thus, an irregular topology structure in the relay mode is proposed by this paper, depicted in Figure 5. The received symbols at *Rj* after CU 1 (see Figure 10) are the same as in Section 3.2, i.e., *ySRj* (*t*) = θ*<sup>T</sup> <sup>S</sup>***H***SRj* **x**(*t*) + *nSRj* (*t*).

**Figure 10.** *Ns*-source setup with CFNC in the relay mode.

After *Nr* channel uses, relay *Rj* detects **<sup>x</sup>**ˆ*j*(*t*) = argmin **<sup>x</sup>**(*t*) *ySRj* (*t*) − θ*<sup>T</sup> <sup>S</sup>***H***SRj* **x**(*t*) and forwards this demodulated symbol with scaling coefficient <sup>α</sup>*<sup>j</sup>* in next CU. The I/O relationship is *yRjD*(*t*) <sup>=</sup> <sup>√</sup>α*jhRjD*θ*<sup>T</sup> <sup>R</sup>***x**ˆ*<sup>j</sup>* + *nRjD*, where *j* = 1, 2, ··· , *Nr*, where θ*<sup>R</sup>* is the *NsNr* × 1 vector designed in Section 3.2.

Since *Nr* symbols are transmitted per source over 2*Nr* channel uses, the symbol rate is clearly 1/2 sym/S/CU. After passing through 2 channels, the ML detection result at *D* is given as follows:

$$\hat{\mathbf{x}}\_{D} = \operatorname\*{argmin}\_{\mathbf{x}'} \left\{ \sum\_{t=1}^{Nr} \sum\_{j=1}^{Nr} \left\| y\_{R\_j D}(t) - \sqrt{\alpha\_j} h\_{R\_j D} \boldsymbol{\theta}\_R^T \mathbf{x}' \right\|^2 \right\},\tag{9}$$

where the calculation method of θ*<sup>R</sup>* is referred to the previous section.

#### **4. Simulation Results and Analysis**

#### *4.1. Topology Structure Performance Evaluation*

The throughput performance of CFNC based on an irregular topology structure in the mixed mode has been assessed in Section 3.1. Compared with CFNC, based on the conventional topology structure, the reliability of CFNC applied in the two proposed topology structures over an AWGN channel has been evaluated by Monte Carlo simulations using MATLAB. In this section, we mainly investigate the influence of the source and relay node numbers to the symbol error probability (SEP) of the two proposed structures. In all simulations, the frame length of information transmitted by each source node was 1000 bits, and the bits in the same position of every information frame constituted a single symbol, i.e., a symbol contained *Ns* bits. The frame number of each source node was fixed at 1500.

We investigated the mixed mode reliability of the proposed irregular topology structure with different numbers of source and relay drones compared to the regular structure. Figures 11 and 12 show the SEP performance of the mixed mode with different numbers of relays in the 6-*Nr*-1 and 8-*Nr*-1 structures, respectively. The edge parameters of the 6-*Nr*-1 and 8-*Nr*-1 structures in the mixed mode are exhibited in Tables 2 and 3 separately, and the other simulation parameters were the same as mentioned above if no special indication is otherwise given. It can be seen from Figure 11 that the SEP performance of the mixed mode increases better with the increasing number of relay drones when the number of source drones is fixed at 6. This is due to the higher diversity gains originating from the increasing number of relay nodes. However, the space for SEP improvement gradually diminishes when increasing the relay drone number when *Nr* is larger than 6. In order to reduce the complexity and cost of UAV cooperative networks, we selected the number of relay drones as 6 for the 6-*Nr*-1 structure. Compared with the regular 6-6-1 structure, the proposed 6-6-1 structure earns gains of at least 3 dB in the region of SEP = 10−3, that is to say, the irregular structure can remarkably improve reliability over the regular structure under the same parameters.

**Figure 11.** The symbol error probability (SEP) of the mixed mode with different numbers of relays in a 6-*Nr*-1 CFNC-based structure.

**Figure 12.** The SEP of the mixed mode with different numbers of relays in a 8-*Nr*-1 CFNC-based structure.


**Table 2.** The edge parameters of the 6-*Nr*-1 structure in the mixed mode.


**Table 3.** The edge parameters of the 8-*Nr*-1 structure in the mixed mode.

For the 8-*Nr*-1 irregular structure in the mixed mode, the simulation results of the SEP performance shown in Figure 12 are very similar to those in Figure 11. As we see from Figure 12, the SEP decreased with an increasing number of relay drones when the number of source drones was set at 8. It is noteworthy that the improvement on SEP is smaller when the number of relay drones is greater than 10. Too many relay nodes will increase the complexity and cost of a UAV cluster. In view of the reasons given above, the number of relay nodes was selected as 10 for the 8-*Nr*-1 structure. In addition, the reliability of the irregular 8-10-1 structure was superior to that of the regular structure under the same simulation parameters. Through the above analysis, we can deduce that the proposed irregular topology structure in the mixed mode has certain advantages in terms of the reliability when compared with the regular structure under the same conditions.

The effect of the source drone number on the SEP performance of the irregular structure in the mixed mode is illustrated in Figure 13. More details about the edge setting in the *Ns*-6-1 structure are exhibited in Table 4. We can observe from Figure 13 that the SEP performance worsens with an increasing number of source drones when the number of relay drones is fixed at 6. For a single relay node, the more information it receives from the connected source drones, the worse the SEP performance is. The interference among different messages will be intensified when a relay node processes or forwards information, which results in poor SEP performance.

**Figure 13.** The SEP of the mixed mode with different numbers of sources in the *Ns*-6-1 CFNC-based structure.


**Table 4.** The edge parameters of the *Ns*-*6*-1 structure in the mixed mode.

In the relay mode, there is no any connection between the source drones and the command and control center, which indicates that **M** is a zero-row matrix. Figure 14 shows the SEP performance of the relay mode with different numbers of relays in the 2-*Nr*-1 irregular topology structure. More details about the edge setting in the 2-*Nr*-1 structure are exhibited in Table 5. As we see from Figure 14, the SEP performance is gradually improved with an increasing number of relay drones when the number of source drones is fixed at 2. This is because more relay nodes bring more diversity gains, which leads to a better reliability. It is noteworthy that the room for improvement on the SEP performance is limited when the number of relay nodes is larger than a certain value. Moreover, the increasing number of relay nodes will impose a relatively high implementation complexity and cost for cooperative UAV networks. Therefore, the selection of the relay number should take into account reliability, network complexity, system cost, and so on.

**Figure 14.** The SEP of the relay mode with different numbers of relays in the 2-*Nr*-1 CFNC-based structure.


**Table 5.** The edge parameters of the 2-*Nr*-1 structure in the relay mode.

The effect of the source drone number on the SEP performance of the irregular structure in the relay mode is illustrated in Figure 15. The detailed edge parameters in the *Ns*-2-1 structure are exhibited in Table 6. It can observed from Figure 15 that the SEP performance gets worse with an increasing number of source drones. The reason for this is similar to that of the mixed mode. The greater the number of source drones a single relay node links, the more messages it receives. The mutual interference among messages goes against data processing and forwarding, which leads to a considerable decline in reliability.

**Figure 15.** The SEP of the relay mode with different numbers of sources in the *Ns*-2-1 CFNC-based structure.


**Table 6.** The edge parameters of the *Ns*-2-1 structure in the relay mode.

#### *4.2. The Combination of CFNC and Conventional Unmanned Aerial Vehicle (UAV) Datalink*

Through the above analysis, we can deduce that the CFNC applied in the proposed irregular structures based on the two transmission modes has a distinct advantage in terms of the reliability and throughput found. Next we discuss the performance of CFNC combined with a UAV datalink signal system and convolutional coded binary phase shift keying (CC-BPSK) modulation, which is a common transmission scheme used in existing UAV datalinks. Figure 16 shows a block diagram of CC-BPSK combined with CFNC (abbreviated as CC-BPSK-CFNC). In this system, the simulation parameters were set as follows: The structure of convolutional code was (2, 1, 3), i.e., one information bit was encoded into a 2-bit codeword each time (code rate was 1/2), and the constraint length was 3; the generator matrix was [1 1 1; 1 0 1]; and the Viterbi algorithm was adopted for decoding. The irregular topology structure of the CFNC in the two modes was chosen as 8-8-1, and the edge parameters in the structure are shown in Table 3.

**Figure 16.** The transmission scheme of coded binary phase shift keying complex field network coding (CC-BPSK-CFNC).

The SEP comparison of CC-BPSK-CFNC, based on the mixed mode in regular and irregular structures, is illustrated in Figure 17. As shown in Figure 17, a SEP value of 10−<sup>4</sup> is attainable for CC-BPSK-CFNC in the irregular structure at a SNR of around 12 dB, whereas the equivalent SEP performance for CFNC based on the same structure without channel coding and modulation has a SNR of about 30 dB (as shown in Figure 12). Note that the reliability could be improved by invoking a few coded modulation techniques at the expense of rate loss. The transmission scheme, i.e., CC-BPSK-CFNC, in the irregular structure could obtain at least a 14 dB gain at the SEP of 5 <sup>×</sup> 10−<sup>3</sup> compared with the scheme in the regular structure. The SEP comparison of the CC-BPSK-CFNC, based on relay mode in regular and irregular structures, is depicted in Figure 18. We can see that the SEP of 10−<sup>4</sup> is attainable for CC-BPSK-CFNC in the irregular structure when the SNR is greater than 18 dB. Compared with the regular structure, the scheme based on the irregular one can earn at least a 6.5 dB gain with a SEP of 10<sup>−</sup>3.

**Figure 17.** The SEP comparison of CC-BPSK-CFNC based on the mixed mode in regular and irregular structures.

**Figure 18.** The SEP comparison of CC-BPSK-CFNC based on the relay mode in regular and irregular structures.

#### **5. Conclusions**

Using multiple drones to form a collaborative network will become one of the main trends of UAV development in the future. The amount of interactive information among drones in such a collaborative network is expected to increase greatly. Complex field network coding (CFNC) is an effective method to improve network throughput and has been introduced to UAV cooperative surveillance networks in this paper, where the throughput was found to be as high as 1/2 sym/S/CU, which is superior to other network coding schemes. According to whether there is a direct communication link between any source drone and the destination, the information transfer mechanism at the downlink was set to one of two modes, either mixed or relay transmission, and two corresponding irregular topology structures for a CFNC-based network have been proposed, and the information transmissions based on CFNC in the mixed and relay modes were derived. The simulation results over an AWGN channel based on the MATLAB software show that the CFNC applied in the proposed irregular structures under the two transmission modes can remarkably improve reliability using the same parameters when compared with the regular structures. Moreover, the CFNC could easily be combined with the existing channel coding and modulations of UAVs datalinks, such as CC-BPSK, which continues to enhance the SEP performance to a great extent.

**Author Contributions:** The work presented in this paper was carried out in collaboration with all authors. R.X. conceived the ideas and concept. L.H. implemented the software and carried out the experiments and wrote the manuscript. H.C. critically reviewed and edited the paper. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was partially funded by the National Natural Science Foundation of China (Grant No. 61873070), the Technology Development Project of the China Research Institute of Radiowave Propagation (Grant No. JW2019-114), and the Fundamental Research Funds for the Central Universities (Grant No. HEUCFM180803).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Abbreviations**

The following abbreviations are used in this manuscript:


#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel. +41 61 683 77 34 Fax +41 61 302 89 18 www.mdpi.com

*Sensors* Editorial Office E-mail: sensors@mdpi.com www.mdpi.com/journal/sensors

MDPI St. Alban-Anlage 66 4052 Basel Switzerland

Tel: +41 61 683 77 34 Fax: +41 61 302 89 18