*Article* **Enhanced Dynamic Spectrum Access in UAV Wireless Networks for Post-Disaster Area Surveillance System: A Multi-Player Multi-Armed Bandit Approach**

**Amr Amrallah 1,\*, Ehab Mahmoud Mohamed 2,3, Gia Khanh Tran <sup>1</sup> and Kei Sakaguchi <sup>1</sup>**


**Abstract:** Modern wireless networks are notorious for being very dense, uncoordinated, and selfish, especially with greedy user needs. This leads to a critical scarcity problem in spectrum resources. The Dynamic Spectrum Access system (DSA) is considered a promising solution for this scarcity problem. With the aid of Unmanned Aerial Vehicles (UAVs), a post-disaster surveillance system is implemented using Cognitive Radio Network (CRN). UAVs are distributed in the disaster area to capture live images of the damaged area and send them to the disaster management center. CRN enables UAVs to utilize a portion of the spectrum of the Electronic Toll Collection (ETC) gates operating in the same area. In this paper, a joint transmission power selection, data-rate maximization, and interference mitigation problem is addressed. Considering all these conflicting parameters, this problem is investigated as a budget-constrained multi-player multi-armed bandit (MAB) problem. The whole process is done in a decentralized manner, where no information is exchanged between UAVs. To achieve this, two power-budget-aware PBA-MAB) algorithms, namely upper confidence bound (PBA-UCB (MAB) algorithm and Thompson sampling (PBA-TS) algorithm, were proposed to realize the selection of the transmission power value efficiently. The proposed PBA-MAB algorithms show outstanding performance over random power value selection in terms of achievable data rate.

**Keywords:** unmanned aerial vehicles; dynamic spectrum access; quality of service; reinforcement learning; multi-armed bandit

## **1. Introduction**

The fast development of UAVs, which are commonly known as drones, has received much attention in various domains [1,2]. Recently, UAVs have been leveraged for future civil applications although their usage was restricted to military applications only during the last few years. This is considered a promising direction since UAVs have unique properties that can support this goal. UAVs are capable of various functions as they are able to fly, are maneuverable, and are easy to deploy. Hence, UAVs can handle different tasks as delivery services, traffic monitoring, aerial photography, disaster management, rescue operations, and wireless communications [1,2]. In recent years, major disasters have occurred around the world such as the great Tohoku earthquake and tsunami, which hit Japan in 2011; Hurricane Sandy on the northeastern coast of the USA in 2012; the Nepal earthquake in 2015, the massive explosion in the port of Beirut, Lebanon, in 2020; and the global wildfires in North America and Europe in 2021. All these natural disasters caused terrible damage to infrastructure and loss of human lives. The first few hours after the disaster are considered the golden relief time to provide support and emergency aid to save these precious lives. Therefore, this paper focuses on wireless communications applications

**Citation:** Amrallah, A.; Mohamed, E.M.; Tran, G.K.; Sakaguchi' K. Enhanced Dynamic Spectrum Access in UAV Wireless Networks for Post-Disaster Area Surveillance System. *Sensors* **2021**, *21*, 7855. https://doi.org/10.3390/s21237855

Academic Editor: Margot Deruyck

Received: 5 November 2021 Accepted: 23 November 2021 Published: 25 November 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

for UAVs to support a post-disaster area surveillance system. Specifically, UAVs can fly over the post-disaster areas to collect live photos of the current situation and send this collected information to a disaster management center to be analyzed. This will enable rescue teams to get information promptly about the actual situation in the affected area, which will enhance their response time [3].

On the other hand, and due to the persistent increase in demand for mobile services, spectrum resources are becoming more and more scarce [4]. Therefore, it is expected that future mobile networks will host a modern communications technology that supports unsurpassed networking architecture and energy-efficient devices. To realize these novel concepts, new fundamental challenges have appeared on the surface. Unlike wired communications systems, due to the national spectrum regulations and the hardware limitation, the wireless world has limited links to distribute. Consequently, it will be mandatory for the traditional regulation of the spectrum to have a fundamental reform so that it can allow more efficient use of spectrum resources. Spectrum inefficiency has become a major concern; hence it is imperative to search for an effective solution to deal with the resource allocation problems from the spectrum and power-efficiency points of view. This solution should achieve three main goals. Firstly, it should be amenable to the distributed implementation. Secondly, it should be capable of dealing with the uncertainty caused by the lack of information. Thirdly, it should deal with users' selfishness. One of the most promising solutions is the DSA system [5], which can be implemented as a CRN [6]. A DSA system has the ability to enhance the spectrum utilization efficiency [7]. Hence, CRN allows unlicensed Secondary Users (SUs) to coexist with the licensed Primary Users (PUs) in the licensed band without causing any harmful impact on PUs in terms of different Quality of Services (QoS) aspects. In other words, SUs can utilize a portion of the licensed PUs spectrum under certain QoS constraints [6]. Therefore, to enhance the network efficiency, SUs' spectrum utilization should be maximized while keeping an eye on the QoS level of the high-priority traffic, i.e., the PUs traffic, to avoid any services interruption to the highly prioritized data transmission.

The concept of this resource allocation issue is considered a challenging problem for two reasons. First, the resource allocation process can be made with a large number of orthogonal communication dimensions such as time, frequency, code, space, and antenna direction [8]. Second, in order to enhance the spectrum utilization, QoS for both PUs and SUs should be maximized. To achieve this, there are different conflicting parameters that need to be jointly optimized as transmitted power, channel occupation, total throughput, and mutual interference level between simultaneous users. Therefore, for a certain number of PUs and SUs, there are indispensable targets for the optimization algorithm such as the interference threshold for each PU, the channel state information, and the geographical location for both of PUs and SUs. Moreover, this optimization scenario can be decentralized,; in other words, there is no need to deploy a fusion center to collect enough information from the environment and complete the optimization process to the end. Since energy levels are not observed in general, and both PUs and SUs form a distributed network, it can benefit from that distribution to sense the available energy at each node. From this point of view, the design of an efficient future wireless network needs to deal with the uncertainty of information besides different users' competition and selfishness. Hence, it becomes mandatory to search for a powerful mathematical tool that can deal with such unprecedented network problems.

Machine learning (ML) algorithms, more precisely reinforcement learning (RL) algorithms, are leveraged to deal with these kinds of optimization problems [9]. The reason behind selecting RL algorithms is their capability to achieve tremendous results in generalization and efficiency, leading to their capability to tackle real-life problems, and especially in field of wireless communications [9]. Furthermore, RL algorithms are able to deal with conflicting optimization parameters of the resource allocation problem for the DSA system [10]. Without prior information about the environment, an agent can learn to enhance its future actions based on its past experience. MAB algorithms are considered one of such

RL algorithms. MAB algorithms can be described as a set of actions (arms) of a bandit machine that each arm leads to a certain reward [11]. A player needs to maximize their accumulated reward over the playing epoch by choosing one arm to pull in each playing round. Moreover, this player has no idea about the reward behind each arm. So, this instantaneous reward behind each arm is revealed once the player decides to select this arm. Therefore, for this hidden setting, the player may lose some reward in each trial due to not selecting the arm that leads to the highest reward value instead of the chosen arm. This loss is denoted by regret [12]. Thus, each player should select a sequence of arms to pull to maximize their total reward over horizon, in other words, to minimize their total regret over horizon. This is a common dilemma faces MAB algorithms and it is called the exploration–exploitation trade off [13–15].

Over the last decade, with the rapid increase in the number of natural disasters occurring throughout the world, there has become an urgent need to develop a smart postdisaster surveillance system. This smart system should operate in a fully decentralized manner, i.e., without having a controlling center, to speed up collection and analysis of data for a post-disaster area to enhance the performance and reduce the response time of the rescue operations. DSA systems are considered a rich topic that was deeply investigated in the early 2000s for some quite old applications such as analog TV white spaces, especially in the Very High Frequency (VHF) and the Ultra High Frequency (UHF) bands [16]. Hence, we aimed to refurbish the well-known DSA system by exploiting the benefit of using ML algorithms as a modern optimization tool. Furthermore, UAVs, which are capable of flying and capturing high-resolution videos using attached cameras, were leveraged recently to support various applications in the civilian life. All these ideas motivated us to develop a smart and cheap post-disaster surveillance system by combining the advantages of DSA system, UAVs, and ML algorithms. In addition, this system is presented as unconventional method to solve the spectrum scarcity problem. In this way, DSA-system-aided ML algorithms can open the gate to unprecedented applications in the field of UAVs wireless communication networks.

In this paper, we aim to design and evaluate a spectrum allocation for a DSA system using MAB algorithms to support a post-disaster surveillance system. From a MAB perspective, UAVs, which are considered SU transmitters, will act as the player who aims to maximize their long-term reward, i.e., data rate. Furthermore, this player is constrained by a limited power budget. On the other hand, different transmitting power levels will act as arms of the bandit machine. The MAB algorithm is considered the most suitable algorithm for our optimization problem as it can deal with online optimization problems without any prior information about the environment except the player's observations of the achieved reward while playing. Our paper adapts two different MAB algorithms, the Upper Confidence Bound (UCB) [15] and Thompson Sampling (TS) [17], to address such an optimization problem. In this paper, a modified version of MAB algorithms is proposed to treat our optimization problem. This is called the Power-Budget-Aware PBA-MAB (MAB) algorithm. The key idea behind the PBA-MAB algorithm is to include the available power budget for each UAV in the decision-making process when choosing the most appropriate transmitting power value.

From the point of view of the DSA system, the SU network, which consists of UAVs and temporary base stations, shared the spectrum resources as a CRN with the PU network, which consists of highway Electronic Toll Collection (ETC) gates and cars passing these ETC gates, under certain QoS constrains. Hence, SU transmitters are allowed to send their data without causing a harmful interference to the most precious data of the PU network. It should be mentioned that our design allows both the PU network and the SU network to coexist at the same time under a certain signal-to-interference-plus-noise ratio (SINR) threshold. Furthermore, we need to utilize the multi-objective formulation. Given the location of each PU and the power budget of each SU, we seek to design for a joint optimization problem considering different conflicting objects such as interference coordination, sum-rate maximization, and total number of active SUs in the network,

subject to QoS constraints for both PUs and SUs. Despite the adversarial problem definition and the selfish behavior of each UAV toward achieving its maximum data rate, modified MAB algorithms learn how to select the most suitable action over time to enhance the overall system performance as discussed in [18–20] and illustrated in our paper. The main contributions of this paper can be summarized in the following points:


The rest of the paper is organized as follows. Section 2 overviews the related work. Section 3 introduces the system model and the power value selection optimization problem. Section 4 introduces proposed PBA-MAB algorithms and how these algorithms can deal with this kind of optimization problem. Section 5 gives simulation and analysis of the proposed optimization scenario. Finally, we summarize the result and point out the future research in Section 6.

#### **2. Related Works**

Since the early 21th century, the idea of DSA gained increasing attention, especially in the US and Europe, due to the spectrum congestion [21]. An overview of the major technical and regularity issues of DSA systems was presented in [21]. The authors of [22] introduced the concept of multi-dimensional spectrum sensing and discussed the challenges associated with it. They developed prediction algorithms based on the past multi-dimensional spectrum utilization information to predict the future usage of the spectrum. With the aid of the DSA system, CRN can be established to support different applications as public safety, smart grid, broadband cellular, and medical applications. Ref. [23] discussed some challenges that faced the practical application toward this idea. An overview of CRN design layers, such as the physical layer (PHY), the medium-access control layer (MAC), and the network layer, is presented in [24]. Furthermore, the authors showed how these layers can interact with each other. The authors of [25] investigated the throughput improvements in a CRN using different channel selection techniques such as frequency hopping, frequency tracking, and frequency coding. Ref. [26] investigated the CRN formed by the incorporating radio capabilities of a Wireless Sensor Network (WSN). It addressed both advantages and limitations of CRN for WSN in conjunction with the existing applications and techniques. A continuous-time Markov chain model is implemented in [27] for a DSA system in an open spectrum wireless network. The authors of [28] examined how CRN devices can find an available spectrum channel under different system capabilities, spectrum policies, and environmental conditions. They defined this problem as a "rendezvous" problem. With the aid of RL algorithms, the authors of [29] proposed a framework for Internet of Things (IoT) devices to capture and model the traffic behavior of short-time spectrum occupancy in order to determine the existing interference in the shared bands. In [30], a novel information and energy cooperation method were introduced for cognitive Heterogeneous Networks (HetNets). This method aimed to enhance energy efficiency by solving an energy efficiency maximization problem with respect to joint time allocation and power control. The authors of [31] proposed an enhanced fusion center rule

for soft decision cooperative spectrum sensing using energy detection to mitigate the noise uncertainty effect and to enhance the sensing performance of CRNs.

In recent years, there have been research efforts for using UAVs to support postdisaster area applications. In [32], the authors used UAVs with conjunction with cellular network and WSN to aid disaster management applications. A genetic algorithm was used in [33] for UAV location optimization to enhance the overall coverage and data rate of the wireless network. The authors of [34] proposed an effective method to support rescue operations in locating victims of a natural disaster. This was done with the aid of lidar and infrared depth cameras attached to UAVs to build a detecting system independent of the illumination intensity. A video recorder and a geolocation module attached to UAV were used in [35] to search for survivors in a post-disaster area. In [36], the authors examined flying communication services using Wi-Fi, video camera, and web servers attached to UAVs. They aimed to enable affected users after a disaster to use their smartphones for texting and video communication in real-time. The authors of [37] proposed a mobility model based on self-deployment of an aerial ad hoc network based on the Jaccard dissimilarity metric for a post-disaster area. The software simulation integrates the mobility of victims and generate a corresponding UAVs mobility model to trace those victims. In [38], authors proposed an energy efficient task scheduling for the collected data by UAVs from ground IoT network to support a disaster management system.

In [39], UAVs were used as on-demand airborne relays to connect remote users with a cellular BS when they were separated by vast obstacles. Furthermore, UAVs can be used in WSNs to distribute and collect information in both of Control Plane (CP) and Data Plane (DP) from wireless sensors deployed on the ground level [40,41]. UAVs are being used to assist the management and control of Vehicle Ad hoc NETworks (VANETs) and extend its coverage [42]. All the above existing research works assume a full awareness of the network parameters, which is not the case of our paper, where there is no information change among UAVs while trying to maximize the achievable data rate, as the network is fully decentralized.

On the other hand, RL algorithms have become a promising optimization technique for solving chronic UAV problems that have occurred as a result of integrating UAVs in wireless communication applications. RL algorithms are well known for their capability to achieve near optimal results in generalization and efficiency. Therefore, they are used to tackle real-time problems in the field of wireless communications. Detailed discussion about different MAB algorithms can be found in [43,44]. It has been shown in several works that MAB algorithms can be adapted to tackle such problems related to DSA systems. The authors of [45] proposed MAB learning algorithms for CRN, and particularly for spectrum sensing in a DSA system in licensed bands [7]. Different MAB algorithms, such as UCB and TS, have been used to improve the spectrum access in unlicensed Wi-Fi networks [45,46]. The authors of [47] considered a set of policies for multiple-user-independent and identical distributed (iid) and rested MAB problems with the assumption that each SU declares its action to others, e.g., the selected channel, which is considered a strong constraint. A disputed learning and spectrum access policy for iid rewards is discussed in [48], and it was proven that this policy has a logarithmic order regret. In [49], the decentralized learning for DSA system with multiple SUs spectrum access has been studied. The authors of [50] proposed a modified MAB algorithm to solve the gateway selection in UAV wireless network for post-disaster area applications. These algorithms are considering the battery life while searching for the most suitable gateway UAV to maximize the total system throughput. A dynamic wireless channel selection based on the MAB algorithm with laser chaos time sequence is proposed in [51]. The adaptive channel selection achieved a higher throughput using four channels Wireless Local Area Network (WLAN) based on IEEE802.11a system. The authors of [52] proposed a simple and powerful tug-of-war MAB algorithm. Since this algorithm is very simple, it can be applied in wireless network selection for devices with small processing capabilities as IoT devices and smartphones. Ref. [53] studied the millimeter-wave (mmWave) two-hop relaying as a single-player MAB

problem in order to enable one relay probing while maximizing the achievable spectral efficiency. This was done by using modified versions of MAB algorithms. The authors of [54] studied the problem of joint neighbor discovery and selection in mmWave device to device (D2D) networks using a stochastic budget-constraint MAB algorithm.

#### **3. System Model**

This section discusses the network architecture of the post-disaster area surveillance system using UAVs and the used channel model for transmitting the collected data.

#### *3.1. Post-Disaster Area Surveillance System Architecture*

Figure 1 shows a simplified version of the system architecture of the UAV wireless network in a metropolitan post-disaster area. Since the first few hours after the occurrence of the natural disaster (such as flood or earthquake) are considered the golden relief time to save human lives, as discussed in the introduction section, UAVs should collect pivotal information about victims in the damaged area using an attached high-definition camera. The collected data can be further analyzed by the disaster management center to identify victim's exact location, number, age, gender, and injury status. On the other hand, temporary base stations are deployed in the disaster area to collect this information from surveillance UAVs and send them to the disaster management center to aid rescue teams. These temporary base stations are used as charging stations for UAVs. Furthermore, they are considered the starting flying points. UAVs fly over the disaster area to capture live photos of certain points at the damaged area. The way in which these temporary base stations transmit the collected data to the disaster management center, and the method for selecting surveillance points, are outside the scope of this paper. Moreover, we assumed in this paper that the different locations in the affected area have the same weight of importance, so these points were chosen on random bases.

**Figure 1.** UAV surveillance-system-assisted DSA for a metropolitan post-disaster area.

On the other hand, our system aims to build this surveillance system using CRN. Therefore, the SU network, which is represented by UAVs and temporary base stations, will utilize the same frequency band of the PU network. The PU network is represented by ETC gates and bypassing vehicles in a nearby highway. In this way, we aimed to reduce the cost of reserving dedicated channels for surveillance system while it is being used during the time of natural disasters only. Each UAV collects and sends data to its

corresponding temporary base station. Furthermore, each UAV should not deal harmful interference to the transmitted data between ETC gates and vehicles on the nearby highway. It should be mentioned that our optimization problem design is considered a soft-spectrum allocation. The difference between conventional spectrum allocation that have been studied in [55,56] and our optimization problem is that the conventional optimization problem treats the spectrum allocation as a hard allocation problem; i.e., no two users (PU and SU) can share same channels at the same time. However, our design introduces other orthogonal dimensions of the threshold to enable more than one user to coexist at the same frequency band if their QoS constraints are not violated. Furthermore, for the sake of generalization, we supposed that all PU channels, which connect every ETC gate and nearby vehicles which are passing this ETC gate, are always active and occupied with the PU network traffic. In this way, we considered the worst-case design scenario in which the QoS constraints should be carefully verified during the optimization process.

#### *3.2. Problem Formulation*

In the following, our design employs the physical model proposed in [57], which provides a path-loss model to realize the communication environment. It is assumed UAVs can communicate to temporary base station via air-to-air wireless communication link. Basically, this type of link can be called a Line of Sight (LoS) wireless communication link. Since the design is built using CRN, which shares the spectrum between PU network and SU network, this shared frequency band is split into *Q* independent sub-bands, and each sub-band has a bandwidth *W* in Hertz. Each primary and secondary transmitter receiver pair, referred to as primary and secondary users, is numbered by indices *ψ* ∈ Ψ = {*PU*1, . . . , *PU*Ψ} and *ω* ∈ Ω = {*SU*1, . . . , *SU*Ω}. Hence, at any time *r*, the general path-loss formula between any transmitter *α* and any receiver *β* can be expressed by:

$$L\_{a\ $,q}(r) = \frac{G\_{\text{Tx},a}G\_{\text{Rx},\emptyset}}{d\_{a\$ }^{\xi}} \left(\frac{c}{4\pi f\_q(r)}\right)^2\tag{1}$$

where *GTx*,*<sup>α</sup>* and *GRx*,*<sup>β</sup>* are the transmit and receive antenna gains, respectively, *dαβ* is the distance between *α* transmitter and *β* receiver, *c* is the speed of light, *fq*(*r*) is the carrier frequency of sub-band *q*, and *ξ* is the attenuation constant for the LoS wireless communication link. For the current design, it is assumed that the pass loss is the dominant loss factor for the received power. Hence, the effect of multi-path fading and shadowing is ignored. Furthermore, we assumed the transmitted signal is affected by an Additive White Gaussian Noise (AWGN) channel with zero mean and *N*<sup>0</sup> variance. Therefore, the SINR of SU *ω* in carrier *q* at time *r* can be given by:

$$\gamma\_{\omega,q}(r) = \frac{p\_{\omega,q}(r)L\_{\omega\omega,q}(r)}{N\_0 + \sum\_{\lambda \in \Psi \cup \Phi, \lambda \neq \omega} p\_{\lambda,q}(r)L\_{\lambda\omega,q}(r)}\tag{2}$$

where *pω*,*q*(*r*) and *pλ*,*q*(*r*) denote the transmitted power of the *ω*-th SU and the *λ*-th PU or SU, respectively. For a successful established communication link, the SINR should satisfy a condition that the achievable SINR must be greater than the threshold SINR, which is given by *γω*,*q*(*r*) > *γω*TH,*q*(*r*). Under these assumptions, the achievable data rate can be calculated by:

$$R\_{\omega,q}(r) = \begin{cases} W \sum\_{q=1}^{Q} \log\_2\left(1 + \gamma\_{\omega,q}(r)\right), & \text{if } \gamma\_{\omega,q}(r) > \gamma\_{\omega \text{TH},q}(r) \\ 0, & \text{otherwise} \end{cases} \tag{3}$$

where *W* is the bandwidth of the communication channel.

Since the data rate is measured from the receiver side, we assumed this value is reported to the SU transmitter through a feedback channel. The concept behind this

assumption comes from how modern communication systems are supposed to offer high flexibility in different ways. One of these ways is to split user and control planes to support software defined networking applications to allow flexible placement of processing function between different network nodes [58]. For PUs, it is assumed that they operate in a narrowband network, which means a pre-determined power value is assigned to each PU. This design criterion is suitable when licensed users have to operate on narrowband channels. On the other hand, for a wideband PU network, straightforward extension can be done without affecting this methodology. Since SUs need to utilize these multiband channels, where each sub-band is previously assigned to a certain PU, each SU has a power budget denoted by *P*max. Whereas it is assumed that our PUs and SUs networks use omnidirectional antennas, the communication channel can be established according to (1) with considering antennas gain *GTx*,*<sup>α</sup>* = *GRx*,*<sup>β</sup>* = 1, ∀*α*, *β*. Furthermore, it is assumed that each PU transmits using only single sub-band, and PUs operate in disjoint sub-bands. As a result, we have the number of PUs equal to number of channels and hence Ψ = *Q*. The main target of the optimization algorithm is to maximize the sum-rate, the total throughput, for the SUs network. This can be achieved by optimizing the power levels allocated for each SU within each shared traffic channel. The power allocation vector can be defined as **p***<sup>ω</sup>* = [*pω*,1, . . . , *pω*,*Q*] T , where each element represents the power value for SU *ω* for each sub band *q*. In case that a SU has a power vector equal to zero, it means that this SU in inactive. On the other hand, for PUs, it is allowed for a single PU to transmit only on a single sub-band so that they are operating in disjoint sub-bands. Moreover, during data transmission of SUs, they should avoid causing any harmful interference to the high priority traffic that belongs to PUs network. It is mandatory for each SU to satisfy this condition and not exceed its allowed power budget during transmission as well. Considering all these power budget limitations and interference constraints, the sum-rate maximization problem can be formulated as:

$$\begin{aligned} \max & \frac{1}{\mathcal{R}} \sum\_{r} \sum\_{\omega} \sum\_{q} R\_{\omega, q}(r) \\ \text{s.t. } & \gamma\_{\psi, q}(r) > \gamma\_{\psi \text{TH}, q}(r) \\ & \gamma\_{\omega, q}(r) > \gamma\_{\omega \text{TH}, q}(r) \end{aligned} \tag{4}$$

where R is the total time spent for data transmission , *r* = 1, . . . , R, and *γψ*,*q*(*r*) > *γψ*TH,*q*(*r*), *γω*,*q*(*r*) > *γω*TH,*q*(*r*) are the SINR constraint conditions for all PUs and all SUs, respectively. Thus, for SUs, it is mandatory to satisfy both SINR conditions to utilize a sub-band channel from PUs channels.

Since our network is designed in a decentralized way with no information exchange between different network elements, the only information available to UAVs are the location, the channel frequency and the transmission power of each ETC gate system. Therefore, we have developed a method to let UAVs estimate the interference caused by self-transmission and calculate the corresponding SINR value for each PU's receiver. With the aid of Equations (1) and (2), each UAV will calculate the expected SINR value at each ETC gate under the interference effect of its own data transmission. Then, each UAV can check individually for the satisfaction of SINR conditions for both the PU network and the SU network. In such a way, there is no need to deploy a fusion center to share the SINR information between different SU network nodes, and therefore the network can be implemented in a decentralized way.

#### **4. Proposed Power Budget Aware MAB Algorithm**

This section discusses two proposed algorithms to tackle this rate maximization problem. These algorithms are called Power Budget Aware Upper Confidence Bound (PBA-UCB) and the Power Budget Aware Thomson Sampling (PBA-TS).

#### *4.1. Proposed PBA-UCB Algorithm*

UCB is considered one of the efficient MAB algorithms that can achieve balancing for the exploration-exploitation dilemma of the MAB algorithm. UCB enhances the confidence of the arm selection by decreasing the uncertainty behind the reward that will be revealed. Algorithm 1 illustrates a modified version of the UCB algorithm, which is called the PBA-UCB algorithm. This algorithm is applied to each UAV to select the most suitable transmission power in a selfish way to maximize the system rate. It is assumed that each UAV has information about the location of surrounding ETC gates operating in the surveillance area. Furthermore, they know the transmitting frequency for each ETC gate. The method of how UAVs can detect the location and the operating frequency of each ETC gate is behind the scope of this paper. Hence, each UAV tries to maximize its own data rate while competing with other UAVs to increase its transmission power while keeping an eye on the SINR threshold. At the beginning, i.e., the first N rounds, PBA-UCB algorithm, which is enabled on each UAV, tests the data rate that can be achieved by transmitting on all available channels with random transmission power and observes the achievable data rate. Afterwards, for the remaining rounds, N + 1 ≤ *r* ≤ R, the PBA-UCB algorithm picks a power value in a way that satisfies:

$$p^\*\_{\omega, \emptyset}(r) = \underset{p\_{\omega, \emptyset} \in \mathbf{p}\_{\omega}}{\arg \max} \left( \hat{\mu}\_{\omega, \emptyset}(r - 1) + \sqrt{\frac{\eta \ln(r)}{T^{(p)}\_{\omega, \emptyset}(r - 1)}} - \frac{p\_{\omega, \emptyset}}{p\_{\omega, M}} \right) \tag{5}$$

where *pω*,*<sup>q</sup>* ∈ **p***<sup>ω</sup>* is the average reward obtained for transmission power value *p* in channel *q* up to the last previous round (*r* − 1), *µ*ˆ*ω*,*q*(*r* − 1) is the average achievable data rate to the last previous round (*r* − 1) using transmission power value *p* in channel *q*, and it can be calculated as:

$$\mathfrak{H}\_{\omega,\mathfrak{q}}(r-1) = \frac{1}{T\_{\omega,\mathfrak{q}}^{(p)}(r-1)} \sum\_{m=1}^{T\_{\omega,\mathfrak{q}}^{(p)}(r-1)} \mathcal{R}\_{\omega,\mathfrak{q}}(m) \tag{6}$$

where *Rω*,*q*(*m*) is the achievable data rate, which can be obtained from Equation (3). *T* (*p*) *<sup>ω</sup>*,*<sup>q</sup>* (*r* − 1) is a count of the number of selections of this transmitting power value until the last previous round (*r* − 1). *pω*,*<sup>q</sup>* is the selected power value for transmission and *pω*,*<sup>M</sup>* is the total available power budget for UAV that can be used. This equation illustrates how PBA-UCB works. If a transmission power value is selected many times, which makes *T* (*p*) *<sup>ω</sup>*,*<sup>q</sup>* (*<sup>r</sup>* <sup>−</sup> <sup>1</sup>) become large, the confidence bond term <sup>r</sup> *<sup>η</sup>* ln(*r*) *T* (*p*) *<sup>ω</sup>*,*<sup>q</sup>* (*r*−1) decreases, and that causes the UAV to seek to explore other power values that are less selected in the previous rounds. On the other hand, when a transmission power value achieved a high reward, i.e., high data rate, during the past rounds, which means *µ*ˆ*ω*,*q*(*r* − 1) becomes large, the UAV seeks to exploit this high-gain arm in order to achieve the maximum achievable reward during this round. Originally, the PBA-UCB algorithm sets parameter *η* to a positive value of 2 in most cases [13], but empirically, when it is set to *η* = 0.5, the performance is improved [12]. In that way, the PBA-UCB algorithm can solve the exploration–exploitation trade-off in an efficient way. Furthermore, the term *<sup>p</sup>ω*,*<sup>q</sup> pω*,*<sup>M</sup>* shows how a UAV can balance between selecting a power value to achieve a high data rate and consider for the remaining power budget to be used in transmission on next available channels. It should be mentioned that this last term defines the contribution behind our proposed PBA-UCB algorithm. Since the original UCB algorithm could achieve only balancing between exploration and exploitation, our proposed PBA-UCB algorithm enables a novel way to keep an eye on the remaining power budget while balancing between exploration and exploitation. Furthermore, when selecting a transmission power, the PBA-UCB algorithm checks for the satisfaction of both PU and SU SINR conditions. Once it is satisfied, the algorithm confirms the use of this transmission power value, starts to transmit data, and calculates the corresponding rate. Otherwise, it

sets the transmission power to zero and also sets the corresponding data rate to zero. In this

way, the PBA-UCB algorithm can make sure there is no harmful interference that affects the PU data transmission. On the other hand, it also counts for the interference threshold on other SUs data transmission. Since the SINR condition is considered a critical design issue, this operation is done in both of the initialization phase and the rate maximization phase to ensure the feasibility of the proposed PBA-UCB algorithm. Algorithm 1 illustrates the proposed PBA-UCB algorithm.

**Algorithm 1** PBA-UCB transmission power selection

1: **for** *ω* ← 1 to Ω **do** 2: **for** 1 ≤ *r* ≤ N **do** . initialization phase 3: **for** *q* ← 1 to *Q* **do** 4: Select a random value for *pω*,*q*(*r*) 5: **if** *γψ*,*q*(*r*) > *γψ*TH,*q*(*r*) **then** 6: **if** *γω*,*q*(*r*) > *γω*TH,*q*(*r*) **then** 7: Obtain *Rω*,*q*(*r*) 8: *T* (*p*) *<sup>ω</sup>*,*<sup>q</sup>* (*r*) ← 1 9: **else** 10: *pω*,*q*(*r*) ← 0 11: **end if** 12: **else** 13: *pω*,*q*(*r*) ← 0 14: **end if** 15: **end for** 16: **end for** 17: **for** *r* ← N + 1 to R **do** . rate maximization phase 18: Set *pω*,*<sup>M</sup>* max SU Tx power 19: **for** *q* ← 1 to *Q* **do** 20: *p* ∗ *ω*,*q* (*r*) = arg max *pω*,*q*∈**p***<sup>ω</sup> <sup>µ</sup>*ˆ*ω*,*q*(*<sup>r</sup>* <sup>−</sup> <sup>1</sup>) + <sup>r</sup> *<sup>η</sup>* ln(*r*) *T* (*p*) *<sup>ω</sup>*,*<sup>q</sup>* (*r*−1) − *pω*,*q pω*,*<sup>M</sup>* ! 21: **if** *γψ*,*q*(*r*) > *γψ*TH,*q*(*r*) **then** 22: **if** *γω*,*q*(*r*) > *γω*TH,*q*(*r*) **then** 23: Obtain *Rω*,*q*(*r*) using *p* ∗ *ω*,*q* (*r*) 24: *T* (*p* ∗ ) *<sup>ω</sup>*,*<sup>q</sup>* (*r*) ← *T* (*p* ∗ ) *<sup>ω</sup>*,*<sup>q</sup>* (*r* − 1) + 1 25: *<sup>µ</sup>*ˆ*ω*,*q*(*r*) <sup>←</sup> <sup>1</sup> *T* (*p* ∗) *<sup>ω</sup>*,*<sup>q</sup>* (*r*) ∑ *T* (*p* ∗) *<sup>ω</sup>*,*<sup>q</sup>* (*r*) *m*=1 *Rω*,*q*(*m*) 26: *pω*,*<sup>M</sup>* ← *pω*,*<sup>M</sup>* − *p* ∗ *ω*,*q* 27: **else** 28: *p* ∗ *ω*,*q* (*r*) ← 0, *Rω*,*q*(*r*) ← 0 29: **end if** 30: **else** 31: *p* ∗ *ω*,*q* (*r*) ← 0, *Rω*,*q*(*r*) ← 0 32: **end if** 33: **end for** 34: **end for** 35: **end for**

#### *4.2. Proposed PBA-TS Algorithm*

TS algorithm copes with the exploration–exploitation dilemma using a different method than the previously discussed UCB algorithm. Basically, the reward gained by laying with different arms using the TS algorithm is drawn from a pure Bayesian probabilistic model [59]. In the beginning, TS uses a prior distribution for the reward based on the initialization of parameters of the probabilistic model. Afterward, it tries to keep tracking of the reward posterior distribution using the observation from the environment during the learning process. Thus, it can randomly choose a suitable arm that is matched

to be optimal according to the probability model. Thus, at each round, random samples are drawn from the constructed reward's posterior distribution. TS selects an arm to play that can maximize the selected sampled value. Then, the arm's posterior distribution is updated by modifying its model parameters. This updated distribution will be used for the arm selection of the upcoming rounds. It is known that TS has a superb empirical performance and even better than the achieved performance of the UCB algorithm.

In our proposed PBA-TS algorithm, it is assumed that the reward, i.e., the achieved data rate, is affected by AWGN noise and mutual interference from other PUs and SUs occupying the same channel. Hence, the assumption of the Gaussian distribution is compatible with our problem formulation. The selection of the most suitable power value for transmission, which can maximize the achieved data rate, can be expressed as:

$$p^\*\_{\omega, \boldsymbol{q}}(r) = \underset{p\_{\omega, \boldsymbol{q}} \in \mathbf{p}\_{\omega}}{\arg \max} \left( p\_{\omega, \boldsymbol{q}}(r - 1) - \frac{p\_{\omega, \boldsymbol{q}}}{p\_{\omega, M}} \right) \tag{7}$$

where *ϕω*,*q*(*r* − 1) is a sample for the previously constructed posterior distribution from the achieved data rate by a UAV *ω* at channel *q* with transmission power *pω*,*q*. The posterior distribution is constructed from the Gaussian distribution N *µ*ˆ*ω*,*q*(*r*), *σ* 2 (*r*) , where *µ*ˆ*ω*,*q*(*r*) and *σ* 2 (*r*) are the mean and the variance of the distribution according to the model in [20], and they can be calculated as:

$$\mathfrak{H}\_{\omega,\mathfrak{q}}(r) = \frac{1}{T\_{\omega,\mathfrak{q}}^{(p)}(r)} \sum\_{m=1}^{T\_{\omega,\mathfrak{q}}^{(p)}(r)} \mathcal{R}\_{\omega,\mathfrak{q}}(m) \tag{8}$$

$$
\sigma^2(r) = \frac{1}{T\_{\omega,\emptyset}^{(p)}(r) + 1} \tag{9}
$$

where *Rω*,*q*(*m*) is the achievable data rate and can be obtained from Equation (3), *T* (*p*) *<sup>ω</sup>*,*<sup>q</sup>* (*r*) is the counted number of selections of this transmitting power value until the last previous round (*<sup>r</sup>* <sup>−</sup> <sup>1</sup>), and *<sup>R</sup>ω*,*q*(*m*) is the achieved data rate. The term *<sup>p</sup>ω*,*<sup>q</sup> pω*,*<sup>M</sup>* is deduced form the distribution to balance between the rate maximization process and the remaining power budget that should be used to transmit data over the next channels. At each round *r*, a sample *ϕω*,*q*(*r* − 1) is taken from the previously constructed Gaussian distribution. Then, the optimum power value *p* ∗ *ω*,*q* that maximizes Equation (7) will be selected for transmission. After that, UAV *ω* starts to transmit over a channel *q* using *p* ∗ *ω*,*q* , its corresponding number of selections *T* (*p* ∗ ) *<sup>ω</sup>*,*<sup>q</sup>* (*r*) is updated, and the achievable data rate *Rω*,*q*(*r*) is observed to construct the Gaussian distribution for the next round *r* + 1. This process is conducted till the last round R. Furthermore, along with the PBA-UCB algorithm, the SINR conditions of both of PU and SU networks are examined at each time when choosing a certain power value for data transmission. If both SINR conditions are satisfied, the PBA-TS algorithm starts to use this transmission power value and counts the corresponding data rate. Otherwise, the PBA-TS algorithm sets the transmission power to zero, which leads to zero achievable data rate. The whole process of the proposed PBA-TS algorithm is summarized in Algorithm 2.

#### *4.3. Complexity Analysis of the Proposed Algorithms*

In this paper, we spotlight the task of UAVs to build a post-disaster surveillance system as a CRN by finding the optimal policy for each UAV. In Algorithms 1 and 2, learning processes can find the optimal transmission power value for both PBA-UCB and PBA-TS by examining various transmission power values over every channel for all UAVs using different policies. On the other hand, it tries to keep the interference level under certain thresholds. Let Ξ represent the total number of available arms, i.e., total elements of the power vector **p**. It is assumed that the action space is deterministic; i.e., all actions are well known to each UAV. Therefore, the number of iterations of PBA-UCB is at most of the order of O(Ω · *Q* · Ξ) steps. In particular, the complexity of PBA-UCB can be expressed as <sup>O</sup>(<sup>Ω</sup> · *<sup>Q</sup>*<sup>2</sup> ), if the total number of the available power levels Ξ in the power vector **p** is equal to the total number of channels *Q*. This means the complexity of the PBA-UCB algorithm is a polynomial in Ω and *Q*. Moreover, the PBA-TS has the same computational complexity <sup>O</sup>(<sup>Ω</sup> · *<sup>Q</sup>*<sup>2</sup> ) as the PBA-UCB algorithm. However, the update strategy in the PBA-TS algorithm is based on sampling from the Gaussian distribution N *µ*ˆ*ω*,*q*(*r*), *σ* 2 (*r*) ; hence it may impose a slightly higher complexity depending on the sampling process.


1: **for** *ω* ← 1 to Ω **do** 2: Set *µ*ˆ*ω*,*<sup>q</sup>* ← 0, *σ* <sup>2</sup> <sup>←</sup> <sup>1</sup> 3: **for** *r* ← 1 to R **do** 4: Set *pω*,*M*= max SU Tx power 5: **for** *q* ← 1 to *Q* **do** 6: Draw a sample *ϕω*,*q*(*r* − 1) from the distribution N *µ*ˆ*ω*,*q*(*r*), *σ* 2 (*r*) 7: *p* ∗ *ω*,*q* (*r*) = arg max *pω*,*q*∈**p***<sup>ω</sup> ϕω*,*q*(*r* − 1) − *pω*,*q pω*,*<sup>M</sup>* 8: **if** *γψ*,*q*(*r*) > *γψ*TH,*q*(*r*) **then** 9: **if** *γω*,*q*(*r*) > *γω*TH,*q*(*r*) **then** 10: Obtain *Rω*,*q*(*r*) using *p* ∗ *ω*,*q* (*r*) 11: *T* (*p* ∗ ) *<sup>ω</sup>*,*<sup>q</sup>* (*r*) ← *T* (*p* ∗ ) *<sup>ω</sup>*,*<sup>q</sup>* (*r* − 1) + 1 12: *<sup>µ</sup>*ˆ*ω*,*q*(*r*) <sup>←</sup> <sup>1</sup> *T* (*p* ∗) *<sup>ω</sup>*,*<sup>q</sup>* (*r*) ∑ *T* (*p* ∗) *<sup>ω</sup>*,*<sup>q</sup>* (*r*) *m*=1 *Rω*,*q*(*m*) 13: *σ* 2 (*r*) <sup>←</sup> <sup>1</sup> *T* (*p* ∗) *<sup>ω</sup>*,*<sup>q</sup>* (*r*)+1 14: *pω*,*<sup>M</sup>* ← *pω*,*<sup>M</sup>* − *p* ∗ *ω*,*q* 15: **else** 16: *p* ∗ *ω*,*q* (*r*) ← 0, *Rω*,*q*(*r*) ← 0 17: **end if** 18: **else** 19: *p* ∗ *ω*,*q* (*r*) ← 0, *Rω*,*q*(*r*) ← 0 20: **end if** 21: **end for** 22: **end for** 23: **end for**

#### **5. Simulation Results**

In this section, the simulation results of our proposed algorithms are evaluated in terms of solution performance. We distributed each PU and SU transmitter randomly in a 5 km × 5 km area, while PUs and SUs receivers are deployed in a certain area from PUs and SUs transmitters to comply with the SINR constraint. The SINR threshold is chosen to be 30 dB for the PUs network, which is relatively high to ensure that the accumulated data transmission from SUs will not cause any harmful interference to the most valuable traffic. On the other hand, the SINR value for SUs network is set to 5 dB to ensure a successful data transmission. The transmission powers for PUs and SUs networks are set to 24 dBm and 30 dBm, respectively. We deployed 10 armed bandits to represent 10 different levels of UAVs' transmission power. These power levels are uniformly distributed with separation equal to the maximum transmission power divided by number or armed bandits. Both PU and SU networks operate at 5.8 GHz band with a bandwidth equal to 10 MHz. Since both PUs and SUs networks operate in an open area, the attenuation constant parameter is set to 3 for a free-space communication in a metropolitan area. Table 1 summarizes the system's parameters which are used for simulation.


**Table 1.** Simulation parameters.

Figure 2 shows an example of PUs and SUs transmitter/receiver pairs deployment. The deployment of PU receivers, i.e., cars, in the simulation area was done in a random way within *δ* distance from their corresponding transmitters, while *δ* is chosen to achieve 30 dB at the boundary of their deployment region. The number of sub-bands is set to be equal to the number of PUs, and hence Ψ = *Q*, as described previously in Section 3.

**Figure 2.** Distribution of PUs and SUs Tx/Rx pairs.

#### *5.1. Average Total System Rate*

This section shows the performance of the total average system rate in bps/Hz against different values of UAVs and ETC gates.

Figure 3 shows the total average system rate using 10 UAVs while increasing the number of ETC gates. It is shown in this figure that the PBA-TS algorithm achieved the highest data rate performance compared to both the PBA-UCB algorithm and transmission using a random power value. The reason behind this is that PBA-TS algorithm is constructed using posterior distributions for the obtained data rates through the integrated Bayesian strategy. On the other hand, transmission using a random power value has the worst performance due to the randomness in the selection of this power value for transmission in each round. Thus, each UAV experiences random interference from not only ETC gates but also other UAVs that share these channels. Furthermore, when the number of ETC gates increases and each ETC gate has its own separate channel, the number of available spectrum resources increases as well. This leads to each UAV becoming able to transmit data over a wider

band of channels and causes the total achievable average system rate to increase for both the PBA-TS algorithm and the PBA-UCB algorithm. On the other hand, and due to the randomness illustrated in this section, the increase in the achievable total average system rate using a random power value data transmission is not as high as the achievable data rate using either the PBA-TS algorithm or the PBA-UCB algorithm.

**Figure 3.** Normalized average sum rate against number of ETC gates using 10 UAVs.

Figure 4 shows the performance of the achievable total average system rate against an increasing number of UAVs while keeping the number of ETC gates equal to 10. It is interesting that at the beginning with a few increments of the number of UAVs, the achievable data rate, using our proposed PBA-MAB algorithms, is increased till a certain point. Then, the achievable data rate begins to decrease with any increment in the number of deployed UAVs. The reason behind that is that while increasing the number of UAVs, the mutual interference between UAVs increases as well. Our proposed PBA-MAB algorithms succeeded in mitigating the interference effect, which is reflected in the achievable data rate reduction. Furthermore, the proposed PBA-TS algorithm can still achieve the highest data rate performance compared to the proposed PBA-UCB algorithm and the transmission using a random power value.

**Figure 4.** Normalized average sum rate against number of UAVs using 10 ETC gates.

#### *5.2. Convergence Rate*

The convergence rate is considered one of the most important parameters to judge the efficiency of online learning algorithms such as MAB algorithms; the faster the algorithm can converge, the better the reward that can be gained in just a few attempts. Hence, this section studies the convergence rate of the achievable total average system rate for our proposed PBA-MAB algorithms with different settings. Figures 5 and 6 show the convergence rate of the achievable total average system rate using 10 ETC gates while changing the number of UAVs to be 10 and 30. This can show the convergence rate for each algorithm under different network setup and different interference values. As shown in these figures, the horizontal axis indicates the count for rounds. Each algorithm runs its iterative process over counts till the algorithm converges toward a higher data rate. The proposed PBA-TS algorithm can converge faster than the PBA-UCB algorithm due to the fact that it uses Bayesian strategy over the posterior distributions of the reward. On the other hand, the PBA-UCB fluctuates during the few beginning rounds, and it takes more time to converge than the PBA-TS algorithm. Furthermore, it has a less convergence rate that the PBA-TS algorithm when both of the algorithms saturate by the end of the simulation rounds. These results can be concluded that both proposed PBA-MAB algorithms can deal with the adversarial network setup and selfish behavior of the UAVs. Hence, it means that every UAV learns how to select the most suitable transmission power value to enhance the overall system performance at every round. Furthermore, without loss of generality, it keeps an eye on the interference level while choosing this most suitable action.

**Figure 5.** Convergence of normalized average sum rate using 10 ETC gates and 10 UAVs.

**Figure 6.** Convergence of normalized average sum rate using 10 ETC gates and 30 UAVs.

#### **6. Conclusions**

In this paper, we have investigated the radio resource allocation for a CRN through DSA system to support a disaster surveillance system using UAVs wireless networks. To tackle this problem, we proposed two MAB algorithms, i.e., the PBA-UCB algorithm and the PBA-TS algorithm. The idea behind deploying MAB algorithms, as a class of RL algorithms, is the ability of MAB algorithms to solve online optimization problems with conflicting parameters that need to be jointly optimized. Since there is no information exchange between all UAVs, multi-player PBA-MAB algorithms were introduced to deal with this selfish configuration. Proposed PBA-MAB algorithms show outstanding performance over transmission using a random power value selection. Furthermore, the proposed algorithms showed a moderate convergence rate. The obtained results showed the capability of different MAB algorithms to deal with such problems with a high degree of randomness. Therefore, it can open the way for applying ML algorithms and more precise MAB algorithms to handle various wireless communication problems.

**Author Contributions:** Conceptualization, A.A. and E.M.M.; methodology, A.A. and G.K.T.; software, A.A.; validation, A.A. and G.K.T.; formal analysis, A.A.; investigation, A.A.; resources, A.A.; data curation, A.A.; writing—original draft preparation, A.A.; writing—review and editing, A.A. and G.K.T.; visualization, A.A.; supervision, E.M.M., G.K.T. and K.S.; project administration, G.K.T. and K.S.; funding acquisition, G.K.T. and K.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Acknowledgments:** We would like to acknowledge the KDDI Foundation International Students Scholarship and the Telecommunications Advancement Foundation for the financial support to complete this research.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Abbreviations**

The following abbreviations are used in this manuscript:



#### **References**


**Sang Ik Han 1,\* and Jaeuk Baek <sup>2</sup>**


**Abstract:** UAV equipped three-dimensional (3D) wireless networks can provide a solution for the requirements of 5G communications, such as enhanced Mobile Broadband (eMBB) and massive Machine Type Communications (mMTC). Especially, the introduction of an unmanned aerial vehicle (UAV) as a relay node can improve the connectivity, extend the terrestrial base station (BS) coverage and enhance the throughput by taking advantage of a strong air-to-ground line of sight (LOS) channel. In this paper, we consider the deployment and resource allocation of UAV relay network (URN) to maximize the throughput of user equipment (UE) within a cell, while guaranteeing a reliable transmission to UE outside the coverage of BS. To this end, we formulate joint UAV deployment and resource allocation problems, whose analytical solutions can be hardly obtained, in general. We propose a fast and practical algorithm to provide the optimal solution for the number of transmit time slots and the UAV relay location in a sequential manner. The transmit power at BS and UAV is determined in advance based on the availability of channel state information (CSI). Simulation results demonstrate that the proposed algorithms can significantly reduce the computational effort and complexity to determine the optimal UAV location and transmit time slots over an exhaustive search.

**Keywords:** UAV relay networks; UAV positioning; resource management; transmit time allocation

## **1. Introduction**

As one of the diverse emerging applications of unmanned aerial vehicles (UAV), it can be utilized as an aerial base station (BS) or an aerial relay node in three-dimensional (3D) wireless networks to satisfy the service requirement of the fifth generation (5G) communication [1–4], such as enhanced Mobile Broadband (eMBB) and massive Machine Type Communications (mMTC). Due to their mobility, versatile UAVs can adjust their locations to improve the connectivity among user equipment (UEs). Easy deployment of UAV enables to construct 3D networks efficiently with terrestrial networks, which can extend the service coverage or accommodate a large number of devices. By introducing a strong air-to-ground line of sight (LOS) channel, UAV can improve the capability of networks through diverse applications such as (1) emergency supports where communication services are unavailable [5], (2) Internet of Things (IoT) platforms where UAV can collect data from distributed IoT devices by saving their transmit power [6,7], (3) terrestrial network supports where UAVs can assist terrestrial BS transmission or device-to-device (D2D) transmissions [8].

In UAV networks, UAV positioning and radio resource allocation are key factors to extend cell coverage and to improve network performance. The locations of UAV BSs determine the coverage area and the number of UEs within its service area, whereas the resource allocation affects the overall performance of networks. Likewise for UAVs as relay node, its location and resource management (e.g., power control and transmit time

**Citation:** Han, S.I.; Baek, J. Optimal UAV Deployment and Resource Management in UAV Relay Networks. *Sensors* **2021**, *21*, 6878. https://doi.org/10.3390/s21206878

Academic Editor: Margot Deruyck

Received: 15 September 2021 Accepted: 8 October 2021 Published: 16 October 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

allocation) are critical to guarantee seamless connectivity to UEs outside BS service areas without performance degradation.

Many studies on UAV BS scenario (UBS) [9–13] have focused on finding the optimal 3D UAV location to take advantage of strong air-to-ground LOS channels. The study [9] uses a circle packing theory to determine optimal locations of UAVs, and maximizes energy efficiency of UAVs. Another study [10], adopts an optimal transfer theory to minimize total transmit power at UAVs, and investigates the effect of UAV height on power efficiency. The authors of [11] propose a spiral algorithm that sequentially determines the locations of multiple UAVs, and their UAV deployment algorithm is shown to outperform other heuristic schemes in terms of performance and computational time. Study [12] analyzes the effect of interference between UAVs and derives the optimal height of UAV that can maximize the coverage of UAVs. The authors of [13] assume a disaster scenario and propose UAV deployment considering the coexistence of aerial and terrestrial BS. Also, recent works [6,14,15] address more complicated problems of optimizing both UAV deployment and resource allocation to improve network performance. In [6], optimal locations of UAVs, cell association and power controls are provided to maximize energy efficiency in IoT communications. The study [14] optimizes both UAV locations and cell association to minimize network delay. Study [15] achieves capacity enhancement in heterogeneous networks by optimizing UAV deployment, load balancing and traffic offload.

Compared to studies on UBS, research on UAV relay network (URN) is in its infancy, and more efforts are required to optimize both UAV deployment and resource allocation for reliable transmission. Especially, the transmit period of each relay transmission link is one of the most crucial factors on URN because it affects both optimal UAV location and the network performance. Studies [16,17] analyze the performance of URN during two transmit time slots and find the optimal height of UAV [16] and UAV operation range numerically [17] to guarantee a reliable relay transmission. However, a relay transmission during two transmit time slots is sometimes insufficient when UE stays far from BS or requires a high level of quality of service (QoS). To deal with this, Ref. [18] adopts multiple transmit time slots (>2) and derives the maximum distance between UAVs to achieve a reliable relay transmission. However, it does not consider the optimal height of UAV and the performance analysis may not be applicable in all circumstances due to a fixed height of UAVs.

Research on URN to optimize both UAV deployment and resource allocation during multiple transmit time slots (>2) can be rarely found due to the following two main reasons; a relay transmission under time-varying channels and a difficulty on joint optimization of UAV deployment and resource allocation. It is impractical to optimize UAV deployment and resource allocation reflecting channel variations within a single time slot. So, joint optimization of UAV deployment and resource allocation for multiple transmit time slots is required even though they depend on each other.

In this paper, we consider with no constraint on the number of overall transmit time slots in URN to joint optimization for UAV deployment and resource allocation. Especially, the throughput of UE within a cell is maximized while guaranteeing a reliable relay transmission to UE in its extended service area. Multiple transmit time slots are utilized in URN, but the minimum number of overall transmit time slots is considered in a relay transmission for efficient resource management and without performance degradation of UEs within its original coverage due to reduced service opportunity by the BS. The formulated joint UAV deployment and transmit time allocation problem is a mixed-integer nonlinear problem, which is difficult to solve and requires huge computational effort to achieve global optimality. To tackle this, a time-varying channel condition is approximated to the channel expectation in URN. The joint optimization problem is decomposed in a sequential manner. As a solution, we propose the fast and practical UAV deployment and transmit time allocation (UDTA) algorithm, which consists of a novel time slot determination (TSD) algorithm and UAV deployment (UD) algorithm that determines the optimal number of transmit time slots and optimal UAV location, respectively. Transmit power at

BS and UAV is determined based on channel state information (CSI). To the best of our knowledge, no such work on URN to optimize UAV deployment and resource allocation for generalized multiple transmit time slots is conducted.

The paper is organized as follows. Section 2 describes a URN system model. In Section 3, the joint UAV deployment and transmit time allocation problem for throughput maximization of UEs is formulated. Section 4 optimizes the UAV location for given transmit time slots, and the optimal number of transmit time slots is determined in Section 5. Computational complexity of the proposed algorithm is analyzed in Section 6. Simulation results in Section 7 demonstrate the optimality and low complexity of the proposed algorithm, followed by the conclusion in Section 8.

#### **2. System Model**

We consider a downlink URN, where UAV is used as an aerial relay node to assist BS transmission in the networks, as shown in Figure 1. Two UEs are considered in URN; a UE at the cell edge, denoted as CU, and an isolated UE, denoted as IU. CU can receive a signal from BS through BS-to-CU link, whereas IU can only receive a signal by the relay transmission through BS-to-UAV-to-IU link due to severe pathloss attenuation or blockage between BS and IU. We assume that UAV operates in a half-duplex mode, and hence two transmission phases are considered. UAV receives data from BS in the first transmission phase, and forwards it to IU in the second transmission phase. Multiple time slots are allocated to each transmission phase to guarantee a reliable signal reception at both UAV and IU. Full channel state information (CSI) is assumed at BS, but not at UAV.

**Figure 1.** UAV relay network.

We assume that UAV is located at the height of *H* over the line between BS and IU to avoid unnecessary signal attenuation in relay transmissions. In addition, to investigate the effect of interference from UAV to the cell (especially, the worst case of maximum interference to the cell), CU is assumed to be located on the same line between BS and IU for analytical simplicity. Thereby, UAV and ground nodes can be projected onto a plane (i.e., *x*-*z* plane), which reduces to the line between BS and IU (i.e., *x*-axis). The location of ground node *v* (*v* ∈ {*BS*, *IU*, *CU*}) can be represented by its *x*-coordinate *xv*, and the UAV location, denoted as **U**, can be expressed as **U** = {*xU*, *H*}, where *x<sup>U</sup>* is the *x*-coordinate of UAV. *xCU* ≤ *x<sup>U</sup>* ≤ *xIU* is assumed to set a strong UAV-to-IU link. Note that the projected two-dimensional (2D) space includes the information on UAV height, so it can clearly reflect the air-to-ground LOS channel characteristics in URN.

#### *2.1. Channel Modeling and Assumption*

Conventional relay systems where all nodes are located on the ground consider only a ground-to-ground link to characterize channels between nodes, whereas URN consists of not only ground nodes (i.e., BS, CU and IU), but also an aerial node (i.e., UAV). Therefore, an air-to-ground link should be considered along with a ground-to-ground link to characterize channels in URN.

For a ground-to-ground link, a small-scale fading with a pathloss dependent largescale fading can be used to reflect a rich-scattering environment and a signal attenuation [19]. In URN, the channel between BS and CU is modeled as *hBS*,*CUd* −*β<sup>G</sup> BS*,*CU*, where *hBS*,*CU*∼*exp*(1) denotes the small-scale fading modeled by Rayleigh distribution, and *d* −*β<sup>G</sup> BS*,*CU* denotes the pathloss dependent large-scale fading. *dBS*,*CU* is the distance between BS and CU, and *β<sup>G</sup>* denotes a pathloss exponent in a ground-to-ground link.

For an air-to-ground link, strong signals in LOS and Non-LOS (NLOS) links dominate the channel characteristics and reduce the randomness of channel fluctuations. Hence, a small-scale fading can be neglected, and only a pathloss dependent large-scale fading in LOS and NLOS links is considered to model an air-to-ground channel in URN [8]. Ref. [20] derives LOS probability of an air-to-ground link between UAV *U* and ground node *v* as

$$\mathbf{p}\_{\rm los}^{\rm v} = F(\theta\_{\rm Ul,v}) = \frac{1}{1 + \mathbb{C} \exp(-\mathcal{B}[\theta\_{\rm Ul,v} - \mathbb{C}])} \,\prime \tag{1}$$

where *θU*,*<sup>v</sup>* is an elevation angle between UAV *U* and ground node *v*, as shown in Figure 1. *B* and *C* are coefficients that reflect the characteristics of the environment, such as rural, suburban and urban areas. Compared to a LOS link, an NLOS link experiences an additional signal attenuation of *ς* [dB]. Therefore, an air-to-ground channel between UAV *U* and ground node *v* can be modeled as *d* −*β<sup>A</sup> v*,*U* (p *v los* + *ς*p *v nlos*), where *dv*,*<sup>U</sup>* is the distance between UAV *U* and ground node *v*, *β<sup>A</sup>* is a pathloss exponent of air-to-ground link, and p *v los* and p *v nlos* are the LOS and NLOS probabilities of the link between UAV *U* and ground node *v* with p *v nlos* = 1 − p *v los*.

We assume that the channel condition between BS and CU is better than that between BS and UAV (i.e., *hBS*,*CUd* −*β<sup>G</sup> BS*,*CU* > *d* −*β<sup>A</sup> BS*,*U* (p *BS los* + *ς*p *BS nlos*)). The distance between BS and UAV is much longer than that between BS and CU (i.e., *dBS*,*<sup>U</sup>* ≥ *dBS*,*CU*), because UAV should be located close to IU for a reliable UAV-to-IU link. Due to a long distance between BS and UAV, a pathloss attenuation becomes dominant in the channel condition of LOS link between BS and UAV. Hence, the channel condition between BS and UAV gets worse than that between BS and CU [17].

#### *2.2. Transmission Schemes in URN*

Based on the result of [17] that a non-orthogonal transmission at BS outperforms an orthogonal transmission in URN in terms of overall throughput of UEs in the cell, we adopt the non-orthogonal transmission at BS in the first transmission phase, where BS transmits a superposition-coded signal to CU and UAV simultaneously [21]. On the other hands, in the second transmission phase, the orthogonal transmission is used at BS and UAV, where BS transmits a signal to CU, and UAV forwards the received data from BS in the first transmission phase to IU. In the rest of this paper, the non-orthogonal transmission phase (NOTP) and the orthogonal transmission phase (OTP) are used to represent the first and second transmission phase, respectively.

#### *2.3. Power Control Strategy and Overall Transmit Time Slots*

A pairwise power control [22] is adopted at BS during entire transmit time slots of URN to guarantee a required QoS in the cell while supporting a relay transmission to IU. In NOTP, BS allocates *PBS*,*CU* = *ρh* −1 *BS*,*CUd β<sup>G</sup> BS*,*CU* transmit power to BS-to-CU link to guarantee a received signal power of *ρ* at CU, and the remaining transmit power at BS, *PBS*,*<sup>U</sup>* (i.e., *PBS*,*<sup>U</sup>* = *P* max *BS* − *PBS*,*CU*), is allocated to BS-to-UAV link, where *P* max *BS* is a maximum transmit

power at BS. As a full CSI is available at BS, BS can determine *PBS*,*CU* depending on the channel condition in a BS-to-CU link, and then *PBS*,*<sup>U</sup>* can be determined. Similarly, in OTP, BS holds a pairwise power control to guarantee the received signal power of *ρ* at CU. On the other hand, since UAV has no CSI, UAV uses its maximum transmit power, *P* max *U* , to provide seamless communication service to IU.

As illustrated in Figure 1, URN consists of overall *n* = *Kno* + *K<sup>o</sup>* time slots, where NOTP is composed of *Kno* time slots with a time index *kno* ∈ **Kno** = {1, . . . , *Kno*} and OTP has *K<sup>o</sup>* time slots with a time index *k<sup>o</sup>* ∈ **K<sup>o</sup>** = {*Kno* + 1, . . . , *Kno* + *Ko*}. (*Kno*, *Ko*) denotes a pair of time slots for each transmission phase.

#### **3. URN: UAV Relay Network**

#### *3.1. Throughput of CU and IU*

At a time slot *kno* (∀*kno* ∈ **Kno**) in NOTP, BS transmits a superposition-coded signal to CU and UAV simultaneously with transmit power *PBS*,*CU*(*kno*) = *ρh* −1 *BS*,*CU*(*kno*)*d β<sup>G</sup> BS*,*CU* and *PBS*,*U*(*kno*) = *P* max *BS* − *PBS*,*CU*(*kno*), as explained in Section 2.2. CU can perform the successive interference cancellation (SIC) [21] to eliminate an interference from BS-to-UAV link due to the channel assumption in Section 2.1 (i.e., *hBS*,*CU*(*kno*)*d* −*β<sup>G</sup> BS*,*CU* ≥ *d* −*β<sup>A</sup> <sup>U</sup>*,*BS*(p *BS los* + *ς*p *BS nlos*)). On the other hand, UAV cannot eliminate an interference from BS-to-CU link. Hence, the corresponding signal to interference plus noise ratios (SINRs) at CU, *ψ no CU*(*kno*), and UAV, *ψU*(*kno*), at a time slot *kno* in NOTP can be expressed as

$$
\psi\_{\rm CU}^{\rm no}(k\_{\rm no}) = \frac{\rho}{\sigma\_{\rm CU}^2} \,\,\,\,\tag{2}
$$

$$\psi\_{II}(k\_{no}) = \frac{P\_{BS,II}(k\_{no}) d\_{BS,II}^{-\beta\_A} (\mathbf{p}\_{los}^{BS} (1 - \boldsymbol{\zeta}) + \boldsymbol{\zeta})}{P\_{BS,CLI}(k\_{no}) d\_{BS,II}^{-\beta\_A} (\mathbf{p}\_{los}^{BS} (1 - \boldsymbol{\zeta}) + \boldsymbol{\zeta}) + \sigma\_{\mathcal{U}}^2} \tag{3}$$

where *d* −*β<sup>A</sup> BS*,*U* (p *BS los*(1 − *ς*) + *ς*) in (3) represents the channel in BS-to-UAV link with LOS probability p *BS los*. *σ* 2 *i* indicates the variance of additive white Gaussian noise (AWGN) at a node *i*.

At a time slot *k<sup>o</sup>* (∀*k<sup>o</sup>* ∈ **Ko**) in OTP, BS transmits a signal to CU, and UAV relays the received data from BS in NOTP to IU. Therefore, the SINRs at CU, *ψ o CU*(*ko*), and IU, *ψIU*(*ko*), at a time slot *k<sup>o</sup>* in OTP can be given by

$$
\psi\_{\rm CLI}^o(k\_o) = \frac{\rho}{I\_{\rm IL,CLI} + \sigma\_{\rm CLI}^2} \,\,\,\tag{4}
$$

$$\psi\_{IU}(k\_o) = \frac{P\_{ll}^{\max} d\_{IL, III}^{-\beta\_A} (\mathbb{p}\_{los}^{III}(1-\varsigma) + \varsigma)}{\sigma\_{IU}^2},\tag{5}$$

where *IU*,*CU* , *P* max *U d* −*β<sup>A</sup> <sup>U</sup>*,*CU*(p *CU los* (1 − *ς*) + *ς*) in (4) represents the interference from UAV to CU. IU does not receive any interference from BS in OTP due to severe pathloss attenuation in BS-to-IU link. p *CU los* and p *IU los* are the LOS probabilities of UAV-to-CU link and UAV-to-IU link, respectively. Note that we assume that the adjacent cell utilizes different frequency bands from that of the cell of interest to avoid the inter-cell interference, and that other interference received at CU is negligible except that from the link between UAV and IU in OTP, which is dominant.

From (2), (4) and (5), we can find that the SINR at CU in both transmission phases and that at IU in OTP are time-invariant (i.e., *ψ no CU*(*kno*) = *ψ no CU*, *ψ o CU*(*ko*) = *ψ o CU* and *ψIU*(*ko*) = *ψIU*, ∀*kno*, *ko*) due to the pairwise power control and channel characteristics of air-to-ground LOS link. However, the SINR at UAV in NOTP (i.e., (3)) is time-varying for each time slot *kno* because *PBS*,*U*(*kno*) and *PBS*,*CU*(*kno*) vary with the channel condition of BS-to-CU link.

Based on the Shannon capacity theorem [19], the amount of received data at CU, *r* ∑ *CU*(|**Kno**|), and at UAV, *r* ∑ *U* (|**Kno**|), in NOTP (∀*kno* ∈ **Kno**) can be obtained using (2) and (3) as

$$r\_{\mathbb{C}II}^{\Sigma}(|\mathbf{K\_{no}}|) = \sum\_{k\_{no}=1}^{K\_{no}} f(\boldsymbol{\psi}\_{\mathbb{C}II}^{no}) = f(\boldsymbol{\psi}\_{\mathbb{C}II}^{no}) \mathbf{K\_{no}} \tag{6}$$

$$r\_{\!\!\!U}^{\Sigma}(|\mathbf{K\_{no}}|) = \sum\_{k\_{no}=1}^{K\_{no}} f(\psi\_{\!\!\!U}(k\_{no})),\tag{7}$$

respectively, where *f*(*x*) , log(1 + *x*). (6) follows that each time slot has a unit length and *ψ no CU* is a time-invariant.

Similarly, the amount of received data at CU, *r* ∑ *CU*(|**Ko**|), and at IU, *r* ∑ *IU*(|**Ko**|), in OTP (∀*k<sup>o</sup>* ∈ **Ko**) can be obtained using (4) and (5) as

$$r\_{\mathbb{C}II}^{\Sigma}(|\mathbf{K\_{o}}|) = \sum\_{k\_{o}=1}^{K\_{o}} f(\boldsymbol{\psi}\_{\mathbb{C}II}^{o}) = f(\boldsymbol{\psi}\_{\mathbb{C}II}^{o})\mathbf{K\_{o}}\tag{8}$$

$$r\_{II}^{\Sigma}(|\mathbf{K\_{0}}|) = \sum\_{k\_{o}=1}^{K\_{o}} f(\psi\_{II}) = f(\psi\_{II})\mathbb{K}\_{o}.\tag{9}$$

For the overall time slots *n*, the average data rate of CU, *RCU* [bps/Hz], can be defined by (6) and (8) as

$$\mathcal{R}\_{\rm CU} = \frac{1}{n} (r\_{\rm CU}^{\Sigma}(|\mathbf{K\_{no}}|) + r\_{\rm CU}^{\Sigma}(|\mathbf{K\_{0}}|)),\tag{10}$$

and the total amount of received data at IU via relay transmission, *DIU* [bit/Hz], can be obtained by (7) and (9) as

$$D\_{IU} = \min(r\_{UI}^{\Sigma}(|\mathbf{K\_{no}}|), r\_{UI}^{\Sigma}(|\mathbf{K\_{0}}|)), \tag{11}$$

where (11) follows that the amount of transmitted data through a forwarding link (i.e., UAVto-IU link) cannot exceed that of received data at UAV via backhaul link (i.e., BS-to-UAV link) in a relay transmission.

#### *3.2. Problem Formulation: JUDTAP*

The throughput maximization of UEs in URN is equivalent to maximizing *RCU* while delivering the required amount of data to IU, *Dreq*, during the minimum number of overall time slots *n* with respect to UAV location **U** = {*xU*, *H*} and transmit time slots **K** = {*Kno*, *Ko*}. Hence, the multi-objective optimization problem, denoted as joint UAV deployment and transmit time allocation problem (**JUDTAP**), can be formulated as (12)

$$\textbf{JUDTAP: } \max\_{\textbf{U}, \textbf{K}} \tag{12}$$

$$\text{s.t.}\tag{12a} \qquad\qquad\qquad D\_{\text{III}} \ge D\_{\text{req}}\tag{12a}$$

$$
\mathfrak{n} = \mathcal{K}\_{\mathfrak{no}} + \mathcal{K}\_{\mathfrak{o}} \tag{12b}
$$

$$\mathbf{K}\_{\rm no} \ge \mathbf{1}, \mathbf{K}\_{\rm o} \ge \mathbf{1} \tag{12c}$$

$$k\_{\rm no} \in \mathbf{K\_{no}}, k\_{\rm o} \in \mathbf{K\_{0}} \tag{12d}$$

$$
\mathfrak{x}\_{\rm CU} < \mathfrak{x}\_{\rm U} < \mathfrak{x}\_{\rm ULI}, \\
H \ge \mathbf{0} \tag{12e}
$$

where multi-objective function implies that the overall number of time slots *n* should be minimized before the average data rate of *RCU* is maximized, as explained in Section 1. (12a) shows the requirement on the amount of received data at IU. (12b)–(12d) represent the constraints on the number of time slots in URN, and (12e) indicates the possible operation range that UAV can be deployed.

The **JUDTAP** is a mixed-integer nonlinear programming [23] and its combinatorial nature makes the bulk of computational load to find a global optimal solution (i.e., **Uopt** , **Kopt**). In addition, mutual-influence between UAV location and transmit periods in both transmission phases makes it more difficult to be solved. For example, to maximize *RCU*, UAV should be located close to IU to reduce interference from UAV to CU (i.e., *IU*,*CU* in (4)). However, it may increase *Kno*, eventually *n*, to guarantee a data transmission in BS-to-UAV link (i.e., to satisfy (12a)). Therefore, the **JUDTAP** cannot be solved by optimizing **U** and **K** independently due to their close relationships.

One approach to solve the **JUDTAP** is updating **U** and **K** iteratively. However, these procedures are not practical and cannot guarantee a convergence to global optimal solution. Therefore, in this paper, we propose a fast and practical algorithm that finds **Uopt** and **Kopt** in a sequential manner;


The details on each step will be presented in Sections 4 and 5.

#### *3.3. Analysis on Relay Transmission during Multiple Time Slots*

As explained in the previous section, the relay transmission during multiple transmit time slots makes it difficult to analyze the constraint on BS-to-UAV link in (12a) (i.e., *r* ∑ *U* (|**Kno**|) ≥ *Dreq*). More specifically, a time-varying small-scale fading in BS-to-CU link changes *ψU*(*kno*) in (3) and *r* ∑ *U* (|**Kno**|) in (7) for each time slot *kno*, so it is challenging to find optimal **U** and **K** that satisfy *r* ∑ *U* (|**Kno**|) ≥ *Dreq*. To cope with this issue, we introduce the expected channel model in a ground-to-ground link (i.e., BS-to-CU link in URN) because the effect of random fluctuation by small-scale fading during multiple time slots is negligible and it is impractical to adjust the location of UAV for the short period of each time slot. Therefore, E[*hBS*,*CU*]*d* −*β<sup>G</sup> BS*,*CU* is used to model the channel condition of BS-to-CU link, where E[*hBS*,*CU*] represents the expectation of small-scale fading *hBS*,*CU*(*k*), ∀*k* ∈ **Kno** ∪ **Ko**. The expected channel model affects the pairwise power control at BS, and the SINR at UAV in (3) as follows.

The pairwise power control at BS is simplified to a fixed power control. In NOTP, BS allocates *P*¯ *BS*,*CU* = *ρ*E[*hBS*,*CU*] <sup>−</sup>1*d β<sup>G</sup> BS*,*CU* and *<sup>P</sup>*¯ *BS*,*<sup>U</sup>* = *P* max *BS* <sup>−</sup> *<sup>P</sup>*¯ *BS*,*CU* to BS-to-CU link and BS-to-UAV link, respectively. Similarly, in OTP, BS allocates *P*¯ *BS*,*CU* = *ρ*E[*hBS*,*CU*] <sup>−</sup>1*d β<sup>G</sup> BS*,*CU* to BS-to-CU link.

The time-varying SINR at UAV in NOTP (i.e., *ψU*(*kno*), ∀*kno* in (3)) can be replaced into a time-invariant *ψ*¯*<sup>U</sup>* due to *P*¯ *BS*,*CU* and *P*¯ *BS*,*U*. The amount of received data at UAV in (7) can be simplified as *r*¯ ∑ *U* (|**Kno**|), where *r*¯ ∑ *U* (|**Kno**|) = *<sup>f</sup>*(*ψ*¯*U*)*Kno*.

Therefore, the constraint (12a) can be expressed as

$$\begin{split} \mathcal{D}\_{II} &\geq \mathcal{D}\_{req} \\ &\Leftrightarrow \frac{\mathsf{P}\_{BS,II} d\_{BS,II}^{-\beta\_A} (\mathsf{p}\_{los}^{BS} (1 - \emptyset) + \emptyset)}{\mathsf{P}\_{BS,CLI} d\_{BS,II}^{-\beta\_A} (\mathsf{p}\_{los}^{BS} (1 - \emptyset) + \emptyset) + \sigma\_{\mathsf{II}}^2} \geq 2^{\frac{D\_{nq}}{K\_{n0}}} - 1, \end{split} \tag{13}$$

$$\text{and}\ P\_{\text{UI}}^{\text{max}}\frac{d\_{\text{UI},\text{III}}^{-\beta\_A}}{\sigma\_{\text{III}}^2}(\mathbf{p}\_{\text{los}}^{\text{III}}(1-\emptyset)+\emptyset) \ge 2^{\frac{D\_{\text{req}}}{K\_0}} - 1,\tag{14}$$

where (13) and (14) are obtained from time-invariant *r*¯ ∑ *U* (|**Kno**|) and (9), respectively.

After rearranging above two inequalities, the left terms of (13) and (14) can be expressed with respect to LOS probability, which are given as

$$F(\theta\_{\rm{LI,BS}}) \ge X\_{\rm{BS}}(d\_{\rm{LI,BS}}, \mathcal{K}\_{\rm{no}}),\tag{15}$$

$$F(\theta\_{\rm{UL,III}}) \ge X\_{\rm{UI}}(d\_{\rm{UL,III}}, \mathcal{K}\_o)\_\prime \tag{16}$$

where

$$X\_{BS}(d\_{U,BS}, K\_{\text{no}}) \stackrel{\Delta}{=} \frac{(2^{\frac{D\_{\text{req}}}{K\_{\text{no}}}} - 1) \frac{\sigma\_{\text{U}}^2}{d\_{U,BS}^{-\overline{\beta\_A}}(1 - \varsigma)}}{(P\_{BS}^{\text{max}} - 2^{\frac{D\_{\text{req}}}{K\_{\text{no}}}} \bar{P}\_{BS,CM})} - \frac{\varsigma}{1 - \varsigma'} \tag{17}$$

$$\chi\_{II}(d\_{U,III}, \mathcal{K}\_o) \stackrel{\Delta}{=} \frac{(2^{\frac{Dnq}{\mathcal{K}\_o}} - 1) \frac{\sigma\_{II}^2}{d\_{U,III}^{-\beta\_A}(1-\xi)}}{P\_{UI}^{\max}} - \frac{\xi}{1-\xi}.\tag{18}$$

Based on (15)–(18), a sequential algorithm for Steps 1 and 2 is derived in the following sections.

#### **4. UAV Deployment**

In this section, we investigate the effect of UAV location on the network performance, and derive the optimal UAV location, **Uopt**. For this purpose, we assume that the transmit time allocation is given (i.e., (*Kno*, *Ko*)) and guarantees the existence of UAV locations that can provide a reliable relay transmission to IU. We utilize distance and elevation angle in *x*-*z* plane to reflect the channel characteristics between UAV and ground node in Section 2.1. **Θ** = {*θU*,*v*|*v* ∈ {*BS*, *CU*, *IU*}} and **D** = {*di*,*<sup>j</sup>* |*i*, *j* ∈ {*BS*, *U*, *IU*, *CU*}} represent sets of elevation angles and distances respectively, and the UAV location **U** = {*xU*, *h*} in (12) can be expressed as **U** = {*dU*,*v*, *θU*,*v*}.

#### *4.1. UAV Deployment Problem*

Based on (2) and (4), the maximization of multi-objective function for a given time allocation in **JUDTAP** is equivalent to the minimization of interference from UAV to CU (i.e., minimization of *IU*,*CU* in (4)). The constraint (12a) can be replaced by (15) and (16), and the constraints (12b)–(12d) can be omitted because the transmit time allocation is given.

Therefore, for a given time allocation, **JUDTAP** reduces to UAV deployment problem (**UDP**), which can be formulated as (19) with respect to **Θ** and **D**.

$$\textbf{UDP: } \underset{\textbf{\Theta}, \textbf{D}}{\text{min}} \tag{19} \\ \qquad \qquad \qquad d\_{\text{ul,CLI}}^{-\beta\_A} (F(\theta\_{\text{UL,CLI}})(1-\varsigma) + \varsigma) \tag{19}$$

$$\text{s.t.} \tag{18.1} \tag{18.2} \\ \text{s.t.} \quad \text{Var}(\theta\_{\text{UL,BS}} | \mathbf{K}\_{\text{no}})\_{\text{\textquotedblleft}}$$

$$F(\theta\_{U,III}) \ge X\_{II}(d\_{U,III}|K\_o)\_\prime \tag{19b}$$

$$d\_{BS,CLI} \le d\_{IL,BS} \cos(\theta\_{IL,BS}) \le d\_{BS,III}.\tag{19c}$$

The objective function is given by *IU*,*CU*/*P* max *U* . (19a) and (19b) are the constraints of BS-to-UAV link and UAV-to-IU link, respectively, and derived from (15)–(18) by replacing *XBS*(*dU*,*BS*, *Kno*) and *XIU*(*dU*,*IU*, *Ko*) with *XBS*(*dU*,*BS*|*Kno*) and *XIU*(*dU*,*IU*|*Ko*) due to the assumption on time allocation. (19c) represents the UAV operation range that UAV can be deployed (i.e., (12e) of **JUDTAP**).

The optimal solution of **UDP** (i.e., **Θopt** , **Dopt**) determines the optimal UAV location, **Uopt**, for a given time allocation. However, all the elements of **Θ** and **D** should be considered simultaneously to find **Uopt**, so no closed-form solution to **UDP** exists. Therefore, we propose UAV deployment (UD) algorithm, which updates the UAV location iteratively to reach **Uopt** based on search areas and directions. In the following section, we define search areas and directions for a given UAV location, and investigate them to update the UAV location toward **Uopt** .

#### *4.2. Search Areas and Directions*

For a given location of UAV *U*, we define *search areas* and *directions* using lines and circles as shown in Figure 2, where UAV and ground nodes are placed on the *x*-*z* plane as explained in Section 2. UAVs on a line have the same elevation angle of *θU*,*<sup>v</sup>* from ground node *v* (i.e., *v* ∈ {*CU*, *BS*, *IU*}), while those on a circle have equal distance of *dU*,*<sup>v</sup>* from

ground node *v* to UAV. The line and circle inside dashed-rectangle (i.e., UAV operation range) in Figure 2b define the search areas considering the interference from UAV to CU, while those in Figure 2c,d represent the search directions based on each relay transmission link. All the search areas and directions for a given UAV location are integrated in Figure 2a. The search areas and directions change when a given UAV location is updated. Therefore, we investigate search areas and directions for the given location of UAV *U*, **U***U*, to find the updated location of UAV *U*0 , **U***U*<sup>0</sup> .

**Figure 2.** Search areas and directions for updating **U***<sup>U</sup>* toward **U***U*<sup>0</sup> . (**a**) The integrated search areas and directions for **U***U*. (**b**) Four search areas for **U***U*<sup>0</sup> based on the objective function of (19) (i.e., link between UAV and CU). (**c**) Four search directions for **U***U*<sup>0</sup> based on the constraint (19a) (i.e., link between BS and UAV). (**d**) Four search directions for **U***U*<sup>0</sup> based on the constraint (19b) (i.e., link between UAV and IU).

#### 4.2.1. Search Area

The search areas ➀–➃ in Figure 2b are divided by the line and circle based on CU location and **U***U*. UAV *U* should move towards the area where the interference from UAV *U*0 to CU (i.e., objective function in (19)) decreases. When UAV *U* moves into Area ➀, both *F*(*θU*,*CU*) and *d* −*β<sup>A</sup> <sup>U</sup>*,*CU* in (19) decreases due to smaller *θU*,*CU* and longer *dU*,*CU*. Any UAV locations within Area ➀ always reduce the UAV-to-CU interference, hence, Area ➀ is a potential search area for **U***U*<sup>0</sup> . On the other hand, all UAV locations in Area ➃ increases both *F*(*θU*,*CU*) and *d* −*β<sup>A</sup> <sup>U</sup>*,*CU*. Therefore, they cannot decrease the objective function in (19), thereby excluding Area ➃ from potential search areas.

The search areas ➁ and ➂ possess the uncertainty on the interference from UAV *U*0 to CU. UAV locations within Area ➁ decrease *d* −*β<sup>A</sup> <sup>U</sup>*,*CU*, but increase the *F*(*θU*,*CU*), while those within Area ➂ results in the opposite. However, when UAV *U*0 is within Area ➁, a pathloss attenuation dominates the LOS connection in BS-to-CU link. More specifically, *F*(*θU*<sup>0</sup> ,*CU*) is close to one due to large *θU*<sup>0</sup> ,*CU* [20], but *dU*<sup>0</sup> ,*CU* can be sufficiently large so that *d* −*β<sup>A</sup> U*0 ,*CU* becomes a dominant factor in the objective function of (19). Therefore, UAV locations within Area ➁ can reduce the UAV-to-CU interference compared to given UAV location **U***U*. On the other hand, UAV locations within Area ➂ makes more severe UAV-to-CU

interference due to proximity of their locations, hence, Area ➂ cannot be the potential search area.

• Observation 1: The objective function in (19) can be decreased by moving UAV into Area ➀ or ➁ in Figure 2b.

The Observation 1 is directly applicable for a feasible UAV location (which satisfies the constraints (19a) and (19b)) to reduce the UAV-to-CU interference. On the other hands, when UAV location cannot satisfy the constraints (i.e., infeasible UAV location), Observation 1 and the channel condition of relay transmission links should be considered simultaneously to find a feasible UAV location and to reduce the objective function in (19). In the following subsection, we examine the search directions to satisfy the constraints and decrease the objective function in (19) simultaneously.

#### 4.2.2. Search Directions

Although the potential search areas that can be used to find **U***U*<sup>0</sup> from **U***<sup>U</sup>* are described on Observation 1, there is no clue on **U***U*<sup>0</sup> within the potential area. Hence, the points on a line or a circle within the potential search area are utilized to determine **U***U*<sup>0</sup> . In particular, search directions in Figure 2c,d are examined to move UAV *U* into the feasible UAV location **U***U*<sup>0</sup> when **U***<sup>U</sup>* cannot satisfy the constraints (19a) or (19b).

From the constraints, there are four cases (i.e., *C*1, *C*¯ <sup>1</sup>, *C*<sup>2</sup> and *C*¯ <sup>2</sup>) to be considered at **U***U*. *C*<sup>1</sup> and *C*<sup>2</sup> indicate that **U***<sup>U</sup>* satisfies (19a) and (19b) respectively, while *C*¯ <sup>1</sup> and *C*¯ 2 represent that it does not. Each case follows a different search direction in Figure 2c for *C*<sup>1</sup> and *C*¯ <sup>1</sup> and in Figure 2d for *C*<sup>2</sup> and *C*¯ 2.

*C*<sup>1</sup> and *C*<sup>2</sup> indicate that **U***U*<sup>0</sup> can be found based on Observation 1 to decrease the objective function in (19). In the case of *C*1, Direction ➀ or ➁ in Figure 2c should be selected because they are within the potential search areas ➀ and ➁ in Figure 2b (see Figure 2a). For the same reason, Direction ➀, ➁ or ➃ of Figure 2d should be selected in case of *C*2.

*C*¯ <sup>1</sup> and *C*¯ <sup>2</sup> represent that **U***<sup>U</sup>* cannot satisfy the constraints (19a) and (19b) due to poor channel conditions in BS-to-UAV and UAV-to-IU link, hence resulting in *F*(*θU*,*BS*) < *XBS*(*dU*,*BS*|*Kno*) and *F*(*θU*,*IU*) < *XIU*(*dU*,*IU*|*Ko*) respectively. Therefore, to find the feasible UAV location **U***U*<sup>0</sup> , *θU*,*<sup>v</sup>* should be increased along a circle or *dU*,*<sup>v</sup>* should be decreased along a line in Figure 2c (when *v* = *BS*) and Figure 2d (when *v* = *IU*). This is because *F*(*θU*,*v*) *v* ∈ {*BS*, *IU*}, *XBS*(*dU*,*BS*|*Kno*) and *XIU*(*dU*,*IU*|*Ko*) are increasing functions of *θU*,*v*, *dU*,*BS* and *dU*,*IU*, respectively.

In the case of *C*¯ <sup>1</sup>, the movement of UAV *U* along Direction ➃ in Figure 2c decreases *dU*,*BS*, while that along Direction ➁ increases *θU*,*BS*. Direction ➃ always provides the feasible UAV location **U***U*<sup>0</sup> that satisfies (19a), while Direction ➁ could find it only when *XBS*(*dU*,*BS*|*Kno*) ≤ 1 because max *F*(*θU*<sup>0</sup> ,*BS*) = 1. Similarly, in the case of *C*¯ <sup>2</sup>, the movement of UAV *U* along Direction ➁ in Figure 2d increases *θU*,*IU*, while that along Direction ➃ in Figure 2d decreases *dU*,*IU*. Direction ➁ could find the feasible UAV location **U***U*<sup>0</sup> that satisfies (19b) only when *XIU*(*dU*,*IU*|*Ko*) ≤ 1.


#### 4.2.3. Combined Search Directions

The Observations 2-5 should be integrated to consider the constraints (19a) and (19b) together. First, Observations 2 and 4 can be used to decrease the objective function in (19) for the case of *<sup>C</sup>*1∩*C*<sup>2</sup> where *<sup>C</sup>*1∩*C*<sup>2</sup> indicates that **<sup>U</sup>***<sup>U</sup>* is a feasible UAV location and satisfies both (19a) and (19b). As in Figure 2, the movement along Direction ➀ in Figure 2c and Direction ➁ in Figure 2d decreases both *F*(*θU*,*CU*) and *d* −*β<sup>A</sup> <sup>U</sup>*,*CU* in (19). They have the same properties, but differ on moving along line and circle, respectively. Similarly, the movement along Direction ➁ in Figure 2c and Direction ➀ in Figure 2d decreases *d* −*β<sup>A</sup> <sup>U</sup>*,*CU* and achieves large *θU*,*CU*, resulting in *F*(*θU*,*CU*) ≈ 1. Either direction that has same properties can be selected for the movement towards **Uopt**. However, it is preferable to select search direction moving along a line (i.e., Direction ➀ in Figure 2c and Direction ➀ in Figure 2d) to reduce computation time, which will be discussed in Section 4.3. Note that, UAV locations along Direction ➃ in Figure 2d could break the constraint (19a) due to insufficient height of UAV and small *θU*,*BS*, therefore, it is not an option for *C*1∩*C*2.

• Observation 6 (*C*1∩*C*2): When **<sup>U</sup>***<sup>U</sup>* satisfies both the constraints (19a) and (19b), **<sup>U</sup>***U*<sup>0</sup> will be found along Direction ➀ in Figure 2c or Direction ➀ in Figure 2d.

When **U***<sup>U</sup>* is infeasible location, there are three cases (i.e., *C*¯ <sup>1</sup>∩*C*2, *<sup>C</sup>*1∩*C*¯ <sup>2</sup> and *C*¯ <sup>1</sup>∩*C*¯ 2) to be considered. However, it is clear that *C*¯ <sup>1</sup>∩*C*<sup>2</sup> <sup>⊂</sup> *<sup>C</sup>*¯ <sup>1</sup> and *<sup>C</sup>*1∩*C*¯ <sup>2</sup> <sup>⊂</sup> *<sup>C</sup>*¯ 2, therefore, Observations 3 and 5 will be solutions for each case.


Lastly, *C*¯ <sup>1</sup>∩*C*¯ <sup>2</sup> indicates that **U***<sup>U</sup>* cannot satisfy both constraints on relay transmission links. Unfortunately, there is no solution based on Observations 3 and 5. For example, if UAV *U* moves along Direction ➃ in Figure 2c to make (19a) satisfied (which is opposite to Direction ➁ in Figure 2d suggested in Observation 5 for the satisfaction of (19b)), it causes *dU*,*IU* < *dU*<sup>0</sup> ,*IU* and *θU*,*IU* > *θU*<sup>0</sup> ,*IU*, thereby resulting in *F*(*θU*<sup>0</sup> ,*IU*) < *F*(*θU*,*IU*) < *XIU*(*dU*,*IU*|*Ko*) < *XIU*(*dU*<sup>0</sup> ,*IU*|*Ko*) (i.e., (19b) is still not satisfied). Similarly, other search directions on Observations 3 and 5 also cannot simultaneously improve both relay transmission links, so we declare that no feasible UAV location exists for the case of *C*¯ <sup>1</sup>∩*C*¯ <sup>2</sup>. To deal with this issue, more transmit time slots should be allocated to the relay transmission, which will be discussed in Section 5.

• Observation 9 (*C*¯ <sup>1</sup>∩*C*¯ <sup>2</sup>): When **U***<sup>U</sup>* cannot satisfy both constraints (19a) and (19b), no feasible UAV location **U***U*<sup>0</sup> exists without allocating more transmit time slots to relay transmission.

#### *4.3. UAV Deployment (UD) Algorithm*

In this section, we propose a novel UAV deployment (UD) algorithm for a given time allocation based on search directions. The constraints (19a) and (19b) are described graphically in Figure 3a as parabolic curves *C<sup>a</sup>* and *C<sup>b</sup>* , which are drawn with an equality in (19a) and (19b) respectively. The UAV locations inside *C<sup>a</sup>* and *C<sup>b</sup>* satisfy the constraints (19a) and (19b) respectively, therefore, areas for *<sup>C</sup>*<sup>1</sup> <sup>∩</sup> *<sup>C</sup>*2, *<sup>C</sup>*<sup>1</sup> <sup>∩</sup> *<sup>C</sup>*¯ <sup>2</sup>, *C*¯ <sup>1</sup> <sup>∩</sup> *<sup>C</sup>*2, and *<sup>C</sup>*¯ <sup>1</sup> <sup>∩</sup> *<sup>C</sup>*¯ <sup>2</sup> (i.e., *AC*1∩*C*<sup>2</sup> , *<sup>A</sup>C*1∩*C*¯ 2 , *AC*¯ <sup>1</sup>∩*C*<sup>2</sup> , and *AC*¯ <sup>1</sup>∩*C*¯ 2 ) can be defined as in Figure 3a. In particular, *AC*1∩*C*<sup>2</sup> (see dashed area in Figure 3a) is of special interest to find the optimal UAV location **Uopt** because it indicates the feasible UAV locations and always includes **Uopt**. The **Uopt** will be determined to be on either *C<sup>a</sup>* or *C<sup>b</sup>* within *AC*1∩*C*<sup>2</sup> (refer to Section 4.6 in [24]), especially near the upper point of intersection of *C<sup>a</sup>* and *C<sup>b</sup>* to minimize the interference between UAV and CU. Note that *AC*1∩*C*<sup>2</sup> always exists due to the assumption at the beginning of Section 4

that the given transmit time allocation (*Kno*, *Ko*) guarantees the existence of feasible UAV locations.

**Figure 3.** The UAV deployment (UD) algorithm. (**a**) Three areas for **U***<sup>U</sup>* with respect to constraints (19a) and (19b). **U***opt* is within *AC*1∩*C*<sup>2</sup> (specifically, either *A<sup>C</sup>* = <sup>1</sup> ∩*C*<sup>2</sup> or *AC*1∩*<sup>C</sup>* = 2 ). (**b**) [Step 1] on UD algorithm to determine **U***U*<sup>0</sup> = **U***<sup>f</sup>* from **U***<sup>U</sup>* = **U***ini* . (**c**) [Step 2] on UD algorithm to update **U***<sup>f</sup>* until **U***U*<sup>0</sup> is determined outside *AC*1∩*C*<sup>2</sup> . (**d**) [Step 3] on UD algorithm to determine **U***opt* and terminate the algorithm.

We introduce two more cases *C* = 1 and *C* = 2 , which represent that current UAV location satisfies the constraints (19a) and (19b) with an equality, respectively. Hence, it is clear that curve *C<sup>a</sup>* consists of *A<sup>C</sup>* = 1 ∩*C*<sup>2</sup> and *A<sup>C</sup>* = 1 ∩*C*¯ 2 , whereas *C<sup>b</sup>* is composed of *AC*1∩*<sup>C</sup>* = 2 and *AC*¯ <sup>1</sup>∩*C* = 2 . In addition, *AC*1∩*C*<sup>2</sup> includes *A<sup>C</sup>* = 1 ∩*C*<sup>2</sup> and *AC*1∩*<sup>C</sup>* = 2 (see Figure 3a).

The UD algorithm consists of three steps: [Step 1] for finding a feasible UAV location **U***f* from an initial UAV location **U***ini*, [Step 2] for updating **U***<sup>f</sup>* towards **Uopt**, and [Step 3] for determining **Uopt** and terminating the algorithm. Figure 3b,d represent three steps respectively, and search directions along a line or a circle have the same properties as those in Figure 2.

#### 4.3.1. [Step 1] Finding **U***<sup>f</sup>* from **U***ini*

To utilize Observations for search directions, it is essential to place an initial UAV at an arbitrary location. We suggest that the initial UAV location **U***ini* = **U***<sup>U</sup>* be at the point of intersection of *C<sup>b</sup>* and line in Figure 2d with *θU*,*IU* ≈ 90° (e.g., '◆' in Figure 3b). This point belongs to *AC*¯ <sup>1</sup>∩*C* = 2 , hence Direction ➃ on Observation 7 can be applied to find **U***<sup>f</sup>* = **U***U*<sup>0</sup> by decreasing *dU*,*BS* along the line between BS and **U***ini*. Since this line *always* passes through *AC*1∩*C*<sup>2</sup> , the feasible UAV location **U***<sup>f</sup>* can be found within *AC*1∩*C*<sup>2</sup> , specifically at the intersection of *A<sup>C</sup>* = 1 ∩*C*<sup>2</sup> and the line (e.g., '●' in Figure 3b) to minimize interference from UAV to CU. Therefore, **U***<sup>f</sup>* = **U***U*<sup>0</sup> = {*dU*<sup>0</sup> ,*BS*, *θU*,*BS*} can be obtained by directly calculating *dU*<sup>0</sup> ,*BS* from (19a) as

$$d\_{Ll',BS} = X\_{BS}^{-1}(F(\theta\_{L,BS}) \, | \, |K\_{no}) \,. \tag{20}$$

where *X* −1 *BS* (· |*Kno*) is the inverse function of *XBS*(·|*Kno*) and *θU*,*BS* = *θU*<sup>0</sup> ,*BS* is the elevation angle between BS and **U***ini* .

Alternatively, **<sup>U</sup>***ini* at the intersection of *<sup>C</sup><sup>a</sup>* and line in Figure 2c with *<sup>θ</sup>U*,*BS* <sup>≈</sup> 90° (e.g., '✚' in Figure 3b) can be considered, and it is within *A<sup>C</sup>* = 1 ∩*C*¯ 2 ⊂ *<sup>A</sup>C*1∩*C*¯ 2 . Based on Direction ➃ on Observation 8, **U***<sup>f</sup>* = **U***U*<sup>0</sup> = {*dU*<sup>0</sup> ,*IU*, *θU*,*IU*} can be determined to be on *C<sup>b</sup>* , where

$$d\_{L',III} = X\_{III}^{-1}(F(\theta\_{U,III}) \, | \, \mathcal{K}\_o). \tag{21}$$

It is obtained by taking the inverse function of *XIU*(·|*Ko*), *X* −1 *IU* (· |*Ko*), to (19b) with *θU*,*IU* = *θU*<sup>0</sup> ,*IU*.

## 4.3.2. [Step 2] Updating **U***<sup>f</sup>*

[Step 1] finds the feasible UAV location **U***<sup>f</sup>* (= **U***<sup>U</sup>* in this step) within *A<sup>C</sup>* = 1 ∩*C*<sup>2</sup> on *Ca*, or *AC*1∩*<sup>C</sup>* = 2 on *C<sup>b</sup>* . [Step 2] updates **U***<sup>U</sup>* iteratively towards **Uopt** and near upper point of intersection of *C<sup>a</sup>* and *C<sup>b</sup>* as shown in Figure 3c. *A<sup>C</sup>* = 1 ∩*C*<sup>2</sup> and *AC*1∩*<sup>C</sup>* = 2 belong to *AC*1∩*C*<sup>2</sup> , hence, Observation 6 can be applied. When **U***<sup>U</sup>* is within *A<sup>C</sup>* = 1 ∩*C*<sup>2</sup> on *C<sup>a</sup>* (e.g., '●' in Figure 3c), **U***U*<sup>0</sup> = {*dU*<sup>0</sup> ,*IU*, *θU*,*IU*} will be on *C<sup>b</sup>* by increasing *dU*,*IU* along Direction ➀ in Figure 2d, according to (21). On the other hand, when **<sup>U</sup>***<sup>U</sup>* is within *<sup>A</sup>C*1∩*<sup>C</sup>* = 2 on *C<sup>b</sup>* (e.g., 'F' in Figure 3c), **U***U*<sup>0</sup> = {*dU*<sup>0</sup> ,*BS*, *θU*,*BS*} will be obtained on *C<sup>a</sup>* by increasing *dU*,*BS* along Direction ➀ in Figure 2c based on (20). The newly obtained **U***U*<sup>0</sup> becomes **U***<sup>U</sup>* for the next procedure to draw a line for search directions. These procedures iterate until **U***U*<sup>0</sup> locates on either *C<sup>a</sup>* or *C<sup>b</sup>* outside *AC*1∩*C*<sup>2</sup> , but near the upper intersection point of *C<sup>a</sup>* and *C<sup>b</sup>* (e.g., '✚' in Figure 3c).

Note that, search directions along the line (i.e., Direction ➀ in Figure 2c,d) are selected on Observation 6 rather than those along the circle (i.e., Direction ➁ in Figure 2c,d). Since (19a) is not a function of *θU*,*IU*, Direction ➁ in Figure 2d cannot find **U***U*<sup>0</sup> within *A<sup>C</sup>* = 1 ∩*C*<sup>2</sup> (e.g., '●' in Figure 3c) from **<sup>U</sup>***<sup>U</sup>* within *<sup>A</sup>C*1∩*<sup>C</sup>* = 2 directly, but numerically by searching *θr*,*iu* as

$$
\theta\_{U',III} = \theta\_{U,III} + \min \Delta\_{\theta} \tag{22}
$$

$$
\text{s.t. } \mathbf{U}^{\text{II}'} \text{satisfies (19a) with an equality.}
$$

where ∆*<sup>θ</sup>* is an increment of *θU*,*IU* and *dU*<sup>0</sup> ,*IU* = *dU*,*IU*. Similarly, Direction ➁ in Figure 2c requires numerical updating *θU*,*BS* to find **U***U*<sup>0</sup> within *AC*1∩*<sup>C</sup>* = 2 (e.g., 'F' in Figure 3c) from **U***<sup>U</sup>* within *A<sup>C</sup>* = 1 ∩*C*<sup>2</sup> . These numerical updates increase computation time on Step 2, therefore, it is preferable to select Direction ➀ in Figure 2c and Direction ➀ in Figure 2d on Observation 6, thereby determining **U***U*<sup>0</sup> from (20) or (21) directly.

## 4.3.3. [Step 3] Determining **Uopt**

[Step 2] places **U***U*<sup>0</sup> at '✚' in Figure 3c, which is outside *<sup>A</sup>C*1∩*C*<sup>2</sup> . [Step 3] puts it back at a point either on *C<sup>a</sup>* or *C<sup>b</sup>* within *AC*1∩*C*<sup>2</sup> , and then declares the optimal UAV location **Uopt** .

Observations 7 and 8 can be utilized for this step because **U***U*<sup>0</sup> from [Step 2] (=**U***<sup>U</sup>* in this step) is on either *C<sup>a</sup>* or *C<sup>b</sup>* outside *AC*1∩*C*<sup>2</sup> , specifically *A<sup>C</sup>* = 1 ∩*C*¯ 2 or *AC*¯ <sup>1</sup>∩*C* = 2 . When **U***<sup>U</sup>* is within *A<sup>C</sup>* = 1 ∩*C*¯ 2 (e.g., '✚' in Figure 3d), Direction ➃ on Observation 8 can be applied using (21) to put **U***U*<sup>0</sup> on *C<sup>b</sup>* , while Direction ➃ on Observation 7 can be utilized to place **U***U*<sup>0</sup> on *C<sup>a</sup>* using (20) when **U***<sup>U</sup>* is within *AC*¯ <sup>1</sup>∩*C* = 2 (e.g., '◆' in Figure 3d). If newly obtained **U***U*<sup>0</sup> is within *AC*1∩*C*<sup>2</sup> , more accurately *A<sup>C</sup>* = 1 ∩*C*<sup>2</sup> or *AC*1∩*<sup>C</sup>* = 2 , the UD algorithm declares that it is **Uopt**, and terminates. If not, it repeats [Step 3] until **U***U*<sup>0</sup> is within *A<sup>C</sup>* = 1 ∩*C*<sup>2</sup> or *AC*1∩*<sup>C</sup>* = 2 . The details of UD algorithm are summarized in the Algorithm 1.

#### **Algorithm 1** UD algorithm.

```
Input: (Kno, Ko).
Output: Optimal UAV location Uopt
                                      .
Step 1. Find Uf
                 from Uini:
 1: initial UAV location is determined as Uini = {dU,IU, θU,IU}
    = {X
          −1
          IU (F(θU,IU)|Ko), θU,IU} with θU,IU ≈ 90°.
 2: update Uini following Direction ➃ in Figure 2c based on (20).
    Then, Uf within AC
                         =
                         1 ∩C2
                              is obtained, and go to Step 2.
Step 2. Update Uf
                    towards Uopt:
 3: if Uf
          is within AC
                       =
                       1 ∩C2
                            then
 4: find UU0
                following Direction ➀ in Figure 2d based on (21).
 5: else if Uf
               is within AC1∩C
                               =
                               2
                                 then
 6: find UU0
                following Direction ➀ in Figure 2c based on (20).
 7: end if
 8: if UU0
           is within AC
                        =
                        1 ∩C¯
                            2
                             then
 9: UU ← UU0
                  , and go to Step 3.
10: else
11: Uf ← UU0
                  , and go to line 3.
12: end if
Step 3. Determine Uopt:
13: if UU is within AC
                       =
                       1 ∩C¯
                           2
                             then
14: find UU0
                following Direction ➃ in Figure 2d based on (21).
15: else if UU is within AC¯
                            1∩C
                               =
                               2
                                 then
16: find UU0
                following Direction ➃ in Figure 2c based on (20).
17: end if
18: if UU0
           is within AC
                        =
                        1 ∩C2
                             or AC1∩C
                                       =
                                      2
                                        then
19: Uopt ← UU0
                   , and terminates the algorithm.
20: else
21: go to line 13.
22: end if
```
#### **5. Optimal Number of Transmit Time Slots**

To maximize the multi-objective function of **JUDTAP** in (12), the overall number of time slots *n* that guarantees a reliable relay transmission to IU should be minimized before UD algorithm is performed. Hence, in this section, we propose the time slot determination (TSD) algorithm to determine an optimal pair of time slots (*K opt no* , *K opt o* ), equivalently the minimum (optimal) number of overall time slots of *n opt* = *K opt no* + *K opt o* .

#### *5.1. Existence of Feasible UAV Locations*

In Section 4, UD algorithm is proposed to find **Uopt** with the assumption that the given time allocation (*Kno*, *Ko*) guarantees the existence of feasible UAV locations, equivalently *AC*1∩*C*<sup>2</sup> , that satisfies both constraint (19a) and (19b). However, when *Kno* and *K<sup>o</sup>* are not enough for reliable relay transmissions, A*C*1∩*C*<sup>2</sup> in Figure 3a does not exist for **Uopt** . Therefore, it is critical to select a proper pair of (*Kno*, *Ko*), but we consider the minimum *Kno* and *K<sup>o</sup>* as an optimum in the resource-efficiency aspect. In addition, scanning all UAV operation range (i.e., inside dashed rectangular in Figure 2) to check the existence of *AC*1∩*C*<sup>2</sup> for each possible (*Kno*, *Ko*) is impractical. Therefore, a time-efficient determination for the existence of *AC*1∩*C*<sup>2</sup> should be considered. This can be realized by utilizing [Step 1] of UD algorithm. If **U***<sup>f</sup>* within *A<sup>C</sup>* = 1 ∩*C*<sup>2</sup> can be found from **<sup>U</sup>***ini*, A*C*1∩*C*<sup>2</sup> exists for (*Kno*, *Ko*) to guarantee a reliable relay transmission. On the other hand, if [Step 1] cannot find **U***<sup>f</sup>* within *A<sup>C</sup>* = 1 ∩*C*<sup>2</sup> , (*Kno*, *Ko*) is not enough to satisfy both constraints on relay transmission

(12a), thereby resulting in absence of *AC*1∩*C*<sup>2</sup> . Hence, more time slots should be allocated for a reliable relay transmission.

#### *5.2. Time Slot Determination (TSD) Algorithm*

In this section, we propose a novel time slot determination (TSD) algorithm to derive the minimum number of overall time slots *n opt* along with (*K opt no* , *K opt o* ) for a reliable relay transmission. First, it defines *lmin* and *lmax* with the assumption of *Kno* = *Ko*, and utilizes them to reduce the search range for *K opt no* and *K opt o* . *lmin* is the minimum number of time slots where (*Kno*, *Ko*) = (*lmin*,*lmin*) could provide *AC*1∩*C*<sup>2</sup> , but not necessarily guarantee it, whereas *lmax* is the number of time slots that does guarantee *AC*1∩*C*<sup>2</sup> for (*Kno*, *Ko*) = (*lmax*,*lmax*). *lmin* is determined by the upper bounds of *dU*,*BS* and *dU*,*IU*, denoted as *d max <sup>U</sup>*,*BS*(*Kno*) and *d max <sup>U</sup>*,*IU*(*Ko*), while *lmax* is obtained based on *lmin* and utilized to find (*K opt no* , *K opt o* ) and *n opt* .

## 5.2.1. Determination of *lmin*

The *lmin* aims at restricting search range, thereby reducing a computation time for (*K opt no* , *K opt o* ). It does not require guaranteeing *AC*1∩*C*<sup>2</sup> necessarily, but provides a lower bound for *lmax*. It can be derived from maximum distances of *dU*,*BS* and *dU*,*IU*, *d max <sup>U</sup>*,*BS*(*Kno*) and *d max <sup>U</sup>*,*IU*(*Ko*), for each *Kno* and *Ko*. From (15) and (16), *dU*,*BS* ≤ *X* −1 *BS* (*F*(*θU*,*BS*), *Kno*) and *dU*,*IU* ≤ *X* −1 *IU* (*F*(*θU*,*IU*), *Ko*) can be obtained, and the right terms of both inequalities are increasing functions of *F*(*θU*,*BS*) and *F*(*θU*,*IU*) respectively. Hence, *d max <sup>U</sup>*,*BS*(*Kno*) and *d max <sup>U</sup>*,*IU*(*Ko*) can be defined as

$$d\_{Ll,BS}^{\max}(K\_{\mathfrak{m}o}) \triangleq X\_{BS}^{-1}(F(\theta\_{lL,BS}), K\_{\mathfrak{m}o})\_{|F(\theta\_{lL,BS})=1} = X\_{BS}^{-1}(1, K\_{\mathfrak{m}o}),\tag{23}$$

$$d\_{IL,III}^{\max}(K\_0) \triangleq X\_{III}^{-1}(F(\theta\_{U,III}), K\_0)\_{\mid F(\theta\_{U,III}) = 1} = X\_{III}^{-1}(1, K\_0),\tag{24}$$

where *F*(*θU*,*i*) = 1, *i* ∈ {*BS*, *IU*} because max *F*(*θU*,*I*) = 1. Note that *d max <sup>U</sup>*,*BS*(*Kno*) and *d max <sup>U</sup>*,*IU*(*Ko*) are increasing functions of *Kno* and *Ko*, respectively.

Using (23) and (24), *lmin* is defined as

$$d\_{\min} = \underset{z \in \mathbb{Z}\_{>0}}{\text{arg min }} d\_{ULBS}^{\max}(z) + d\_{UL,UI}^{\max}(z) \ge d\_{BS,III},\tag{25}$$

where *z* is a positive integer indicating *z* = *Kno* = *Ko*, and *dBS*,*IU* is a distance between BS and IU. The *l*min given by (25) makes two circles drawn at ground node *i* with radius *d max U*,*i* (*l*min), *i* ∈ {*BS*, *IU*} (i.e., blue and dashed green circles in Figure 2a) overlap each other. As aforementioned, the feasible UAV locations could exist within an overlapped region by two circles, but not guaranteed because the derivation of *lmin* starts with the assumption of *F*(*θU*,*i*) = 1, *i* ∈ {*BS*, *IU*} in (23) and (24).

### 5.2.2. Determination of *lmax*

If a feasible UAV location exists for (*Kno*, *Ko*) = (*l*min, *l*min), *lmax* can be determined as *lmax* , *lmin* by the definition. Otherwise, *lmax* should be greater than *lmin* such that a feasible UAV location exists for (*Kno*, *Ko*) = (*l*max, *l*max). In addition, *Kno* and *Ko*, both less than *lmin*, do not need to be considered because it can not provide *AC*1∩*C*<sup>2</sup> . Hence, *lmax* can be obtained as

$$l\_{\max} = \underset{l\_{\min} + z}{\arg\min} \left( l\_{\min} + z, l\_{\min} + z \right), \tag{26}$$

$$\text{s.t.} \qquad \qquad z \in \mathbb{Z}\_{\ge 0} \text{ and}$$

$$A\_{\mathbb{C}\_1 \cap \mathbb{C}\_2} \text{ exists for } (K\_{\text{no}}, K\_o) = (l\_{\min} + z, l\_{\min} + z),$$

where *z* is a non-negative integer value. If a feasible UAV location exists for (*Kno*, *Ko*) = (*l*min, *l*min), *z* = 0, and otherwise, *z* > 0. Note that it is obvious that *n opt* <sup>≤</sup> <sup>2</sup>*l*max because a feasible UAV location always exists for (*Kno*, *Ko*) = (*l*max, *l*max).

#### 5.2.3. Determination of (*K opt no* , *K opt o* )

Based on *lmax*, the initial number of overall time slots is defined as *nini* = 2*lmax*. The TSD algorithm targets to the minimum number of overall time slots, so *Kno* and *K<sup>o</sup>* can be different even though *nini* is derived from the assumption of *Kno* = *Ko*. In addition, for each number of overall time slot *n*, it is preferable to maximize *Kno* to achieve maximum *RCU*, because the SINR at CU in NOTP (i.e., (2)) is larger than that in OTP (i.e., (4)). Therefore, *n opt* will be determined by any pair of (*Kno*, *Ko*), which leads to the minimum *Kno* + *Ko*. However, when *n opt* is given, (*K opt no* , *K opt <sup>o</sup>* ) is preferred to be (*Kno*, *Ko*) with *Kno* ≥ *K<sup>o</sup>* if exists.

Figure 4 represents a way to update *Kno* and *K<sup>o</sup>* towards *K opt no* and *K opt o* . The row and column of the table indicate values of *K<sup>o</sup>* and *Kno*. *K<sup>o</sup>* is upper bounded by *lmax* because (*Kno*, *Ko*) = (*l*max − *z*, *l*max + *z*), *z* ∈ Z><sup>0</sup> for *nini* = 2*l*max cannot achieve *Kno* ≥ *Ko*. Each element of table denotes the sum of row and column values (i.e., the number of overall time slots), and pairs of *Kno* and *K<sup>o</sup>* resulting in *Kno* + *K<sup>o</sup>* > 2*lmax* are out of interest (i.e., upper triangle in Figure 4). To find (*K opt no* , *K opt o* ), an initial time allocation is set to (*Kno*, *Ko*) = (*lmax*, *lmax*), *n opt* = *nini*, and two rules for updating (*Kno*, *Ko*) are defined as follows:

**Figure 4.** Part 2 on the time slot determination (TSD) algorithm. (**a**) Updating initial (*Kno*, *Ko*) = (*lmax*, *lmax*). (**b**) Updating rules for (*Kno*, *Ko*).

#### [Rule 1]

If a feasible UAV location exists for (*Kno*, *Ko*) = (*m*, *l*), update *n opt* as *m* + *l* and check the existence of feasible UAV location for (*Kno*, *Ko*) = (*m*, *l* − 1) and (*Kno*, *Ko*) = (*m* − 1, *l*), obtained by downward and leftward movements in Figure 4 to decrease *K<sup>o</sup>* and *Kno*, respectively.

[Rule 1-1] If a feasible UAV location exists *only* for (*Kno*, *Ko*) = (*m* − 1, *l*), update *n opt* as *m* + *l* − 1 and repeat [Rule 1] at (*Kno*, *Ko*) = (*m* − 1, *l*).

[Rule 1-2] If a feasible UAV location exists for (*Kno*, *Ko*) = (*m*, *l* − 1), update *n opt* as *m* + *l* − 1 and repeat [Rule 1] at (*Kno*, *Ko*) = (*m*, *l* − 1).

#### [Rule 2]

If no feasible UAV location exists for (*Kno*, *Ko*) = (*m*, *l*) and *m* + *l* + 1 ≤ *n opt*, move right in Figure 4 to increase *Kno* and check the existence of feasible UAV location for (*Kno*, *Ko*) = (*m* + 1, *l*).

[Rule 1] aims at checking the availability of smaller *n opt*, while [Rule 2] is to investigate the existence of (*Kno*, *Ko*) with *Kno* > *K<sup>o</sup>* for the candidate *n opt*. When feasible UAV locations exist for both (*Kno*, *Ko*) = (*m* − 1, *l*) and (*Kno*, *Ko*) = (*m*, *l* − 1) in [Rule 1], (*Kno*, *Ko*) = (*m*, *l* − 1) in [Rule 1-2] is selected to maximize *Kno* for *n opt* <sup>=</sup> *<sup>m</sup>* <sup>+</sup> *<sup>l</sup>* <sup>−</sup> 1. TSD

algorithm terminates when (*Kno*, *Ko*) = (*m* + 1, *l*) in [Rule 2] results in *m* + *l* + 1 > *n opt* . The last updated *n opt* is minimum (optimal) number of overall time slots, and (*Kno*, *Ko*), which leads to *n opt*, becomes (*K opt no* , *K opt <sup>o</sup>* ).

#### 5.2.4. TSD Algorithm

Details of the TSD algorithm are summarized in Algorithm 2. Part 1 determines *lmin* and *lmax* from (25) and (26), respectively. Part 2 derives *n opt* and (*K opt no* , *K opt <sup>o</sup>* ) from (*Kno*, *Ko*) = (*lmax*, *lmax*) by the updating rules for (*Kno*, *Ko*). Note that only a few iterations are required on Part 2 of the TSD algorithm. From (26), it is clear that (*Kno*, *Ko*) = (*l*max − 1, *l*max − 1) with *n* = 2*l*max − 2 does not provide a feasible UAV location. This results from the assumption of *Kno* = *Ko*, hence, *n* = 2*l*max − 2 could give a feasible UAV location when *Kno* is different from *Ko*. However, there is little chance for such a case to obtain a valid (*Kno*, *Ko*) with *n* = 2*l*max − 2 because *Kno* or *K<sup>o</sup>* may be too small to set a reliable relay link between BS and UAV or between UAV and IU. In other words, a leftward or a downward movement in Figure 4 may be enough once or twice to reach *n opt*, and so may a rightward movement to maximize *Kno* for the same reason. Therefore, the TSD algorithm reduces a search time dramatically compared to exhaustive algorithm or others, hence delivers **Kopt** = (*K opt no* , *K opt <sup>o</sup>* ) quickly.

#### **Algorithm 2** TSD algorithm.

**Part 1. Calculate of** *l*min **and** *l*max. 1: Determine *l*min using (25). 2: Determine *l*max using (26). **Part 2: Find (***K opt no* , *K opt <sup>o</sup>* ). **Input**: (*l*max, *l*max). **Output**: (*K opt no* , *K opt <sup>o</sup>* ), *n opt* . Initialization: (*Kno*, *Ko*) ← (*l*max, *l*max), *n opt* <sup>←</sup> <sup>2</sup>*l*max. 3: **if** feasible UAV location exists for (*Kno*, *Ko*) **then** 4: **if** feasible UAV location exists for (*Kno*, *K<sup>o</sup>* − 1) **then** 5: (*K opt no* , *K opt <sup>o</sup>* ) ← (*Kno*, *K<sup>o</sup>* − 1), *n opt* <sup>←</sup> *<sup>K</sup>no* <sup>+</sup> *<sup>K</sup><sup>o</sup>* <sup>−</sup> 1. 6: (*Kno*, *Ko*) ← (*Kno*, *K<sup>o</sup>* − 1), and go to line 3. 7: **else if** feasible UAV location exists for (*Kno* − 1, *Ko*) **then** 8: (*K opt no* , *K opt <sup>o</sup>* ) ← (*Kno* − 1, *Ko*), *n opt* <sup>←</sup> *<sup>K</sup>no* <sup>+</sup> *<sup>K</sup><sup>o</sup>* <sup>−</sup> 1. 9: (*Kno*, *Ko*) ← (*Kno* − 1, *Ko*), and go to line 3. 10: **end if** 11: **else** 12: **if** *Kno* + *K<sup>o</sup>* + 1 ≤ *n opt* **then** 13: (*Kno*, *Ko*) ← (*Kno* + 1, *Ko*), and go to line 3. 14: **else** 15: Terminate the algorithm. 16: **end if** 17: **end if**

#### **6. UAV Deployment and Time Allocation Algorithm**

The TSD and UD algorithms are presented to determine **<sup>K</sup>opt**={*<sup>K</sup> opt no* , *K opt <sup>o</sup>* } and to decide **Uopt** for **Kopt**, respectively. The **JUDTAP** can be solved by UAV deployment and transmit time allocation (UDTA) algorithm, which consists of TSD and UD algorithms and runs them in a sequential manner. Details of UDTA algorithm are summarized in Algorithm 3. As mentioned in Sections 4 and 5, TSD and UD algorithms reduce search range for **Kopt** and search area for **Uopt** respectively, thereby requiring much less computation time over exhaustive search algorithm. In the following subsection, a computational complexity is analyzed with respect to the total number of computations, considered for searching and determining **Kopt** and **Uopt** .

#### **Algorithm 3** UDTA algorithm.


#### *Complexity Analysis*

To determine the number of UAV locations for an exhaustive search, we consider that a grid is superimposed over the operation range (i.e., dashed rectangular in Figure 2a) with lines separated by ∆*<sup>d</sup>* [m]. As a result, the number of UAV locations to be considered increases as ∆*<sup>d</sup>* decreases. Total computations for an exhaustive search is derived as O(∆*<sup>d</sup>* ) , l *S* ∆*d* 2 m *n opt*(*n opt*−1) 2 . l *S* ∆*d* 2 m is the number of UAV locations within the operation range of *S* [m<sup>2</sup> ], where d·e is rounding up to the nearest integer. *<sup>n</sup> opt*(*n opt*−1) 2 is the number of combinations for (*Kno*, *Ko*) to find (*K opt no* , *K opt <sup>o</sup>* ) from initial (*Kno*, *Ko*) = (1,1). Hence, O(∆*<sup>d</sup>* ) indicates that <sup>l</sup> *S* ∆*d* 2 m locations for their feasibility need to be considered for each time allocation (*Kno*, *Ko*). On the other hand, the UDTA algorithm requires at most *lmin* + 2 <sup>|</sup>*lmax*−*lmin*<sup>|</sup> + 2 2 *lmax*−*K opt o* + *lmax*−*K opt no*  + 10 number of computations to find **Kopt** and **Uopt**. On TSD algorithm, *lmin* computations are required in (25) to determine *lmin* from (*Kno*, *Ko*) = (1, 1). To determine *lmax*, 2 |*lmax*−*lmin*| computations are required in (26) from *lmin*, where only two UAV locations (i.e., initial and first feasible locations in [Step 1] on UD algorithm) are considered for each time allocation (*Kno*, *Ko*). Similarly, there are at most 2 *lmax* − *K opt o*  + *lmax* − *K opt no* combinations for (*Kno*, *Ko*) to find (*K opt no* , *K opt <sup>o</sup>* ) from (ł*max*, ł*max*), hence resulting in 2 2 *lmax*−*K opt o* + *lmax*−*K opt no* computations. In the UD algorithm, the total number of computations in (20) or (21) is less than 10, which is derived from simulations and reasonable due to dramatically reduced search area within *AC*1∩*C*<sup>2</sup> by the proposed algorithm. As a result, the UDTA algorithm requires much fewer UAV locations and (*Kno*, *Ko*) combinations to be considered for **Kopt** and **Uopt** over exhaustive search algorithm, thereby reducing computational time and effort significantly.

#### **7. Numerical Results**

In this section, we compare the optimal UAV location and transmit time allocations by proposed algorithms with those from an exhaustive search, and demonstrate that the UDTA algorithm achieves optimality while significantly reducing computational complexity. For simulations, we assume that *P max BS* = 30 [dBm], *P max U* = 25 [dBm], *β<sup>A</sup>* = 3, *β<sup>G</sup>* = 2, and *ς* = 20 [dB] [25–28]. An urban environment is assumed with *B* = 0.136 and *C* = 11.95 [8]. *λ* , *dBS*,*IU dBS*,*CU* is a relative location of IU with respect to CU. In order to evaluate the optimality of proposed algorithm, the throughput gap (%) is defined by *RCU* difference from exhaustive search of ∆*<sup>d</sup>* = 1 because ∆*<sup>d</sup>* = 1 is sufficiently small to find the global optimal UAV location for exhaustive search.

Figure 5 represents feasible combinations of (*Kno*, *Ko*) and formation of *AC*1∩*C*<sup>2</sup> , and compares **Uopt** by UD algorithm with that by exhaustive search of ∆*<sup>d</sup>* = 1, where *xBS* = 0, *xCU* = 300 [m], *xIU* = 1000 [m] and *Dreq* = 3 [bit/Hz]. When (*Kno*, *Ko*) = (4, 4), *AC*1∩*C*<sup>2</sup> starts to appear, but TSD algorithm concludes (*K opt no* , *K opt <sup>o</sup>* ) = (4, 3) resulting in *n opt* = 7, even though (*Kno*, *Ko*) = (3, 4) also provides *AC*1∩*C*<sup>2</sup> . This is because (*K opt no* , *K opt <sup>o</sup>* ) = (4, 3) maximizes *K opt no* for given *n opt*. TSD algorithm reduces the feasible UAV locations *AC*1∩*C*<sup>2</sup> significantly, and UD algorithm successfully determines **Uopt** close to that from the exhaustive search of ∆*<sup>d</sup>* = 1.

**Figure 5.** Feasible combinations of (*Kno*, *Ko*) and comparison of **Uopt** from different algorithms.

Figure 6 represents (*K opt no* , *K opt <sup>o</sup>* ), *n opt*, UAV location, and throughput gap with respect to *Dreq* for same *xv*, *v* ∈ {*BS*, *CU*, *IU*} in Figure 5. The UDTA algorithm finds (*K opt no* , *K opt <sup>o</sup>* ) and UAV location close to optimum with negligible throughput gap. *n opt* increases as *Dreq* increases to set a reliable relay connection. For a given *n opt*, UAV should be placed lower and close to CU as *Dreq* increases. Even though this UAV movement increases the interference to CU, it is necessary for reliable relay transmission in BS-to-UAV link. For example, Figure 6a represents that *n opt* <sup>=</sup> <sup>2</sup> is required for 0.5 <sup>≤</sup> *<sup>D</sup>req* <sup>≤</sup> 0.9. When *Dreq* = 0.5, UAV can be placed very high and remote from CU. As *Dreq* increases, however, UAV moves towards BS to set a reliable BS-to-UAV link. Lastly, the throughput gap between the proposed algorithm and exhaustive search is less than 0.1 (%) for entire range of *Dreq*, hence, it demonstrates that UDTA algorithm successfully determines **Kopt** and **Uopt** with negligible throughput gap from the exhaustive search of ∆*<sup>d</sup>* = 1.

**Figure 6.** *Cont.*

**Figure 6.** Performance comparison of different algorithms and their throughput gap with respect to *Dreq*. (**a**) The number of transmit time slots. (**b**) UAV location and throughput gap.

URN with multiple transmit time slots (*n opt* >2) on each relay transmission link can reduce redundant usages of transmit time slots. For example, URN consisting of two time slots for a single relay transmission (i.e., single transmit time slot allocation to each relay transmission link) requires at least three repetitions of relay transmissions to provide *Dreq* = 1.9 at IU since *Dreq* = 0.9 is a maximum delivered data to IU by URN with *n opt* = 2, as shown in Figure 6a. Therefore, six time slots are required for URN that utilizes two transmit time slots for a single relay transmission, while URN that allocates multiple transmit time slots on each relay transmission link only requires four time slots for *Dreq* = 1.9. Hence, multiple transmit time slots should be adopted in URN to efficiently utilize the transmit time slots.

Figure 7 represents (*K opt no* , *K opt <sup>o</sup>* ), *n opt*, UAV location, and throughput gap with respect to *λ* for *xCU* = 300 [m]. As *λ* increases, IU moves away from BS, hence, larger *n opt* is required in URN for a reliable relay transmission. In addition, UAV height should be increased for large *θU*,*<sup>v</sup>* to set a strong LOS connection between UAV and ground nodes *v*. In order to guarantee the minimum number of transmit time slots, *K opt no* can be smaller than *K opt <sup>o</sup>* as explained in Section 5.2.3. For example, when 2.43 ≤ *λ* ≤ 2.83, *n opt* is equal to 5 with (*K opt no* , *K opt <sup>o</sup>* ) = (3, 2) or (2, 3). Especially, (*K opt no* , *K opt <sup>o</sup>* ) = (2, 3) is selected when 2.76 ≤ *λ* ≤ 2.83 to achieve the minimum number of overall time slots, however, it requires UAV to move towards BS for reliable BS-to-UAV link due to smaller *K opt no* . Similar to Figure 6, UDTA algorithm achieves negligible throughput gap, less than 0.2 (%), over the exhaustive search of ∆*<sup>d</sup>* = 1.

Figure 8 shows the throughput gap of exhaustive searches with respect to the number of computations. It is obvious that the throughput gap increases as ∆*<sup>d</sup>* increases, due to the reduction on the number of UAV locations considered for searching **Kopt** and **Uopt**, compared with ∆*<sup>d</sup>* = 1. As aforementioned, the UDTA algorithm significantly reduces computational time to find **Kopt** and **Uopt** due to the time-efficient determination of (*K opt no* , *K opt <sup>o</sup>* ) based on *lmin* and *lmax*, and small *AC*1∩*C*<sup>2</sup> derived from (*K opt no* , *K opt <sup>o</sup>* ). Therefore, it provides optimal solution for **JUDTAP**, and achieves much less computations of *lmin* + 2 <sup>|</sup>*lmax*−*lmin*<sup>|</sup> + 2 2 *lmax*−*K opt o* + *lmax*−*K opt no*  + 10 *O*(1) with negligible throughput gap over the exhaustive search of even ∆*<sup>d</sup>* ≤ 1.

**Figure 7.** Performance comparison of different algorithms and their throughput gap with respect to *λ*. (**a**) The number of transmit time slots. (**b**) UAV location and throughput gap.

**Figure 8.** Throughput gap of exhaustive searches of different ∆*<sup>d</sup>* .

#### **8. Conclusions**

In this paper, we have investigated URN with multiple transmit time slots, and proposed algorithms to maximize the throughput of UE in a cell while guaranteeing a reliable transmission to UE in its extended service area. The formulated multi-objective joint UAV deployment and transmit time allocation optimization problem (**JUDTAP**) is solved by TSD and UD algorithms to determine the optimal number of overall transmit time slots **Kopt** and optimal UAV location **Uopt** in a sequential manner. Simulation results demonstrate that **Kopt** and **Uopt** are critical to URN for a reliable relay transmission. **Kopt** and **Uopt** by the proposed algorithm match well those from exhaustive search, but with significantly reduced computation complexity to determine them over the exhaustive search. In addition, URN allocating multiple transmit time slots on relay transmission links is better than that utilizing two transmit time slots for a single relay transmission in terms of resource efficiency.

**Author Contributions:** Conceptualization, S.I.H. and J.B.; methodology, S.I.H. and J.B.; validation, S.I.H. and J.B.; formal analysis, S.I.H. and J.B.; investigation, J.B.; writing—original draft preparation, S.I.H.; writing—review and editing, S.I.H.; supervision, S.I.H.; project administration, S.I.H. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Data is contained within the article.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**

