Next Article in Journal
Miniaturized On-Ground 2.4 GHz IoT LTCC Chip Antenna and Its Positioning on a Ground Plane
Previous Article in Journal
Reduced Order Modeling of Nonlinear Vibrating Multiphysics Microstructures with Deep Learning-Based Approaches
Previous Article in Special Issue
Secrecy Energy Efficiency Enhancement in UAV-Assisted MEC System
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

UAV Trajectory Design and Power Optimization for Terahertz Band-Integrated Sensing and Communications

1
School of Information and Electrical Engineering, Hebei University of Engineering, Handan 056038, China
2
Chongqing Engineering Research Center of Intelligent Sensing Technology and Microsystem, Chongqing 400065, China
3
Beijing Advanced Innovation Center for Future Internet Technology, Beijing University of Technology, Beijing 100124, China
4
Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China
*
Author to whom correspondence should be addressed.
Sensors 2023, 23(6), 3005; https://doi.org/10.3390/s23063005
Submission received: 14 February 2023 / Revised: 4 March 2023 / Accepted: 6 March 2023 / Published: 10 March 2023

Abstract

:
Sixth generation (6G) wireless networks require very low latency and an ultra-high data rate, which have become the main challenges for future wireless communications. To effectively balance the requirements of 6G and the extreme shortage of capacity within the existing wireless networks, sensing-assisted communications in the terahertz (THz) band with unmanned aerial vehicles (UAVs) is proposed. In this scenario, the THz-UAV acts as an aerial base station to provide information on users and sensing signals and detect the THz channel to assist UAV communication. However, communication and sensing signals that use the same resources can cause interference with each other. Therefore, we research a cooperative method of co-existence between sensing and communication signals in the same frequency and time allocation to reduce the interference. We then formulate an optimization problem to minimize the total delay by jointly optimizing the UAV trajectory, frequency association, and transmission power of each user. The resulting problem is a non-convex and mixed integer optimization problem, which is challenging to solve. By resorting to the Lagrange multiplier and proximal policy optimization (PPO) method, we propose an overall alternating optimization algorithm to solve this problem in an iterative way. Specifically, given the UAV location and frequency, the sub-problem of the sensing and communication transmission powers is transformed into a convex problem, which is solved by the Lagrange multiplier method. Second, in each iteration, for given sensing and communication transmission powers, we relax the discrete variable to a continuous variable and use the PPO algorithm to tackle the sub-problem of joint optimization of the UAV location and frequency. The results show that the proposed algorithm reduces the delay and improves the transmission rate when compared with the conventional greedy algorithm.

1. Introduction

Following the birth of various emerging applications, such as holographic communication, sensory interconnection, three-dimension immersive experiences, and the metaverse, terahertz (THz) band communication is envisioned as one of the key enabling technologies to satisfy the needs of emerging applications [1,2]. Specifically, the ultra-wide THz band that ranges from 0.1 THz to 10 THz promises to support applications with a high quality of service and terabits per second data rates [3]. The THz frequency band will provide new applications for future ultra-high data rate communication because of the ultra-wide THz band [4]. In addition to communication applications, the THz frequency will also enable high resolution and accuracy sensing, such as radar, augmented human senses, and other scenarios [5]. Furthermore, THz networks can realize massive communication connectivity with plenty of available spectrum resources, as more than 10 billion devices are expected to be connected in the coming years [6].
However, the realization of ultra-bandwidth terahertz communication faces three major technical challenges: the first is to technically reach a high-speed terahertz signal of over 100 Gbps, the second is to be able to process high-speed terahertz signals in real time, and the third is to overcome the high channel loss characteristics of terahertz signals. We are going to focus on the third case.
On the one hand, the THz frequency has the characteristic of a high path loss; thus, with the increase in carrier frequencies and communication distances, THz wave propagation suffers from higher spreading losses and stronger non-line-of-sight path losses attributed to scattering, reflection, diffraction, and shadowing [7]. Thus, non-line-of-sight transmission in the THz spectrum is rarely received at the receiving end due to these phenomena. However, line-of-sight transmission is almost nonexistent because of the amount of cover in cities. The deployment of UAVs has been regarded as a complementary alternative to existing cellular systems to achieve higher transmission efficiency and capacity [8]. Therefore, UAVs are needed as aerial base stations to provide line-of-sight transmission links for THz frequencies.
On the other hand, THz communications are highly affected by molecular absorption loss caused by water molecules in the atmosphere. The atmospheric water molecule content varies during the day, and traditionally, the relevant papers, such as [9], usually just give a constant value, which causes errors when choosing channels. Error deviation from the traditional estimation method of path loss is unacceptable in the THz frequency. However, THz-UAVs can detect real-time environmental changes through sensing signals, thus measuring terahertz channel parameters. Therefore, the performance of sensing systems should be taken into account when optimizing THz communication resources [10,11]. In short, integrated sensing and communications endows THz-UAV communication networks with new abilities to interact to perceive the physical world and then improve user information rates. Thus, this topic has importance. We included Table 1 to clearly demonstrate the novelty of our paper and this will be discussed in Section 2. Thus, it is highly necessary to achieve integrated sensing and communications for THz transmission.
Against this background, in this paper, we propose a THz band sensing-assisted UAV communication network to provide wireless communication for users. Particularly, we focus on downlink communications while jointly optimizing the UAV trajectory, frequency association, and power association. The main contributions of our work include:
Designing a new sensing and communication power optimization method that considers interference between sensing and communication signals in a THz sensing-assisted UAV communication network.
Formulating an optimization problem (Figure 1) and proposing an efficient alternative optimization to solve this problem. First, we use the Lagrangian dual decomposition method to obtain the power of sensing and communication with a fixed trajectory. Second, we use the policy optimization (PPO) algorithm for joint optimization of the UAV location and frequency association with a fixed power of sensing and communication.
Designing a PPO algorithm for optimizing the UAV trajectory and frequency association. The PPO algorithm uses the critic network with global information and the actor network with local information to achieve cooperation to explore the angle of UAV and the frequency association.
The rest of this paper is organized as follows: The prior works are described in Section 2. In Section 3, the system model is described. In Section 4, the decomposition problem and the joint optimization design are presented. In Section 5, the simulation results are provided and discussed. Finally, this paper is concluded in Section 6.
Figure 1. Alternating optimization algorithm.
Figure 1. Alternating optimization algorithm.
Sensors 23 03005 g001

2. Prior Works

The study of UAVs is considered as a new frontier field [12,13,14,15,16]. In [12], the authors designed an optimization problem to maximize the sum rate of a satellite and aerial integrated network. In [13], the authors aimed to maximize the energy efficiency of UAV-enabled communication by optimizing its trajectory. In [14], the authors designed a limited storage space and energy for a UAV-assisted wireless communication system to realize the multi-user communication. In [15], the authors proposed a new protocol for UAV-to-UAV and UAV-to-GCS communication. In [16], the authors give a short overview of the possible threats, attacks, and countermeasures related to UAV communications.
To exploit THz band UAV wireless communication, some initial works have considered THz-enabled aerial communications [17,18,19]. In [17], the authors proposed a UAV-to-user THz sub-band association scheme to eliminate interference in the THz frequency transmission. They proved that terahertz frequencies could be used for communication, and extensions of the wireless charging window and THz-transmitting window are derived. In [18], the authors minimized the total delays of the uplink and downlink transmissions between the UAV and the users by jointly optimizing the location of the operating UAV and the bandwidth of the users, as well as minimizing the transmitting power of the users. They optimized the performance of the drones to communicate using terahertz frequencies. In [19], the authors studied how UAVs support THz communications and an IRS was deployed to help the transmission. Yijin Pan’s aim is to maximize the minimum average rates of all users. They optimized and evaluated the resource optimization problem for terahertz UAVs.
Many works have been dedicated to integrated sensing and communications [20,21,22]. In [20], the authors provided a brief explanation of communication rate maximization theory. Their goals were to research the basic communications phenomenology and to study dealing with systems in an information theory context. In [21], the authors aimed to further investigate the achievable performance of spectrally overlapping radar and communication systems by conjugating the detection. In [22], the authors developed a new approach for producing joint radar communications performance bounds. The authors studied the boundary question of combined communication and sensing.
There are growing research interests in power optimization [23,24,25,26]. In [23], the authors’ design objective was to minimize the total transmission power of both the satellite and BS with a limited onboard power resource. In [24], the authors designed an objective function to maximize the system secrecy energy efficiency under the constraint of the total transmission power budget. In [25], the authors investigated the energy minimization problem of a UAV-assisted data collection sensor network. In [26], the authors designed a function that maximized the sum rate in a satellite–terrestrial integrated network, aiming to satisfy the constraints of per-antenna transmission power and quality-of-service requirements of both satellite and cellular users.
Although there are many papers on UAV communications, most of these existing works [20,21,22,23,24,25,26] do not focus on integrated sensing and communication. Therefore, this area is well worth studying.
We have summarized the relevant work in Table 1.

3. System Model and Problem Formulation

3.1. System Model

Let us now consider a downlink from a THz UAV to N users during time horizon T, shown in Figure 2. We suppose that the user equipment is taken as a two-dimensional (2D) homogeneous Poisson point process (PPP) Φ u with intensity λ u . For ease of calculation, the time horizon of T is equally divided into K + 1 time slots with length T K + 1 . THz-UAVs use integrated sensing and communication to improve the performance of system. As a result of shared spectrum resources in sensing and communication signals, it is challenging to achieve the critical trade-off between these two integrated functionalities. In order to reduce the interference of communication and sensing signals of the same frequency, at time slot 0, the UAV sends sensing signals and users receive sensing signals. During the time slot of 1 to K + 1 , the UAV sends communication and sensing signals and users receive communication and sensing signals.
Therefore, for N targets, the user signal received at time slot k can be expressed as:
z k = n = 1 N z k n = n = 1 N h k n P k S , n 1 2 S k + h k n P k C , n 1 2 C k + n k ,
where S k is the sensing signal, C k is the communication signal, P k S , n and P k C , n are the transmitting power of sensing and communication signals at time slot k, respecitvely, and h k n is the THz channel gain from the UAV to the user n.
Without loss of generality, we assume that the UAV is moving with a constant speed denoted by V, and the location of the UAV is denoted by L k = ( x k , y k , H ) at time slot k. Here, the altitude, H, of the UAV is assumed to be constant. Therefore, the following coordinates of the UAV at time slot k should be satisfied
x k = x k 1 + V k c o s ψ k 1 n , y k = y k 1 + V k s i n ψ k 1 n ,
where ψ k 1 n [ 0 , 4 π ] is the direction of the UAV at time slot k 1 from the UAV to the user n.
The following trajectory constraints of the UAV should be satisfied [27]
L 1 L 0 2 V T K + 1 2 , L k L k 1 2 V T K + 1 2 , L K L K 1 2 V T K + 1 2 ,
where L 0 and L K are the initial location and finial location, respectively.
Considering the LoS transmission, the path loss between the UAV and the user, n, can be written as [28]:
h k n f k , i n , ε n f k , i n , ϵ k = H k S p r f k n H k A b s f k n , ε n f k , i n , ϵ k e j 2 π f k n ,
where f k , i n , i { f 1 , f 2 , . . . , f I } is the carrier frequency adpoted by the UAV for communicating with user n and ε n f k , i n , ϵ k is the absorption coefficient parameter related to the carrier frequency f k , i n and the number of water molecules in the atmosphere, ϵ k , at time slot k.
The free space direct ray or LoS channel transfer function, H L o S , consists of the spreading loss function, H S p r , and the molecular absorption loss function, H A b s . The transfer function due to the spreading loss is given by:
H k S p r f k , i n = c 4 π f k , i n d k u , n .
The transfer function of the molecular absorption loss can be expressed as:
H A b s f k , i n , ε n f k , i n , ϵ k = e ε n f k , i n , ϵ k d k u , n ,
where the accuracy of ε n f k , i n , ϵ k is positively correlated with the sensing power. For the specific formula, please refer to [22].
The environmental parameters change slowly; therefore, we can use time slot k 1 to represent the sensing estimate value at time slot k. The communication signal of the user at time slot k is the total signal received at time slot k, z k n , minus the sensing estimated signal at time slot k. Thus, user n receives communication signals at time slot k, which can be determined by:
C k n = z k n h ˜ k 1 n f k 1 , i n , ε n f k 1 , i n , ϵ k 1 P k 1 S , n 1 2 S k 1 ,
where h ˜ k 1 n f k 1 , i n , ε n f k 1 , i n , ϵ k 1 is the THz channel gain at frequency f k , i n , which is obtained by sensing signals.
The THz-UAV needs to extract sensing signals to estimate ε n f k , i n , ϵ k and to assign a THz carrier to users. The accuracy of ε n f k , i n , ϵ k affects the THz carrier distribution. Similarly, at time slot k, the sensing signals received by the THz-UAV can be expressed as:
S k n = z k n h ˜ k 1 n f k 1 , i n , ε n f k 1 , i n , ϵ k 1 P k C , n 1 2 C k .
As a result of sensing and communication signals sharing spectrum resources, the error between the real sensing signal at time k and the estimated sensing signal will interfere with communication signals. In addition, other users using the same THz carrier will also interfere with user n. Therefore, the SINR received at user n can be expressed as:
γ k n = p k C , n h ˜ k 1 n · N 0 + p k S , n h ˜ k 1 n · p k 1 S , n h ˜ k 1 n · + j = 1 j / n p k C , j h ˜ k 1 j ( · ) + p k S , j h ˜ k 1 j ( · ) p k 1 S , j h ˜ k 1 j · ,
where N 0 is the additive white gaussian noise power at user n using the ith carrier frequency of the THz band.
Correspondingly, the achievable downlink rate of the UAV to user n can be written as [29]:
r k n = B l o g 1 + γ k n ,
where B is the bandwidth of the UAV to user n, which is assumed to be equal for each user.
Thus, the delay of all the users at time slot k can be written as follows:
Φ k = n = 1 N D n B u l o g 1 + γ k n ,
where D n is the amount of data required by user n.

3.2. Problem Formulation

Using the above setup, we aim to minimize the delay over time slots K + 1 by jointly optimizing the UAV trajectory, frequency association, and transmission power. This optimization problem is mathematically formulated as:
min f k n , L k , p k C , n , p k S , n k = 1 K n = 1 N D n B u l o g 1 + γ k n
so that
C 1 : i = 1 I f k , i n 1 , C 2 : L 1 L 0 2 ( V T K ) 2 , C 3 : L k L k 1 2 ( V T K ) 2 , C 4 : L K L k 1 2 V T K 2 , C 5 : n = 1 N p k C , n + p k S , n P k m a x ,
where constraint C1 ensures each user can be associated with one carrier frequency at each time slot k. C2–C4 ensure that the UAV cannot exceed the maximum speed at the time horizon T. C5 limits the maximum transmission power of sensing signals and communication signals.

4. Problem Decomposition and Joint Optimizing Design

4.1. Problem Decomposition

We note that the challenges of solving problem (12) lie in the following reasons. First, the optimization variable f k , i n for user n at time slot k is binary, and thereby the feasible set of problem (12) is non-convex. Second, the variables L k and f k , i n are strongly coupled with the sensing power and communication power. Hence, problem (12) is a mixed integer non-convex optimization problem and in general there is no standard method for solving it efficiently.
To tackle the above challenges, we decompose the original problem (12) into two sub-problems by separating the power allocation optimization (P1) and the trajectory and frequency variables (P2).
We first consider the power variables p k C , n and p k S , n in (P1) by fixing the trajectory variable L k n and the frequency variable f k , i n . Therefore, subproblem (P1) can be expressed as:
( P 1 ) : min p k C , n , p k S , n D n B u l o g 1 + γ k n s . t . C 5
We next consider the trajectory variable in (14) by fixing the UAV power allocation variables p k C , n and p k S , n . Therefore, subproblem (P2) can be formulated by:
( P 2 ) : min L k , f k , i n n = 1 N D n B u l o g 1 + γ k n s . t . C 1 C 4
The two subproblems are separately optimized with multiple iterations. In the j + 1 -th iteration ( j = 0 , 1 , 2 , · · · , j m a x ) , we first optimize p k C , n and p k S , n using the Lagrange multiplier method in (P1) with fixed trajectory variable L k n and frequency variable f k , i n , and find that the solution can be expressed by p k * C , n , p k * S , n . We then optimize the variables L k and f k , i n in (P2) using the PPO algorithm, and find that the solution can be expressed by L k j + 1 , f k , i n , j . After the solution converges or a the maximum number of iterations or j m a x is reached, the solution of (14) can be obtained.

4.2. Joint Optimization Design

In this section, we will present the solution to the above two subproblems, and then propose a joint algorithm via separately optimizing the subproblems in an iterative way.

4.2.1. Joint Sensing and Communication Power

Before solving (12), we first demonstrate the convexity of this problem in Theorem 1 shown below.
Theorem 1.
Problem (P1) is convex. Please refer to Appendix A.
As a result of sub-problem (13) being a convex problem, we chose the Lagrangian dual decomposition method to solve it and obtain the optimal solution of p k * S , n and p k * C , n . The Lagrangian function of (P1) can be given by:
L ( p k S , n , p k C , n , f k , i n , χ , η , ϑ ) = Φ 1 + k = 1 K η k n = 1 N p k C , n + p k S , n P m a x ,
where η k is the Lagrange multiplier associated with constraint C5.
Since (P1) is convex, it satisfies the Karush–Kuhn–Tucker (KKT) conditions, which can be specifically derived as:
η k n = 1 N p k * C , n + p k * S , n P k m a x = 0 ,
L ( · · · ) p k C , n = B u l n 2 1 l o g ( 1 + r k n ) 1 1 + r k n λ 1 + k = 1 K η k = 0 ,
L ( · · · ) p k S , n = B u l n 2 k = 1 K p k C , n h ˜ k 1 n · + N 0 + p k S , n h ˜ k 1 n · p k 1 S , n h ˜ k 1 n · + j = 1 j / n p k C , j h ˜ k 1 j · + p k S , j h ˜ k 1 j · p k 1 S , j h ˜ k 1 j · N 0 + p k S , n h ˜ k 1 n · p k 1 S , n h ˜ k 1 n · + j = 1 j / n p k C , j h ˜ k 1 j · + p k S , j h ˜ k 1 j · p k 1 S , j h ˜ k 1 j · 2 × 1 l o g ( 1 + r k n ) + k = 1 K η k = 0 .
Case 1. If η k 0 , the KKT conditions (16) can be written as:
η k n = 1 N P k S , m a x + P k C , m a x p k C , n p k C , n = 0 ,
where P k C , m a x and P k S , m a x indicate the maximum sensing and communication powers for time slot k, respectively. η k 0 ; therefore, the solution of p ´ k S , n and p ´ k S , n in (P1) can be denoted in closed-form as p ´ k C , n = P k C , m a x and p ´ k S , n = P k S , m a x .
Case 2. If η k = 0 , combining η k = 0 and (17) and (18), the solution of p ´ k S , n and p ´ k S , n in (P1) can be denoted in closed-form as:
p k S , n = j = 1 j / n p k C , j h k j ( f k , i j ) h k n ( f k , i j ) ,
p k C , n = p k 1 S , n h ˜ k 1 n · + j = 1 j / n p k C , j . h ˜ k 1 j · + p k S , j h ˜ k 1 j · p k 1 S , j h ˜ k 1 j · h ˜ k 1 n .
In summary, the optimal solutions of p ´ k S , n and p ´ k S , n in (P1) can be denoted in closed-form as:
a r c m i n p k * C , n , p k * S , n Φ k p k C , n = p ´ k C , n , p k S , n = p ´ k S , n , Φ k p k C , n = p k C , n , p k S , n = p k S , n

4.2.2. Joint UAV Trajectory and Frequency Association

As shown in Figure 3, we pursue an intelligent UAV trajectory optimization aided by the PPO algorithm for improving the system’s delay. The proposed PPO algorithm framework considers the UAV as a learning agent. The learning process of the PPO algorithm for the UAV by interacting with the THz environment can be expressed as:
( S , A , R , γ ) ,
where S is the state space, A is the action space, and R = S × A R is the infinite set of rewards that contain the set of immediate rewards when moving from one state to next state resulting from the actions taken by the agents. The state, action, and reward are defined as follows:
State: The states observed by an agent are determined by a combination of the transmission powers of sensing and communication. Thus, we define the state of a UAV at time step t as follows:
S k ( t ) = n = 1 N p k S , n , ( t ) , n = 1 N p k C , n , ( t ) .
Action: The action is to choose proper flight direction and proper frequency association to obtain better rewards. Furthermore, we define the action performed in time-step t as a k ( t ) . Let us suppose the possibility of state s k taking action a k at time-step t is P θ ( a k ( t ) | s k ( t ) ) , where θ is the probability density function with parameter θ . The action is denoted by the all possible actions at time step t, i.e., A k ( t ) = { 0 4 π } × f k , i n , ( t ) .
Reward: The agent receives an immediate reward, denoted as T k ( t ) T { s k ( t ) , a k ( t ) } R , which describes its benefit from taking action a k ( t ) . Thus, the function of reward can be written as:
T n = k = k K η k k Φ k ,
where η k k [ 0 , 1 ] is the discount rate, which determines the effect of future rewards on the current action. η k k 1 means that the reward value of the future state has a great influence on the action state function, while η k k 0 means that the reward value of the future state has little influence on the action state function.
In the policy gradient algorithm, shown in Algorithm 1, the agent updates the policy by gradient augmentation. In PPO, the old actor modifies its parameters by duplicating the actor’s parameters. In order not to incur too much error, we introduce r a t i o k to limit the magnitude of rewards. In other words, when calculating the rewards, by limiting the ratio of the new policy and the old policy, the amplitudes of the state can be limited. As a result, it not only improves the stability of the PPO algorithm, but also reduces its complexity and improves the efficiency of the calculation. In this paper, the ratio of the old to new policy of each agent is calculated as follows:
r a t i o k = θ k θ k 1 , k { 1 , 2 , . . . , K } .
Figure 3 describes the operation of the PPO algorithm. During training, a set of samples are chosen from the storage system to update the THz network parameters. The value of the network determines the choice of action through the rewards value of these sampled values. The rewards value in turn affects the sampling probability density functions. When the agent explores the THz network parameters, it will select an action at random, targeting a higher long-term reward. Furthermore, it selects the action that gains the most rewards immediately. In order to improve the sampling efficiency, PPO adopts an important sampling method to change the policy gradient algorithm from the on policy to the off policy. At this time, the update formula of the actor network is:
min π θ k Φ k 2 ( θ k ) = E s t P θ k ( τ ) [ J θ k ( θ k ) ] ,
where τ = { s 1 . a 1 , s 2 , a 2 , . . . . , s K , a K } represents the trajectory of the agent in the entire episode.
PPO uses a clip function to directly limit the update range to [ 1 ε , 1 + ε ] . From Figure 4, this function of PPO can be written as follows:
J θ k θ k s t , a t m i n { P θ k a k | s k P θ a k | s k A θ k s t , a t , c l i p P θ k a k | s k P θ a k | s k , 1 ε , 1 + ε A θ k ( s t , a t ) } ,
where ε is a hyperparameter that represents the maximum difference between P θ k and P θ . P θ k ( τ ) interacts with the environment and P θ ( τ ) has already interacted with the environment. Furthermore, A θ k ( s t , a t ) represents the estimation of the advantage function at time step t and can be written as:
A θ k ( s k , a k ) 1 J j = 1 J Φ k 2 E [ Φ k 2 ] ,
where J is the number of points to sample with the probability of P θ ( a k | s k ) . P θ k ( a k | s k ) is the modified probability density function parameters ( θ k ). Furthermore, the function of c l i p can be written as:
c l i p ( x , x m i n , x m a x ) = x , if x m i n x x m a x . x m i n , if x < x m i n . x m a x , if x > x m a x .
The formula for updating the action of possibility, P θ k ( τ ) , can be written as:
θ k + 1 θ k + η k k Φ k θ k .

4.3. Computational Complexity

Theorem 2.
The complexity of Algorithm 2 is given by O ( N + j m a x · K N ) .
Proof. 
In Algorithm 2, the computationally most expensive part is solving the sub-problems in (P1) (line 2) and (P2) (line 3).
In line 2 of Algorithm 2, sub-problem (P1) is solved. Every user needs to calculate function (22). Since there are N users, the computational complexity using method Lagrangian function is O ( N ) .
In line 3 of Algorithm 2, sub-problem (P2) is solved by Algorithm 1. The computationally most expensive part is lines 3 and 4 of Algorithm 1. In lines 3 and 4 of Algorithm 1, we need to calculate the probability density function parameter θ k and calculate the rewards function. Thus, the computational complexity is O ( K N ) . We assume that the maximum number of iterations of Algorithm 1 is j m a x . Therefore, the total computational complexity of Algorithm 1 can be written as O ( j m a x · K N ) .
To summarize, the overall computational complexity of Algorithm 2 is calculated as O ( N + j m a x · K N ) . This concludes the proof.    □
Algorithm 1 The Proximal Policy Optimization Algorithm
1:
for iteration = 1,2,.... j m a x  do
2:
    for action = 1,2,....K do
3:
        Run policy θ k in environment for K time steps according to (30)
4:
        Compute advantage estimates A θ k according to (29)
5:
    end for
6:
    Optimize surrogate θ k
7:
     θ k + 1 θ k
8:
    Calculate J θ k ( θ k ) according to (28)
9:
end for
Algorithm 2 The Proposed Alternating Optimization Algorithm to Solve Problem (12)
1:
for iteration = 1,2,.... j m a x  do
2:
    Solve problem (P1) for given L k , f k , i n and denote the optimal solution as P k * C , n , P k * S , n .
3:
    Solve problem (P2) for given P k , j * C , n , P k , j * S , n and denote the suboptimal solution as L * k , j + 1 , f k , j + 1 * n .
4:
    j = j + 1;
5:
end for

5. Simulation Results

In this section, we numerically evaluate the performance of the overall alternating optimization algorithm of intelligent trajectory planning by implementing simulations in MATLAB. The radius of the UAV coverage area was set to 50 m. We set the bandwidth which is allocated to the UAV as 10 GHz. We adopted THz carrier frequencies of 300 GHz, 310 GHz, 320 GHz, 330 GHz, 340 GHz, and 350 GHz. The details of the relevant parameters are listed in Table 2.
To investigate the convergence behavior of the proposed algorithm, we start with illustrating the accumulation of the UAV communication rate versus the number of iterations when the user Poisson distribution parameter is λ u = 0.2 , 0.3 , or 0.4 persons per meter, Figure 5. It is observed that the proposed algorithm provides a higher sum rate of the system than that of the greedy sampling algorithm. This is because the PPO algorithm considers the rewards from the time of k + 1 to K + 1 . The greedy algorithm is the result of the k-time obtained by mass sampling. Without considering other possible cases in general, the local optimal solution is selected each time and no backtracking is carried out, so the optimal solution is rarely obtained. This highlights the importance of the PPO algorithm, and how it theoretically gives the better sum rate for the system.
In Figure 6, we show a comparison of the system’s sum rate in the THz and Sub-6G frequency ranges, respectively, between the proposed algorithm and the greedy algorithm under varying user distribution functions. It is discovered that the proposed algorithm provides a higher sum rate of the system than that of the greedy algorithm, because in the greedy algorithm, there is a large number of random sampling at time k, while the PPO algorithm not only considers the system performance at time k, but also considers the system performance from time k to time K + 1 . It is also discovered that the THz frequency provides a higher sum rate of the system than that of the Sub-6G. That is because the signal-to-noise ratio is much higher at the terahertz frequency than at the sub-6G frequency due to the high pathloss characteristic of THz channel resulting in low interference between users. This highlights the importance of an appropriate algorithm for the THz frequency.
In Figure 7, the relationship between the maximum communication power and sensing power is shown. It is observed that as the maximum communication power increases, the transmitting sensing power increases, but once the maximum value is reached, the sensing and communication powers start to decrease to maintain the same communication rate. This is because the communication and sensing signals share a spectrum. When the value of the maximum transmitted communication power increases, the THz-UAV increases the power of communication in order to obtain a higher information rate. As a result of the C5 constraint, the sensing power becomes smaller. The precision of sensing the terahertz channel will be affected by the decrease in sensing power. This will affect the allocation of the THz-UAV channel and cause the information rate to decrease. Therefore, there must be a maximum value of the sensing power to obtain the minimum delay.
In Figure 8, we show the relationship between frequency efficiency in the THz and Sub-6G frequencies, respectively. The numbers of users under the parameter of user density function are λ u = 0.2 and λ u = 0.3 . We can see that as the number of users increases, the frequency efficiency increases. This is due to the fact that as the number of users increases, the information rate has been greatly improved. As can be seen from the figure, with the same number of users, the higher the user density function parameter, the lower the spectrum density. This is because when the user density function parameter is higher, the interference between users is stronger, resulting in a reduction in the information rate, so the spectral efficiency is lower. Therefore, the frequency spectrum efficiency of THz wireless communication is easily affected by the user density.

6. Conclusions

This paper investigated the problem of joint UAV trajectory, frequency association, and power optimization, aiming to minimize the sum delay in the terahertz band. The sum delay minimization was formulated as a convex optimization problem. This problem was transformed into the Lagrange multiplier method and a PPO problem. A Lagrange sub-problem was devised, aiming to obtain the sensing and communication powers. A PPO algorithm was devised to obtain the UAV trajectory and frequency association. Our results showed that the proposed algorithm achieved a good performance with a significant increase in the sum delay compared with the greedy algorithm and the Sub-6G frequency scenario, indicating its potential in a practical design. However, the method used in this paper has not used in a real UAV. Thus, there is a certain gap between theory and practice, which provides a direction for future research.

Author Contributions

Conceptualization, Y.G.; methodology, Y.G.; software, L.Z.; validation, H.X., L.Z. and E.S.; formal analysis, L.Z.; investigation, L.Z.; resources, L.Z.; data curation, L.Z.; writing—original draft preparation, Y.G.; writing—review and editing, Y.G. and L.Z.; visualization, L.Z.; supervision, L.Z., H.X. and E.S.; project administration, L.Z.; funding acquisition, L.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Natural Science Foundation of China under grant 61971032, in part by the Hebei Natural Science Foundation under grant F2022402001 and grant A2020402013, and in part by the Open Fund of Chongqing Engineering Research Center of Intelligent Sensing Technology and Microsystem under grant D2021337.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to legal restrictions.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Proof. 
The second-order derivative of objective function (13) with respect to p k C , n and p k S , n can be, respectively, obtained by
2 Φ k 1 2 p k C , n = n = 1 N 2 D n B u l n 2 λ 1 1 1 + γ k n 2 λ 1 1 l o g ( 1 + γ k n ) 0
where λ 1 = h k n ( f k , i n ) N 0 + p k S , n h k n ( f k , i n ) p k 1 S , n h k 1 n ( f k , i n ) + j = 1 j / n p k C , j h k j ( f k , i j ) + p k S , j h k j ( f k , i n ) p k 1 S , j h k 1 j ( f k , i n ) .
2 Φ k 1 2 p k S , n = n = 1 N 2 D n B u l n 2 λ 2 1 1 + γ k n 2 h n n ( f k , i n ) + λ 2 p k C , n h n n ( f k , i n ) 2 2 + r k n + h n n ( f k , i n ) + λ 2 p k C , n h n n ( f k , i n ) 0
where λ 2 = N 0 p k 1 S , n h k 1 n ( f k , i n ) + j = 1 j / n p k C , j h k j ( f k , i j ) + p k S , j h k j ( f k , i n ) p k 1 S , j h k 1 j ( f k , i n ) .
Therefore, the problem (P1) is convex. This concludes the proof. □

References

  1. Zhang, L.; Zhao, H.; Hou, S.; Zhao, Z.; Xu, H.; Zhang, R. A Survey on 5G Millimeter Wave Communications for UAV-Assisted Wireless Networks. IEEE Access 2019, 7, 117460–117504. [Google Scholar] [CrossRef]
  2. Andrews, J.G.; Buzzi, S.; Choi, W.; Hanly, S.V.; Lozano, A.; Soong, A.C.K.; Zhang, J.C. What will 5G be? IEEE J. Sel. Area in Comm. 2014, 32, 1065–1082. [Google Scholar] [CrossRef]
  3. Sarieddeen, H.; Saeed, N.; Al-Naffouri, T.Y.; Alouini, M.-S. Next generation terahertz communications: A rendezvous of sensing, imaging, and localization. IEEE Commun. Mag. 2020, 58, 69–75. [Google Scholar] [CrossRef]
  4. Zhang, Z.; Xiao, Y.; Ma, Z.; Xiao, M.; Ding, Z.; Lei, X.; Karagiannidis, G.K.; Fan, P. 6G wireless networks: Vision, requirements, architecture, and key technologies. IEEE Veh. Technol. Mag. 2019, 14, 28–41. [Google Scholar] [CrossRef]
  5. Liu, A.; Huang, Z.; Li, M.; Wan, Y.; Li, W.; Han, T.X.; Liu, C.; Du, R.; Tan, D.K.P.; Lu, J.; et al. A Survey on Fundamental Limits of Integrated Sensing and Communication. IEEE Commun. Surv. Tutor. 2022, 24, 994–1034. [Google Scholar] [CrossRef]
  6. Liu, F.; Cui, Y.; Masouros, C.; Xu, J.; Han, T.X.; Eldar, Y.C.; Buzzi, S. Integrated Sensing and Communications: Toward Dual-Functional Wireless Networks for 6G and Beyond. IEEE J. Sel. Areas Commun. 2022, 40, 1728–1767. [Google Scholar] [CrossRef]
  7. Zhang, J.; Fei, Z.; Wang, X.; Liu, P.; Huang, J.; Zheng, Z. Integrated Scheduling of Sensing, Communication, and Control for mmWave/THz Communications in Cellular Connected UAV Networks. IEEE J. Sel. Areas Comm. 2022, 40, 2103–2113. [Google Scholar]
  8. Zhang, L.; Wang, Y.; Min, M.; Guo, C.; Sharma, V.; Han, Z. Privacy-Aware Laser Wireless Power Transfer for Aerial Multi-Access Edge Computing: A Colonel Blotto Game Approach. IEEE Internet Things 2022, 15, 2327–4662. [Google Scholar] [CrossRef]
  9. Wang, X.; Wang, P.; Ding, M.; Lin, Z.; Lin, F.; Vucetic, B.; Hanzo, L. Performance Analysis of Terahertz Unmanned Aerial Vehicular Networks. IEEE Trans. Veh. Technol. 2020, 69, 16330–16335. [Google Scholar] [CrossRef]
  10. Griffiths, H.; Cohen, L.; Watts, S.; Mokole, E.; Baker, C.; Wicks, M.; Blunt, S. Radar spectrum engineering and management: Technical and regulatory issues. Proc. IEEE 2015, 103, 85–102. [Google Scholar] [CrossRef]
  11. Roberton, M.; Brown, E.R. Integrated radar and communications based on chirped spread-spectrum techniques. IEEE MTT-S Int. Microw. Symp. Dig. 2013, 1, 611–614. [Google Scholar]
  12. Lin, Z.; Lin, M.; de Cola, T.; Wang, J.; Zhu, W.; Cheng, J. Supporting IoT With Rate-Splitting Multiple Access in Satellite and Aerial-Integrated Networks. IEEE Internet Things J. 2021, 8, 11123–11134. [Google Scholar] [CrossRef]
  13. Yuan, Z.; Yang, Y.; Wang, D.; Ma, X. Energy-Efficient Trajectory Optimization for UAV-Enabled Cellular Communications Based on Physical-Layer Security. Aerospace 2022, 9, 50. [Google Scholar] [CrossRef]
  14. Lan, T.; Qin, D.; Sun, G. Joint Optimization on Trajectory, Cache Placement, and Transmission Power for Minimum Mission Time in UAV-Aided Wireless Networks. ISPRS Int. J. Geo-Inf. 2021, 10, 426. [Google Scholar] [CrossRef]
  15. Ko, Y.; Kim, J.; Duguma, D.G.; Astillo, P.V.; You, I.; Pau, G. Drone Secure Communication Protocol for Future Sensitive Applications in Military Zone. Sensors 2021, 21, 2057. [Google Scholar] [CrossRef]
  16. Krichen, M.; Adoni, W.Y.H.; Mihoub, A.; Alzahrani, M.Y.; Nahhal, T. Security Challenges for Drone Communications: Possible Threats, Attacks and Countermeasures; SMARTTECH: Riyadh, Saudi Arabia, 2022; pp. 184–189. [Google Scholar]
  17. Li, Q.; Nayak, A.; Zhang, Y.; Yu, F.R. A Cooperative Recharging-Transmission Strategy In Powered UAV-Aided Terahertz Downlink Networks. IEEE Trans. Veh. Technol. 2022, 1939–9359. [Google Scholar] [CrossRef]
  18. Xu, L.; Chen, M.; Chen, M.; Yang, Z.; Chaccour, C.; Saad, W.; Hong, C.S. Joint Location, Bandwidth and Power Optimization for THz-enabled UAV Communications. IEEE Commun. Lett. 2021, 25, 1984–1988. [Google Scholar] [CrossRef]
  19. Raza, A.; Ijaz, U.; Ishfaq, M.K.; Ahmad, S.; Liaqat, M.; Anwar, F.; Iqbal, A.; Sharif, M.S. Intelligent reflecting surface-assisted terahertz communication towards B5G and 6G: State-of-the-art. Microw. Opt. Technol. Lett. 2022, 64, 858–866. [Google Scholar] [CrossRef]
  20. Chiriyath, A.R.; Paul, B.; Bliss, D.W. Radar-Communications Convergence: Coexistence, Cooperation, and Co-Design. IEEE Trans. Cogn. Commun. Netw. 2017, 3, 1–12. [Google Scholar] [CrossRef]
  21. Zheng, L.; Lops, M.; Wang, X.; Grossi, E. Joint Design of Overlaid Communication Systems and Pulsed Radars. IEEE Trans. Signal Process. 2018, 66, 139–154. [Google Scholar] [CrossRef] [Green Version]
  22. Chiriyath, A.R.; Paul, B.; Jacyna, G.M.; Bliss, D.W. Inner Bounds on Performance of Radar and Communications Co-Existence. IEEE Trans. Signal Process. 2015, 64, 464–474. [Google Scholar] [CrossRef]
  23. Lin, Z.; Niu, H.; An, K.; Wang, Y.; Zheng, G.; Chatzinotas, S.; Hu, Y. Refracting RIS-Aided Hybrid Satellite-Terrestrial Relay Networks: Joint Beamforming Design and Optimization. IEEE Trans. Aerosp. Electron. Syst. 2022, 58, 3717–3724. [Google Scholar] [CrossRef]
  24. Lin, Z.; An, K.; Niu, H.; Hum, Y.; Hu, Y.; Chatzinotas, S.; Zheng, G.; Wang, J. SLNR-based Secure Energy Efficient Beamforming in Multibeam Satellite Systems. IEEE Trans. Aerosp. Electron. Syst. 2022, in press. [CrossRef]
  25. Wang, Y.; Chen, M.; Pan, C.; Wang, K.; Pan, Y. Joint Optimization of UAV Trajectory and Sensor Uploading Powers for UAV-Assisted Data Collection in Wireless Sensor Networks. IEEE Interent Things 2022, 9, 11214–11226. [Google Scholar] [CrossRef]
  26. Lin, Z.; Lin, M.; Wang, J.; de Cola, T.; Wang, J. Joint Beamforming and Power Allocation for Satellite-Terrestrial Integrated Networks With Non-Orthogonal Multiple Access. IEEE J.-STSP 2019, 13, 657–670. [Google Scholar] [CrossRef] [Green Version]
  27. Zhang, L.; Ma, X.; Zhuang, Z.; Xu, H.; Sharma, V.; Han, Z. Q-Learning Aided Intelligent Routing with Maximum Utility in Cognitive UAV Swarm for Emergency Communications. IEEE Trans. Veh. Technol. 2022, in press. [Google Scholar] [CrossRef]
  28. Han, C.; Bicen, A.O.; Akyildiz, I.F. Multi-Ray Channel Modeling and Wideband Characterization for Wireless Communications in the Terahertz Band. IEEE Trans. Wirel. Commun. 2015, 14, 2402–2412. [Google Scholar] [CrossRef]
  29. Zhang, L.; Zhang, H.; Guo, C.; Xu, H.; Song, L.; Han, Z. Satellite-Aerial Integrated Computing in Disasters: User Association and Offloading Decision. In Proceedings of the 2020 IEEE International Conference on Communications (ICC), Dublin, Ireland, 7–11 June 2020; pp. 554–559. [Google Scholar]
Figure 2. Illustration of a terahertz band integrated sensing and communications network.
Figure 2. Illustration of a terahertz band integrated sensing and communications network.
Sensors 23 03005 g002
Figure 3. PPO algorithm.
Figure 3. PPO algorithm.
Sensors 23 03005 g003
Figure 4. The value of J θ k ( θ k ) .
Figure 4. The value of J θ k ( θ k ) .
Sensors 23 03005 g004
Figure 5. Number of iterations of the algorithms.
Figure 5. Number of iterations of the algorithms.
Sensors 23 03005 g005
Figure 6. Relationship between rate and number of users.
Figure 6. Relationship between rate and number of users.
Sensors 23 03005 g006
Figure 7. Relationship between sensing and communication powers.
Figure 7. Relationship between sensing and communication powers.
Sensors 23 03005 g007
Figure 8. Relationship between frequency efficiency and number of users.
Figure 8. Relationship between frequency efficiency and number of users.
Sensors 23 03005 g008
Table 1. Our novel contribution contrasted to the state-of-the-art in UAV communication research.
Table 1. Our novel contribution contrasted to the state-of-the-art in UAV communication research.
[12,13,14,15,16][17,18,19][20,21,22][23,24,25,26]Our Work
THz Frequency××
UAVs Communication××
Power Optimization××
Integrated Sensing and Communication×××
UAV Trajectory Design×
Table 2. Simulation parameters.
Table 2. Simulation parameters.
ParameterValueParameterValue
Time, T20 sA-BS Height, H5 m
Time slot, K + 1 1 msABS Speed, V[0,3] m/s
Noise power, N 0 −20 bBmReference pressure, p 0 101.325 kPa
Reference temperature, T S T P 20Maximum sensing transmission power, p k S , n 30 dBm
Maximum communication transmission power, p k C , n 30 dBmDiscount rate, η 0.1
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Gao, Y.; Xue, H.; Zhang, L.; Sun, E. UAV Trajectory Design and Power Optimization for Terahertz Band-Integrated Sensing and Communications. Sensors 2023, 23, 3005. https://doi.org/10.3390/s23063005

AMA Style

Gao Y, Xue H, Zhang L, Sun E. UAV Trajectory Design and Power Optimization for Terahertz Band-Integrated Sensing and Communications. Sensors. 2023; 23(6):3005. https://doi.org/10.3390/s23063005

Chicago/Turabian Style

Gao, Ying, Hongmei Xue, Long Zhang, and Enchang Sun. 2023. "UAV Trajectory Design and Power Optimization for Terahertz Band-Integrated Sensing and Communications" Sensors 23, no. 6: 3005. https://doi.org/10.3390/s23063005

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop