Next Article in Journal
Over the Limits of Traditional Sampling: Advantages and Issues of AICs for Measurement Instrumentation
Previous Article in Journal
Multiple Damaged Cables Identification in Cable-Stayed Bridges Using Basis Vector Matrix Method
Previous Article in Special Issue
Precoder and Decoder Co-Designs for Radar and Communication Spectrum Sharing
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Biological Intelligence Inspired Trajectory Design for Energy Harvesting UAV Networks †

1
Beijing Laboratory of Advanced Information Network, Beijing University of Posts and Telecommunications, Beijing 100876, China
2
Beijing Key Laboratory of Network System Architecture and Convergence, Beijing University of Posts and Telecommunications, Beijing 100876, China
3
State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing 100876, China
*
Author to whom correspondence should be addressed.
This paper is an extended version of our paper published in Liu, X.; Chen, M.; Wang, S.; Saad, W.; Yin, C. Trajectory Design for Energy Harvesting UAV Networks: A Foraging Approach. In Proceedings of the 2020 IEEE Wireless Communications and Networking Conference (WCNC), Seoul, Republic of Korea, 25–28 May 2020; pp. 1–6.
Sensors 2023, 23(2), 863; https://doi.org/10.3390/s23020863
Submission received: 29 November 2022 / Revised: 3 January 2023 / Accepted: 8 January 2023 / Published: 11 January 2023
(This article belongs to the Special Issue Artificial Intelligence (AI)-Enabled 6G Communications)

Abstract

:
In this paper, the problem of trajectory design for energy harvesting unmanned aerial vehicles (UAVs) is studied. In the considered model, the UAV acts as a moving base station to serve the ground users, while collecting energy from the charging stations located at the center of a user group. For this purpose, the UAV must be examined and repaired regularly. In consequence, it is necessary to optimize the trajectory design of the UAV while jointly considering the maintenance costs, the reward of serving users, the energy management, and the user service time. To capture the relationship among these factors, we first model the completion of service and the harvested energy as the reward, and the energy consumption during the deployment as the cost. Then, the deployment profitability is defined as the ratio of the reward to the cost of the UAV trajectory. Based on this definition, the trajectory design problem is formulated as an optimization problem whose goal is to maximize the deployment profitability of the UAV. To solve this problem, a foraging-based algorithm is proposed to find the optimal trajectory so as to maximize the deployment profitability and minimize the average user service time. The proposed algorithm can find the optimal trajectory for the UAV with low time complexity at the level of polynomial. Fundamental analysis shows that the proposed algorithm achieves the maximal deployment profitability. Simulation results show that, compared to Q-learning algorithm, the proposed algorithm effectively reduces the operation time and the average user service time while achieving the maximal deployment profitability.

1. Introduction

1.1. Background and Motivation

Taking advantage of their mobility and low cost, unmanned aerial vehicles (UAVs) can provide more swift deployment and better communication channels for next generation wireless communication systems [1]. In fact, UAVs have already been deployed in extensive fields [2], such as wireless power transfer, wireless sensor networks, and secure communications. However, energy limitation is still the challenge in UAV-assisted wireless networks.
Together with a couple of communication equipments consuming extra energy, the flight time of a UAV can be substantial reduced [3]. In the UAV-assisted networks, the flight time of the UAVs determines the life time of the communication networks. Optimizing the energy management of the UAVs can effectively extend the life time of the UAV-assisted networks. On the other hand, the extra payload of the assisted-UAVs increases the probability of UAV damage and malfunction [4]. This kind of risk degrades the reliability of the UAV service and affects the accuracy and performance of the UAV-assisted wireless network, which makes regular repair necessary before UAV deployment and increases the cost of maintenance.
Motivated by the aforementioned factors, we focus on a wireless network that a UAV provides service to ground users. In such a network, the UAV can harvest energy from charging stations to extend its flight time. By selecting served users and designing the trajectory, the UAV can further optimize the energy consumption. In the studied scenario, the UAV trajectory is jointly evaluated by the the maintenance costs, the energy management, the completion of users’ requests, and the user service time.

1.2. Related Work

The existing literature has studied a number of problems related to the energy management of UAVs for wireless communication systems, such as [5,6,7,8,9,10]. The authors in [5] derived a theoretical model on the propulsion energy consumption of UAVs, which first correlated the UAVs’ energy consumption with the varying flying speed, direction, and acceleration in UAV communications. The work in [6] investigated the energy trade-off between the communication power and the propulsion power, so as to find an energy-efficient design of UAV trajectory. In [7], the authors studied the energy-efficiency in a multi-UAV coverage deployment model by a game-theoretic framework and proposed a sub-optimal energy-efficient coverage deployment by decoupling the coverage maximization and power control. The authors in [8] studied the energy consumption and completion time trade-off in a UAV-enabled wireless power communication network, so as to achieve better communication performance. The works in [5,6,7,8] only consider the optimization of UAV energy consumption to save energy, but ignore the energy supplementary which can also extend the working time of a UAV to serve more users. The work in [9] studied the energy of solar-powered UAVs and considered the solar energy harvesting during the UAV deployment, which enhances the UAV communication capacity. The authors in [10] introduced ground solar panels to recharge UAVs and discussed the relationship between UAV battery level and UAV coverage. With the ground solar panels as supplementary, the mission duration of UAVs can be extended. However, most of the existing works such as [5,6,7,8,9,10] solves the UAV energy management problems with optimization methods, which takes too much time for UAVs to obtain the optimal policies to execute in practical environments.
A number of existing literature works [11,12,13] have studied the combination of low-complexity biological intelligence with UAV control. The work in [11] studied the collision-free trajectory problem by introducing swarm behaviors, which makes the UAV be aware of spatial-temporal constraints and eliminate collision conflicts. The authors in [12] explored the biological robustness to design a reliable multi-UAV network by adaptively resisting the node failures. In [13], the authors proposed a target searching scenario of multi-UAVs and coordinated the UAV behaviors as stigmatic and flocking behaviors. By the bio-inspired strategy, the UAVs can efficiently search and sense potential targets. Motivated by the above works [11,12,13], we model the UAV deployment as a foraging process of bacteria searching for protein to extend lifetime. In the proposed wireless network, the UAV works as a base station (BS) to serve users and searches for energy supplementary to extend the working time. In this case, the trajectory design problem of UAVs with energy management can be solved by an algorithm with low time complexity. Furthermore, the UAV deployment can be faster in practice.

1.3. Contributions

The main contribution of this paper is to optimize the trajectory of the UAV while jointly considering maintenance cost, the reward of serving users, the energy management during deployment, and the user service time. In this regard, our key contributions are summarized as follows:
  • We propose an energy harvesting UAV network, in which the UAV can serve ground users while collecting energy from the charging stations (CSs). To serve the ground users and collect energy, the UAV must be examined and repaired before deployment. In consequence, it is necessary to jointly consider the maintenance cost, the number of users that are served by the UAV, and the energy consumption and harvesting.
  • To capture the relationship among the maintenance cost, the number of users that are served by the UAV, and the energy consumption and harvesting, we model the completion of users’ data requests and the harvested energy as reward, and the energy consumption as cost. The deployment profitability is defined as the ratio of the reward achieved during the deployment to the cost of energy consumption. Given the concept of the deployment profitability, the trajectory design problem is decoupled as a decision-making problem of maximizing the deployment profitability and a queuing problem of minimizing the average user service time.
  • To solve this problem, we develop a foraging-based algorithm [14]. Compared to the trajectory design algorithms such as successive convex approximation [15] and Q-learning [16,17], the proposed foraging algorithm is proved to design the UAV trajectory with the optimal deployment profitability and minimize the average service time of served users. The time complexity of the proposed algorithm is also significantly reduced to the level of polynomial.
Simulation results show that, in terms of the deployment profitability, the proposed algorithm yields up to 20.2% gain compared to the Q-learning algorithm. In terms of the average user service time, based on the optimized deployment profitability, the proposed algorithm achieves 17.3% and 8.7% reduction compared to the worst case benchmark and the Q-learning algorithm, respectively. The proposed algorithm also reduces the operation time effectively. To our best knowledge, this is the first work that uses the foraging theory to analyze the profitability of UAV deployment and design the trajectory.

1.4. Organization

The rest of this paper is organized as follow. The system model and problem formulation are described in Section 2. The foraging-based algorithm is introduced in Section 3. In Section 4, numerical results are presented and analyzed. Finally, conclusions are drawn in Section 5.

2. System Model and Problem Formulation

We consider a downlink wireless network that consists of a rotary-wing UAV and a set U of U users. The users are equally clustered into a set G of G groups, as shown in Figure 1. In these user groups, C groups are equipped with CSs. The CSs located at the center of user groups are made by laser transmitters so as to provide energy for the UAV installed with photovoltaic receivers by laser power. The UAV deployed at an initial position works as a BS to provide service to the users according to user’s data request D i and harvests energy from the CSs to extend the UAV working time. For ease of reading, we summarize the main notations in this paper in Table 1.
For each time slot τ , the UAV will serve one group of users. In particular, providing service to a group of users consists of four steps: (1) Flying to the center of group j, (2) Harvesting energy to charge battery if a CS exists, (3) Providing downlink transmission to complete all the data requests in a given group, and (4) Returning to the initial deployed position. Next, we first introduce the transmission model and energy consumption model of the UAV. Then, we define the deployment profitability of the UAV to evaluate the service trajectory and formulate the problem of maximizing the deployment profitability. On this basis, we further formulate the problem of minimizing the average service time of served users.

2.1. Transmission Model

The size of data requested by user i located at x i , y i is D i , i U . After flying to the center of group j, whose coordinate is m j , n j , j G , the UAV can first charge its battery if group j owns a CS. Then, the UAV BS provides service to all users in group j simultaneously.
The probabilistic UAV channel model is used to model the transmission link between the UAV and user i. Probabilistic line-of-sight (LoS) and non-line-of-sight (NLoS) links are considered in [18]. The LoS and NLoS channel gains of the UAV transmitting data to user i are given by [19]:
g i j LoS = d i j α ,
g i j NLoS = η d i j α ,
where d i j = x i m j 2 + y i n j 2 + H 2 is the distance between user i and the UAV hovering position at group j, H is the altitude of the UAV, α is the path loss exponent for the UAV transmission link, and η is an additional attenuation factor caused by the NLoS connection. The probability of the LoS link is given by [20]:
γ i j LoS = 1 1 + X exp Y ϕ i j X ,
where X and Y are constants depended on the environment (rural, urban, dense urban, and others), ϕ i j = 180 π sin 1 H d i j is the elevation angle in degree. The average channel gain from the UAV to user i is given by [19]:
g ¯ i j = γ i j LoS × g i j LoS + γ i j NLoS × g i j NLoS ,
where γ i j NLoS = 1 γ i j LoS . Based on Shannon equation, the downlink rate of user i in group j is expressed as:
c i j = ρ i j B log 2 1 + P i j T g ¯ i j σ 2 ,
where B is the total bandwidth of the UAV downlink transmission, ρ i j is the bandwidth allocation coefficient of user i in group j, P i j T is the transmission power of the UAV serving user i in group j, and σ 2 is the power of the Gaussian noise. The transmission time of each user i can be simply given by t i j T = D i / c i j and the transmission time of group j is defined as the maximal transmission time of users in this group, which is given by:
t j T = max i U j t i j T ,
where U j is the set of users in group j. With the transmission model of UAV serving users, we can further define the energy consumption model.

2.2. Energy Consumption Model

In this model, the UAV can harvest energy from CSs. The UAV charges its battery to extend its working time via uplink wireless power transfer (WPT) [21]. The CS in group j transmits energy with the power of P j E , where P j E = 0 implies that group j does not have a CS. Since the CSs are located at the centers of user groups and the UAV also hovers over the centers of user groups, we assume that the WPT channel is LoS-dominated so that the free-space path loss model is adopted. The path loss of the power transferred from the CS to the UAV is expressed as h = β 0 H 2 , where β 0 denotes the power path loss at a reference distance. Hence, the received power by the UAV from CS in group j is given by:
P j C = h P j E .
The charging time in group j is defined as t j C . Obviously, the UAV cannot harvest energy while serving user group j without a CS, which implies the charging time is t j C = 0 . In consequence, the energy that is harvested by the UAV from a CS in group j is given by:
E j C = P j C t j C .
The CSs are the only sources that the UAV can charge the battery, so the total harvested energy of the UAV serving users in group j is given by E j + = E j C .
The energy consumption of the UAV consists of two components: (1) energy consumption of UAV-user communication and (2) energy consumption of UAV movement. The energy consumption of UAV-user communication refers to the energy that the UAV uses to complete users’ data requests. The energy consumption of UAV movement consists of the propulsion energy that the UAV takes round-trip between the initial position and group centers, and the hovering energy supporting the UAV to provide service. Next, we formulate the model of the energy consumption of the UAV.
(1) Communication Energy: The transmission power and time of the UAV serving user i in group j are defined in the previous subsection by P i j T and t i j T , respectively. Thus, the energy consumption of the UAV transmitting data to the users in group j is given by:
E j T = i U j P i j T t i j T .
(2) Movement Energy: To serve a group of users, the UAV needs to fly to the center of group j, hover while charging and serving, and return to the initial position eventually. We assume that the horizontal velocity of the UAV is a constant v during the movement. The one-way propulsion energy consumption from the initial position to the center of group j is given by [5]:
E j M = d j v c 1 v 3 + c 2 v ,
where d j represents the distance from the initial position to group j, c 1 and c 2 are the propulsion parameters related to the weight, wing area, and air density of the UAV. Similarly, the energy consumption of UAV moving from the center of group j to the initial position is E j M .
The energy consumption of UAV hovering at group j is given by:
E j H = P H t j C + t j T ,
where P H is the hovering power that depends on the UAV weight, air density, and rotor disc area [22]. For the groups without CSs, the energy consumption of UAV hovering is E j H = P H t j T , since the UAV will not spend time to harvest energy.
In consequence, the total energy that the UAV consumes for serving a group of users is given by:
E j = E j T + 2 E j M + E j H .
From (12), we can see that the transmission energy consumption, the propulsion energy consumption of the round-trip, and the hovering energy consumption are jointly considered.

2.3. Problem Formulation

Next, we first introduce the notion of the deployment profitability and formulate the problem of maximizing the deployment profitability. Then, we formulate the problem of minimizing the average service time of served users on the basis of maximal deployment profitability.
For the service provider, deploying a UAV to serve the users in a certain area has a maintenance cost for examining and repairing the UAV, which is denoted by Q. The deployment profitability is used to capture the relationship among the maintenance cost, the number of users that are served, and the energy consumption and harvesting, which is given by:
f o = Q + j G o j q S U j + q C E j + j G o j q C E j ,
where o = o 1 , , o j , , o G denotes the potential served groups, U j = | U j | is the number of users in group j that served by the UAV, | · | is the operator that counts the elements in a set. In particular, o j = 1 implies that the users of group j will be served by the UAV. Otherwise, we have o j = 0 . q S is the income that the UAV gains by completing one user request. q C is the energy price per Joule. q S U j represents the reward that the UAV earns by serving users in group j. q C E j + implies the reward that the UAV achieves by harvesting energy from the CS of group j. q C E j reveals the energy cost for serving group j.
Having introduced the notation of deployment profitability in (13), the maximization problem can be formulated as:
max o , P i j T , ρ i j f o ,
s . t . o j 0 , 1 ,
i U j ρ i j 1 , o j = 1 ,
i U j P i j T P T , o j = 1 ,
where constraint (15) means o j is the indicator of potential served group, constraints (16) and (17) indicate that the sum of allocated bandwidth and transmission power cannot exceed the total bandwidth B and the UAV transmission power P T , for potential served groups. (14) aims to select the potential served groups so that the UAV can achieve the maximal deployment profitability in any trajectory.
With the optimal group selection o * , we further define the total service delay [17] of each user so as to design the optimal trajectory for minimizing the average service time of served users.
For user i in group j that selected by (14), the total service delay does not only include the transmission delay, it also includes the time for waiting the UAV to complete the former services. We define the UAV trajectory as e = e 1 , , e τ , , e T , where e τ = j indicates that the UAV flies to group j at time slot τ , T = | | o * | | 0 is the number of time slots that the UAV completes the deployment with maximal profitability, | | · | | 0 is the 0-norm operator that counts the non-zero elements in a vector. Since the UAV serves one user group at one time slot, the number of potential served groups can be also represented by T. Given the UAV trajectory e , we use t τ F e to indicate the total time that the UAV returns to the initial position after serving group e τ . Obviously, the total time of each time slot τ can be derived by the total time of the previous time slot τ 1 , which is given by:
t τ F e = t τ 1 F e + 2 d e τ v + t e τ C + t e τ T ,
where t 0 F e = 0 represents the UAV is first deployed in the serving area. Thus, for user i in group j served in time slot τ , the total service time can be expressed by:
t i j W e = t τ F e + d j v + t j C + t i j T ,
which includes the waiting time that the UAV completes the former services, the flight time that the UAV moves from the initial position to the center of group j, the transmission time of completing user i’s request, and the charging time of the UAV if a CS is in group j. On this basis, we can further formulate the problem of minimizing the average user service time as:
min e j o * i U j t i j W e j o * U j ,
s . t . j e , o j * = 1 ,
where constraint (21) ensures that all the potential user groups can be served with the UAV trajectory e . Thus, by using (20), we can design the optimal trajectory with minimized average user service time based on the optimal served groups with the maximized deployment profitability.
Finding the optimal served groups in (14) needs to evaluate all possible permutations of group selection o . Using conventional optimization methods may not be practical for a future wireless network that consists of a large number of wireless devices, it is necessary to introduce a low complexity algorithm to find the optimal group selection and design the UAV trajectory.

3. Foraging-Based Trajectory Design Algorithm

To solve the deployment profitability maximization problem in (14) and the average user service time minimization problem in (20), we propose an algorithm based on foraging theory [14]. Compared to existed algorithms for UAV trajectory design [17] such as Q-learning and double Q-learning, whose operation time is based on the number of actions and states of each agent, the foraging-based algorithm can result in the maximum of the deployment profitability with a polynomial time complexity. With the maximal deployment profitability, we can further design the UAV trajectory.
The proposed foraging-based algorithm can be divided into four parts: (a) calculating the reward of serving users, the energy consumption and harvesting for each group j; (b) ranking the ratio of reward to cost of serving each group j; (c) choosing potential served groups to achieve the maximal deployment profitability; and (d) designing the trajectory for minimizing the average user service time.
Next, we first introduce the components of the proposed foraging algorithm. Then, we explain how to use the proposed foraging-based algorithm to find the optimal trajectory for the UAV so as to maximize the deployment profitability and minimize the average service time of served users. Then, we analyze the time complexity of the foraging-based algorithm.

3.1. Components of Foraging-Based Algorithm

In (14), maximizing the deployment profitability can be treated as a decision-making problem. In particular, to solve this problem, we must select the user groups that the UAV will serve, which can be determined by the foraging theory. The components of the foraging theory can be corresponded to this problem as follows [14]:
(1) Forager: Given the defined system model, the UAV takes actions of selecting the potential served groups and designing the trajectory. During the serving process, the UAV can be regarded as the forager.
(2) Advantage-to-disadvantage function: The optimal behavior of a forager is to maximize the generic advantage-to-disadvantage (A2D) function. In the proposed problem, the behavior of the forager UAV is to maximize the deployment profitability which depends on the potential served groups. To solve the maximization problem (14), we need to reconstruct the deployment profitability (13) into the form of A2D function. The A2D function of serving group j is given by:
f o = Q + j G o j M j j G o j N j ,
where M j = q S U j + q C E j + represents the reward after serving group j, N j = q C E j represents the energy consumption of serving the users in group j. (22) can also be written by:
f o = o j M j + M ^ j o j N j + N ^ j ,
where M ^ j = j G , j j o j M j Q represents the reward that the UAV serves all the groups except group j , N ^ j = j G , j j o j N j indicates the cost of energy consumption of serving all the groups except group j .
(3) Profitability of objects: The forager makes decisions based on the profitability of its objects. In the maximization problem (14), the objects that the forager UAV aims at are the user groups. For group j, the reward that the UAV can gain stems from providing service and harvesting energy. The cost is the energy consumption of providing service to the users in each group. Thus, the profitability of group j can be defined as p j = M j / N j . Similarly, p ^ j = M ^ j / N ^ j can be regarded as the alternative profitability of group j, which is the deployment profitability resulting from serving all the groups except group j. In the following subsection, we will introduce the use of the profitability and the alternative profitability for the foraging-based algorithm to find the potential served groups and achieve the maximal deployment profitability.

3.2. Implementation of Foraging-Based Algorithm

In the studied model, we formulate the problem as: (1) maximizing the deployment profitability, and (2) minimizing the average service time of served users. From (14), we can see that the optimization variables are the group selection indicator o , the power allocation P i j T , and the bandwidth allocation coefficient ρ i j . In particular, the served group selection depends on the result of power and bandwidth allocation. By optimizing the resource allocation of each group, the forager UAV can obtain the group information including the service time, energy consumption of each group. We can also decouple the maximization problem into two parts: (a) resource allocation for each user groups, and (b) group selection for achieving the maximal deployment profitability. The optimization of resource allocation is widely studied in existing literature using convex optimization [23] or reinforcement learning [24] methods. In this work, we mainly focus on the foraging-based group selection. Thus, the maximization problem (14) can be simplified as:
max o f o ,
s . t . o j 0 , 1 ,
where the resource allocation of power and bandwidth has already been solved by existing methods.
With the optimal resource allocation, the deployment profitability can be treated as a linear function of a vector o . To obtain the maximum of the deployment profitability, we need to differentiate f o with respect to o j . Thus, we temporarily adopt the served indicator relaxation, where o j can be any real value in 0 , 1 , so as to make the function f o continuous in its domain. Later, we will show that the optimal solution of the served indicator o j must be either 1 or 0 even though the feasible domain of o j is relaxed. The partial differential of f o with respect to o j is given as follow:
f o j = M j N ^ j N j M ^ j o j N j + N ^ j 2 .
From (26), it is noted that if M j N ^ j N j M ^ j is negative, then f o is maximized by choosing the smallest o j . Alternatively, if M j N ^ j N j M ^ j is positive, then f o is maximized by choosing the largest o j . Thus, we can obtain a policy for the UAV selecting the user groups:
o j = 0 , if p j p ^ j , 1 , if p j > p ^ j .
From (27), we can see that the UAV will select group j to serve the users once the profitability of group j is larger than the alternative profitability of group j . p j can be easily obtained based on the reward and cost of group j itself. However, calculating p ^ j needs to know the reward and cost of all other groups, which takes up a large amount of calculation.
In consequence, we introduce a sorting algorithm to solve the simplified problem (24) by only calculating profitability p j of each group. According to their profitability, the groups are descending ranked as p s 1 > p s 2 > > p s G , where s k = j indicates the profitability of group j is the k-th largest of all the groups. The group selection o is default 0 . Starting from the most profitable group, the UAV decides to serve the first k groups to increase the deployment profitability, until the termination condition is satisfied. To find the termination condition of group selection, we present the following result.
Theorem 1.
The deployment profitability f o * achieves maximum when the termination condition is satisfied, which is given by:
f o * = Q + o * o j * M o j * o * o j * N o j * > p s k + 1 .
Proof. 
Please refer to Appendix A.    □
From Theorem 1, we can see that the UAV selects each potential served group with the largest group profitability until the deployment profitability of the current served group selection o * is larger than the group profitability of the ( k + 1 ) largest group. In other words, the current group selection o * makes the deployment profitability f o * greater than the profitability of any unselected group. Based on the above procedure, the foraging-based group selection algorithm for maximizing the deployment profitability performed by the UAV is summarized in Algorithm 1.
With the optimal group selection o * , we can design the UAV trajectory so as to minimize the average service time of served users. From (19), we can see that the waiting time of each served user consists the fixed part and the variable part, which can be expressed by:
t i j W e = a b t τ F e variable + a b d j v + t j C + t i j T fixed .
Algorithm 1 Foraging-based group selection
1:
Input: User positions x i , y i , user requests D i , group locations m j , n j , and charging power P j C
2:
Init: UAV position, group selection o = 0
3:
Optimize the power allocation P i j T and bandwidth allocation coefficient ρ i j for each group and each user
4:
Calculate the profitability p j = M j / N j for each group
5:
Rank the profitability from large to small
6:
repeat
7:
   Select the next potential served group j from the remaining set with the largest profitability
8:
    o j 1
9:
   Delete group j from the group set G
10:
until Satisfy (28)
11:
o * o
12:
Output: Optimal group selection o * with maximal deployment profitability
For any trajectory e , the fixed part of user service time cannot be optimized. In this case, the minimization problem (20) can be simplified as minimizing the average total time of all the time slot τ , which is given by:
min e 1 T τ e t τ F e ,
s . t . j e , o j * = 1 ,
where U j can be reduced since the users are equally clustered in groups. The simplified minimization problem (30) can be treated as a queuing problem of the potential served groups where completing the service of group j spends a time of t ^ j = 2 d j v + t j C + t j T . It is easy to know that the group with the shortest service time t ^ min should be first served so as to achieve the average total service time. In this case, we rank the service time of all the potential served groups selected by o * in ascending order as t ^ z 1 < t ^ z 2 < < t ^ z T , where z k = j indicates the service time of group j is the k-th shortest of all the potential served groups. The UAV trajectory can be written as:
e = z 1 , z 2 , , z T , o z k * = 1 .
Based on the above procedure, the trajectory design algorithm for minimizing the average service time of all served users is summarized in Algorithm 2.
Algorithm 2 Trajectory design for minimizing the average service time
1:
Input: Optimal served group selection o *
2:
Init: UAV trajectory e , time slot τ = 0
3:
Calculate the service time t ^ j of each potential served group j
4:
Rank the service time from small to large
5:
repeat
6:
    τ τ + 1
7:
   Select the next served group j with the τ -th shortest service time
8:
    e e , j
9:
until τ = T
10:
e * e
11:
Output: Optimal UAV trajectory e * with minimal average user service time

3.3. Complexity of Foraging-Based Algorithm

In the studied problem, the resource allocation subproblem is first solved by existing methods. The proposed algorithm mainly focuses on the group selection and trajectory design parts based on an optimized resource allocation policy. In these two parts, the intermediate variables can be easily obtained by the algebra calculation, the time complexity of ergodic calculation is at a linear level of O G . The complexity of ranking the profitability is based on the chosen sorting algorithm. In the proposed algorithm, we use the QuickSort to rank the profitability and the service time. The time complexity of QuickSort is O n log 2 n [25], where n is the number of the elements to be sorted. In particular, the proposed algorithm first sorts the profitability of all G groups to select potential served groups. After that, the proposed algorithm ranks the service time of T potential served groups to design the UAV trajectory. In general, the total time complexity of the proposed algorithm can be regarded as O G log 2 G + T log 2 T , which is related to the number of total groups and potential served groups. Compared to those machine learning algorithms [26] utilized in wireless communications, whose operation time depends on the network scale and learning parameters, the proposed foraging algorithm has a stable and lower theoretical complexity.

4. Simulation Results

In our simulations, we consider a circular wireless network with a radius R = 200 m, in which a rotary-wing UAV is deployed to serve users and harvest energy. The UAV keeps an altitude H = 100 m and a horizontal speed v = 30 m/s during movement. The initial position of the UAV BS is set to the origin 0 , 0 . G groups with a radius R G = 20 m are uniformly distributed in the network and each group j is with U j = 3 users. Half of the groups are equipped with CSs, C is equal to the integer part of G / 2 . For implementing and verifying the proposed foraging-based algorithm, we use the Matlab tools for simulation. Unless state otherwise, the parameters we used during the simulations are listed in Table 2. The functionality of the proposed algorithm can be divided into two parts: (1) group selection, and (2) trajectory design. For the group selection part, the optimization of resource allocation is pre-processed by methods in [23] and will not be discussed in the following results. We compare the deployment profitability with Q-learning algorithm in [17]. For the trajectory design part, we first generate the optimal group selection policy with the maximal deployment profitability by the foraging-based algorithm. Then, while completing the service of selected groups, we compare the average total service time with the worst case scenario and the Q-learning algorithm. In our simulations, the worst case scenario indicates that the UAV trajectory leads to the longest average total service time. All the statistical results are averaged over 500 independent runs.
Figure 2 shows that the deployment profitability changes as the number of groups changes. From Figure 2, we can see that, for both considered algorithms, the deployment profitability increases with the number of groups increasing. This is due to the fact that as the number of user groups increases, the user groups that can be served by the UAV and the energy that is harvested by the UAV increase. In Figure 2, when the number of groups G = 20 , the foraging-based algorithm achieves a deployment profitability of 29.42 , while the Q-learning algorithm achieves a deployment profitability of 24.52 . The proposed foraging-based algorithm yields up to 20.0% gain in terms of deployment profitability compared to the Q-learning algorithm. This gain stems from the fact that the proposed algorithm can find the optimal potential served groups which maximizes the deployment profitability, while the Q-learning algorithm may find a sub-optimal group selection which leads to a worse value.
Figure 3 shows that the deployment profitability changes as the number of users in a group changes. In Figure 3, we can see that both considered algorithms achieve lower deployment profitability with the number of users per group increasing. This is due to the fact that, for each group, the increasing of users also leads to the increasing of required data. The UAV needs to hover longer so as to complete the user requests and consumes more hovering energy. Increased energy consumption leads to the reduction of the deployment profitability.
In Table 3, we show that the operation time of algorithm changes as the number of groups changes. The operation time records how fast the algorithm can select potential served users. In our simulation, the Q-learning converges after around 2000 iterations. From Table 3, we can see that, the operation time of Q-learning algorithm is more than ten thousand times larger than that of foraging-based algorithm. This is due to the fact that the foraging-based algorithm provides a solution to the proposed maximization problem with the time complexity of O G log 2 G + T log 2 T . Compared to the Q-learning, whose operation time depends on the number of iterations, the foraging-based algorithm effectively shortens the time that the UAV spends on potential served group selection. Table 3 also shows that the operation time of both algorithms increases as the number of groups increases. For the proposed foraging-based algorithm, having more groups increases the elements to sort, which takes up the major operation time of the proposed algorithm. For Q-learning algorithm, having more groups increases the number of actions to implement, which extends the steps of each iteration.
Figure 4 shows that the average user service time changes as the number of groups varies. With the increasing of groups, the average user service time also increases. This is due to the face that the increasing of groups gives the UAV more options of selecting potential served groups so as to increase the deployment profitability. Serving more groups makes the users in the later served groups have to wait for a longer time before the UAV comes and provides service. From this figure, we can also see that the proposed foraging-based algorithm achieves lower results than Q-learning algorithm. This is because the proposed algorithm designs the UAV trajectory based on a greedy policy, which solves the queuing problem of minimizing the queuing time. However, the optimization process of the Q-learning algorithm may stuck in a sub-optimal trajectory. In Figure 4, when the number of groups G = 20 , in terms of the average user service time, the foraging-based algorithm achieves up to 17.3 % and 8.7 % reduction compared to the worst case baseline and the Q-learning algorithm, respectively.
In Figure 5, we show that the average user service time changes as the users in a group varies. The average user service time increases when more users are in a group. This is because the UAV needs to spend longer time to complete the user requests when serving a group. In this case, the service time of each group increases. The users in the later served group have to spend more time waiting the UAV completes the data transmission in the previous groups.
Figure 6 shows the users served by the UAV of an arbitrary case after the UAV follows the optimal trajectory designed by the foraging-based algorithm. In this case, G = 10 groups are distributed in the wireless network and the UAV selects the potential served users with the maximal deployment profitability. From Figure 6, we can see that the UAV does not only serve the groups with CSs and the groups near to the initial position, but also serves the groups without CSs and the groups far from the initial position. This is due to the fact that the UAV jointly considers the distance and the existence of CSs, which decides the consumed energy and the harvested energy, and further affects the profitability of serving a group.

5. Conclusions

In this paper, we have developed a novel framework to evaluate the deployment profitability for the UAV. The UAV can gain reward by serving users in groups and harvesting energy from CSs. The cost of the UAV consists of the consumed energy during transmitting data and movement. To solve this problem, we have developed a novel algorithm based on foraging theory. The proposed foraging-based algorithm enables the UAV to find the optimal trajectory that achieves the maximal deployment profitability and minimized the average service time of served users. By ranking the profitability of each group and choosing the group from the largest profitability, the UAV selects potential served groups. The UAV trajectory is further designed based on a queuing problem. Simulation results have shown that the proposed approach with much lower computational complexity yields significant performance gains of the deployment profitability compared to prior Q-learning algorithm. With the optimized deployment profitability, the proposed approach also reduces the average service time of served users.

Author Contributions

Conceptualization, X.L. and C.Y.; methodology, X.L. and S.W.; software, X.L.; validation, X.L., S.W. and C.Y.; writing—original draft preparation, X.L.; writing—review and editing, X.L., S.W. and C.Y.; funding acquisition, C.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Natural Science Foundation of China under Grants 61871041, 61671086 and 61629101, and in part by Beijing Natural Science Foundation and Municipal Education Committee Joint Funding Project under Grant KZ201911232046.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Proof of Theorem 1

Theorem 1 presents the termination condition (28) of UAV selecting the potential served groups so as to maximize the deployment profitability. To prove that the group selection o * achieves the maximal deployment profitability when the inequality holds, we use the contradiction method.
Suppose there exists an unselected group j , i.e., o j * = 0 , whose profitability p j is smaller than the deployment profitability of group selection o * , we have:
p j = M j N j < f o * = Q + o * o j * M o j * o * o j * N o j * ,
and a group selection o , which is defined as:
o j = 1 , o j * = 1 or j = j , 0 , otherwise .
The deployment profitability f o is larger than the maximal deployment profitability f o * . The following inequality holds:
f o f o * = Q + o * o j * M o j * + M j o * o j * N o j * + N j Q + o * o j * M o j * o * o j * N o j * .
Since N j represents the energy consumption of serving the users in group j, which is a positive number, both the denominators of the two fractions are positive. Thus, (A3) can be rewritten as o * o j * N o j * M j Q + o * o j * M o j * N j > 0 . Then, we have:
p j = M j N j > Q + o * o j * M o j * o * o j * N o j * = f o * .
From (A4), we can see that the profitability of serving group j is actually larger than the maximal deployment profitability of group selection o * . Here, inequality (A4) contradicts to the condition in (28).
Therefore, there does not exist any group j out of the group selection o * , which can increase the deployment profitability f o * . Thus, the optimal group selection o * achieves the maximal deployment profitability when the UAV selects the first k groups with the largest group profitability.

References

  1. Saad, W.; Bennis, M.; Chen, M. A Vision of 6G Wireless Systems: Applications, Trends, Technologies, and Open Research Problems. IEEE Netw. 2020, 34, 134–142. [Google Scholar] [CrossRef] [Green Version]
  2. Mozaffari, M.; Saad, W.; Bennis, M.; Nam, Y.; Debbah, M. A Tutorial on UAVs for Wireless Networks: Applications, Challenges, and Open Problems. IEEE Commun. Surv. Tutor. 2019, 21, 2334–2360. [Google Scholar] [CrossRef] [Green Version]
  3. Gupta, L.; Jain, R.; Vaszkun, G. Survey of Important Issues in UAV Communication Networks. IEEE Commun. Surv. Tutor. 2016, 18, 1123–1152. [Google Scholar] [CrossRef] [Green Version]
  4. Petritoli, E.; Leccese, F.; Ciani, L. Reliability Degradation, Preventive and Corrective Maintenance of UAV Systems. In Proceedings of the 2018 5th IEEE International Workshop on Metrology for AeroSpace (MetroAeroSpace), Rome, Italy, 20–22 June 2018; pp. 430–434. [Google Scholar]
  5. Zeng, Y.; Zhang, R. Energy-Efficient UAV Communication With Trajectory Optimization. IEEE Trans. Wirel. Commun. 2017, 16, 3747–3760. [Google Scholar] [CrossRef] [Green Version]
  6. Yang, D.; Wu, Q.; Zeng, Y.; Zhang, R. Energy Tradeoff in Ground-to-UAV Communication via Trajectory Design. IEEE Trans. Veh. Technol. 2018, 67, 6721–6726. [Google Scholar] [CrossRef] [Green Version]
  7. Lang, R.; Wang, J.; Chen, J.; Xu, Y.; Yang, Y.; Jiang, H.; Zhang, Y.; Xu, Y. Energy-Efficient Multi-UAV Coverage Deployment in UAV Networks: A Game-Theoretic Framework. China Commun. 2018, 15, 194–209. [Google Scholar]
  8. Wu, F.; Yang, D.; Xiao, L.; Cuthbert, L. Energy Consumption and Completion Time Tradeoff in Rotary-Wing UAV Enabled WPCN. IEEE Access 2019, 7, 79617–79635. [Google Scholar] [CrossRef]
  9. Cong, J.; Li, B.; Guo, X.; Zhang, R. Energy Management Strategy based on Deep Q-network in the Solar-powered UAV Communications System. In Proceedings of the 2021 IEEE International Conference on Communications Workshops (ICC Workshops), Xiamen, China, 28–30 July 2021; pp. 1–6. [Google Scholar]
  10. Amorosi, L.; Chiaraviglio, L.; Galán-Jiménez, J. Optimal Energy Management of UAV-Based Cellular Networks Powered by Solar Panels and Batteries: Formulation and Solutions. IEEE Access 2019, 7, 53698–53717. [Google Scholar] [CrossRef]
  11. Qing, W.; Chen, H.; Wang, X.; Yin, Y. Collision-free Trajectory Generation for UAV Swarm Formation Rendezvous. In Proceedings of the 2021 IEEE International Conference on Robotics and Biomimetics (ROBIO), Sanya, China, 27–31 December 2021; pp. 1861–1867. [Google Scholar]
  12. Hazra, K.; Shah, V.K.; Roy, S.; Deep, S.; Saha, S.; Nandi, S. Exploring Biological Robustness for Reliable Multi-UAV Networks. IEEE Trans. Netw. Serv. Manag. 2021, 18, 2776–2788. [Google Scholar] [CrossRef]
  13. Alfeo, A.L.; Cimino, M.G.; De Francesco, N.; Lazzeri, A.; Lega, M.; Vaglini, G. Swarm Coordination of Mini-UAVs for Target Search using Imperfect Sensors. Intell. Decis. Technol. 2018, 12, 149–162. [Google Scholar] [CrossRef] [Green Version]
  14. Pavlic, T.P.; Passino, K.M. Generalizing foraging theory for analysis and design. Int. J. Robot. Res. 2011, 30, 505–523. [Google Scholar] [CrossRef]
  15. Feng, T.; Xie, L.; Yao, J.; Xu, J. UAV-Enabled Data Collection for Wireless Sensor Networks With Distributed Beamforming. IEEE Trans. Wirel. Commun. 2022, 21, 1347–1361. [Google Scholar] [CrossRef]
  16. Chen, M.; Challita, U.; Saad, W.; Yin, C.; Debbah, M. Artificial Neural Networks-Based Machine Learning for Wireless Networks: A Tutorial. IEEE Commun. Surv. Tutor. 2019, 21, 3039–3071. [Google Scholar] [CrossRef] [Green Version]
  17. Liu, X.; Chen, M.; Yin, C. Optimized Trajectory Design in UAV Based Cellular Networks for 3D Users: A Double Q-Learning Approach. J. Commun. Inf. Netw. 2019, 4, 24–31. [Google Scholar] [CrossRef]
  18. Chen, M.; Mozaffari, M.; Saad, W.; Yin, C.; Debbah, M.; Hong, C.S. Caching in the Sky: Proactive Deployment of Cache-Enabled Unmanned Aerial Vehicles for Optimized Quality-of-Experience. IEEE J. Sel. Areas Commun. 2017, 35, 1046–1061. [Google Scholar] [CrossRef]
  19. Al-Hourani, A.; Kandeepan, S.; Jamalipour, A. Modeling air-to-ground path loss for low altitude platforms in urban environments. In Proceedings of the IEEE Global Communications Conference, Austin, TX, USA, 8–12 December 2014. [Google Scholar]
  20. Al-Hourani, A.; Kandeepan, S.; Lardner, S. Optimal LAP altitude for maximum coverage. IEEE Wirel. Commun. Lett. 2014, 3, 569–572. [Google Scholar] [CrossRef] [Green Version]
  21. Xu, J.; Zeng, Y.; Zhang, R. UAV-Enabled Wireless Power Transfer: Trajectory Design and Energy Optimization. IEEE Trans. Wirel. Commun. 2018, 17, 5092–5106. [Google Scholar] [CrossRef] [Green Version]
  22. Zeng, Y.; Xu, J.; Zhang, R. Energy Minimization for Wireless Communication With Rotary-Wing UAV. IEEE Trans. Wirel. Commun. 2019, 18, 2329–2345. [Google Scholar] [CrossRef] [Green Version]
  23. Xie, L.; Xu, J.; Zhang, R. Throughput Maximization for UAV-Enabled Wireless Powered Communication Networks. IEEE Internet Things J. 2019, 6, 1690–1703. [Google Scholar] [CrossRef] [Green Version]
  24. Cui, J.; Liu, Y.; Nallanathan, A. Multi-Agent Reinforcement Learning-Based Resource Allocation for UAV Networks. IEEE Trans. Wirel. Commun. 2020, 19, 729–743. [Google Scholar] [CrossRef] [Green Version]
  25. Thompson., C.D. Thompson. The VLSI Complexity of Sorting. IEEE Trans. Comput. 1983, C-32, 1171–1184. [Google Scholar] [CrossRef]
  26. Chen, M.; Yang, Z.; Saad, W.; Yin, C.; Poor, H.V.; Cui, S. A Joint Learning and Communications Framework for Federated Learning Over Wireless Networks. IEEE Trans. Wirel. Commun. 2021, 20, 269–283. [Google Scholar] [CrossRef]
Figure 1. Architecture of an energy harvesting UAV network.
Figure 1. Architecture of an energy harvesting UAV network.
Sensors 23 00863 g001
Figure 2. Deployment profitability as the number of groups varies.
Figure 2. Deployment profitability as the number of groups varies.
Sensors 23 00863 g002
Figure 3. Deployment profitability as the users per group vary.
Figure 3. Deployment profitability as the users per group vary.
Sensors 23 00863 g003
Figure 4. Average user service time as the number of groups varies.
Figure 4. Average user service time as the number of groups varies.
Sensors 23 00863 g004
Figure 5. Average user service time as the users per group vary.
Figure 5. Average user service time as the users per group vary.
Sensors 23 00863 g005
Figure 6. An illustrative example of the users served following the UAV trajectory designed by the proposed algorithm.
Figure 6. An illustrative example of the users served following the UAV trajectory designed by the proposed algorithm.
Sensors 23 00863 g006
Table 1. List of notations.
Table 1. List of notations.
Notations Description
G Set of user groups
U Set of users
U j Set of users in group j
D i Data request of user i
d i j Transmission distance between user i and the UAV at group j
HUAV altitude
g ¯ i j Average channel gain of probabilistic UAV channel
c i j Downlink rate of user i in group j
BBandwidth
ρ i j Bandwidth allocation coefficient
P i j T Transmission power of serving user i
t i j T Transmission time of user i
t j T Transmission time of group j
E j C Harvested energy while UAV serving group j
E j T Transmission energy while UAV serving group j
E j M Movement energy while UAV serving group j
E j H Hovering energy while UAV serving group j
QMaintenance cost
o Group selection indicator
e UAV trajectory
f o Deployment profitability of UAV trajectory
t τ F e Total time after UAV serving a group at time slot τ
t i j W e Total service time of user i in group j
M j Reward of serving group j
N j Cost of serving group j
p j Profitability of group j
Table 2. System parameters.
Table 2. System parameters.
ParametersDescriptionValues
HUAV altitude100 m
α Path loss exponent2
η NLoS attenuation factor0.3
X , Y Environment constants11.95, 0.136
σ 2 Noise power−84 dBm
BBandwidth1 MHz
β 0 Channel power gain−30 dB
c 1 , c 2 Propulsion parameters 9.26 × 10 4 , 2250
P T Transmission power0.2 W
QMaintenance cost10
q S Service reward2
q C Energy cost 1.4 × 10 3
Table 3. Operation time (s) as the number of groups varies.
Table 3. Operation time (s) as the number of groups varies.
Number of GroupsForaging-Based AlgorithmQ-Learning Algorithm
5 5.8 × 10 5 0.725
10 6.9 × 10 5 1.263
15 1.12 × 10 4 1.779
20 1.14 × 10 4 2436
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liu, X.; Wang, S.; Yin, C. Biological Intelligence Inspired Trajectory Design for Energy Harvesting UAV Networks. Sensors 2023, 23, 863. https://doi.org/10.3390/s23020863

AMA Style

Liu X, Wang S, Yin C. Biological Intelligence Inspired Trajectory Design for Energy Harvesting UAV Networks. Sensors. 2023; 23(2):863. https://doi.org/10.3390/s23020863

Chicago/Turabian Style

Liu, Xuanlin, Sihua Wang, and Changchuan Yin. 2023. "Biological Intelligence Inspired Trajectory Design for Energy Harvesting UAV Networks" Sensors 23, no. 2: 863. https://doi.org/10.3390/s23020863

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop