Next Article in Journal
Cognitive Vergence Recorded with a Webcam-Based Eye-Tracker during an Oddball Task in an Elderly Population
Previous Article in Journal
Integrated Simulation and Calibration Framework for Heating System Optimization
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Unmanned Aerial Vehicle Cooperative Data Dissemination Based on Graph Neural Networks

School of Information, North China University of Technology, No. 5 Jinyuanzhuang Road, Beijing 100144, China
*
Author to whom correspondence should be addressed.
Sensors 2024, 24(3), 887; https://doi.org/10.3390/s24030887
Submission received: 13 December 2023 / Revised: 31 December 2023 / Accepted: 8 January 2024 / Published: 30 January 2024
(This article belongs to the Section Vehicular Sensing)

Abstract

:
Unmanned Aerial Vehicles (UAVs) have critical applications in various real-world scenarios, including mapping unknown environments, military reconnaissance, and post-disaster search and rescue. In these scenarios where communication infrastructure is missing, UAVs will form an ad hoc network and perform tasks in a distributed manner. To efficiently carry out tasks, each UAV must acquire and share global status information and data from neighbors. Meanwhile, UAVs frequently operate in extreme conditions, including storms, lightning, and mountainous areas, which significantly degrade the quality of wireless communication. Additionally, the mobility of UAVs leads to dynamic changes in network topology. Therefore, we propose a method that utilizes graph neural networks (GNN) to learn cooperative data dissemination. This method leverages the network topology relationship and enables UAVs to learn a decision policy based on local data structure, ensuring that all UAVs can recover global information. We train the policy using reinforcement learning that enhances the effectiveness of each transmission. After repeated simulations, the results validate the effectiveness and generalization of the proposed method.

1. Introduction

To perform complex tasks that cannot be achieved by a single UAV, UAV swarms have gained significant attention. There are numerous practical applications, including unknown environment mapping [1], military reconnaissance [2,3], and search and rescue operations in post-disaster areas [4]. In these scenarios, communication infrastructures are often unable to be installed. Thus, UAVs need to form an ad hoc network and operate in a distributed manner.
UAV data dissemination plays a crucial role in various important application scenarios, making it a key component for UAV functionality. Its significance stems from its wide range of applications and impacts. Such as safe navigation of the UAV in unknown environments [5], intelligent transportation systems [6,7,8], and for emergency communication services in post-disaster areas [9,10,11].
In this paper, we focus on a mapping scenario after geological disasters where UAVs play a crucial role in enhancing search and rescue efficiency. Figure 1 shows the UAVs performing data collection and data dissemination under the landslide scenario, with the pink solid circle representing data fragments (i.e., sub-maps), the pink arrow representing the current data fragments being collected by the UAV, and the blue arrow representing the communication between the UAVs.
Sub-maps are transmitted between UAVs through communication until each UAV obtains all the sub-maps. This paper aims to complete the dissemination of sub-maps in the shortest possible time; while exchanging data fragments, UAVs also exchange status information about other UAVs in the network with their neighbors.

2. Related Work

This section introduces the work related to this paper from recent years. Section 2.1 is the main application scenarios of data dissemination. Section 2.2 includes work that uses data encoding to improve the quality of data transmission. Section 2.3 introduces examples of using graphs to represent distributed network structures and the advantages of GNNs in processing graph-structured data.

2.1. Data Dissemination

Most research on data dissemination has been carried out in various vehicular ad hoc networks (VANETs) applications and requires roadside units (RSU) to feed data into the network. Yang et al. [12] propose a hybrid data dissemination model with both vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I) disseminations in automatic driving scenarios. The RSU injects data to the vehicles and the data disseminates via the vehicle network. For managing the driving status of the platoon, Li et al. [13] propose a method in which a lead vehicle transmits driving information to following autonomous vehicles by using multi-hop data dissemination in intelligent transportation systems.
In recent years, there has been an increasing amount of literature on data dissemination algorithms in the Internet of Things (IoT) applications regarding UAV-assisted communications [14]. To realize UAV-assisted edge computing resource scheduling for platooning vehicles in [15], Liu et al. [15] use Time Division Multiple Access (TDMA) protocol to realize the communication between UAV and the ground vehicles. Similarly, Shah et al. [16] propose a data dissemination technique using a time barrier mechanism to reduce the overhead of messages that can clutter the network. To empower the efficiency of data dissemination, Zhang et al. [17] propose a novel UAV-enabled scheduling protocol consisting of a proactive caching policy and a file-sharing strategy in V2V networks.
In this paper, there is no communication infrastructure such as RSU. Data collection and dissemination are all completed by UAVs. Therefore, the existing methods cannot be applied to the scenario proposed in this paper. We design a distributed communication algorithm for UAVs to realize data dissemination.

2.2. Data Coding

The performance of data dissemination can be significantly compromised by the limited bandwidth resources of UAVs. Recent studies have shown that data coding can improve bandwidth utilization. In [6], a scheduling strategy is proposed to provide efficient data dissemination with network coding and vehicular caching where infrastructures are unavailable. Ref. [18] considers the wireless network-coded video broadcast problem for users with multiple interfaces to minimize the number of transmission slots.
In this paper, we exploit packet coding to improve transmission quality.

2.3. Graph Representation and Graph Neural Networks

Graphs are widely utilized to illustrate UAV cooperative data dissemination. Many graph-based algorithms have been developed to facilitate UAV cooperative data dissemination. Research [19] indicates that neighbor selection based on graphs can enhance the performance of UAV cooperation. To address the broadcasting of live media streaming, Ref. [18] proposes a transmission strategy for multiple users’ mobile wireless networks. A significant step in this strategy involves finding a maximal connected subgraph within the network, giving priority to live media streaming dissemination. An investigation [20] into the code cooperative data exchange (CCDE) problem in multi-channel multi-hop wireless networks adopts the time-expanded graph approach. The CCDE addresses the recovery of desired packets in a connected network [21] and has been proven NP-hard for general topologies [22]. Inspired by the time-expanded graph, Ref. [23] resolves the CCDE problem using a conflict graph.
GNN has emerged as a powerful tool for learning representations of graph-structured data and performing various tasks on graphs. Graph Network (GN) blocks, a renowned spatial graph convolution method, define functions for relational reasoning over graph-structured representations. Graphs can express arbitrary relationships among entities, making the input to GN determine interactions between representations rather than a fixed architecture. Graphs represent entities and their relations as sets, they are permutation-invariant, rendering GNs to be unaffected by the order of elements. The per-edge and per-node functions of a GN can be reused across all edges and nodes, allowing a single GN to operate on graphs of different sizes and shapes [24].
This paper focuses on the design of a graph neural network method to realize UAV cooperative data dissemination. The method proposed in this paper includes the structural basis of the design drawing and the communication protocol, then applies the data generated by the UAV interaction to the input GNN and trains it through reinforcement learning.
The main contributions of this work are summarized as follows:
  • The cooperative data dissemination problem is described in a distributed manner. We use graph structures to represent ad hoc networks and design the data structures of nodes and edges.
  • This work improves wireless transmission quality through data encoding. A wireless communication protocol is designed to avoid message collision and adopts the Signal to Interference Noise Ratio (SINR) to evaluate the communication quality.
  • We propose a distributed cooperative data dissemination method based on GNN. The method can adapt to the dynamic topology and enhance network efficiency and stabilization. We train the policy with a reward function that enhances the efficiency of each transmission and reduces the required number of time slots.
The remainder of this paper is organized as follows. Section 3 presents the system model and the cooperative data dissemination method is proposed in Section 4. We build the simulation and give a performance evaluation in Section 5. Finally, Section 6 concludes this work.

3. System Model

This section describes the system model of this work. It first performs scene modeling and symbolic expression, then introduces the input data structure in detail, and finally introduces the wireless communication model and data encoding method used in this work.

3.1. Scene Description

Let N = { 1 , 2 , , n } denote the UAV set. The packet set is represented as B = { b 1 , b 2 , b | B | } when there are totally | B | packets. κ i is a vector that represents serial number data packets acquired by the current UAV i N . It is a one-dimensional vector of length | B | with element k i , m { 0 , 1 } , which indicates whether packet b m B is obtained by UAV i [25]. Let ( i , j ) represent a transmission link from UAV i N to UAV j N .
For a better understanding, we illustrate an example as shown in Figure 2. For simplicity, we choose four UAVs holding different packets. The packet vectors of UAVs at time slot t are assumed as κ t 1 = [ 1 , 0 , 0 , 1 ] , κ t 2 = [ 0 , 1 , 1 , 0 ] , κ t 3 = [ 0 , 0 , 1 , 0 ] , and κ t 4 = [ 0 , 0 , 0 , 1 ] . As can be seen from this figure, UAV 1 has two neighbors, UAV 2 and UAV 4. The transmission from UAV 2 to UAV 1 can make UAV 1 obtain more packets than from UAV 4 to UAV 1. It can be seen that the number of packets in a single transmission is related to the difference between the UAVs’ packet vectors and the network topology.

3.2. Local Data Structure

The local data structure is the input of the proposed method. It is a combination of several feature vectors which are related to the transmission process. At time slot t, the local data structure of UAV i N is formulated as { M t i , j , T t i , j , K t i , j , L t i , j , P t i , j } j N . It represents UAV i’s knowledge about the network. Primarily, M t i , j is the status of UAV j known to UAV i. Each UAV maintains the state of other UAVs in the network, including position and velocity. And T t i , j records the time slot when the observation of the state of UAV j occurs. K t i , j is the packet vector of UAV j which is observed by UAV i after the transmission from UAV j to UAV i. The above three vectors are the attributes of UAVs. When a transmission link is established, two vectors are used to record the changes of network topology: L t i , j denotes the time slot when transmission link ( i , j ) occurs. Let P t i , j denote the first relay UAV on the path from UAV j to UAV i.
First of all, we randomly initialize the UAV attributes as the initial status of the multi-UAV system. Taking UAV i as an example, the state of UAV i is composed of the current position pos t i and velocity vel t i , which are both one-dimensional vectors and changing over time. The status of UAV i at time slot t is represented as M t i , i and the current time slot is represented as T t i , i , then:
M t i , i = [ pos t i ; vel t i ] , T t i , i = t .
When transmission link ( j , i ) occurs at time slot t, UAV i observes and records the status of UAV j denoted as M t i , j , the current time slot T t i , j and the packet vector of UAV j observed by UAV i denoted as K t i , j . Then:
M t i , j = M t j , j , T t i , j = T t j , j , K t i , j = K t j , j .
When UAV i receives the desired data packets, its packet vector κ t i is changed. Take the example in Figure 2, κ t 1 = [ 1 , 0 , 0 , 1 ] and κ t 2 = [ 0 , 1 , 1 , 0 ] . When transmission link ( 2 , 1 ) occurs, UAV 1 receives b 2 and b 3 from UAV 2. Thus, κ t 1 is updated from [ 1 , 0 , 0 , 1 ] to [ 1 , 1 , 1 , 1 ] . At time slot t, the packet vector itself observed by UAV i is denoted by K t i , i . The time slot at which UAV i sends packets to UAV j is denoted by L t i , j .
UAVs receive physical status information about other UAVs from their neighbors and record the trajectory of the information transmission. We define the parent reference notation P t i , j to record the destination node on the path from UAV j to UAV i, and it is recorded in UAV i. For example, if UAV l directly sends packets and UAVs’ status to UAV i, then P t i , l = i . It means that the parent node of the current transmission link ( l , i ) is UAV l. When UAV i receives UAV l’s status from UAV j, UAV i records the same parent reference as UAV j.
K t i , i = κ t i , L t i , j = t , P t i , l = P t j , l .
UAV i picks up information about other UAVs through its neighbors. When UAV i receives UAV l’s status from UAV j, the local data structure of UAV i is consistent with UAV j.
( M t i , l = M t j , l ) ( T t i , l = T t j , l ) ( K t i , l = K t j , l ) ( P t i , l = P t j , l ) , l N \ j .
By learning the knowledge of other UAVs and boosting the experience of the current UAV i, a connected subgraph consisting of communication links and UAVs is generated for training.

3.3. Communication Model

This paper has requirements for the quality of service of wireless communications. The establishment of a communication link is conditional [26,27]. We used the most common SINR to simulate wireless communications. The SINR threshold is set to a constant γ . Clearly, at time slot t, UAV i can send to UAV j R t i only if the SINR value at the receiver r t i , j satisfies:
r t i , j = θ t i d t i , j α η + k N \ i θ t k d t k , j α γ .
where d t i , j is the distance between UAV i and UAV j at time slot t, θ t i is the transmission power of UAV i, η is the noise power, and α is the pass-loss exponent. Let R t i denote the set of UAVs that receive a transmission from UAV i N at time slot t. We can express the probability p of successful packet reception at UAV j as:
p ( j R t i ) = 1 , r t i , j γ 0 , r t i , j < γ
A fundamental limitation of ad hoc networks with a shared medium is that the UAV can only receive at most one transmission at a time slot. Two transmissions for the same destination will result in packet collisions and no successful decoding of the data at the receiver. Additionally, in an ad hoc network, all nodes will typically compete for the same medium and therefore be able to decode any packet transmission they come within range of, regardless of whether they are the intended recipients. We allow UAVs to eavesdrop on each other’s transmissions [28].

3.4. Network Coding Scheme

We will use the example presented in Figure 2 to demonstrate the advantages of the coding scheme. Due to the limitation of bandwidth resources of the UAVs, we use the network coding scheme to maximize the bandwidth efficiency.
Network coding can enhance transmission efficiency, including improving throughput and reducing delay through a single coded packet which is combined by different original packets [29,30,31,32]. Let W ( i ) denote the packets that the UAV i wants and H ( i ) refer to the packets that UAV i has. W ( i ) and H ( i ) satisfy W ( i ) H ( i ) = , W ( i ) H ( i ) = B , i N . The packets transmitted by UAVs are coded with the binary sum ⊕, and the corresponding coded packet is represented as b b H ( i ) b b , i N  [18]. To better understand the network coding scheme, we assume B = b 1 , b 2 , b 3 , b 4 and H ( 1 ) = b 1 , b 3 , W ( 1 ) = b 2 , b 4 , H ( 2 ) = b 2 , b 4 , and H ( 2 ) = b 2 , b 4 . Given a communication link ( 2 , 1 ) , in this case, UAV 2 will transmit b 2 b 4 to UAV 1 at time slot t, then H ( 1 ) = b 1 , b 2 , b 3 , b 4 , W 1 = . According to [33], after receiving the coded packet, the receiver can instantly decode the packets it wants.

4. Proposed Solution

The data structure addressed in this paper is a directed graph, and it does not require connectivity. However, traditional Graph Convolutional Networks (GCNs) have certain limitations when handling non-connected graphs. These limitations arise because the convolution operations in traditional GCNs are based on the Laplacian matrix, which necessitates a connected graph. Therefore, we use a highly adaptive spatial graph neural network algorithm, which has good local perception and scalability.
Section 4.1 introduces a spatial graph neural network and gives the architecture of the method proposed by this paper. Section 4.2 introduces the transmission–response protocol and describes the data update process in detail. Section 4.3 introduces the reinforcement learning algorithm used in this paper.

4.1. The Local Policy with Aggregation Graph Neural Networks

Let S t i denote the set of the receivers decided by the policy for UAV i N at time slot t. Let π denote the local policy which can give the UAV i’s receiver set. The policy consumes the local data structure and outputs set of receivers for each UAV. The remainder of this section introduces the π operation process in detail.
S t i = π ( { M t i , j , T t i , j , K t i , j , L t i , j , P t i , j } j N ) .

4.1.1. Definition of Graph

First, we use a graph G t i = { V t i , E t i , u } to represent UAV i’s knowledge about the ad hoc network at time slot t. According to the system model, the node feature set is formed as V t i = { M t i , j , T t i , j , K t i , j , L t i , j } j N . ( i , j ) is a transmission link that packets successfully transmit from UAV i to UAV j R t i . Then the edge feature can be represented as ( P t i , j , j ) , j N \ i . The set of all directed edges in the graph is E t i = { ( P t i , j , j ) } j N \ i .

4.1.2. Graph Network Block

To better utilize graph-structured data, we use GN block as the main part of the policy function for reinforcement learning [24]. The input of the GN block is a graph that expresses how UAVs are isolated and interact by edges. The GN block deals nodes and edges as two sets, which means GNs are permutation invariant and the order of nodes and edges does not influence the output of GNs. The GN block uses the graph convolution operation with learnable coefficients. These coefficients equal the graph signal and multiply the powers of the adjacency matrix [34,35]. To use the GN block, we must convert the local data structure into a graph signal that can be calculated, the graph signal is represented in vectors. We flatten the node feature { M t i , j , T t i , j , K t i , j , L t i , j } into one-dimensional vector v n and flatten the edge feature ( P t i , j , j ) into one-dimensional vector e l ; n and l are the indexes of nodes and directed edges. Then, the local data structure is converted into graph signal G = { { v n } , { e l } , u } , u is the current time slot t. Define G N ( · ) as a function of G , which contains three parts. φ v and φ e are the update functions using original node and edge features. ρ e v is an aggregation function applied to edge features. The application of the GN block will transform original signals into G = { { v n } , { e l } , u } :
e l = φ e ( e l , v r l , v s l , u ) , e ¯ n = ρ e v ( E n ) ,
v n = φ v ( e ¯ n , v n , u ) ,
where E n = e l , r l , s l r l = n , l = 1 : N e , N e is the number of edges; s l and r l are the sender and receiver node of edge l. The aggregation function ρ e v takes the set of transformed incident edge features E n at node n and generates the fixed-size latent vector e ¯ n . The Aggregation GNN updates edge and node features with learnable non-linear functions:
φ e e l , v r l , v s l , u = N N e e l , v r l , v s l , u ,
φ v e n , v n , u = N N v e n , v n , u ,
where N N e and N N v are both Multi-layer Perceptrons (MLPs). Moreover, the aggregation function ρ e v needs to deal with varying numbers of unordered graph signals. Thus, we need to normalize the output as follows [36]:
ρ e v ( E n ) = 1 E n e l E n e l .

4.1.3. The Encoder-Process–Decoder Architecture

Inspired by [24,28], we add the encoder f e n c and the decoder f d e c layers on both sides of the GN layers to form the Encoder-Process–Decoder architecture which is illustrated in Figure 3. The linear output function f o u t deals a high-dimensional vector which concatenates the outputs of every GN stage [34,37], and outputs the required low-dimensional vector:
G = f o u t ( [ f d e c ( f e n c ( G ) ) , f d e c ( G N ( f e n c ( G ) ) ) , ] ) ,
where f o u t computes the logarithm of the Boltzmann distribution, and then generates a discrete distribution using the Gumbel-Softmax. At each time slot, each UAV samples a receiver of its transmission from this distribution. Especially, the number of GN operations determines the receptive field of GNN and how far packets can travel along edges in the network, selecting an appropriate receptive field will improve the performance of the method [38].
The receptive field refers to the specific region in the input space that a neuron or a group of neurons in a neural network is sensitive to. It is well-known that the receptive field is a critical factor for neural networks affecting performance. It determines which input signals influence the activation of the neuron or the response of the network. The receptive field can be conceptualized as a window through which the neuron or network “views” and processes information. The size and shape of the receptive field can vary depending on the architecture and parameters of the neural network.

4.2. Transmission–Response Protocol Design

In this section, we design a communication protocol and introduce our method in detail. The protocol is divided into two main phases: a transmission phase and a response phase. In the transmission phase, the GNN outputs recipients for each UAV and packet transfer occurs. In the response phase, the recipients of the transmission can respond. The algorithm we designed is summarized as Algorithm 1.
Algorithm 1 Transmission–Response Protocol
Require:  N , B , π , t , pos t i , vel t i , κ t i
  1: while  i N , b m B , κ i , m i s 0  do
  2:     / / t r a n s m i s s i o n p h a s e b e g i n s
  3:     M t i , i : = [ pos t i ; vel t i ] , T t i , i : = t , K t i , i : = κ t i i N , t 0
  4:     S t i : = π ( { M t i , j , T t i , j , K t i , j , L t i , j , P t i , j } ) i N
  5:     p ( j R t i ) i N = f S I N R { S t i , pos t i } i N
  6:     U A V s u p d a t e l o c a l d a t a s t r u c t u r e s , j R t i , l N \ j
  7:    if  κ t j , m i s 0 a n d κ t i , m i s 1 , b m B  then
  8:       ( M t j , l : = M t i , l ) ( T t j , l : = T t i , l ) ( K t j , l : = K t i , l ) ( P t j , l : = P t i , l )
  9:       ( M t j , i : = M t i , i ) ( T t j , i : = t ) ( K t j , i : = K t i , i ) ( S t j , i : = j )
10:    end if
11:     / / r e s p o n s e p h a s e b e g i n s
12:     ( j R t i ) ( j S t i ) i S t j ¯ , j N
13:     p ( i R t j ¯ ) j N = f S I N R { S t j ¯ , pos t j } j N
14:     U A V s u p d a t e l o c a l d a t a s t r u c t u r e s , i R t j ¯ , l N \ i
15:    if  κ t i , m i s 0 a n d κ t j , m i s 1 , b m B  then
16:       ( M t i , l : = M t j , l ) ( T t i , l : = T t j , l ) ( K t i , l : = K t j , l ) ( P t i , l : = P t j , l )
17:       ( M t i , j : = M t j , j ) ( T t i , j : = t ) ( K t i , j : = K t j , j ) ( P t i , j : = i ) ( L t i , j : = t )
18:    end if
19:     t : = t + 1
20: end while
The communication time window t consists of both a transmission and a response phase. During the first half of a transmission time window, UAVs first update their local data structure with local observations. Then, the updated data structure is passed to the local communication policy, which outputs the intended recipient sets S t i , i N . If the set consists only of the UAV itself, then no transmission occurs. UAVs then transmit to their intended recipients for their transmission.
The output set of node i S t i by π may not be suitable for the wireless communication model. The function f S I N R is defined as the communication model described in Equation (8) to evaluate whether transmission links can be successfully established. The local policy outputs the receivers of all UAVs. Then, the communication model calculates all the SINRs between UAVs and their corresponding receivers, and the output of f S I N R is represented as R t i . The communication model is defined as:
p ( j R t i ) i N = f S I N R { S t i , pos t i } i N .
For the transmission links that exceed the SINR threshold, the recipients will update their local data structures with new information received and record the transmission link ( i , j ) , j R t i .
After updating, the response phase is triggered. As shown in Figure 4, the receiver j of the transmission phase has a potential recipient set S t j ¯ . After the calculation of the wireless communication model, the recipient set R t j ¯ of UAV j is determined. Then, the recipients in the transmission phase will respond to the corresponding UAVs. Again, the updating procedure of the data structure occurs.

4.3. Reinforcement Learning

This paper uses the Proximal Policy Optimization (PPO) method in reinforcement learning to train GNN [28,39,40], in which the policy function and the value function are the structures introduced in Section 4.1.3. f e n c , f d e c , and G N are all three-layer MLPs with 64 hidden units, and the Rectified Linear Unit (ReLU) activation is used after the first two layers. The only difference between the policy function and the value function is the output part.
In the policy function, the output function f o u t uses the Boltzmann distribution to convert high-dimensional space vectors into low-dimensional output, then uses Gumbel-Softmax to output the action probability distribution. In the value function, the output, which is a scalar, is used to evaluate the overall value.
The state space of reinforcement learning is the local data structure known by each UAV. The action space is the UAV set because all UAVs are likely to be selected as receivers in the current time slot.

Reward Function Design

We look forward to completing the mapping mission within a limited time frame. To minimize the time, the proportion of the number of packets obtained in a single transmission slot is used as the reward function. The more packets transmitted, the larger the step reward. To maximize cumulative rewards and reduce the total number of slots, a certain penalty is given if a packet is not successfully transmitted in a single transmission slot. The maximum reward is given when all packets are received by each UAV to encourage top-up. Hence, the reward function is designed as follows:
R e w a r d = x / X if x > 0 a n d X 0 + 1 t x < X 1 / X if x = 0 a n d X 0 + 1 t x < X 1 if x = 0 a n d X 0 + 1 t x = X a n d t < λ
where x is the number of packets received by UAVs, X is the number of total packets, and X 0 is the number of packets that all UAVs have in the initial state. When x > 0 , the UAVs win a positive reward. λ is a constant variable that denotes the length of steps in one episode with λ = 200 in this paper.

5. Performance Evaluation

To verify the performance of the proposed method, we design simulations in mobile scenarios. max i , j N , i j T t i , j represents the number of time slots required when each UAV obtains all packets. All statistical simulation results are averaged over 50 independent runs. To compare the convergence speed of reinforcement learning and the number of time slots required under different parameters, we first adjusted the settings of several parameters. Afterward, we conducted experiments under different UAV speeds, different UAV scales, and different data amounts; our method is better than the baseline algorithms.

5.1. Simulation Setup

Before the simulation experiment, we need to standardize some pre-variable values. We use a fixed UAV density to ensure a reasonable distance between UAVs under different sizes of UAV swarms. The UAV density is 40 UAVs per 1 km2 in this paper. For example, when the total number of UAVs is 20, the UAV activity area is 0.5 km2. The maximum sensing and communication range for UAVs is 0.25 km. At time slot t, we assume the velocity of UAV i vel t i = 3 m/s and the maximum acceleration is set as 20 m/s2 for all experiments. We assume the communication graph is algebraic connectivity and each packet is contained by one UAV at least. In Equation (7), we assume that the pass-loss exponent α = 2 , the addictive white Gaussian noise η = 50 dBm, and the SINR threshold γ = 1 dBm. The GNN is trained using PPO with 2 × 10 6 observations. Adam optimizer is used with step size 1 × 10 4 decayed by a factor of 0.95 for every 200 steps, and a batch size of 64. Unless noted otherwise, we use a receptive field of 4 across all the below experiments.

5.2. Baselines

Three commonly used communication protocols are chosen to be our baselines: Random Flooding, Round Robin, and Minimum Spanning Tree (MST).
Random Flooding with a certain probability [41] is widely used in wireless communications. To balance the network load, Round Robin is also used to handle distributed network data transmission [42]. In this work, a central UAV is selected as the base station, and its neighbor exchanges information with it each time slot. The MST baseline aims to exploit the fact that MST minimizes the total edge length required to connect all UAVs in the network. It requires that the global network topology is known and the minimum spanning tree is calculated to allow interconnected UAVs to communicate with each other.

5.3. Simulation Results

To evaluate the convergence performance of our proposed cooperative data dissemination method in a mobile scenario with 20 UAVs disseminating 10 packets, Figure 5 shows the cumulative rewards with increasing training iterations under different receptive fields. The training curves are drawn to detail the statistical results of 10,200 episodes. During an episode, all UAVs run the algorithm independently and decide on receivers. This figure shows that the algorithm trains best as the receptive field increases to 4, where the cumulative reward is maximized and reinforcement learning converges fastest. From the perspective of convergence speed, the larger the receptive field is, the fewer episodes are required for convergence. This result is because as the receptive field increases, each UAV can aggregate more neighbors’ states and network topology information.
We next verify that our method requires fewer time slots than baselines. A boxplot illustrates the detail of the time slots needed under different GNN’s receptive fields in Figure 6a. The bars of the boxplot show the lowest, first quartiles, median, third quartiles, and highest values from bottom to top. This figure shows that, as the GNN receptive field grows below 4, the required number of time slots decreases. We conduct multiple experiments for each receptive field and the distribution of results is more centralized when the receptive field is 4. This is consistent with the results of the training curve. Figure 6b depicts the average time slots required with different receptive fields. Our proposed method can achieve 15% fewer time slots on average compared with the round-robin algorithm when the receptive field is 4.
Next, we investigate the effect of the transmission distance of UAVs. In Figure 7a, we first depict the performance of the proposed method, by evaluating the time slots required against different transmission distances. The boxplot demonstrates the distribution of the time slots required for multiple experiments with varying transmission distances. It can be seen that the required time slots decrease with the increase in transmission distances. When the transmission distance is bigger than 0.35 km, the distribution of multiple simulations’ time slots required is more concentrated. When the transmission distance is 1 km, less than 10 time slots are needed to complete data dissemination. For comparison, we also show the required time slots of three baselines. From Figure 7b, we can see that when the transmission distance is greater than 0.25 km, our method outperforms the comparison algorithms. However, UAVs often operate in harsh environments, such as storms, lightning, and mountains, which greatly affect the efficiency of wireless communication and restrict transmission distance. Therefore, the transmission distance is set as 0.25 km in this paper.
Under the mobile scenario, we analyze the average time slots under different velocities. We consider a larger UAV scale. The number of UAVs is set as 40 and the number of packets is five. As can be seen from Figure 8, our method can adapt well to the mobile scenario and consumes fewer time slots than baselines with lower fluctuation, this is due to the fact that GNN is permutation invariant and the order of nodes and edges does not have an effect on the result.
Next, we evaluate the generalization of our proposed method with different numbers of UAVs and packets. Firstly, the simulation trains a model on 20 UAVs and tests it with the number of UAVs varying from 10 to 80. The number of packets is set as five. From Figure 9a, we can see that our method requires fewer time slots and performs better than all baseline algorithms when the number of UAVs is larger than 20. When the number of UAVs is more than 40, the effect of the GN Block is greatly improved compared with Robin Round. This indicates that our method will perform well when it extends to larger UAVs. Then, we evaluate the required time slots under different numbers of packets. The number of UAVs is set as 20. From Figure 9b, we can see that our method outperforms all the baseline algorithms. These figures demonstrate the effectiveness of our algorithm in mobile scenarios.
We can observe from Figure 10 that under the condition that the total number of data packets needed to be transmitted is five, our method is more effective than the reinforcement learning method after repeated trials in different UAV scales. The experimental results demonstrate that GNN helps reduce the total data dissemination time.

6. Conclusions

In this paper, we propose a cooperative data dissemination method for the mapping task in searching and rescuing scenarios. Then, we propose a decision policy based on GNN. The policy determines which UAVs will communicate with each other. A wireless communication protocol is designed to constrain data forwarding. The policy is trained by reinforcement learning with a reward function designed according to the completion progress of data dissemination. Simulations show that our method outperforms all the baseline algorithms in mobile scenarios. Meanwhile, our method has great generalization. The method proposed in this paper can achieve rapid data dissemination in various distributed networks, including multi-smart vehicle space exploration, mobile user live broadcast data transmission, and scenarios such as IoV security and formation. The GNN applied in this paper can adopt certain strategies to increase the depth and further improve the experimental effect.

Author Contributions

Conceptualization, N.X. and Y.Z. (Ye Zhang); methodology, N.X. and Y.Z. (Ye Zhang); software, Y.Z. (Ye Zhang); validation, Y.Z. (Ye Zhang); formal analysis, N.X. and Y.Z. (Ye Zhang); investigation, Y.Z. (Ye Zhang); resources, N.X. and Y.W.; data curation, Y.Z. (Ye Zhang); writing—original draft preparation, Y.Z. (Ye Zhang); writing—review and editing, N.X.; visualization, Y.Z. (Ye Zhang); supervision, N.X., Y.Z. (Ye Zhang), Y.W. and Y.Z. (Yang Zhou); project administration, N.X.; funding acquisition, N.X. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the R&D Program of Beijing Municipal Education Commission (KM202310009001), in part by the Scientific Research Foundation of North China University of Technology, and in part by the National Key Research and Development Program of China under Grant 2020YFC0811004.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
SymbolDefinition
N the UAV set
B the packet set
| B | the number of packets in packet set B
b m the m th packet
κ t i the packet vector of UAV i with element k t i , m { 0 , 1 } at time slot t, b m B
( i , j ) a transmission link from UAV i N to UAV j N
d t i , j the distance between UAV i N and j N at time slot t
η the noise effect
α the pass-loss exponent
θ t i the transmission power of UAV i N
γ the SINR threshold
r t i , j the SINR value of the transmission link ( i , j ) at time slot t
H ( i ) the packet set UAV i N has
W ( i ) the packet set UAV i N wants
pos t i the position of UAV i N at time slot t
vel t i the velocity of UAV i N at time slot t
M t i , j the state of UAV j N known by UAV i N at time slot t
T t i , j the time slot for the observation of the state of UAV j
P t i , j the first relay on the way from UAV j N to UAV i N at time slot t
L t i , j the time slot when ( i , j ) occurs at time slot t
K t i , j the packet vector of UAV j N observed by UAV i N at time slot t
G graph
V node
E edge
π the policy function
v n the n th node attributes
e l the l th edge attributes
u the global attributes
φ v the update function of node attributes
φ e the update function of edge attributes
ρ e v the aggregation function which aggregate the edge features to node
R t i the receiver set of UAV i N at time slot t
S t i the set of the receiver set decided by the policy for UAV i N at time slot t
X0the number of packets that all UAVs have in the initial state
Xthe total number of packets that all UAVs should have
xthe number of packets received by UAVs in a single slot

References

  1. Liu, Y.; Shi, P.; Lim, C.C. Collision-Free Formation Control for Multi-Agent Systems with Dynamic Mapping. IEEE Trans. Circuits Syst. II Exp. Briefs 2020, 67, 1984–1988. [Google Scholar] [CrossRef]
  2. Shames, I.; Fidan, B.; Anderson, B.D. Close Target Reconnaissance with Guaranteed Collision Avoidance. Int. J. Robust Nonlin. Control 2011, 21, 1823–1840. [Google Scholar] [CrossRef]
  3. Xing, N.; Zong, Q.; Dou, L.; Tian, B.; Wang, Q. A Game Theoretic Approach for Mobility Prediction Clustering in Unmanned Aerial Vehicle Networks. IEEE Trans. Veh. Technol. 2019, 68, 9963–9973. [Google Scholar] [CrossRef]
  4. Briñón-Arranz, L.; Renzaglia, A.; Schenato, L. Multi-Robot Symmetric Formations for Gradient and Hessian Estimation with Application to Source Seeking. IEEE Trans. Robot. 2019, 35, 782–789. [Google Scholar] [CrossRef]
  5. Huang, F.; Li, G.; Tian, S.; Chen, J.; Fan, G.; Chang, J. Safe navigation for UAV-enabled data dissemination by deep reinforcement learning in unknown environments. China Commun. 2022, 19, 202–217. [Google Scholar] [CrossRef]
  6. Xiao, K.; Feng, K.; Dong, A.; Mei, Z. Efficient Data Dissemination Strategy for UAV in UAV-Assisted VANETs. IEEE Access 2023, 11, 40809–40819. [Google Scholar] [CrossRef]
  7. Lu, R.; Zhang, R.; Cheng, X.; Yang, L. UAV-Assisted Data Dissemination with Proactive Caching and File Sharing in V2X Networks. In Proceedings of the 2019 IEEE Global Communications Conference (GLOBECOM), Big Island, HI, USA, 9–13 December 2019; pp. 1–6. [Google Scholar] [CrossRef]
  8. Zhang, R.; Zeng, F.; Cheng, X.; Yang, L. UAV-Aided Data Dissemination Protocol with Dynamic Trajectory Scheduling in VANETs. In Proceedings of the ICC 2019—2019 IEEE International Conference on Communications (ICC), Shanghai, China, 21–23 May 2019; pp. 1–6. [Google Scholar] [CrossRef]
  9. Yang, H.; Ruby, R.; Pham, Q.V.; Wu, K. Aiding a Disaster Spot via Multi-UAV-Based IoT Networks: Energy and Mission Completion Time-Aware Trajectory Optimization. IEEE Internet Things J. 2022, 9, 5853–5867. [Google Scholar] [CrossRef]
  10. Perera, T.D.P.; Panic, S.; Jayakody, D.N.K.; Muthuchidambaranathan, P. UAV-assisted Data Collection in Wireless Powered Sensor Networks over Multiple Fading Channels. In Proceedings of the IEEE INFOCOM 2020—IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), Toronto, ON, Canada, 6–9 July 2020; pp. 647–652. [Google Scholar] [CrossRef]
  11. Mukhopadhyay, A.; Ganguly, D. FANET based Emergency Healthcare Data Dissemination. In Proceedings of the 2020 2nd International Conference on Inventive Research in Computing Applications (ICIRCA), Coimbatore, India, 15–17 July 2020; pp. 170–175. [Google Scholar] [CrossRef]
  12. Yang, L.; Zhang, L.; He, Z.; Cao, J.; Wu, W. Efficient Hybrid Data Dissemination for Edge-Assisted Automated Driving. IEEE Internet Things J. 2019, 7, 148–159. [Google Scholar] [CrossRef]
  13. Li, K.; Ni, W.; Tovar, E.; Guizani, M. Optimal Rate-Adaptive Data Dissemination in Vehicular Platoons. IEEE Trans. Intell. Transport. Syst. 2019, 21, 4241–4251. [Google Scholar] [CrossRef]
  14. Zhou, J.; Tian, D.; Sheng, Z.; Duan, X.; Shen, X. Joint Mobility, Communication and Computation Optimization for UAVs in Air-Ground Cooperative Networks. IEEE Trans. Veh. Technol. 2021, 70, 2493–2507. [Google Scholar] [CrossRef]
  15. Liu, Y.; Zhou, J.; Tian, D.; Sheng, Z.; Duan, X.; Qu, G.; Leung, V.C.M. Joint Communication and Computation Resource Scheduling of a UAV-Assisted Mobile Edge Computing System for Platooning Vehicles. IEEE Trans. Intell. Transp. Syst. 2022, 23, 8435–8450. [Google Scholar] [CrossRef]
  16. Shah, S.S.; Malik, A.W.; Rahman, A.U.; Iqbal, S.; Khan, S.U. Time Barrier-Based Emergency Message Dissemination in Vehicular Ad-hoc Networks. IEEE Access 2019, 7, 16494–16503. [Google Scholar] [CrossRef]
  17. Zhang, R.; Lu, R.; Cheng, X.; Wang, N.; Yang, L. A UAV-Enabled Data Dissemination Protocol with Proactive Caching and File Sharing in V2X Networks. IEEE Trans. Commun. 2021, 69, 3930–3942. [Google Scholar] [CrossRef]
  18. Zhan, C.; Wen, Z.; Wang, X.; Zhu, L. Device-to-Device Assisted Wireless Video Delivery with Network Coding. Ad Hoc Netw. 2018, 69, 76–85. [Google Scholar] [CrossRef]
  19. Shao, H.; Pan, L.; Mesbahi, M.; Xi, Y.; Li, D. Distributed Neighbor Selection in Multi-Agent Networks. IEEE Trans. Automat. 2023, 68, 6711–6726. [Google Scholar]
  20. Luo, G.; Liu, Z.; Li, J.; Yang, F. Understanding Cooperative Data Exchange Problem in Multi-Hop Wireless Network. IEEE Wirel. Commun. Lett. 2020, 9, 2054–2058. [Google Scholar] [CrossRef]
  21. Courtade, T.A.; Wesel, R.D. Coded Cooperative Data Exchange in Multihop Networks. IEEE Trans. Inf. Theory 2013, 60, 1136–1158. [Google Scholar] [CrossRef]
  22. Gonen, M.; Langberg, M. Coded Cooperative Data Exchange Problem for General Topologies. IEEE Trans. Inf. Theory 2015, 61, 5656–5669. [Google Scholar] [CrossRef]
  23. Luo, G.; Wang, X.; Li, J.; Yang, F. Coded Cooperative Data Exchange in Multichannel Multihop Wireless Networks. IEEE Internet Things J. 2020, 7, 3013–3025. [Google Scholar] [CrossRef]
  24. Battaglia, P.W.; Hamrick, J.B.; Bapst, V.; Sanchez-Gonzalez, A.; Zambaldi, V.; Malinowski, M.; Tacchetti, A.; Raposo, D.; Santoro, A.; Faulkner, R.; et al. Relational Inductive Biases, Deep Learning, and Graph Networks. arXiv 2018, arXiv:1806.01261. [Google Scholar]
  25. Xu, X.; Gao, S.; Tao, M. Distributed Online Caching for High-Definition Maps in Autonomous Driving Systems. IEEE Wirel. Commun. Lett. 2021, 10, 1390–1394. [Google Scholar] [CrossRef]
  26. Wang, J.; Liu, K.; Xiao, K.; Chen, C.; Wu, W.; Lee, V.C.; Son, S.H. Dynamic Clustering and Cooperative Scheduling for Vehicle-to-Vehicle Communication in Bidirectional Road Scenarios. IEEE Trans. Intell. Transp. Syst. 2017, 19, 1913–1924. [Google Scholar] [CrossRef]
  27. Capone, A.; Chen, L.; Gualandi, S.; Yuan, D. A New Computational Approach for Maximum Link Activation in Wireless Networks under the SINR Model. IEEE Trans. Wirel. Commun. 2011, 10, 1368–1372. [Google Scholar] [CrossRef]
  28. Tolstaya, E.; Butler, L.; Mox, D.; Paulos, J.; Kumar, V.; Ribeiro, A. Learning Connectivity for Data Distribution in Robot Teams. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Prague, Czech Republic, 27 September–1 October 2021; pp. 413–420. [Google Scholar]
  29. Fu, A.; Sadeghi, P.; Médard, M. Dynamic Rate Adaptation for Improved Throughput and Delay in Wireless Network Coded Broadcast. IEEE/ACM Trans. Netw. 2013, 22, 1715–1728. [Google Scholar] [CrossRef]
  30. Wang, P.; Mao, G.; Lin, Z.; Ge, X.; Anderson, B.D. Network Coding Based Wireless Broadcast with Performance Guarantee. IEEE Trans. Wirel. Commun. 2014, 14, 532–544. [Google Scholar] [CrossRef]
  31. Thomos, N.; Kurdoglu, E.; Frossard, P.; Van der Schaar, M. Adaptive Prioritized Random Linear Coding and Scheduling for Layered Data Delivery from Multiple Servers. IEEE Trans. Multimed. 2015, 17, 893–906. [Google Scholar] [CrossRef]
  32. Vukobratović, D.; Khirallah, C.; Stanković, V.; Thompson, J.S. Random Network Coding for Multimedia Delivery Services in LTE/LTE-Advanced. IEEE Trans. Multimedia 2013, 16, 277–282. [Google Scholar] [CrossRef]
  33. Le, A.; Tehrani, A.S.; Dimakis, A.G.; Markopoulou, A. Instantly Decodable Network Codes for Real-Time Applications. In Proceedings of the International Symposium on Network Coding (NetCod), Calgary, AB, Canada, 7–9 June 2013; pp. 1–6. [Google Scholar]
  34. Gama, F.; Marques, A.G.; Leus, G.; Ribeiro, A. Convolutional Neural Network Architectures for Signals Supported on Graphs. IEEE Trans. Signal Process. 2018, 67, 1034–1049. [Google Scholar] [CrossRef]
  35. Kipf, T.N.; Welling, M. Semi-Supervised Classification with Graph Convolutional Networks. arXiv 2016, arXiv:1609.02907. [Google Scholar]
  36. Tolstaya, E.; Gama, F.; Paulos, J.; Pappas, G.; Kumar, V.; Ribeiro, A. Learning Decentralized Controllers for Robot Swarms with Graph Neural Networks. In Proceedings of the 2020 Conference on Robot Learning, Virtual, 16–18 November 2020; pp. 671–682. [Google Scholar]
  37. Tolstaya, E.; Paulos, J.; Kumar, V.; Ribeiro, A. Multi-Robot Coverage and Exploration using Spatial Graph Neural Networks. In Proceedings of the 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems, Prague, Czech Republic, 27 September–1 October 2021; pp. 8944–8950. [Google Scholar]
  38. Zhuang, J.; Dong, Y.; Bai, H.; Zuo, P.; Cheng, J. Auto-Selecting Receptive Field Network for Visual Tracking. IEEE Access 2019, 7, 157449–157458. [Google Scholar] [CrossRef]
  39. Schulman, J.; Wolski, F.; Dhariwal, P.; Radford, A.; Klimov, O. Proximal Policy Optimization Algorithms. arXiv 2017, arXiv:1707.06347. [Google Scholar]
  40. Raffin, A.; Hill, A.; Gleave, A.; Kanervisto, A.; Ernestus, M.; Dormann, N. Stable-Baselines3: Reliable Reinforcement Learning Implementations. J. Mach. Learn. Res. 2021, 22, 1–8. [Google Scholar]
  41. Ni, S.Y.; Tseng, Y.C.; Chen, Y.S.; Sheu, J.P. The broadcast storm problem in a mobile ad hoc network. Wirel. Netw. 2002, 8, 153–167. [Google Scholar]
  42. Miao, G.; Zander, J.; Sung, K.W.; Ben Slimane, S. Fundamentals of Mobile Data Networks; Cambridge University Press: Cambridge, UK, 2016. [Google Scholar] [CrossRef]
Figure 1. UAVs collect and disseminate data in disaster areas without communication infrastructure.
Figure 1. UAVs collect and disseminate data in disaster areas without communication infrastructure.
Sensors 24 00887 g001
Figure 2. An example of UAV cooperative data dissemination consists of 4 UAVs and disseminating 4 packets.
Figure 2. An example of UAV cooperative data dissemination consists of 4 UAVs and disseminating 4 packets.
Sensors 24 00887 g002
Figure 3. The Encoder-Process–Decoder architecture.
Figure 3. The Encoder-Process–Decoder architecture.
Sensors 24 00887 g003
Figure 4. In the transmission phase, the communication link ( i , j ) , j S t i is established. When the response phase is triggered, UAV j responses UAV i S t j ¯ .
Figure 4. In the transmission phase, the communication link ( i , j ) , j S t i is established. When the response phase is triggered, UAV j responses UAV i S t j ¯ .
Sensors 24 00887 g004
Figure 5. The convergence curves under different receptive fields.
Figure 5. The convergence curves under different receptive fields.
Sensors 24 00887 g005
Figure 6. (a) The boxplot of the total time slots required by GN Block under different receptive fields; (b) The required time slots in the mobile scenario under different GNN receptive fields.
Figure 6. (a) The boxplot of the total time slots required by GN Block under different receptive fields; (b) The required time slots in the mobile scenario under different GNN receptive fields.
Sensors 24 00887 g006
Figure 7. (a) The boxplot of the total time slots required by GN Block under different transmission distances; (b) The required time slots under different transmission distances.
Figure 7. (a) The boxplot of the total time slots required by GN Block under different transmission distances; (b) The required time slots under different transmission distances.
Sensors 24 00887 g007
Figure 8. The required slots in the mobile scenario under different velocities.
Figure 8. The required slots in the mobile scenario under different velocities.
Sensors 24 00887 g008
Figure 9. (a) The required slots vs. the number of UAVs; (b) the required slots vs. the number of packets.
Figure 9. (a) The required slots vs. the number of UAVs; (b) the required slots vs. the number of packets.
Sensors 24 00887 g009
Figure 10. The number of time slots required by the GN block and reinforcement learning are compared, respectively, under different UAV scales.
Figure 10. The number of time slots required by the GN block and reinforcement learning are compared, respectively, under different UAV scales.
Sensors 24 00887 g010
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Xing, N.; Zhang, Y.; Wang, Y.; Zhou, Y. Unmanned Aerial Vehicle Cooperative Data Dissemination Based on Graph Neural Networks. Sensors 2024, 24, 887. https://doi.org/10.3390/s24030887

AMA Style

Xing N, Zhang Y, Wang Y, Zhou Y. Unmanned Aerial Vehicle Cooperative Data Dissemination Based on Graph Neural Networks. Sensors. 2024; 24(3):887. https://doi.org/10.3390/s24030887

Chicago/Turabian Style

Xing, Na, Ye Zhang, Yuehai Wang, and Yang Zhou. 2024. "Unmanned Aerial Vehicle Cooperative Data Dissemination Based on Graph Neural Networks" Sensors 24, no. 3: 887. https://doi.org/10.3390/s24030887

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop