Deep Learning-Based Autonomous Navigation of 5G Drones in Unknown and Dynamic Environments

Alotaibi, Theyab; Jambi, Kamal; Khemakhem, Maher; Eassa, Fathy; Bourennani, Farid

doi:10.3390/drones9040249

Open AccessArticle

Deep Learning-Based Autonomous Navigation of 5G Drones in Unknown and Dynamic Environments

by

Theyab Alotaibi

^1,*

,

Kamal Jambi

¹

,

Maher Khemakhem

¹

,

Fathy Eassa

¹

and

Farid Bourennani

²

¹

Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi Arabia

²

Faculty of Computer Science and Engineering, University of Jeddah, Jeddah 23218, Saudi Arabia

^*

Author to whom correspondence should be addressed.

Drones 2025, 9(4), 249; https://doi.org/10.3390/drones9040249

Submission received: 18 February 2025 / Revised: 18 March 2025 / Accepted: 24 March 2025 / Published: 26 March 2025

Download

Browse Figures

Versions Notes

Abstract

:

The flexibility and rapid mobility of drones make them ideal for Internet of Things (IoT) applications, such as traffic control and data collection. Therefore, the autonomous navigation of 5G drones in unknown and dynamic environments has become a major research topic. Current methods rely on sensors to perceive the environment to plan the path from the start point to the target and to avoid obstacles; however, their limited field of view prevents them from moving in all directions and detecting and avoiding obstacles. This article proposes the deep learning (DL)-based autonomous navigation of 5G drones. This proposal uses sensors capable of perceiving the entire environment surrounding the drone and fuses sensor data to detect and avoid obstacles, plan a path, and move in all directions. We trained a convolution neural network (CNN) using a novel dataset we created for drone ascent and passing over obstacles, which achieved 99% accuracy. We also trained artificial neural networks (ANNs) to control drones and achieved a 100% accuracy. Experiments in the Gazebo environment demonstrated the efficiency of sensor fusion, and our proposal was the only one that perceived the entire environment, particularly above the drone. Furthermore, it excelled at detecting U-shaped obstacles and enabling drones to emerge from them.

Keywords:

deep learning; CNN; ANN; autonomous navigation; drone; unmanned aerial vehicle; U-shape; dynamic obstacle

1. Introduction

In recent years, there has been a significant increase in the adoption and use of unmanned aerial vehicles (UAVs) in several practical domains, including delivery, transportation, agriculture, medical support, and picture capture [1,2]. This is due to the ability to access intricate hotspots and rapid flight capabilities. Ultimately, UAVs are projected to dominate the low-altitude airspace [3]. One of the primary focal points to be addressed in this matter is navigation, which is characterized as the challenge of directing a designated drone from an initial location to a final destination while ensuring the avoidance of collisions with obstacles within the surrounding environment [4]. The IoT refers to a network consisting of various sensors and terminal devices interconnected through the Internet [5]. Its primary objective is to facilitate the interconnection of various entities and propel economic growth. The IoT is currently widely used in various domains, including environmental monitoring, industrial manufacturing, and other platforms [6].

Due to their flexibility, UAVs can be rapidly deployed to provide additional network resources in regions with significant communication congestion or limited connectivity. UAVs can function as base stations or relay nodes, enabling the establishment of self-organizing networks. Consequently, they can offer network services, facilitating seamless integration into wireless communication networks [7]. UAVs possess notable mobility and flexibility, enabling their rapid deployment in response to service demand. UAVs have found extensive utilization in IoT contexts, including smart agriculture, disaster management [8], and smart cities [9]. Despite the extensive utilization of UAVs in the delivery of IoT services, there remain a variety of challenges regarding their implementation. The use of UAVs within the IoT is subject to notable limitations due to the unpredictable nature of the flight environment, which presents a significant risk, as UAVs are susceptible to potential collisions with various obstacles during their flight. Therefore, it is crucial to ensure collision avoidance to guarantee the successful completion of each mission. The intricate issue of drone navigation involves two main components: path planning and obstacle detection and avoidance. Due to the unpredictability of unknown environments, it is necessary to divide the effort into smaller, more manageable tasks that can be completed efficiently [10].

Moreover, in a dynamic environment, it is essential to continually adjust the flight path in response to objects that encounter the trajectory of drones. Consequently, a variety of methodologies were proposed over time, including traditional artificial intelligence algorithms, metaheuristic algorithms, deep learning, and reinforcement learning techniques. These studies concentrated on drone navigation, particularly obstacle recognition, avoidance, and path planning in unknown and dynamic environments. The low energy and storage capacity of UAVs remains a serious problem since they constrain their operational capabilities and flight endurance. Hence, path-planning optimization techniques must effectively reduce the energy usage, which increases the flight duration, to accomplish complex missions with limited resources [11].

UAV path planning techniques often use traditional methods, such as Voronoi diagrams and A* and Dijkstra algorithms, which consider static obstacles and objectives. These algorithms are appropriate for static path planning in a known environment. Although these techniques might identify potential paths, they often fail to enhance the energy efficiency or adapt to changing mission needs [12]. Intelligent algorithms, such as genetic algorithms and ant colony algorithms, provide solutions to environmental adaptation. However, these algorithms entail considerable search and iteration expenses, resulting in ineffective real-time planning [12]. Reinforcement learning obtains the mapping relationship from state to action without a sophisticated search method, which makes it suitable for real-time UAV path planning, but it is not efficient in a high-dimensional environment [13]. In contrast, DL excels in complex and dynamic environments.

DL can learn compact and useful features from complex, high-dimensional situations; therefore, it is more adaptable and resilient than traditional algorithms. CNN techniques were proposed in [14,15] as a way to directly produce control inputs from raw sensory data, which reduces the complexity of traditional approaches, but these methods have the limitation of a limited field of view. The integration of deep learning and reinforcement learning into a deep reinforcement learning algorithm has yielded significant advancements in the domain of industrial control. Researchers proposed many deep reinforcement learning (DRL) methods for path planning in unknown and dynamic environments. For example, researchers combined RL with CNN in [12,13], long short-term memory (LSTM) in [16], and variational autoencoders (VAEs) in [17]. These proposals suffer from a lack of dynamic obstacle avoidance and a limited field of view, which impedes the ability of drones to perceive the entire environment and move in all directions. In [18,19], the use of DRL included avoiding moving obstacles, but sensors were not used to sense the surrounding environment to assess the effectiveness of the proposal, and the proposal did not provide a way out of U-shaped obstacles.

Testing the robustness of the proposed algorithm, particularly for emerging from U-shaped obstacles, presents an additional challenge, as noted by [20]. Most existing proposals do not address the process of exiting U-shaped obstacles. Wang et al. [12] described a U-shaped obstacle scenario where the target is positioned behind it, potentially ensnaring drones in a loop trap from which it cannot extricate itself. In [12], a solution to this challenge was provided based on saving the two most recent actions undertaken. This solution proved insufficient for complex U-shape obstacle scenarios.

Camera sensor images are used in the proposed 5G drone navigation system methods. These methods use datasets taken inside buildings, like the NYU dataset [21], the HDIN dataset [22], and the Udacity dataset [23]. Additionally, these methods utilize the KITTI dataset [24] and the collision sequence dataset [25], both of which were collected from road scenes with cars. Drones employ these datasets to train their steering angles and collision avoidance in 2D planes. Initially, researchers created some of these datasets for autonomous navigation systems for cars on the road, while other datasets were captured from the interiors of buildings for use by drones. However, as far as we know, there is currently no database for training a drone to adjust its altitude to fly over obstacles. Moreover, real-world data are challenging, expensive, or impossible to obtain, making simulated synthetic datasets crucial for developing new models in rare, dangerous, or inaccessible environments [26].

In line with the motivation mentioned above, this paper introduces a deep learning-based autonomous navigation system for 5G drones in unknown and dynamic environments. This proposal can perceive the entire environment surrounding the drone, enabling it to detect and avoid dynamic obstacles and emerge from U-shaped obstacles. To satisfy the requirement of the proposed approach, we present the following contributions:

We propose sensors that can perceive the entire environment surrounding a drone using cameras to capture the scene and LiDAR to detect distance.
We created a novel dataset to train a drone on how to adjust its altitude to pass over an obstacle by perceiving its length.
We propose a novel technique for detecting static and dynamic objects using CNN and YOLOv9 and avoiding them using ANNs.
We propose a novel technique for detecting and emerging from U-shaped obstacles. Instead of using memory to store past actions, we used 16 input states in the ANNs.
We propose a novel technique that fuses all of the results of the sensors in the ANNs to provide actions for drones.
We integrated all of the above techniques to provide an autonomous navigation system for drones.

The remainder of this paper is organized as follows. In Section 2 of this paper, we provide a brief review of the existing methods. Section 3 explains the proposed method. The experiments and their findings are described in Section 4. Section 5 contains the conclusion and future work.

2. Literature Review

This section provides a comprehensive examination of the latest research findings. We divided this section into two parts. First, we review the proposed papers, and then we discuss them. The discussion includes the presented navigation method and a review of the dataset used in the autonomous navigation of 5G drones.

2.1. Review of Papers

Studies introduced potential resolutions by employing machine learning, genetic algorithms, and traditional search methods. Such strategies can be categorized into three distinct types: mapless, online maps, and offline maps. An offline map indicates that the drone has complete information about the environment, including the start and end points and obstacle locations, prior to autonomous navigation [27]. Online map building, also known as simultaneous localization and mapping (SLAM), is the process of creating a map during navigation without any prior information. Mapless strategies do not need any prior knowledge of the environment; instead, they rely on the ability to observe and navigate surroundings without maps [27]. Numerous studies were conducted on the application of RL in this field. In [28], the authors introduced a reinforcement learning framework that enabled a drone to accomplish many subgoals in the path-planning context. Experiments took place in a two-dimensional grid setting without using any sensors, and the drone’s objective was to reach the target point in a straightforward manner. Lee et al. [29] proposed enhancing the path optimization of surveillance drones during flight using a reinforcement learning technique without specifying an obstacle avoidance mechanism. Only a two-dimensional grid environment was tested using the proposed method. Cui et al. [30] developed a dual-layered Q-learning methodology to address the issue of planning paths for drones. They also employed the B-spline technique for path smoothing, which improved the planned path’s performance. They used a 30 × 30 cell grid arrangement in MATLAB software to validate the efficacy of the suggested methodology. Arshad et al. [25] used the Udacity and collision sequence datasets to train CNNs, which provided two outputs: the forward velocity and steering angle of UAVs. Such UAVs utilize a front camera, and the proposed technique teaches them to fly by mimicking the movements of automobiles and bikers. The performance was measured using the collision probability and steering angle, with an accuracy of 96.26%. Darwish and Arie [31] used a hybrid approach that combines CNN and LSTM to track and move toward the dynamic target and predict Q and V values. They used a sequence of four grayscale depth and RGB images for tracking and navigation purposes. They utilized AirSim and Unreal Engine for the simulation and OpenAI Gym for the training and control. They encountered challenges during this simulation since the background may conceal the diminutive target drone. The proposal by Artizzu et al. [32] involves using an omnidirectional camera to perceive the entire drone environment and an actor–-critic network to control drones by training this network based on three images: depth, RGB, and segmentation. The experiments demonstrated the method’s ability to navigate in two different virtual forest environments.

Recently, Yu et al. [16] presented a hybrid approach that combines VAE with LSTM to apply their strategy at a variable speed rather than at a fixed speed. This hybrid approach uses depth images captured from a stereo camera. The last paper we mentioned was based on a latent representation of depth images, similar to the work of Kulkarni et al. [33], who proposed depth cameras to navigate cluttered environments and used variational autoencoders for collision prediction. This method provides collision scores, which enable safe autonomous flight. In this process, they aimed to emphasize the advantages of this method in managing real sensor data errors, as opposed to methods that rely solely on simulation training. In addition, Kulkarni et al. [1] relied on variational autoencoders. They used a hybrid approach that combines a variational autoencoder with a deep reinforcement learning approach for aerial robots to control the velocity and yaw rate commands by converting a depth image into a collision image. Zhang et al. [15] proposed combining a CNN with an autoencoder; the input of this network is depth images from an RGB camera. The network’s output comprises steering commands that use the angular velocity to avoid obstacles. They mentioned that they could integrate additional sensors to enhance the efficiency of their method. Similarly to Yu et al. [16], González et al. [34] proposed a variable speed instead of a fixed speed in their work. They presented a differential evolution algorithm for path planning that covers a specific area of an environment. This algorithm yields the steering angle of the path that incurs the lowest distance cost. The velocity of a drone increases when it is located at a considerable distance from obstacles, whereas it has a slower speed when it is near one.

Drones in swarms have become a topic of interest to researchers, including Wang et al. [18], who proposed implementing reinforcement learning to prevent collisions between multiple agents in a decentralized environment where communication is absent. The observation of the drone includes the positions and velocities of its neighbors. The proposed approach involves utilizing steering commands as the sole method of action. However, the experiment did not specify the exact sensors used. Mikkelsen et al. [35] introduced a decentralized method, positing that robots communicate wirelessly to share data within a certain distance. This study aimed to develop a distributed algorithm that can determine the velocities of individual robots within a swarm while maintaining a formation, communication, and collision avoidance. The experiment was conducted on a two-dimensional grid. Ma et al. [14] presented a framework for controlling the formation of a UAV swarm using vision-based techniques. The authors presented the proposition that the hierarchical swarm possesses a centralized framework. The algorithm architecture consists of four main modules: detection, tracking, localization, and formation control. Deep learning enables the system to determine its position without relying on the Global Navigation Satellite System (GNSS). The image serves as the lead UAV’s input, the velocity command serves as its output, and the broadcaster distributes the locations for the remaining UAVs. Schilling et al. [36] aimed to establish a safe path for drones during their Internet of Things (IoT) operations. They proposed that vision-based drone swarms depend on Voronoi diagrams (VDs), where each drone detects nearby drones. In the diagram, every point represents a drone, which is a dynamic obstacle to constructing a constrained plane, known as a VD. The significant distance between obstacles and all edges of the path contributes to the safety and reliability of VD-derived paths. The drone is equipped with an omnidirectional camera.

Several methods are based on point cloud builds to represent obstacles in the field of view. Chen et al. [37] proposed using a quadrotor with a depth camera to capture a point cloud of obstacles within its field of view. They then used this point cloud to construct the map. They pointed out that the quadrotor’s limited field of view can make it challenging to achieve completely safe flights in dynamic environments, potentially leaving some dynamic obstacles undetected. Xu et al. [38] proposed using an RGB camera to detect obstacles and generate depth images. These images are then trained to detect obstacles and convert them into cloud points. The cloud points are then classified as dynamic or static and tracked. The authors asserted that the main limitation of the performance of their algorithm is the sensor’s field of view. Future enhancements can be achieved by using sensor fusion in a multiple-camera system.

Previous studies fused sensors. Yue et al. [39] proposed using a camera and laser to sense the target distance while deep reinforcement learning autonomously navigates a drone in an unknown environment. Sensors only make the drone aware of its front environment. They mentioned that due to its limited field of view, a UAV cannot perceive the direction of motion of dynamic obstacles throughout the global range. Doukhi et al. [40] utilized the deep Q-network algorithm to determine the optimal collision-free policy to select the best action among the three moving commands, which are right, left, and forward. They used LiDAR data and a depth camera as input for the algorithm. However, this method does not address navigation problems in a three-dimensional space. Jevtic et al. [41] presented reinforcement learning for flying mobile robots to avoid collisions. LiDAR data were used to identify obstacles in the mobile robot’s surroundings. They delineated three actions: forward movement, diagonal left movement, and diagonal right movement. Experiments were conducted in a two-dimensional environment using MATLAB Simulink version R2022a Simulink. Xie et al. [12] introduced a deep reinforcement learning algorithm that relies on a camera and a distance sensor to capture environmental information from in front of a drone to plan its path in a complex and dynamic three-dimensional environment. CNNs derive features from pictures, while recurrent neural networks (RNNs) preserve the trajectory history. They performed 3D simulations on a virtual robot experimentation platform (V-REP); the experiment was devoid of dynamic objects. Chrronis et al. [13] used a reinforcement learning approach to enable autonomous drone navigation using four distance sensors: one on the drone’s belly and three on its front. These sensors only provide the drone with awareness of the environment in front of it. Based on this awareness, the drone moves by rotating its Z-axis in both clockwise and counterclockwise directions, adjusting its speed and altitude accordingly. Microsoft’s AirSim4 conducted experiments in static environments. They mentioned that cameras would enable them to leverage additional information from the environment beyond what these sensors provide.

Several papers have pointed out the limited field of view of the current proposed methods for path planning and obstacle avoidance. In [12], the authors mentioned that different sensors show different types of information; therefore, using multiple sensors in reinforcement learning to make more reasonable decisions is an important research direction [12]. In future work, the authors of [13] intend to repeat the same series of experiments but integrate alternative sensors, such as cameras, to take advantage of additional data from the environment. The experiments in [38] proved that the sensor’s field of view was the main bottleneck that affected their algorithm’s performance. Consequently, they proposed sensor fusion in multiple-camera systems that can be used to improve future performance. In [37], the authors mentioned that a significant factor that affected the robustness was the unsatisfactory performance of perception due to a limited field of view in an environment that contained numerous dynamic obstacles. Hence, they proposed investigating perception-aware planning in future works to better predict the status of dynamic obstacles and enhance the robustness of the system. The experiments in [39] proved that a UAV cannot perceive the motion direction of dynamic obstacles across the entire global range due to its limited field of view. In future research endeavors, they mentioned that they could better integrate multi-sensory data to emulate autonomous perceptual capabilities in animal systems. Examining the robustness of the proposed algorithm, which can be used to exit U-shaped obstacles, poses an additional challenge that warrants further study [20]. Researchers need to find even better ways to integrate these sensors so that drones can better sense their surroundings and make quick decisions to prevent them from colliding with other objects [42].

The four tables below compare autonomous navigation methods for 5G drones, highlighting the obstacles they avoid, testing environments, sensor fusion, and field of view. Table 1 shows that the strategies in [19,35,37,38] are ineffective in dynamic environments due to their reliance on online and offline map strategies. Table 2 shows that only [12] presented a way out of U-shaped obstacles by recalling previous actions. This solution is insufficient for complicated U-shaped obstacles. The limitations of studies [14,18] include that drones can only avoid collisions with other drones in the same swarm by transmitting data via a communication link. Therefore, drones without a communication link cannot avoid collisions. The drawback of most articles is that their proposals did not avoid dynamic objects, as shown in Table 2. Meanwhile, the approaches that did avoid them suffered from limited fields of view. In contrast, Table 3 illustrates the sensors utilized in each method. In [20], sensors that require high computational time were mentioned; therefore, in Table 3, we indicate these sensors, which are stereo cameras, monocular cameras, omnidirectional cameras and LiDAR. Certain sensors, such as stereo, monocular, omnidirectional, and RGB cameras, are sensitive to light and weather conditions, unlike radar and LiDAR, which are not affected by these factors [20,43].

In Table 3, there are three approaches [14,18,35] that rely on communication links to avoid collisions with other drones. However, this approach may not work well because not all drones are able to communicate with other drones. Additionally, there are static obstacles, such as buildings, and dynamic objects, such as enemy drones, that cannot be communicated with. Table 4 illustrates the limited fields of view of drones in all the studies. Most studies only focused on environmental perception through the front of the drone and ignored the other parts of the drone. We note that no method proposed a way to perceive the environment above a drone. Although [32,36] perceived the environment from four sides, they suffered from a limited field of view because they did not perceive the environment above the drone. In addition, these studies used an omnidirectional camera, so they suffered from a high computational time.

2.2. Dataset

The proposed methods, which rely on images from camera sensors for visual-based navigation, utilize image datasets to train drones to steer with certain angles and avoid collisions. Here, we review the popular datasets in this field. Nathan et al. [21] presented the NYU dataset, which comprises 1449 RGB images of different indoor scenes with depth information and detailed descriptions of each image to parse the complexity of the depicted indoor scenes. The division of the image into regions corresponds to individual objects or surfaces. The HDIN dataset was created by Yingxiu et al. [22] and only allows UAVs to navigate in the 2D plane inside buildings. It has real-world indoor data with three label types: expected steering, fitting angular velocity, and scalable angular velocity. Geiger et al. [24] established the KITTI dataset by capturing road images for autonomous driving in a 2D environment. Zhengyuan et al. [23] created the Udacity dataset, which is based on data collected from roads during the day and night to determine the speed and steering angles, while Arshad et al. [25] compiled the collision sequence dataset by attaching a camera to a bicycle handlebar and navigating through the city to study collisions or the lack thereof.

3. Proposed Approach

The proposed approach outlines a structured framework to perceive the entire environment, detect and avoid static and dynamic obstacles, detect and emerge from U-shaped obstacles, and select the appropriate action to reach the target. The framework of the proposed approach, based on DL, is illustrated in Figure 1 and comprises the following steps:

Our approach uses four camera sensors to perceive the environment surrounding a drone. These sensors allow the drone to perceive its surroundings on four sides. The details are discussed in Section 3.1.
These sensors collect data from the drone environment; then, the CNN and YOLO algorithms are used to detect static and dynamic obstacles. The details of each detector are discussed in Section 3.1 and Section 3.2.
Two LiDAR sensors are used to perceive the environment above and below a drone. The specifics of this perception are discussed in Section 3.3. In addition, information on the location and tilt of the target is collected, which is discussed in Section 3.5.
ANNs fuse the data collected in steps 1, 2, and 3, along with the target location and the target’s tilt, to perceive the environment. The details of this fusion are discussed in Section 3.4.
ANNs detect various obstacles in all directions. When the DL detects obstacles, it initiates the appropriate action to avoid them and then instructs the drone to carry out this action in the environment. The details of these actions are discussed in Section 3.6.
When the ANNs detect a U-shaped obstacle, it initiates the appropriate action to emerge from the U-shape and then instructs the drone to carry out this action in the environment. The details are discussed in Section 3.7.
If there is no U-shape obstacle present, the ANNs initiate an appropriate action to reach the target and then instruct the drone to execute this action in the environment. The details are discussed in Section 3.8.
The drone carries out the action in the environment, returns to the first step, and continues to execute iteratively until the end of the mission. The details of the autonomous navigation system for drones are discussed in Section 3.8.

3.1. Static Obstacles

We mounted four RGB cameras on the underside of a drone to perceive its surroundings in four directions: front, right, left, and back, as shown in Figure 2. This figure illustrates the ability of the four cameras to perceive the environment around the drone in all four directions, which enables the drone to simultaneously perceive obstacles in these directions without the need to change direction. These cameras prevent the drone from following a zigzag path and ensure that it does not take more actions than necessary.

We used a CNN algorithm to detect and avoid static obstacles on each side, as depicted in Figure 3. The RGB camera captured color images with a size of 640 × 640 pixels. To expedite the CNN processing, we resized the images to 200 × 200 pixels and normalized them to map the 0–255-pixel range before feeding them into our networks. We input images with dimensions of 200 × 200 pixels into the first convolutional layer; we applied 32 filters with a size of 3 × 3 in this convolutional layer. After this convolutional layer, we used max pooling with a size of 2 × 2, which aided in extracting the images’ features and reducing their size by half. The process was repeated using the same convolutional layer, filter, and max pooling. We input the result into fully connected layers that consisted of 64 neurons, each with the ReLU activation function, followed by two hidden layers of sizes 128 and 64 neurons, each with the ReLU activation function. The output layer used the sigmoid function to detect whether there were obstacles or not. The collision sequence dataset [19] comprised 32,000 images. However, this database was insufficient for training our network, as it lacked data from different altitudes. Consequently, our drone was unable to fly over the building when relying on this dataset. As a result, we utilized the Gazebo environment to gather training data from various altitudes, which enhanced our training data’s efficiency. We collected our dataset from three environments in the Gazebo simulator, as shown in Figure 4. This dataset was collected at various altitudes, as shown in Figure 5 and as shown in Figure 6. This dataset allowed the drone to perceive its own altitude and the height of obstacles, such as buildings. If the drone was higher than the building, it could avoid a collision with this building by passing above it, but if the building was higher than the drone, the drone had to rise to avoid a collision. Our dataset comprised a total of 3226 images; we labeled each image with the numbers 0 and 1, where 0 indicated no obstacle and 1 indicated an obstacle. We divided this dataset into training, test, and validation data, and the CNN result accuracy was 99% after 30 epochs, as shown in Figure 7. This accuracy was calculated using Equation (1). Using a CNN, the drone was able to simultaneously avoid obstacles in all four directions without taking any action, enabling it to exit U-shaped obstacles or caves, where the drone selected the exit direction that appeared in one of the four cameras. We further explore this experiment in its dedicated section.

A c c u r a c y = (T r u e p o s i t i v e + T r u e n e g a t i v e) / T o t a l n u m b e r o f s a m p l e s

(1)

3.2. Dynamic Objects

We used four RGB cameras to detect and avoid dynamic objects. You Only Look Once (YOLO) is an object detection algorithm [14]. This study used YOLOv9 [14]. Drones are not among the objects detected by the algorithm; we trained the algorithm to detect drones by collecting drone images from the Gazebo environment. YOLOv9 was trained and the accuracy of drone detection was high, as shown in Figure 8. This clearly shows that this YOLOv9 model could detect drones with complex backgrounds, such as in images (b,c). This model was able to detect part of the drone, such as in image (d), and the detection rate was 77%. This rate was due to the detection of only a part of the drone, while after it appeared in its entirety in image (f), the detection rate increased to 97%. The YOLOv9 model was capable of detecting a wide variety of drone types, as demonstrated in image (e). The next step after detecting the drones was to avoid them. The drone was trained to avoid dynamic objects by executing two actions based on the detected drone frame coordinates, with the aim to avoid collision with another drone. If the drone was higher than the detected drone, it executed an upward action, whereas if it was lower, it executed a downward action. For example, the coordinates of the detected drone frame in image (a) were x1 = 201, y1 = 446, x2 = 352, and y2 = 520. We used the coordinates y1 and y2 to determine the action of the drone. y1 indicates that there were 446 pixels above the detected drone, while there were 120 pixels below it, as calculated by subtracting y2 from 640 pixels, which represented the image’s length. The coordinates of the images that entered this model were 640 in length and 640 in width. Depending on the pixels above and below the detected drone, our drone performed an upward action to avoid a collision. The result of using this model was zero if no drone was detected in the image, one if the model detected a drone and the number of pixels above the drone exceeded those below it, and two if the pixels below the drone exceeded those above it. If the model detected more than one drone, whether in a single camera or multiple cameras, it selected the larger drone to determine whether to make an upward or downward action. For instance, the drone detected in image (e) appeared to be the largest among all detected drones, as it was the closest to our drone.

3.3. LiDAR Sensor

We installed the first LiDAR on the drone’s surface, as shown in Figure 9. This LiDAR had three rays that extended from the front to the back of the drone, constrained to an angle range of [−20, 20] degrees, and each ray had three rays that extended from the right to the left of the drone, constrained to an angle range of [−20, 20] degrees. These rays were extended to three meters; this helped to cover the entire drone during flight. These rays had the ability to detect obstacles located above the drone. If one of these rays detected an obstacle, it indicated the presence of an obstacle above the drone; therefore, this LiDAR prevented the drone from orienting upward.

We mounted the second LiDAR on the drone’s belly. This LiDAR had three rays that extended from the front to the back of the drone, constrained to an angle range of [−20, 20] degrees, and each ray had three rays that extended from the right to the left of the drone, constrained to an angle range of [−20, 20] degrees. These rays extended up to two meters, which enabled them to detect obstacles located beneath the drone. If one of these rays detected an obstacle, it indicated the presence of an obstacle below the drone; therefore, this LiDAR prevented the drone from orienting downward. We only used 9 rays out of the 360 rays in each LiDAR to save energy. The LiDAR sensors provided two inputs to the ANNs, as shown in Figure 3: one for LiDAR 1 and another for LiDAR 2. If the LiDAR detected an obstacle, the input was 1, whereas if the LiDAR did not detect the obstacle, the input was 0.

3.4. ANNs

We used ANNs to fuse the drone sensor data and generate commands for the drone to execute in the environment. The ANNs had an input layer with eight neurons, as depicted in Figure 3. These neurons received eight inputs: four from the CNN, one from LiDAR 1, one from LiDAR 2, one from YOLOv9, and information about the target’s location and tilt. The input layer was followed by two hidden layers (one with 128 and the other with 64 neurons) and a ReLU activation function. Since we had ten possible actions, we employed ten neurons in the output layer, which utilized the SoftMax function. The following subsections provide a more detailed explanation of the ANN input, including the two LiDAR sensors, the location of the drone, and its tilt. In addition, we explain the output of the ANNs.

3.5. Location and Tilt of the Target

This study identified ten locations that corresponded to the position of the target relative to the drone, which indicated whether the target was perpendicular or tilted. Figure 10 illustrates the ten locations of the target in relation to the drone. In image (a), the drone is located on the X- and Y-axes without tilt, with white stars indicating the position of the target on the same axis and blue stars representing the tilt of the target on the X- or Y-axis. In image (b), gray stars indicate the target’s location above or below the drone on the Z-axis without tilt. We labeled these ten locations and tilts. Consequently, white locations received a location number from 0 to 3 and a tilt number of zero, while blue locations received a location number from 0 to 3 and a tilt number of one. The gray locations were numbered 4 or 5 regardless of tilt.

3.6. Action Space

The action space encompasses all available actions for the drone. We defined ten actions for the drone: forward, forward right, forward left, right, left, backward, backward right, backward left, upward, and downward, as depicted in Figure 11. The drone chose an action depending on the target location. If the target was located in front of, behind, right of, or left of the drone, the drone remained on the X-axis without tilt, and the drone chose one of four actions according to the location of the target. These actions were forward, right, left, and backward, as depicted in image (a). The drone tilted 45 degrees south on its X-axis when the target was in a forward-left, forward-right, backward-left, or backward-right position and selected one of four actions based on the target’s location. These actions were forward right, forward left, backward right, and backward left, as depicted in image (b). The drone was able to execute both upward and downward actions. Whether the drone was on the X-axis or tilted south at a 45-degree angle, it had the ability to move upward or downward, depending on the target above or below it, as depicted in image (c). Additionally, if the dynamic object was above the drone, the drone regarded the target as being below, and vice versa, to avoid a collision with the dynamic object. The drone relied on the ANNs to determine which action it would execute; therefore, the ANNs depended on observations from the CNN, YOLOv9, and two LiDAR sensors, the location of the target, and its tilt.

3.7. ANN Training

We trained the ANNs to select the action that would enable the drone to reach the target while avoiding collisions with obstacles. We arranged the actions based on the location of the drone and the readings from its sensors. The drone executed the action closest to the target and performed various actions depending on the presence of obstacles. If the target was in front, it undertook a forward action. If an obstacle was in front, it made an upward action. If an obstacle was above, it performed a right action. This continued until the drone chose a downward action. Using the same example but with the target tilted, the drone performed the following actions: forward right, upward, backward right, forward left, backward left, and downward.

Each location of the target relative to the drone generated a total of 128 drone states, which culminated in a total of 768 states for the drone in the environment. We trained the ANNs to respond to each of these 768 states based on eight inputs in the ANNs: the location and tilt of the target, four from the CNN, and two from the LiDAR. We established a training set that encompassed all potential ANN inputs and the corresponding actions for each input set. Each input had two binary options (0 or 1), with an exception for the location of the target, which had six options that ranged from 0 to 5. These 768 states were responsible for detecting obstacles, including U-shaped obstacles, and guiding drones toward the target. Out of the 768 states, 16 were responsible for detecting U-shaped obstacles, as shown in Table 5. We divided this dataset into a training set and a test set. After training the ANNs, we achieved a score of 1, as shown in Figure 12.

3.8. Autonomous Navigation System for Drones

We integrated all the proposed techniques to provide an autonomous navigation system for drones. We developed Algorithm 1 in Python language version 3.11 to incorporate all of these techniques. Initially, we collected data from the drone’s surroundings and forwarded it to each technique. This process continued until we provided all of the necessary data for the ANNs to determine the action that the drone must execute in the environment. We used lines 1–9 to read the location of the drone and the target in the environment, and code line 10 initiated the drone’s mission to fly toward the target. This mission continued with a while loop until it reached the target. Lines 15 and 16 determined the target’s location in the environment and its tilt. Lines 17 to 20 were responsible for reading images from the camera sensor, while lines 21 and 22 were responsible for reading data from the two LiDAR sensors. We began transforming the required data into each technique after preparing it. First, the for loop in line 23 used the CNN to process the images, detect static objects, and determine whether an obstacle was in each image. Second, YOLOv9 processed the images to detect dynamic objects and determine whether the result indicated upward, downward, or no dynamic object detection. Finally, the for loop in line 34 utilized the two LiDAR sensors to detect obstacles above and below the drone. If YOLOv9 detected a dynamic object, the code range of 42 to 47 reset the location of the drone. In line 48, we fused all sensor data using the ANNs to provide an action to the drone. In line 49, the drone executed the action in the environment; therefore, the location of the drone in the environment changed. Thus, in lines 50–55, we reset the location of the drone to determine whether it had reached the target or was still moving toward it.

Algorithm 1 Deep Learning-Based Autonomous Navigation of 5G Drones in Unknown and Dynamic Environment

1:: $X d \leftarrow l a t i t u d e$ ▹ Current position of the drone
2:: $y d \leftarrow l o n g i t u d e$ ▹ Current position of the drone
3:: $Z d \leftarrow a l t i t u d e$ ▹ Current position of the drone
4:: $X t \leftarrow l a t i t u d e$ ▹ Current position of the target
5:: $y t \leftarrow l o n g i t u d e$ ▹ Current position of the target
6:: $Z t \leftarrow a l t i t u d e$ ▹ Current position of the target
7:: $x \leftarrow | X d - X t |$
8:: $y \leftarrow | y d - y t |$
9:: $z \leftarrow | Z d - Z t |$
10:: while $x \neq 0$ and $y \neq 0$ and $z \neq 0$ do
11:: $L 1 \leftarrow 0$
12:: $L 2 \leftarrow 0$
13:: $s t a t e \leftarrow 0$
14:: $b i g \leftarrow 0$
15:: $l o c a t i o n \leftarrow [0, 1, 2, 3, 4, 5]$ ▹ The location is one of these array numbers
16:: $t i l t \leftarrow [0, 1]$ ▹ The tilt is one of these array numbers
17:: $C 1 \leftarrow i m a g e$ ▹ Read image of camera 1
18:: $C 2 \leftarrow i m a g e$
19:: $C 3 \leftarrow i m a g e$
20:: $C 4 \leftarrow i m a g e$
21:: $L i D A R 1 \leftarrow d i s t a n c e$ ▹ Read data of LiDAR 1
22:: $L i D A R 2 \leftarrow d i s t a n c e$ ▹ Read data of LiDAR 2
23:: for $i \leftarrow 1$ to 4 do
24:: $D i \leftarrow r e s i z e (C i, (200, 200))$
25:: $D i \leftarrow D i / 255$
26:: $D i \leftarrow C N N (D i)$
27:: $d e t e c t \leftarrow Y O L O v 9 (C i)$
28:: $L i \leftarrow p i x e l (d e t e c t)$
29:: if $d e t e c t$ > $b i g$ then
30:: $b i g \leftarrow d e t e c t$
31:: $s t a t e \leftarrow L i$
32:: end if
33:: end for
34:: for $j \leftarrow 1$ to 9 do
35:: if $L i D A R 1 j$ < 3 then
36:: $L 1 \leftarrow 1$
37:: end if
38:: if $L i D A R 2 j$ < 3 then
39:: $L 2 \leftarrow 1$
40:: end if
41:: end for
42:: if $s t a t e$ == 1 then
43:: $l o c a t i o n \leftarrow 4$
44:: end if
45:: if $s t a t e$ == 2 then
46:: $l o c a t i o n \leftarrow 5$
47:: end if
48:: $a c t i o n \leftarrow A N N (l o c a t i o n, t i l t, D 1, D 2, D 3, D 4, L 1, L 2)$
49:: $G a z e b o \leftarrow D r o n e (a c t i o n)$
50:: $X d \leftarrow l a t i t u d e$
51:: $y d \leftarrow l o n g i t u d e$
52:: $Z d \leftarrow a l t i t u d e$
53:: $x \leftarrow | X d - X t |$
54:: $y \leftarrow | y d - y t |$
55:: $z \leftarrow | Z d - Z t |$
56:: end while

4. Simulation Setup

In this section, we describe a series of rigorous simulation experiments that were conducted to evaluate the viability of the proposed approach. The simulation environments we used were generated in the Gazebo simulator [44]. Gazebo is well integrated with the robot operating system (ROS) middleware, which means it can work alongside some of the robotics stack used in real robots [45]. In addition, Gazebo offers a model library for various sensors, such as cameras, GPS, and IMUs, allowing users to import environments from digital elevation models [46]. Therefore, Gazebo provides environments for real-world robotic development processes, and it is widely used in research on robotics [46]. We utilized a Lenovo laptop computer with an Intel(R) Core (TM) i7-10750H CPU 2.60 GHz and 16 GB of RAM. We trained the DL model using the Keras framework by adjusting the speed to 0.3 m/s. The first simulation demonstrated the method by which the three rays that extended from the front to the back of the drone prevented the drone from colliding with the building’s roof, whereas the second simulation demonstrated the method by which the three rays that extended from the right to the left of the drone prevented the drone from colliding with the building’s roof. The third experiment proved the efficiency of the proposed approach in determining the altitudes of the buildings in the environment and the ability of the drone to pass above these buildings. We took a different approach to determining the altitudes of buildings from the one presented in [13]. This involved adjusting the drone’s height depending on three laser sensors, which determined one point on the front of the drone to determine the height of the obstacle, and this point was not enough to resolve this. We also demonstrated the drone’s ability to take the path to the target and navigate obstacles. The fourth experiment proved the efficiency of the proposed approach in emerging from U-shaped obstacles. We conducted a fifth experiment to evaluate the drone’s load.

4.1. First Experiment

In Figure 13, we show a structure that measured 5 m in height, 6 m in length, and 6 m in width. We blocked three directions of this building and placed the drone inside to find the exit path. Image (a) shows the drone finding an exit path from the building in an unobstructed direction. It is apparent that the three rays that extended from front to back prevented the drone from colliding with the building’s roof. Image (b) demonstrates the effectiveness of ray number two, which prevented the drone from colliding with the building’s roof. Image (c) shows that ray number three prevented the drone from colliding with the building’s roof. These three rays played crucial roles in preventing the drone from colliding with the roof of the building when the drone moved along the X-axis. Images (e,f) show the drone moving upward in a safe path without encountering any obstacles.

4.2. Second Experiment

In Figure 14, we utilized the same building employed in the first experiment. Image (a) shows the drone finding an exit path from the building in an unobstructed direction. It is apparent that the three rays that extended from right to left prevented the drone from colliding with the roof of the building. In images (b,c,d), the drone is shown to exit efficiently using the three rays, each of which were crucial for detecting the obstacles and ensuring a safe exit. These three rays played crucial roles in preventing the drone from colliding with the roof of the building when it moved along the Y-axis. Images (e,f) show the drone moving upward in a safe path without encountering any obstacles.

4.3. Third Experiment

In this experiment, Figure 15 demonstrates the ability of the drone to determine the heights of buildings in a Gazebo environment, as well as the appropriate heights that the drone should reach to pass over the buildings. This experiment also proved the effectiveness of our dataset, which we collected from the Gazebo environment, at enabling the drone to determine the heights of the buildings. We arranged three buildings in a U-shape and positioned the drone in the middle of these buildings to initiate its flight from this point to the target point, which was located behind this U-shaped obstacle. This allowed us to test the drone’s ability to pass over these buildings and orient itself toward the target. Additionally, we placed three buildings behind the U-shaped obstacle to test whether the drone perceived these buildings as obstacles and continued its upward trajectory or considered these buildings as less high than the drone; therefore, these buildings were not obstacles. In image (b), the drone realized that the building was higher, and thus, constituted an obstacle, so it started moving upward to become higher than the building. In image (c), the drone realized that it was higher than the building, so it stopped moving upward. In image (d), the drone moved forward on a straight path to reach the target and did not consider the building behind the U-shape as an obstacle since the drone realized that it was higher than these buildings. In images (e,f), the drone first reached the top of the target, after which it began to move downward to reach the target.

4.4. Fourth Experiment

Before explaining the fourth experiment, we describe the U-shaped obstacle, which is also known as a V-shaped obstacle. Figure 16 illustrates a sampled model of obstacles that a drone can encounter in the real environment. We assume that the drone starts at point A. The sensor’s limited field of view prevents the drone from perceiving the inverted U-shaped obstacle in front, so it may select point B as the desired position point to approach the target. The drone discovers an obstacle in front of point B and may choose to turn to point C or D. Assuming the drone chooses to reach point C, it can still choose point B in the next step. As a result, the drone may fall into a loop trap and cannot escape the inverted U-shaped obstacle.

As shown in Figure 17, we conducted an experiment to demonstrate the effective use of four cameras and two LiDAR sensors to simultaneously perceive the entire environment around the drone. This perception saved the drone’s time, which would otherwise have been wasted if the drone had perceived only a portion of the environment and attempted to move accordingly. We utilized the same building as in the first experiment and placed the drone inside it. We also established the target position behind the building to observe how the drone would reach it through the U-shaped obstacle. In images (d,c), the drone recognized its location inside a building blocked from three directions, including above it. This recognition was made possible by the four cameras and LiDAR 1 sensor, which enabled the drone to perceive its environment rapidly and without further action. It then instantly emerged from the U-shaped obstacle. Due to LiDAR 1 and the four cameras, the drone emerged from the U-shaped obstacle without colliding with the building’s roof or walls. Following this, the drone oriented itself toward the target located behind the U-shaped obstacle, utilizing its perception of the height of the buildings to reach its destination, as depicted in images (d,e,f). We repeated the same experiment but with a larger U-shaped building; however, the drone successfully emerged from the larger U-shaped building, where it reached the target in a longer time than for a smaller U-shaped obstacle, as depicted in Table 6.

4.5. Fifth Experiment

This experiment aimed to evaluate the weight of the sensors and its impact on drones. We used three drones: one equipped with all the sensors we propose, one equipped with two LiDAR sensors and one camera, and one without any sensors, as shown in Table 7. The sensors used in this experiment only ran on the drone without our proposed approach, which processed the data captured by these sensors. We made these three drones fly in a straight line for seven minutes and then calculated the distance traveled in meters and the percentage of battery consumed. As shown in Table 7, the first drone traveled 350 m in 7 min, which was shorter by 4 m than the second due to the sensor weight, indicating that reducing the number of cameras from four to one did not significantly impact the flight speed. The third drone traveled a distance of 403 m because it did not carry any sensors, but the difference compared with the drones carrying sensors was not great; therefore, the drone was able to carry sensors and fly efficiently. Upon comparing the battery consumption between the three drones, we noticed that the battery consumption was similar between all the drones after flying for seven minutes; therefore, operating these sensors without computation time did not consume battery power.

4.6. Discussion

The results of the experiments and the four comparison tables show the efficiency of our approach. The first and second experiments showed that our proposed approach enabled the drone to perceive its entire environment, including above and below it, thereby enabling movement in all directions. Table 1 indicates that our proposal was based on a mapless strategy; therefore, it does not consume memory capacity to process a map like the methods that depend on an online map strategy. Additionally, our proposal is suitable for a dynamic environment that does not depend on an offline map strategy. The fourth experiment demonstrated that our proposed method allowed for rapidly emerging from a U-shaped obstacle without relying on action history. Furthermore, our method depends on fusing all sensors, regardless of how much action is needed, in contrast to [12], which requires saving previous actions to exit U-shaped obstacles. Table 2 and the fourth experiment proved that our proposal is the only effective way to emerge from the U-shaped obstacles.

In Table 2, Reference [41] avoided dynamic obstacles but had a limited field of view and did not account for upward movement due to their method’s inability to perceive the environment above the drone. The contribution of [31,38,39] is the tracking of dynamic targets by a camera mounted in front of the UAV; therefore, these papers suffer from a limited field of view. In contrast, our proposal does not suffer from a limited field of view; hence, it has the ability to avoid dynamic objects in all directions.

Table 3 illustrates that we fused the cameras and LiDAR sensors to perceive the entire environment around the drone and move according to this perception, as proven by our experiments. This fusion enabled the method to perceive the entire environment around the drone, and it outperformed the other methods in Table 3.

Table 4 illustrates the limited field of view of a drone in all articles, whereas our proposal was the only one to perceive the entire environment around the drone. Despite [40] perceiving the environment through the three parts of the drone, it works only in a two-dimensional environment, and its altitude does not change during flight. In the third experiment, we used our dataset to train the drone to adjust its altitude, unlike in [13], which relied on three points from the sensor laser to perceive the environment in front of it and adjust its altitude without perceiving the environment above it. Although these methods [32,36] utilize an omnidirectional camera to perceive the environment, they still lack the ability to perceive the environment above the drone. Therefore, our proposed approach outperformed these methods by perceiving the entire surroundings of the drone.

The experiments we conducted are described in more detail in Table 6, including the time spent and the energy consumption of the drone when implementing the proposed approach. This table shows the ability of the drone to avoid obstacles in a short period of time, and thus, the drone’s ability to perform the computational operations of the proposed method with low energy consumption. Moreover, the ability of the drone to fly efficiently with four cameras and two LiDAR sensors was demonstrated through the fifth experiment.

In Table 8, we compare the time taken to implement the proposed approach using one camera and four cameras. The proposed approach uses ANNs once to fuse all sensors and provide command action to a drone, ensuring that the implementation time of ANNs using four cameras was equivalent to using one. Although the proposed approach uses CNNs with each camera, it does not take a long time. In contrast, the YOLOv9 algorithm showed a significant time difference when used with four cameras. Table 8 shows a three-and-a-half-fold increase in the proposed approach with four cameras. The proposed approach minimized the drone actions, as demonstrated in the fourth experiment, therefore compensating for the time taken to execute the algorithm.

Table 9 shows the superiority of our proposed approach over deep learning-based methods, such as those in [15,25]. These methods use Equation (1) to evaluate the accuracy, whereas other methods use DRL to train drones to learn from the environment, and the accuracy is calculated by the success rate; however, our approach outperformed these in terms of accuracy. The success rate of our proposed approach was based on the results of the qualitative experiments described in the Simulation Setup Section. These experiments proved different aspects of our proposal, such as being aware of the entire environment and the drone’s ability to move in all directions and avoid various obstacles.

5. Conclusions and Future Work

This paper proposes an autonomous navigation system of 5G drones that works through the fusion of sensor data to perceive the entire environment, move in all directions, detect and avoid obstacles, exit U-shaped obstacles, and provide command actions for a drone. All previous methods suffered from a limited field of view, so the proposed approach uses four RGB cameras and two LiDAR sensors to perceive the entire environment; therefore, our approach does not suffer from a limited field of view, as shown in Table 4, which allows drones to move in all directions and detect various obstacles. Due to the lack of a database to train a drone to adjust their altitude to fly over obstacles, we created a novel dataset that assisted drones in adjusting its altitude and passing over buildings, as illustrated in the third experiment.

Additionally, to address the shortcomings of previous methods in perceiving the environment above the drone, we expanded the rays of the LiDAR to cover the drones during both upward and downward movements. The extension of the rays, along with our dataset, makes upward and downward movements safe and flexible without requiring further actions from the drone, as proved by the first and second experiments. Furthermore, Table 4 demonstrates that our approach, out of all of the methods, was the only one capable of perceiving the environment above a drone. Only one method was proposed to exit U-shaped obstacles, but the solution proved insufficient for complex U-shaped obstacle scenarios because it was based on saving the two most recent actions undertaken. Therefore, we proposed a novel technique for detecting and exiting U-shaped obstacles using 16 input states in ANNs. The fourth experiment demonstrated the efficiency of the proposed approach in exiting U-shaped obstacles and successfully reaching the target.

We integrated all of the techniques presented in this article into an autonomous drone navigation system, which determines the path for a drone to reach the target and avoids both static and dynamic obstacles, as well as U-shaped obstacles. To save energy, we only used 9 out of the 360 rays in each LiDAR, reduced the image size to 200 × 200 pixels, and normalized the range of pixel values to 0–255. The fifth experiment demonstrated that the drone flight speed was not significantly impacted by the sensors’ weight. Moreover, Table 6 demonstrates the efficient flight of drones using our proposed approach and sensors, but Table 8 reveals limitations, as the use of four cameras increased the execution time of the proposed approach by three and a half times compared with using just one camera. Moreover, the RGB cameras that we used are sensitive to light and weather conditions; therefore, our proposed approach needs to be tested in the real world under these conditions to test its efficiency. Future work will focus on minimizing the activation of all sensors during flight, ensuring that their operation is tailored to the specific needs of the surrounding environment and thereby reducing the time to execute the proposed approach.

Author Contributions

Methodology, T.A., K.J., M.K., F.E. and F.B.; Software, T.A.; Validation, T.A. and F.E.; Formal analysis, T.A.; Investigation, T.A. and M.K.; Resources, T.A.; Writing—original draft, T.A.; Writing—review & editing, K.J., M.K., F.E. and F.B.; Supervision, K.J. and F.B.; Project administration, F.E. All authors read and agreed to the published version of this manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

UAV	Unmanned aerial vehicle
DL	Deep learning
RL	Reinforcement learning
DRL	Deep reinforcement learning
IoT	Internet of Things
ANN	Artificial neural network
CNN	Convolution neural network
LSTM	Long short-term memory
VAE	Variational autoencoder
LiDAR	Light detection and ranging

References

Shirabayashi, J.V.; Ruiz, L.B. Toward UAV Path Planning Problem Optimization Considering the Internet of Drones. IEEE Access 2023, 11, 136825–136854. [Google Scholar]
Ariante, G.; Del Core, G. Unmanned Aircraft Systems (UASs): Current State, Emerging Technologies, and Future Trends. Drones 2025, 9, 59. [Google Scholar] [CrossRef]
Labib, N.S.; Brust, M.R.; Danoy, G.; Bouvry, P. The rise of drones in internet of things: A survey on the evolution, prospects and challenges of unmanned aerial vehicles. IEEE Access 2021, 9, 115466–115487. [Google Scholar]
López, B.; Muñoz, J.; Quevedo, F.; Monje, C.A.; Garrido, S.; Moreno, L. 4D Trajectory Planning Based on Fast Marching Square for UAV Teams. IEEE Trans. Intell. Transp. Syst. Syst. 2024, 25, 5703–5717. [Google Scholar] [CrossRef]
De Silvestri, S.; Pagliarani, M.; Tomasello, F.; Trojaniello, D.; Sanna, A. Design of a service for hospital internal transport of urgent Pharmaceuticals via drones mitigation. Drones 2022, 6, 70. [Google Scholar]
Cheng, N.; Wu, S.; Wang, X.; Yin, Z.; Li, C.; Chen, W.; Chen, F. AI for UAV-Assisted IoT Applications: A Comprehensive Review. IEEE Internet Things J. 2023, 10, 14438–14461. [Google Scholar]
Castrillo, V.U.; Manco, A.; Pascarella, D.; Gigante, G. A review of counter-UAS technologies for cooperative defensive teams of drones. Drones 2022, 6, 65. [Google Scholar] [CrossRef]
Alsamhi, S.H.; Shvetsov, A.V.; Kumar, S.; Shvetsova, S.V.; Alhartomi, M.A.; Hawbani, A.; Rajput, N.S.; Srivastava, S.; Saif, A.; Nyangaresi, V.O. UAV computing-assisted search and rescue mission framework for disaster and harsh environment mitigation. Drones 2022, 6, 154. [Google Scholar] [CrossRef]
Zeng, Y.; Wu, Q.; Zhang, R. Accessing From the Sky: A Tutorial on UAV Communications for 5G and Beyond. Proc. IEEE 2019, 107, 2327–2375. [Google Scholar]
Pandey, P.; Shukla, A.; Tiwari, R. Aerial path planning using meta-heuristics: A survey. In Proceedings of the 2017 Second International Conference on Electrical, Computer and Communication Technologies (ICECCT), Tamil Nadu, India, 22–24 February 2017; pp. 1–7. [Google Scholar]
Aggarwal, S.; Kumar, N. Path planning techniques for unmanned aerial vehicles: A review, solutions, and challenges. Comput. Commun. 2020, 149, 270–299. [Google Scholar]
Xie, R.; Meng, Z.; Wang, L.; Li, H.; Wang, K.; Wu, Z. Unmanned Aerial Vehicle Path Planning Algorithm Based on Deep Reinforcement Learning in Large-Scale and Dynamic Environments. IEEE Access 2021, 9, 24884–24900. [Google Scholar]
Chronis, C.; Anagnostopoulos, G.; Politi, E.; Dimitrakopoulos, G.; Varlamis, I. Dynamic Navigation in Unconstrained Environments Using Reinforcement Learning Algorithms. IEEE Access 2023, 11, 117984–118001. [Google Scholar] [CrossRef]
Ma, L.; Meng, D.; Huang, X.; Zhao, S. Vision-Based Formation Control for an Outdoor UAV Swarm With Hierarchical Architecture. IEEE Access 2023, 11, 75134–75151. [Google Scholar]
Zhang, N.; Nex, F.; Vosselman, G.; Kerle, N. Channel-Aware Distillation Transformer for Depth Estimation on Nano Drones. arXiv 2023, arXiv:2303.10386. [Google Scholar]
Yu, H.; Wagter, C.; de Croon, G.C.H.E. MAVRL: Learn to Fly in Cluttered Environments With Varying Speed. IEEE Robot. Autom. Lett. 2025, 10, 1441–1448. [Google Scholar]
Kulkarni, M.; Alexis, K. Reinforcement learning for collision-free flight exploiting deep collision encoding. arXiv 2024, arXiv:2402.03947. [Google Scholar]
Wang, D.; Fan, T.; Han, T.; Pan, J. A Two-Stage Reinforcement Learning Approach for Multi-UAV Collision Avoidance Under Imperfect Sensing. IEEE Robot. Autom. Lett. 2020, 5, 3098–3105. [Google Scholar]
Wang, X.; Gursoy, M.C.; Erpek, T.; Sagduyu, Y.E. Learning-Based UAV Path Planning for Data Collection With Integrated Collision Avoidance. IEEE Internet Things J. 2022, 9, 16663–16676. [Google Scholar]
Li, J.; Xiong, X.; Yan, Y.; Yang, Y. A Survey of Indoor UAV Obstacle Avoidance Research. IEEE Access 2023, 11, 51861–51891. [Google Scholar]
Silberman, N.; Hoiem, D.; Kohli, P.; Fergus, R. Indoor segmentation and support inference from rgbd images. In Proceedings of the Computer Vision–ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy, 7–13 October 2012; pp. 746–760. [Google Scholar]
Chang, Y.; Cheng, Y.; Murray, J.; Huang, S.; Shi, G. The hdin dataset: A real-world indoor uav dataset with multi-task labels for visual-based navigation. Drones 2022, 6, 202. [Google Scholar] [CrossRef]
Yang, Z.; Zhang, Y.; Yu, J.; Cai, J.; Luo, J. End-to-end multi-modal multi-task vehicle control for self-driving cars with visual perceptions. In Proceedings of the 2018 24th International Conference on Pattern Recognition (ICPR), Beijing, China, 20–24 August 2018; pp. 2289–2294. [Google Scholar]
Geiger, A.; Lenz, P.; Stiller, C.; Urtasun, R. Vision meets robotics: The kitti dataset. Int. J. Robot. Res. 2013, 32, 1231–1237. [Google Scholar] [CrossRef]
Arshad, M.A.; Khan, S.H.; Qamar, S.; Khan, M.W.; Murtza, I.; Gwak, J.; Khan, A. Drone navigation using region and edge exploitation-based deep CNN. IEEE Access 2022, 10, 95441–95450. [Google Scholar] [CrossRef]
Silva, M.; Seoane, A.; Mures, O.A.; López, A.M.; Iglesias-Guitian, J.A. Exploring the effects of synthetic data generation: A case study on autonomous driving for semantic segmentation. Vis. Comput. 2025, 10, 1–19. [Google Scholar] [CrossRef]
Chang, Y.; Cheng, Y.; Manzoor, U.; Murray, J. A review of UAV autonomous navigation in GPS-denied environments. Robot. Auton. Syst. 2023, 170, 104533. [Google Scholar] [CrossRef]
Lee, G.T.; Kim, K. A Controllable Agent by Subgoals in Path Planning Using Goal-Conditioned Reinforcement Learning. IEEE Access 2023, 11, 33812–33825. [Google Scholar] [CrossRef]
Lee, D.; Cha, D. Path optimization of a single surveillance drone based on reinforcement learning. Int. J. Mech. Eng. Robot. Res. 2020, 9, 1541–1547. [Google Scholar] [CrossRef]
Cui, Z.; Wang, Y. UAV Path Planning Based on Multi-Layer Reinforcement Learning Technique. IEEE Access 2021, 9, 59486–59497. [Google Scholar] [CrossRef]
Darwish, A.A.; Nakhmani, A. Drone Navigation and Target Interception Using Deep Reinforcement Learning: A Cascade Reward Approach. IEEE J. Indoor Seamless Position. Navig. 2023, 1, 130–140. [Google Scholar] [CrossRef]
Artizzu, C.-O.; Allibert, G.; Demonceaux, C. OMNI-DRL: Learning to fly in forests with omnidirectional images. IFAC-PapersOnLine 2022, 55, 120–125. [Google Scholar] [CrossRef]
Kulkarni, M.; Nguyen, H.; Alexis, K. Semantically-enhanced deep collision prediction for autonomous navigation using aerial robots. In Proceedings of the 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Detroit, MI, USA, 1–5 October 2023; pp. 3056–3063. [Google Scholar]
Gonzalez, V.; Monje, C.A.; Garrido, S.; Moreno, L.; Balaguer, C. Coverage Mission for UAVs Using Differential Evolution and Fast Marching Square Methods. IEEE Aerosp. Electron. Syst. Mag. 2020, 35, 18–29. [Google Scholar] [CrossRef]
Mikkelsen, J.H.; Fumagalli, M. Distributed Planning for Rigid Robot Formations Using Consensus on the Transformation of a Base Configuration. In Proceedings of the 2023 21st International Conference on Advanced Robotics (ICAR), Abu Dhabi, United Arab Emirates, 5–8 December 2023; pp. 627–632. [Google Scholar]
Schilling, F.; Soria, E.; Floreano, D. On the scalability of vision-based drone swarms in the presence of occlusions. IEEE Access 2022, 10, 28133–28146. [Google Scholar]
Chen, G.; Peng, P.; Zhang, P.; Dong, W. Risk-aware trajectory sampling for quadrotor obstacle avoidance in dynamic environmentsa. IEEE Trans. Ind. Electron. 2023, 70, 12606–12615. [Google Scholar] [CrossRef]
Xu, Z.; Zhan, X.; Xiu, Y.; Suzuki, C.; Shimada, K. Onboard dynamic-object detection and tracking for autonomous robot navigation with rgb-d camera. IEEE Robot. Autom. Lett. 2023, 9, 651–658. [Google Scholar] [CrossRef]
Yue, P.; Xin, J.; Zhang, Y.; Lu, Y.; Shan, M. Semantic-Driven Autonomous Visual Navigation for Unmanned Aerial Vehicles. IEEE Trans. Ind. Electron. 2024, 71, 14853–14863. [Google Scholar]
Doukhi, O.; Lee, D.J. Deep Reinforcement Learning for Autonomous Map-Less Navigation of a Flying Robot. IEEE Access 2022, 10, 82964–82976. [Google Scholar]
Jevtić, Ð.; Miljković, Z.; Petrović, M.; Jokić, A. Reinforcement Learning-based Collision Avoidance for UAV. In Proceedings of the 2023 10th International Conference on Electrical, Electronic and Computing Engineering (IcETRAN), East Sarajevo, Bosnia and Herzegovina, 5–8 June 2023; pp. 1–6. [Google Scholar]
Wandelt, S.; Wang, S.; Zheng, C.; Sun, X. AERIAL: A Meta Review and Discussion of Challenges Toward Unmanned Aerial Vehicle Operations in Logistics, Mobility, and Monitoring. IEEE Trans. Intell. Transp. Syst. 2024, 25, 6276–6289. [Google Scholar] [CrossRef]
Chandran, N.K.; Sultan, M.T.H.; Łukaszewicz, A.; Shahar, F.S.; Holovatyy, A.; Giernacki, W. Review on type of sensors and detection method of anti-collision system of unmanned aerial vehicle. Sensors 2023, 23, 6810. [Google Scholar] [CrossRef]
Koenig, N.; Howard, A. Design and use paradigms for Gazebo, an open-source multi-robot simulator. In Proceedings of the 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566), Sendai, Japan, 28 September–2 October 2004; Volume 3, pp. 2149–2154. [Google Scholar]
Zhao, W.; Queralta, J.P.; Westerlund, T.A. Sim-to-real transfer in deep reinforcement learning for robotics: A survey. In Proceedings of the 2020 IEEE Symposium Series on Computational Intelligence (SSCI), Canberra, ACT, Australia, 1–4 December 2020; pp. 737–744. [Google Scholar]
Collins, J.; Chand, S.; Vanderkop, A.; Howard, D. A review of physics simulators for robotic applications. IEEE Access 2021, 9, 51416–51431. [Google Scholar]

Figure 1. The proposed approach framework outlines the steps for implementing the proposed approach.

Figure 2. Gazebo environments. (a) A drone in a Gazebo environment, surrounded by three buildings on three sides. (b) The drone’s perception of the environment shown through cameras 1 and 3. (c) The drone’s perception of the environment shown through cameras 2 and 4.

Figure 3. The input of the CNN is a color image of 200 × 200 from four RGB cameras used to detect static obstacles. The input of YOLOv9 is a color image of 640 × 640 from four RGB cameras used to detect dynamic obstacles. ANNs fuse all the sensors. The ANNs receive observations from the CNN, YOLOv9, and two LiDAR sensors, the location of the target, and the tilt of the target. The ANNs have 10 actions and instruct the drone to perform them in the environment.

Figure 4. We collected our dataset from three environments in the Gazebo simulator: (a) environment of sand island in Gazebo, (b) environment of KSQL airport in Gazebo, and (c) environment of a runway and grass in Gazebo.

Figure 5. (a–d) In our dataset, we labeled these images with the number 1, which indicated an obstacle.

Figure 6. (a–d) In our dataset, we labeled these images with the number 0, which indicated no obstacle.

Figure 7. (a) Accuracy of the CNN. (b) Confusion matrix for obstacle classification using test images from our dataset.

Figure 8. During an experiment in a Gazebo environment, YOLOv9 detected a drone, and each image illustrates the appropriate action our drone took to avoid a collision with the other drone: (a) upward action, (b) downward action, (c) upward action, (d) downward action, (e) upward action, and (f) downward action.

Figure 9. (a) The drone used three oriented rays to detect objects above its front and rear. (b) The drone used three rays to detect objects above its right and left sides.

Figure 10. (a) The stars represent the locations of the targets surrounding the drone, with white stars indicating the position of the target on the same axis and blue stars representing the tilt of the target on the X- or Y-axis.(b) The stars represent the locations of the targets above and below the drone. Each location of the target was represented by a number to be entered into the ANN.

Figure 11. (a) The drone is on the X-axis and can execute four actions. (b) The drone tilts 45 degrees south of the X-axis and can execute four actions. (c) The drone is on the X-axis or tilts 45 degrees south of the X-axis and can execute two actions.

Figure 12. (a) Accuracy of the ANNs. (b) Confusion matrix for action classification using the test dataset.

Figure 13. (a–f) First experiment: the three rays extending from the front to the rear of the drone prevent it from colliding with a building’s roof.

Figure 14. (a–f) Second experiment: the three rays that extended from the right to the left sides of the drone prevented the drone from colliding with the building’s roof.

Figure 15. (a–f) Third experiment: A test of the drone’s ability to pass over buildings and orient itself toward the target.

Figure 16. U-shaped obstacle.

Figure 17. (a–f) In the fourth experiment, the drone emerged from a U-shaped obstacle without colliding with the building’s roof or walls, and the drone oriented itself toward the target situated behind the obstacle.

Table 1. Comparison of the methods used in the autonomous navigation of 5G drones, considering the type of method and map used. This symbol √ indicates that the reference used this type of map.

Reference	Algorithm	Online Map	Offline Map	Mapless
[12] (2021)	RL and LSTM (DRQN)	-	-	√
[13] (2023)	RL	-	-	√
[14] (2023)	CNN	-	-	√
[15] (2023)	CNN	-	-	√
[16] (2025)	RL and LSTM	-	-	√
[17] (2024)	RL and autoencoders	-	-	√
[18] (2020)	Actor–critic algorithm	-	-	√
[19] (2022)	D3QN	-	√	-
[31] (2023)	RL and LSTM (DRQN)	-	-	√
[32] (2022)	Actor–critic algorithm	-	-	√
[33] (2023)	VAE and LSTM	-	-	√
[35] (2023)	Artificial potential field planner	-	√	-
[36] (2022)	Voronoi diagram	√	-	-
[37] (2023)	Sampling-based techniques	√	-	-
[38] (2023)	CNN	√	-	-
[39] (2024)	RL and LSTM (DRQN	-	-	√
[40] (2022)	RL and CNN (DQN)	-	-	√
[41] (2023)	RL	-	-	√
Our proposal	CNN and ANN	-	-	√

Table 2. Comparison of the methods used in the autonomous navigation of 5G drones, considering the types of obstacles the drone avoids and the testing environment. This symbol √ denotes that the reference conducted this experience.

Reference	Emerge from U-Shaped	Dynamic Environment	Static Environment	Dimensions of the Experience	Avoid Dynamic Object	Avoid Dynamic Object Only in Swarm
[12] (2021)	√	-	√	3D	-	-
[13] (2023)	-	-	√	3D	-	-
[14] (2023)	-	√	√	3D	-	√
[15] (2023)	-	-	√	3D	-	-
[16] (2025)	-	-	√	3D	-	-
[17] (2024)	-	-	√	3D	-	-
[18] (2020)	-	√	√	2D	-	√
[19] (2022)	-	√	√	2D	-	-
[31] (2023)	-	√	√	3D	√	-
[32] (2022)	-	√	√	3D	-	-
[33] (2023)	-	-	√	3D	-	-
[35] (2023)	-	√	√	2D	-	√
[36] (2022)	-	√	√	2D	-	√
[37] (2023)	-	√	√	2D	-	-
[38] (2023)	-	√	√	3D	√	-
[39] (2024)	-	√	√	3D	√	-
[40] (2022)	-	-	√	2D	-	-
[41] (2023)	-	√	√	2D	√	-
Our proposal	√	√	√	3D	√	-

Table 3. Comparison of the methods used in the autonomous navigation of 5G drones, considering the sensors used and their perception of the environment surrounding the drone. This symbol √ denotes that the reference used this sensor.

Reference	Stereo Camera ¹	Monocular Camera ¹	LiDAR ¹	RGB Camera	Omnidirectional Camera ¹	Radar	Communication Link
[12] (2021)	-	-	-	√	-	√	-
[13] (2023)	-	-	-	-	-	√	-
[14] (2023)	-	-	-	√	-	-	√
[15] (2023)	-	-	-	√	-	-	-
[16] (2025)	√	-	-	-	-	-	-
[17] (2024)	-	√	-	-	-	-	-
[18] (2020)	-	-	-	-	-	-	√
[19] (2022)	-	-	-	-	-	-	-
[31] (2023)	-	-	-	√	-	-	-
[32] (2022)	-	-	-	-	√	-	-
[33] (2023)	-	√	-	-	-	-	-
[35] (2023)	-	-	-	-	-	-	√
[36] (2022)	-	-	-	-	√	-	-
[37] (2023)	√	-	-	-	-	-	-
[38] (2023)	-	-	-	√	-	-	-
[39] (2024)	-	-	√	√	-	-	-
[40] (2022)	-	-	√	√	-	-	-
[41] (2023)	-	-	√	-	-	-	-
Our proposal	-	-	√	√	-	-	-

¹ The sensor suffers from a high computational time.

Table 4. Comparison of the methods used in the autonomous navigation of 5G drones, taking into account the field of view. This symbol √ indicates that the reference has this field of view.

Reference	Downside	Upside	Back Side	Left Side	Right Side	Front Side
[12] (2021)	-	-	-	-	-	√
[13] (2023)	√	-	-	-	-	√
[14] (2023)	√	-	-	-	-	√
[15] (2023)	-	-	-	-	-	√
[16] (2025)	-	-	-	-	-	√
[17] (2024)	-	-	-	-	-	√
[18] (2020)	-	-	-	-	-	-
[19] (2022)	-	-	-	-	-	-
[31] (2023)	-	-	-	-	-	√
[32] (2022)	-	-	√	√	√	√
[33] (2023)	-	-	-	-	-	√
[35] (2023)	-	-	-	-	-	-
[36] (2022)	-	-	√	√	√	√
[37] (2023)	-	-	-	-	-	√
[38] (2023)	-	-	-	-	-	√
[39] (2024)	-	-	-	-	-	√
[40] (2022)	-	-	-	√	√	√
[41] (2023)	-	-	-	-	-	√
Our proposal	√	√	√	√	√	√

Table 5. Sixteen states were responsible for detecting a U-shaped obstacle.

State	Location	Tilt	Camera1	Camera2	Camera3	Camera4	LiDAR 1	LiDAR 2	Action
16	0	1	1	1	1	0	1	1	9
17	0	0	1	1	1	0	1	1	3
80	0	1	1	1	1	0	1	0	9
81	0	0	1	1	1	0	1	0	3
136	1	1	1	1	0	1	1	1	8
137	1	0	1	1	0	1	1	1	2
200	1	1	1	1	0	1	1	0	8
201	1	0	1	1	0	1	1	0	2
260	2	1	1	0	1	1	1	1	7
261	2	0	1	0	1	1	1	1	1
324	2	1	1	0	1	1	1	0	7
325	2	0	1	0	1	1	1	0	1
386	3	1	0	1	1	1	1	1	6
387	3	0	0	1	1	1	1	1	0
450	3	1	0	1	1	1	1	0	6
451	3	0	0	1	1	1	1	0	0

Table 6. The results of the experiments in terms of the type of obstacle, flight duration, and energy consumption.

Experiment	Obstacle	Time	Remaining Battery Power
1	Building shelf	10 s	99%
2	Building shelf	10 s	99%
3	Buildings	40 s	97%
4	U-shape	50 s	97%
4	Large U-shape	75 s	97%

Table 7. The results of the fifth experiment: the distance traveled and remaining battery power after flying for seven minutes for the three drones used in this experiment.

Drone Number	The Sensors Used	Distance	Remaining Battery Power
1	Four cameras and two LiDAR sensors	350 m	81%
2	One cameras and two LiDAR sensors	354 m	82%
3	Without sensors	403 m	79%

Table 8. Compares the implementation time of the proposed approach using one camera and four cameras.

Metrics	One Camera	Four Cameras
CNNs	38 ms	152 ms
ANNs	25 ms	25 ms
YOLOv9	170 ms	680 ms
SUM	233 ms	857 ms

Table 9. A comparison of the accuracy of methods used in controlling drones: the accuracies of these methods were taken from their papers.

Reference	Accuracy	Success Rate
[15]	97%	-
[22]	86%	-
[25]	96%	-
Our CNNs	99%	-
Our ANNs	100%	-
[16]	-	97%
[30]	-	89%
[31]	-	68%
[33]	-	95%
[39]	-	90%
Our proposed	-	100%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Alotaibi, T.; Jambi, K.; Khemakhem, M.; Eassa, F.; Bourennani, F. Deep Learning-Based Autonomous Navigation of 5G Drones in Unknown and Dynamic Environments. Drones 2025, 9, 249. https://doi.org/10.3390/drones9040249

AMA Style

Alotaibi T, Jambi K, Khemakhem M, Eassa F, Bourennani F. Deep Learning-Based Autonomous Navigation of 5G Drones in Unknown and Dynamic Environments. Drones. 2025; 9(4):249. https://doi.org/10.3390/drones9040249

Chicago/Turabian Style

Alotaibi, Theyab, Kamal Jambi, Maher Khemakhem, Fathy Eassa, and Farid Bourennani. 2025. "Deep Learning-Based Autonomous Navigation of 5G Drones in Unknown and Dynamic Environments" Drones 9, no. 4: 249. https://doi.org/10.3390/drones9040249

APA Style

Alotaibi, T., Jambi, K., Khemakhem, M., Eassa, F., & Bourennani, F. (2025). Deep Learning-Based Autonomous Navigation of 5G Drones in Unknown and Dynamic Environments. Drones, 9(4), 249. https://doi.org/10.3390/drones9040249

Article Menu

Deep Learning-Based Autonomous Navigation of 5G Drones in Unknown and Dynamic Environments

Abstract

1. Introduction

2. Literature Review

2.1. Review of Papers

2.2. Dataset

3. Proposed Approach

3.1. Static Obstacles

3.2. Dynamic Objects

3.3. LiDAR Sensor

3.4. ANNs

3.5. Location and Tilt of the Target

3.6. Action Space

3.7. ANN Training

3.8. Autonomous Navigation System for Drones

4. Simulation Setup

4.1. First Experiment

4.2. Second Experiment

4.3. Third Experiment

4.4. Fourth Experiment

4.5. Fifth Experiment

4.6. Discussion

5. Conclusions and Future Work

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

State	Location	Tilt	Camera1	Camera2	Camera3	Camera4	LiDAR 1	LiDAR 2	Action
16	0	1	1	1	1	0	1	1	9
17	0	0	1	1	1	0	1	1	3
80	0	1	1	1	1	0	1	0	9
81	0	0	1	1	1	0	1	0	3
136	1	1	1	1	0	1	1	1	8
137	1	0	1	1	0	1	1	1	2
200	1	1	1	1	0	1	1	0	8
201	1	0	1	1	0	1	1	0	2
260	2	1	1	0	1	1	1	1	7
261	2	0	1	0	1	1	1	1	1
324	2	1	1	0	1	1	1	0	7
325	2	0	1	0	1	1	1	0	1
386	3	1	0	1	1	1	1	1	6
387	3	0	0	1	1	1	1	1	0
450	3	1	0	1	1	1	1	0	6
451	3	0	0	1	1	1	1	0	0

State	Location	Tilt	Camera1	Camera2	Camera3	Camera4	LiDAR 1	LiDAR 2	Action
16	0	1	1	1	1	0	1	1	9
17	0	0	1	1	1	0	1	1	3
80	0	1	1	1	1	0	1	0	9
81	0	0	1	1	1	0	1	0	3
136	1	1	1	1	0	1	1	1	8
137	1	0	1	1	0	1	1	1	2
200	1	1	1	1	0	1	1	0	8
201	1	0	1	1	0	1	1	0	2
260	2	1	1	0	1	1	1	1	7
261	2	0	1	0	1	1	1	1	1
324	2	1	1	0	1	1	1	0	7
325	2	0	1	0	1	1	1	0	1
386	3	1	0	1	1	1	1	1	6
387	3	0	0	1	1	1	1	1	0
450	3	1	0	1	1	1	1	0	6
451	3	0	0	1	1	1	1	0	0

State	Location	Tilt	Camera1	Camera2	Camera3	Camera4	LiDAR 1	LiDAR 2	Action
16	0	1	1	1	1	0	1	1	9
17	0	0	1	1	1	0	1	1	3
80	0	1	1	1	1	0	1	0	9
81	0	0	1	1	1	0	1	0	3
136	1	1	1	1	0	1	1	1	8
137	1	0	1	1	0	1	1	1	2
200	1	1	1	1	0	1	1	0	8
201	1	0	1	1	0	1	1	0	2
260	2	1	1	0	1	1	1	1	7
261	2	0	1	0	1	1	1	1	1
324	2	1	1	0	1	1	1	0	7
325	2	0	1	0	1	1	1	0	1
386	3	1	0	1	1	1	1	1	6
387	3	0	0	1	1	1	1	1	0
450	3	1	0	1	1	1	1	0	6
451	3	0	0	1	1	1	1	0	0