**1. Introduction**

Drones are actively used in a variety of fields, including recreational, commercial, security, crisis management, and mapping [1,2]. They are also used in combination with other platforms such as satellites in resource management, agriculture, and environmental protection [3–5]. However, the negligent and the malicious use of these flying vehicles poses a great threat to public safety in sensitive areas such as government buildings, power plants, and refineries [6,7]. For this reason, it is important to recognize drones to prevent them from entering critical infrastructure or ensuring security in large locations such as stadiums [8].

In this study, the recognition of four types of drones was investigated. Conventional drone detection technologies include the use of various sensors such as radar (radio detection and ranging) [9], Lidar (Light Detection and Ranging) [10], acoustic [11], and thermal sensors [12]. In these methods, first, the presence or the absence of the drone in the scene is checked and then the drone type recognition process is performed [13,14]. However, the

**Citation:** Dadrass Javan, F.; Samadzadegan, F.; Gholamshahi, M.; Ashatari Mahini, F. A Modified YOLOv4 Deep Learning Network for Vision-Based UAV Recognition. *Drones* **2022**, *6*, 160. https://doi.org/ 10.3390/drones6070160

Academic Editors: Daobo Wang and Zain Anwar Ali

Received: 3 June 2022 Accepted: 24 June 2022 Published: 27 June 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/). *drones* application of these types of sensors has always been associated with problems such as higher costs and higher energy consumption [15]. In contrast, visible images do not have these problems and are widely used for object recognition and semantic segmentation due to their high resolution [16]. On the other hand, the use of visible images also introduces problems such as light changes within the imagery, the presence of occluded areas, and a crowded background, which necessitates the application of an efficient and comprehensive method for recognition.

Recent advances in deep convolutional neural networks and the appearance of more improved hardware make it possible to use visual information to recognize objects with higher accuracy and speed [17]. Unlike conventional drone detection technologies, the nature of deep learning networks is to perform drone recognition simultaneously. By classifying inputs into several classes, these networks determine the presence, absence, image location, and type of drone class [18]. Among neural networks, the convolutional neural network (CNN) is one of the most important representatives of image recognition and classification. In this network, the input data enters the convolutional layers. The convolution operation is then performed using the network kernel to find similarities. Finally, feature extraction is performed using the resulting feature map [19]. There are different types of convolutional neural networks available such as R-CNN (Region-based CNN) [20], SPPNet (Spatial Pyramid Pooling Network) [21], and Faster-RCNN [22]. In these networks, due to the application of convolutional operations, more features are extracted than in conventional object detection methods and better speed and accuracy are achieved in recognizing objects. The extracted features are essentially descriptors of objects, and as the number of these features increases, object recognition is performed with higher accuracy. In these networks, the proposed regions are first defined using region proposal networks (RPNs) [23]. Then, convolutional filters are applied to these regions, and the extracted features are obtained as the result of the convolutional operation [22]. In other deep learning methods such as SSD (Single Shot MultiBox Detector) [24] and YOLO [25], the image is generally explored, which results in higher accuracy and speed in object recognition as compared to the basic methods [25]. The reason for the higher speed in these methods is the architecture is simpler than in region-based methods. The YOLO network is a method for detecting and for recognizing an object based on CNNs. The YOLO network predicts bounding box coordinates and class probabilities for these boxes, considering the whole image. The fourth edition of the YOLO Network is the YOLOv4 Deep Learning Network, which performs better than previous versions in terms of speed and accuracy [26]. However, the YOLOv4 deep learning network may not be able to overcome some challenges, such as the small size of the drone in different images [16]. In this study, this network could not recognize the drone in some of the challenging images. These challenges include confusing some drones with birds due to their small size, and the presence of drones in crowded backgrounds and hidden areas. Therefore, the YOLOv4 deep learning network was modified to better overcome the challenges of recognizing flying drones. The change in the architecture of this network is the main innovation in this article. Also, 4 types of multirotors, fixed-wings, helicopters, and VTOLs (Vertical Take-Off and Landing) were recognized. Given the need to recognize each type of UAV in different applications, the study of this topic can be considered as another innovation of this paper.

#### *1.1. Challenges in Drone Recognition*

Drone recognition is always fraught with challenges. Some of the important challenges in this regard are discussed.

#### 1.1.1. Confusion of Drones and Birds

Due to the physical characteristics of drones, they can easily be confused with birds in human eyes. This problem is more challenging when using drones in maritime areas due to the presence of more birds. The similarity between drones and birds and their distinction from each other is shown in Figure 1.

Due to the physical characteristics of drones, they can easily be confused with birds in human eyes. This problem is more challenging when using drones in maritime areas due to the presence of more birds. The similarity between drones and birds and their dis-

Due to the physical characteristics of drones, they can easily be confused with birds in human eyes. This problem is more challenging when using drones in maritime areas due to the presence of more birds. The similarity between drones and birds and their dis-

Due to the physical characteristics of drones, they can easily be confused with birds in human eyes. This problem is more challenging when using drones in maritime areas due to the presence of more birds. The similarity between drones and birds and their dis-

**Figure 1.** Challenges related to confusion with birds in drone recognition. **Figure 1.** Challenges related to confusion with birds in drone recognition. **Figure 1.** Challenges related to confusion with birds in drone recognition. **Figure 1.** Challenges related to confusion with birds in drone recognition.

#### 1.1.2. Crowded Background 1.1.2. Crowded Background 1.1.2. Crowded Background 1.1.2. Crowded Background

1.1.1. Confusion of Drones and Birds

1.1.1. Confusion of Drones and Birds

1.1.1. Confusion of Drones and Birds

tinction from each other is shown in Figure 1.

tinction from each other is shown in Figure 1.

tinction from each other is shown in Figure 1.

*Drones* **2022**, *6*, x FOR PEER REVIEW 3 of 21

*Drones* **2022**, *6*, x FOR PEER REVIEW 3 of 21

As it appears from Figure 2, the presence of drones in areas with crowded backgrounds and similar environments has made them more difficult to recognize due to the inability to isolate the background. A crowded background refers to conditions such as the existence of clouds, dust, fog, and fire in the sky. As it appears from Figure 2, the presence of drones in areas with crowded backgrounds and similar environments has made them more difficult to recognize due to the inability to isolate the background. A crowded background refers to conditions such as the existence of clouds, dust, fog, and fire in the sky. As it appears from Figure 2, the presence of drones in areas with crowded backgrounds and similar environments has made them more difficult to recognize due to the inability to isolate the background. A crowded background refers to conditions such as the existence of clouds, dust, fog, and fire in the sky. As it appears from Figure 2, the presence of drones in areas with crowded backgrounds and similar environments has made them more difficult to recognize due to the inability to isolate the background. A crowded background refers to conditions such as the existence of clouds, dust, fog, and fire in the sky.

**Figure 2.** Challenges related to a crowded background in drone recognition. **Figure 2.** Challenges related to a crowded background in drone recognition. **Figure 2.** Challenges related to a crowded background in drone recognition.

#### 1.1.3. Small Drone Size The small size of drones makes them difficult to see at longer distances and difficult 1.1.3. Small Drone Size 1.1.3. Small Drone Size 1.1.3. Small Drone Size

to quickly and accurately recognize, or they are possibly recognized as birds. Furthermore, the presence of a swarm of UAVs at different scales makes the recognition process more challenging. Figure 3 illustrates some examples of the presence of small drones at different scales. The small size of drones makes them difficult to see at longer distances and difficult to quickly and accurately recognize, or they are possibly recognized as birds. Furthermore, the presence of a swarm of UAVs at different scales makes the recognition process more challenging. Figure 3 illustrates some examples of the presence of small drones at different scales. The small size of drones makes them difficult to see at longer distances and difficult to quickly and accurately recognize, or they are possibly recognized as birds. Furthermore, the presence of a swarm of UAVs at different scales makes the recognition process more challenging. Figure 3 illustrates some examples of the presence of small drones at different scales. The small size of drones makes them difficult to see at longer distances and difficult to quickly and accurately recognize, or they are possibly recognized as birds. Furthermore, the presence of a swarm of UAVs at different scales makes the recognition process more challenging. Figure 3 illustrates some examples of the presence of small drones at different scales.

Drone recognition is always fraught with challenges. For this reason, it is necessary **Figure 3.** Challenges related to the presence of small drones at different scales. **Figure 3.** Challenges related to the presence of small drones at different scales.

to use a fast, accurate, robust, and efficient method to overcome the challenges and to correctly recognize drones. **2. State of the Art Work** Drone recognition is always fraught with challenges. For this reason, it is necessary to use a fast, accurate, robust, and efficient method to overcome the challenges and to correctly recognize drones. Drone recognition is always fraught with challenges. For this reason, it is necessary to use a fast, accurate, robust, and efficient method to overcome the challenges and to correctly recognize drones. Drone recognition is always fraught with challenges. For this reason, it is necessary to use a fast, accurate, robust, and efficient method to overcome the challenges and to correctly recognize drones.

#### Recently, the use of drones has become increasingly popular, and they have been **2. State of the Art Work 2. State of the Art Work 2. State of the Art Work**

applied to various scientific and commercial purposes in different fields of photogrammetry, surveying, agriculture, natural disaster management, and so on [27]. There are different types of drones, each used for a specific purpose; and, in terms of design technology, application, and physical characteristics, they can be divided into four types: multirotor, helicopter, VTOL, and fixed-wing [28–31]. They are also divided into two scenarios in terms of operation manner in the environment. In the first scenario, drones operate Recently, the use of drones has become increasingly popular, and they have been applied to various scientific and commercial purposes in different fields of photogrammetry, surveying, agriculture, natural disaster management, and so on [27]. There are different types of drones, each used for a specific purpose; and, in terms of design technology, application, and physical characteristics, they can be divided into four types: multirotor, helicopter, VTOL, and fixed-wing [28–31]. They are also divided into two scenarios in terms of operation manner in the environment. In the first scenario, drones operate Recently, the use of drones has become increasingly popular, they applied to various scientific and commercial purposes in different fields of photogrammetry, surveying, agriculture, natural disaster management, and so on[27]. different types of drones, each used for a specific purpose; and, in terms of technology, application, and physical characteristics, they can be divided into four types: multirotor, helicopter, VTOL, and fixed-wing [28–31]. They are also divided into two scenarios in terms of operation manner in the environment. In the first scenario, drones operate **Figure 3.** Challenges related to the presence of small drones at different scales. Recently, the use of drones has become increasingly popular, and they have been applied to various scientific and commercial purposes in different fields of photogrammetry, surveying, agriculture, natural disaster management, and so on [27]. There are different types of drones, each used for a specific purpose; and, in terms of design technology, application, and physical characteristics, they can be divided into four types: multirotor, helicopter, VTOL, and fixed-wing [28–31]. They are also divided into two scenarios in terms of operation manner in the environment. In the first scenario, drones operate individually, while in the second scenario they fly in combination with others, which are normally known as a swarm of UAVs [32–35].

Because of the enormous potential applications of each type of UAV in meeting the needs of society, the possibility of their misuse has become a major concern for communities. Over the past decade, much of the research has focused on finding efficient and accurate

techniques for the recognition of different types of UAVs [12,17,36]. However, sometimes drone recognition is difficult because they are normally flying in challenging environments. Therefore, the recognition of UAVs requires advanced techniques that can recognize them as they fly individually or in swarm mode.

#### **3. Related Works**

Due to the increasing development of deep neural networks in visual applications, these networks are also used widely for the recognition of objects in visible images [36–38]. In 2019, Nalamati et al. used a collection of visible images to detect small drones and solve their detection challenges. In this work, different CNN-based architectures were used, such as SSD [22], Faster-RCNN with ResNet-101 [22], and Faster-RCNN [22] with Inceptionv2. Based on the results, the R-CNN network with ResNet-101 performs the best in training and testing [39]. In 2019, Unlu et al. used an independent drone detection system, using the YOLOv3 Deep Learning Network. One of the advantages of this system is its costeffectiveness due to the limited need for GPU memory. This study can detect drones of a small size and at a minimal distance, but it cannot recognize the types of drones [40]. In 2020, Mahdavi et al. detected a drone using a fisheye camera, and three methods of classification were applied: convolutional neural network (CNN), support vector machine (SVM), and nearest-neighbor. The results showed that CNN, SVM, and nearest-neighbor have total accuracy of 95%, 88%, and 80%, respectively. Compared with other classifiers with the same experimental conditions, the accuracy of the convolutional neural network classifier was satisfactory. In this study, only the detection of drones without considering their types and challenges has been investigated [41]. In 2020, Behera et al. detected and classified drones in RGB images using the YOLOv3 network, and they achieved a mAP of 74% after 150 epochs. In this article, only drones were detected at various distances, and the issue of drone recognition and its distinction from birds was not discussed [42]. In 2020 Shi et al. proposed a detection process of the low-altitude drone based on the YOLOv4 deep learning network. They then compared the YOLOv4 detection result with the YOLOv3 and the SSD networks. In this study, the YOLOv4 network performed better than the YOLOv3 and the SSD networks in detecting, recognizing, and identifying three types of drones in terms of mAP and detection speed, achieving 89% mAP [43]. In 2021, Tan Wei Xun et al. detected and tracked a drone using the YOLOv3 deep learning network. In their study, the NVIDIA Jetson TX2 was used to detect drones in real-time. The results of this method show that the proposed YOLOv3 network detects drones of three sizes: small, medium, and large, with an average confidence score of 88% and a confidence score between 60% and 100% [44]. In 2021, Isaac-Medina et al. detected and tracked drones using a set of visible and thermal images and four deep learning network architectures. In this paper, the deep learning networks Faster RCNN, SSD, YOLOv3, and DETR (DEtection TRansformer) are used. Based on the results, all the studied networks were able to detect a small drone at a far distance. But the YOLOv3 deep learning network generally leads to better accuracy (up to 0.986 mAP) and the RCNN network performed better in detecting small drones (up to 0.77 mAP) [45]. In 2021 Singha et al. developed an automatic drone detection system using YOLOv4. They used a dataset of drones and birds to detect drones, and then evaluated the model on two types of drone videos. The results obtained in this study for detecting two types of multirotor drones are: mAP 74.36%, F1-score 0.79, recall 0.68, and precision 0.95 [46]. In 2021, Liu et al. examined three object detection methods, such as YOLOv3, YOLOv4, RetinaNet, and FCOS (Fully Convolutional One-stage Object Detector) networks, on visible image data. To get great accuracy in drone detection, the pruned YOLOv4 model is used to build a sparser, flatter network. The application of the method has improved the detection of small drones and high-speed drones. The pruned YOLOv4, with a pruning rate of 0.8 and a 24-layer pruning, achieved a mAP of 90.5%, an accuracy of 22.8%, a recall of 12.7%, and a processing speed of 60%. However, the challenges of crowded backgrounds, hidden areas, and surveys of multiple drone types have not yet been addressed [16]. In 2022, Samadzadegan et al. detected and recognized drones using YOLOv4 Deep Networks

in visible images [47]. This network can recognize multirotor and helicopters directly, and it can differentiate between drones and birds with a mAP of 84%, an IoU of 81%, and an accuracy of 83%. In this paper, the challenges related to recognition have been well addressed, but this method is limited to detecting and to recognizing only two drone types, such as multirotor and helicopter, and it has not detected other types [47]. tecting and to recognizing only two drone types, such as multirotor and helicopter, and it has not detected other types [47]. In this study, to achieve higher accuracy in solving the challenges of drone type recognition in visible images, a modified YOLOv4 network is proposed. Drone recognition challenges include the drone's far distance from the camera, a crowded background,

the method has improved the detection of small drones and high-speed drones. The pruned YOLOv4, with a pruning rate of 0.8 and a 24-layer pruning, achieved a mAP of 90.5%, an accuracy of 22.8%, a recall of 12.7%, and a processing speed of 60%. However, the challenges of crowded backgrounds, hidden areas, and surveys of multiple drone types have not yet been addressed [16]. In 2022, Samadzadegan et al. detected and recognized drones using YOLOv4 Deep Networks in visible images [47]. This network can recognize multirotor and helicopters directly, and it can differentiate between drones and birds with a mAP of 84%, an IoU of 81%, and an accuracy of 83%. In this paper, the challenges related to recognition have been well addressed, but this method is limited to de-

*Drones* **2022**, *6*, x FOR PEER REVIEW 5 of 21

In this study, to achieve higher accuracy in solving the challenges of drone type recognition in visible images, a modified YOLOv4 network is proposed. Drone recognition challenges include the drone's far distance from the camera, a crowded background, unpredictable movements, and the drone's resemblance to birds. As an independent approach, the proposed modified YOLOv4 deep learning network architecture is capable of recognizing birds and four types of drones: multirotors, fixed-wings, helicopters, and VTOLs. To show the improved results of the new model, its performance is also compared with the base YOLOv4 network. unpredictable movements, and the drone's resemblance to birds. As an independent approach, the proposed modified YOLOv4 deep learning network architecture is capable of recognizing birds and four types of drones: multirotors, fixed-wings, helicopters, and VTOLs. To show the improved results of the new model, its performance is also compared with the base YOLOv4 network. **4. Methodology**

#### **4. Methodology** In this study, a modified network based on the latest version of the YOLO network

In this study, a modified network based on the latest version of the YOLO network is proposed. The steps to recognize bird species and four different types of drones are presented in Figure 4. In the first step, the input data was prepared to be ready to enter the proposed network. In the second step, the model was trained to recognize the drones, and the weight file obtained for the testing phase was generated. In the third step, the network was tested to observe how it worked; and, in the last step, the proposed deep learning network was evaluated using evaluation metrics. is proposed. The steps to recognize bird species and four different types of drones are presented in Figure 4. In the first step, the input data was prepared to be ready to enter the proposed network. In the second step, the model was trained to recognize the drones, and the weight file obtained for the testing phase was generated. In the third step, the network was tested to observe how it worked; and, in the last step, the proposed deep learning network was evaluated using evaluation metrics.

**Figure 4.** Recognition process using implemented modified network.

**Figure 4.** Recognition process using implemented modified network.
