Enhanced Detection of Small Unmanned Aerial System Using Noise Suppression Super-Resolution Detector for Effective Airspace Surveillance

Yoo, Jiho; Cho, Jeongho

doi:10.3390/app15063076

Open AccessArticle

Enhanced Detection of Small Unmanned Aerial System Using Noise Suppression Super-Resolution Detector for Effective Airspace Surveillance

by

Jiho Yoo

and

Jeongho Cho

^*

Department of Electrical Engineering, Soonchunhyang University, Asan 31538, Republic of Korea

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(6), 3076; https://doi.org/10.3390/app15063076

Submission received: 21 January 2025 / Revised: 3 March 2025 / Accepted: 10 March 2025 / Published: 12 March 2025

Download

Browse Figures

Versions Notes

Abstract

:

Small unmanned aerial systems have become increasingly prevalent in various fields, including agriculture, logistics and the public sector, but concerns over misuse, such as military intrusions and terrorist attacks, highlight the necessity for effective aerial surveillance. Although conventional radar systems can detect large areas, they face challenges in accurately identifying small drones. In contrast, vision sensors offer high-resolution identification but encounter challenges in long-range detection and real-time processing. To address these limitations, this study proposes a vision sensor-based detection framework, termed the noise suppression super-resolution detector (NSSRD). To ensure the reliability and real-time capability of small drone detection, NSSRD integrates image segmentation, noise suppression, super-resolution transformation, and efficient detection processes. NSSRD divides the surveillance area into uniform sections, applies a bilateral filter to suppress noise before passing the images to an object detection model, and uses a region of interest selection process to reduce the detection area and computational load. The experimental results demonstrate that NSSRD outperforms existing models, achieving a 24% improvement in the true positive rate and a 25% increase in recall at an altitude of 40 m, validating its superior performance.

Keywords:

object detection; unmanned aerial systems; air surveillance; super-resolution; noise suppression

1. Introduction

The proliferation of small unmanned aerial systems is becoming increasingly evident in both commercial and personal markets, with these systems playing pivotal roles in various sectors, including agriculture, logistics, and construction. The integration of artificial intelligence into drones has been particularly impactful, enhancing their efficiency through autonomous flight, real-time data analysis, and route optimization. As these technologies advance, small drones are becoming more accessible to individual users due to their increasingly diverse functionalities and affordability. Consequently, the use of drones is now prevalent in various fields beyond recreational and photographic applications, including public sector operations such as environmental monitoring and disaster response, with further expansion in their scope of use anticipated [1,2].

However, the increasing use of drones has also heightened the need for safety measures and regulations. Governments worldwide are enforcing stricter regulations on drones, such as the establishment of no-fly zones and registration systems, to address concerns over potential misuse. The malicious use of drones, such as crossing national borders or targeting military installations, poses significant risks, including difficulties in target identification, military failures, and civilian casualties. Such misuse can constitute a violation of international law and humanitarian principles, particularly in densely populated areas or restricted zones. Furthermore, drones are increasingly being exploited for illegal activities such as smuggling and terrorism [3,4]. In addition, UAVs are playing a pivotal role in package distribution, where effective detection technologies ensure accurate delivery in urban and remote areas. As UAVs continue to be deployed in these fields, the demand for reliable detection technologies is expected to grow, making the development of efficient and robust detection systems a critical area of research. Consequently, the development of aerial surveillance systems has become essential for detecting and preventing drone-related threats. These systems aim to identify and respond promptly to potential misuse, thereby protecting civilian facilities and military installations while minimizing damage.

Technological approaches to drone surveillance involve various sensor-based methods. Radar, acoustic, and vision sensors are commonly employed as primary surveillance instruments, each with its own advantages and limitations. In military and security applications, radar systems are used to detect and track the flight paths of planes. Radar can cover large areas and detect invisible objects at long distances, making it an effective surveillance tool. However, the accuracy of radar can be compromised by interference from other electromagnetic signals, and radar may struggle to detect small drones with low radar cross-sections (RCSs) [5,6]. To address these issues, extensive research has been conducted. One such study, by Wu et al. [7], proposed integrating features such as micro-Doppler effects caused by drone movements and rotations, RCS variations reflecting the size and shape of objects, and motion analysis to improve situational awareness. This fusion technology has contributed to reducing false detections and improving drone identification performance. In addition, Fan et al. [8] used a linear frequency-modulated continuous wave radar that uses high-frequency bandwidths to provide high-resolution range information at short distances and reduce noise through frequency modulation, thereby enabling precise detection of small drones. In a similar vein, Gong et al. [9] analyzed the Doppler signal-to-clutter ratio generated by drone movements to differentiate weak drone signals from environmental noise. This approach improved detection accuracy in complex environments. However, when small drones fly close to the ground, their low altitude can cause radar signals to scatter or be absorbed by surrounding terrain and obstacles. This can lead to situations where conventional detection methods struggle to detect drones effectively. In addition, various electromagnetic noises and clutter in the environment may obscure drone signals, further complicating detection.

Another method for aerial surveillance involves using acoustic characteristics to detect drones. Kim et al. [10] proposed a technique that identifies the presence of drones by visualizing the spectrum of acoustic signals generated during flight using fast Fourier transform and recognizing specific frequency patterns. This method leverages the unique frequency characteristics of drone flight noise for detection. Seo et al. [11] transformed acoustic signals into spectrograms using short-time Fourier transform and trained convolutional neural networks (CNNs) to effectively distinguish between drone and non-drone signals. This approach uses the time-frequency information of acoustic signals and the robust recognition capabilities of CNNs to improve drone identification accuracy in diverse environments. However, acoustic sensor-based detection methods are highly susceptible to external factors such as noise, wind, and weather conditions, which can significantly degrade detection performance. In addition, drones at long distances generate weaker acoustic signals, making detection challenging. Variations in flight patterns or environmental conditions can also affect the acoustic signal characteristics, posing additional limitations.

Vision sensors have been employed to detect low-altitude drones by providing high-resolution imagery, enabling the identification of fine details, real-time monitoring, and flight path analysis. However, unlike radar, vision sensors face challenges in identifying small drones from long distances due to reduced image resolution. To address these issues, research has focused on improving detection performance through image grid partitioning and resolution enhancement. Akyon et al. [12] divided the image into grids and used the object detection model you only look once (YOLO) to independently process each grid area, making it advantageous for detecting small drones at long distances through image magnification. Zhang et al. [13] introduced an adaptive slicing technique to dynamically divide images into various sizes and allocate more computational resources to critical areas, thereby improving drone detection efficiency. They also developed hyper inference to accurately deduce drone characteristics and locations. However, these methods typically require high-performance hardware for handling high-dimensional data, limiting their real-time processing capabilities. Bashir et al. [14] addressed these challenges by enhancing feature extraction using residual feature aggregation to mitigate gradient vanishing issues even in deeper neural networks. They also employed generative adversarial networks to model diverse scenarios for detecting small drones under various environmental conditions, thereby improving the detection effectiveness and accuracy. Zhao [15] utilized fast super-resolution CNN (SRCNN) and SRCNN to enhance the recognition of fine details in small drones and proposed a method to reduce false detections using a single-shot multibox detector. Yoon et al. [16] improved the detection performance for small objects by adaptively learning various object characteristics and proposed a method to detect small objects effectively using SRCNN. Although these studies present diverse approaches to improve detection performance for small objects or drones, challenges such as artificial noise or distortions introduced during resolution enhancement and high computational demands remain unresolved, limiting their real-time detection capabilities.

To overcome these challenges and meet the requirements of real-time aerial surveillance, this study proposes a vision sensor-based detection framework called the noise suppression super-resolution detector (NSSRD). Small drone detection requires wide surveillance coverage and high accuracy under diverse environmental conditions. The noise and computational overhead associated with resolution enhancement processes pose significant constraints on real-time processing. NSSRD addresses these issues by integrating image partitioning, noise suppression, high-resolution conversion, and efficient detection processes.

NSSRD first partitions the surveillance area into N grids, and each grid image is passed as input to the YOLO object detection model. To address the potential degradation of detection performance caused by amplified noise near small drones during image scaling, NSSRD applies bilateral filtering as a preprocessing step to effectively reduce the noise before feeding the images into YOLO. Furthermore, to improve the detection speed and reduce unnecessary computations, NSSRD selectively focuses on a single region of interest (ROI) instead of processing all N-grid images. The selected ROI image is then enhanced to high resolution using SRCNN, providing clearer details about the drone. The enhanced image is subsequently processed by YOLO, achieving both high performance and real-time detection.

The proposed NSSRD was evaluated through comparative experiments with various state-of-the-art detection models under different drone altitude and drone-to-sensor distance conditions (10–50 m and 10–70 m, respectively). The results demonstrated an improvement of up to 24% in true positive (TP) rates and a 25% increase in recall at a 40 m altitude compared to existing models, validating the performance and practicality of the proposed framework. The main contributions of this study can be summarized as follows:

-: A detection model integrating image partitioning, noise suppression, and super-resolution image enhancement was developed to improve detection performance for small drones, while preserving as much of the critical silhouette information of the drone as possible for accurate detection.
-: A preprocessing method using image partitioning and bilateral filtering was proposed to suppress noise and distortion occurring during image magnification, thereby improving the reliability of drone detection.
-: An ROI exploration process based on Laplacian filtering was introduced to reduce unnecessary computations in image partitioning, thereby enhancing overall computational efficiency and detection speed, and meeting real-time processing requirements.
-: The performance and adaptability of the proposed model were validated in real-world scenarios with various drone altitudes, drone-to-sensor distance conditions, and drone sizes.

The organization of the paper is as follows: Section 1 provides an overview of the proposed small drone detection architecture. Section 2 presents a detailed step-by-step description of the architecture. In Section 3, we discuss the experimental process and present the results. Section 4 concludes the paper with a summary of findings and outlines potential directions for future research.

2. Materials and Methods

The proposed NSSRD framework, designed for the effective detection of low-altitude small drones, is composed of three main modules: grid partitioning, ROI image extraction, and enhanced object detection (Figure 1).

First, the NSSRD framework divides the input image into N equal segments to efficiently detect small objects. Subsequently, through the ROI selection stage, only one segment, which is most likely to contain the object, is selectively extracted from the N segments, thereby minimizing unnecessary computations. This approach increases the processing speed and computational efficiency. The extracted segment undergoes a bilateral filtering process before being transformed into a high-resolution image to mitigate potential issues arising during resolution enhancement. This step suppresses noise that may be restored around the object during high-resolution transformation, thereby minimizing its negative impact on detection performance. The noise-suppressed image is then transformed into a high-resolution version using SRCNN, thereby maximizing the detail of the drone shape and improving detection accuracy. Finally, the transformed high-resolution image is passed through an object detector, ensuring successful object detection. The processing steps of the proposed NSSRD model are detailed in Algorithm 1.

Algorithm 1. Noise Suppression Super-Resolution Detection (NSSRD)

Input: input image

I

, image width

W

, image height

H

, maximum integer by which the image can be partitioned

D

, binarization threshold

T

, number of ROI images

S

Require: pretrained detector YOLO, weights (

w_{1}

,

w_{2}

,

w_{3}

), biases (

b_{1}

,

b_{2}

,

b_{3}

)
Output: detections

B b o x

Split input image $I$ into $D$ parts
for $i = 1 : D$ //Loop for ROI image extraction
for $j = 1 : D$
$I_{(i, j)}^{g r y}$ ← Converting each grid image to grayscale
$I_{(i, j)}^{l a p}$ ← Laplacian filtering each grayscaled image
If $I_{(i, j)}^{l a p} (x, y)$ >= T, $I_{(i, j)}^{b i n} (x, y)$ ← 1, otherwise $I_{(i, j)}^{b i n} (x, y)$ ← 0 //Binarization
end
end
if $m a x (I_{(i, j)}^{b i n}) = = 1$ , $I^{r o i} \leftarrow I_{(i, j)}^{b i n}$ //Select ROI
$I^{b i l}$ ← Bilateral filtering of ROI
for $i = 1 : S$ //Loop for resolution enhancement
$I_{i}^{s r 1} \leftarrow m a x (0, w_{1} * I_{i}^{b i l} + b_{1})$
$I_{i}^{s r 2} \leftarrow m a x (0, w_{2} * I_{i}^{s r 1} + b_{2})$
$I_{i}^{s r} \leftarrow w_{3} * I_{i}^{s r 2} + b_{3}$
end
$B b o x$ = YOLO ( $I^{s r}$ ) //Object detection

2.1. Grid Partitioning

To enhance the detection of small objects, the proposed NSSRD model employs grid partitioning. Large objects are easier to detect due to sufficient image detail and resolution [17]. In contrast, small objects, defined by a limited number of pixels, pose a challenge because detectors struggle to learn meaningful features and accurately identify them. To address this issue, the image is divided into smaller grids for detection. The partitioned images are resized to match the input size of the detector as follows:

I_{(i, j)}^{g r d} = I [\frac{W \times (i - 1)}{D} : \frac{W \times (i)}{D}, \frac{H \times (j - 1)}{D} : \frac{H (j)}{D}] .

(1)

Here, W and H represent the number of pixels along the x-axis and y-axis of the image, respectively, and when W and H values are equal, the sub-images maintain a square shape; D represents the number of divisions along both the horizontal and vertical axes, resulting in D square sub-images.

This process ensures that the partitioned images retain a square shape when input into the detector, preventing any loss of object information due to aspect ratio distortion. Such distortion can alter object features and degrade detection performance, making its prevention critical. When the original image (Figure 2a) is partitioned according to Equation (1), it results in square sub-images as shown in Figure 2b. This grid partitioning approach enhances the clarity of small objects, improving both detection accuracy and reliability.

2.2. ROI Image Extraction

Failure to promptly detect drones in high-risk areas, such as secure zones or conflict regions, can lead to significant damage. Therefore, minimizing unnecessary computations is essential to ensure the speed and efficiency of drone detection [18]. If all N partitioned images were analyzed, the computational cost would increase N-fold compared to detecting a single image. However, selectively processing only images containing objects can significantly reduce the detection time, thereby improving detection efficiency. To achieve this, a Laplacian filter is first applied to all partitioned images

I^{g r d}

to enhance noise and object boundaries within the images. This visualization allows for distinguishing images containing only the background from those containing objects. In the next step, a threshold T is set based on the filtered images

I^{l a p}

, and a binarization process is performed. Finally, only images containing pixels with a value of 1 in the binarized images

I^{b i n}

are extracted. These images are then matched with their original RGB counterparts to select the final images containing objects,

I^{r o i}

. By processing only the images with objects among the N partitioned images, unnecessary computations are significantly reduced. Figure 3 illustrates the ROI extraction process using the Laplacian filter.

2.2.1. Edge Detection

In the process of object detection, it is critical to prevent the omission of objects. To this end, filters that effectively detect object boundaries are employed, with representative methods, including the Sobel filter, Canny edge filter, and Laplacian filter [19,20,21]. The Sobel filter detects edges through first-order derivative operations and is simple and efficient. However, its edge detection accuracy tends to be lower than that of second-order derivative-based methods such as the Laplacian filter. The Canny edge filter, a robust edge detection technique, suppresses noise and selectively detects significant edges but has limitations due to potential information loss depending on the threshold settings. In contrast, the Laplacian filter, based on second-order derivatives, does not rely on specific thresholds and emphasizes edge information while retaining maximum boundary details.

RGB images consist of three channels, namely, red, green, and blue, each with independent intensity values. The Laplacian filter operates as a single-channel operator and requires independent filtering for each channel when applied to RGB images. However, boundaries calculated separately for each channel may overlap and mix, leading to ambiguous boundaries. To address this, grayscale conversion is used to focus on luminance information, thereby allowing the Laplacian filter to detect more distinct boundaries. Grayscale conversion is achieved through a weighted combination as follows:

I_{(i, j)}^{g r y} = 0.299 * I_{(i, j)}^{g r d} [R] + 0.587 * I_{(i, j)}^{g r d} [G] + 0.114 * I_{(i, j)}^{g r d} [B] .

(2)

Here,

I_{(i, j)}^{g r y}

denotes the pixel value of the image converted to grayscale, and

I_{(i, j)}^{g r d} [R]

,

I_{(i, j)}^{g r d} [G]

, and

I_{(i, j)}^{g r d} [B]

denote the pixel values of the R, G, and B channels, respectively.

The Laplacian filter, based on second-order derivatives, calculates the rate of change in image brightness and effectively emphasizes abrupt intensity variations near edges. The Laplacian filter is defined as follows:

I_{(i, j)}^{l a p} = \frac{\partial^{2} I_{(i, j)}^{g r y}}{\partial^{2} x^{2}} + \frac{\partial^{2} I_{(i, j)}^{g r y}}{\partial^{2} y^{2}} .

(3)

Here,

\frac{\partial^{2} I_{(i, j)}^{g r y}}{\partial^{2} x^{2}}

and

\frac{\partial^{2} I_{(i, j)}^{g r y}}{\partial^{2} y^{2}}

denote the second-order derivatives along the x-axis and y-axis, respectively, which calculate the curvature of the pixel values. This curvature calculation results in higher values at the boundaries where there is a sharp change in image brightness, thereby enabling a clearer emphasis on the object edges.

2.2.2. Binarization and RGB Image Matching

The resulting image after applying the Laplacian filter expresses the boundaries between the background and the object with various pixel values in a single channel. To distinguish these boundaries, a binarization process is performed. The binarization converts the background to 0 and the object boundaries to 1 based on a specific threshold T, which is typically set as the maximum pixel value of the background image, as follows:

I_{(i, j)}^{b i n} (x, y) = \{\begin{matrix} 1, if I_{(i, j)}^{l a p} (x, y) \geq T \\ 0, if I_{(i, j)}^{l a p} (x, y) < T . \end{matrix}

(4)

Here,

I_{(i, j)}^{l a p}

denotes the resulting image after applying the Laplacian filter. In the binarized image, a pixel value of 1 indicates the presence of an object in the region. The binarized image, which is defined as the ROI, is represented as

I^{{r o}^{i}'} \in R^{1 \times S}

, where

S

represents the number of regions with pixel values of 1 in the partitioned image. The corresponding data from the original RGB image are selected according to

I^{{r o}^{i}'}

; thus, the final object-containing ROI

I^{r o i}

is extracted. This process is performed by using the binarized image as a mask and matching it with the original RGB image.

2.3. Enhanced Object Detection

Previous studies on small object detection using vision sensors have primarily focused on enhancing the resolution of input images to detect objects more accurately [22,23,24]. However, simply increasing the resolution can result in the amplification of noise around the object, which negatively impacts detection performance. This issue leads to false positives or missed detections, particularly in complex backgrounds or low-resolution environments. To solve this problem, an effective noise suppression technique is proposed before enhancing the image resolution. Specifically, a bilateral filter is applied before enhancing the resolution to suppress noise around the object. The bilateral filter uses weights based on the spatial distance and intensity differences between pixels, effectively eliminating unnecessary noise while preserving the fine details of the object shape. The denoised image is then enhanced using the SRCNN, thereby improving detection performance through an object detection model, YOLO.

2.3.1. Noise Suppression

A factor affecting object detection is noise interference. In particular, when detecting small objects, noise blurs the boundaries, degrading detection performance. This is because the boundary information of small objects is relatively weak compared to that of large objects, and the stronger the noise, the lower the detection accuracy [25,26]. To address this issue, a bilateral filter is applied to effectively suppress noise while preserving object boundaries.

Commonly used noise removal filters tend to lose not only the noise but also object boundary information. However, bilateral filters perform filtering by simultaneously considering spatial distance and pixel intensity differences. This allows the object boundaries to be preserved while other regions are blurred, effectively removing noise. A bilateral filter is applied to an image as follows:

I^{b i l} = \frac{1}{K_{n}} \sum_{k, l} e x p (- \frac{{(x_{c} - k)}^{2} + {(y_{c} - l)}^{2}}{2 σ_{s}^{2}}) \exp (- \frac{{(I^{r o i} (x_{c}, y_{c}) - I^{r o i} (k, l))}^{2}}{2 σ_{r}^{2}}) .

(5)

Here,

k_{n}

denotes the normalization constant, which prevents distortion while maintaining the brightness and color of the image after filtering.

(x_{c}, y_{c})

denotes the pixel coordinates of the center of the kernel created in the

I^{r o i}

image, and

(k, l)

denotes the coordinates of other pixels within the kernel.

σ_{s}^{2}

and

σ_{r}^{2}

denote the Gaussian kernel variances for spatial distance and pixel intensity differences, respectively. The bilateral filter combines these spatial distance and color differences to remove noise around the object while preserving the boundaries.

Figure 4 compares the results before and after the application of the bilateral filter. Figure 4a shows the segmented ROI image. Figure 4b shows the result after applying the Laplacian filter to the original segmented image to examine the noise effect that occurs during image magnification, where the object boundaries are emphasized but noise is also amplified. In contrast, Figure 4c shows the result after applying the bilateral filter to suppress noise, followed by the Laplacian filter. Here, the object boundaries are preserved, and background noise is significantly reduced.

2.3.2. Image Enhancement

The image

I^{b i l}

, where noise has been suppressed using the bilateral filter, is then processed using SRCNN to convert the low-resolution image into a high-resolution image, enabling a more accurate and detailed capture of the object features. Consequently, SRCNN helps effectively detect various types of drones based on distance and shape. As shown in Figure 5, SRCNN consists of three layers: patch extraction and representation, nonlinear mapping, and reconstruction [14,15].

The first layer of SRCNN extracts small regions (patches) from the low-resolution ROI image

I_{s}^{b i l}

, which has undergone bilateral filtering, and converts them into high-dimensional vectors to create feature maps. The operation is defined as follows:

I_{i}^{s r 1} = m a x (0, w_{1} * I^{b i l} + b_{1}) .

(6)

Here,

w_{1}

and

b_{1}

denote the filter weights and bias of the first layer, respectively, ∗ denotes the convolution operation, and

m a x (0,)

denotes the rectified linear unit activation function. The feature map generated by the first layer is the result of a linear transformation based on the low-resolution image, which is unsuitable for expressing complex nonlinear patterns. To address this issue, the second layer performs nonlinear mapping to extract the complex features required for generating the high-resolution image as follows:

I_{i}^{s r 2} = m a x (0, w_{2} * I_{i}^{s r 1} + b_{2}) .

(7)

In the third layer, the high-dimensional vectors extracted from the previous two layers are combined to reconstruct the final high-resolution image as follows:

I_{i}^{s r} = w_{3} * I_{i}^{s r 2} + b_{3} .

(8)

Figure 6 compares the results of using SRCNN for the resolution enhancement of small drones. Figure 6a shows the original low-resolution image, and Figure 6b shows the high-resolution image obtained after applying SRCNN. Compared with the original image, the enhanced image restores the detailed information around the object and sharpens the boundaries, offering a higher-quality image.

2.3.3. Object Detection

The advancement of deep learning technology has significantly improved image recognition accuracy based on CNNs. A common technique used in CNNs is pooling, which reduces the size of the input feature map, thereby reducing computational load and enhancing model generalizability. However, this process can result in the loss of small features, which can be problematic when fine details are crucial. For example, max pooling selects only the largest value from each region, potentially ignoring small features, whereas average pooling takes the average value of the region, leading to the dilution of detailed information [27]. To address these limitations and achieve fast and accurate object detection, this study used the YOLO-v8 model. YOLO-v8 combines feature pyramid networks and path aggregation networks to merge detailed features and abstract information bi-directionally, thereby enabling more accurate object detection [28].

YOLO [29] is a single neural network-based model for object detection that divides the input image into A×A grids and predicts B predefined bounding boxes for each grid cell. If the predicted bounding box overlaps with the actual bounding box by a certain threshold, the box is considered to contain an object. The probability that the predicted bounding box corresponds to one of the detectable objects is calculated as follows [29]:

P_{C L S} = P_{o b j} \times {I O U}^{t h r} \times P_{c l s | o b j}

(9)

Here,

P_{o b j}

denotes the probability that the predicted bounding box contains an object,

{I O U}^{t h r}

denotes the intersection over union (IoU) ratio, which represents the overlapping area between the predicted and actual bounding boxes, and

P_{c l s | o b j}

denotes the probability that a specific object is classified among the objects of interest. Ultimately, the bounding box with the highest

P_{C L S}

value among the predicted B bounding boxes is selected as the bounding box for the detected object.

3. Experimental Results

To evaluate the performance of the proposed NSSRD framework, various tests were conducted by varying the drone altitude and the drone distance from sensors to generate diverse datasets. This approach aimed to validate the superiority of the proposed small drone detection model through comparisons with existing state-of-the-art models with similar structures.

3.1. Experimental Setup

The proposed model was implemented using an NVIDIA GeForce GTX 1080 Ti graphics card (NVIDIA, Santa Clara, CA, USA) and an Intel Core i7-8700 CPU (Intel, Santa Clara, CA, USA). Drone flight videos were collected using DJI Mavic Pro 2 and DJI Mavic Air 2. The dimensions of the DJI Mavic Pro 2 and DJI Mavic Air 2 were 183 × 253 × 77 mm (L × W × H) and 322 × 242 × 84 mm (L × W × H), respectively. Drone flights were conducted at altitudes ranging from 10 to 50 m in 10 m intervals. The distance between the vision sensor and the drone was set from 10 to 70 m. For each drone altitude and distance configuration, 200 images were randomly collected, resulting in 1400 images per altitude for evaluation. Drone distances beyond this range were excluded from the testing scope because visual drone identification is difficult at such distances.

The original collected images had a resolution of 600 × 600 pixels and were divided into nine equal parts, each with a resolution of 200 × 200 pixels. When the ROI-extracted images were processed through SRCNN, they were upscaled to the original resolution of 600 × 600 pixels. These SRCNN-enhanced images were used as inputs for the YOLO-v8 detection model, which required resizing to the input resolution of 640 × 640 pixels of the detector, potentially causing pixel-level loss. To minimize such loss, the divided image size was set to 200 × 200 pixels. The primary goal of NSSRD was to ensure that no small drones were missed, making recall the key performance evaluation metric. Recall, which is calculated as TP/(TP + FN), represents the proportion of correctly detected objects among all actual objects. Here, TP and FN denote true positives and false negatives, respectively [28].

3.2. Comparative Evaluation of Detection Performance

The performance of the proposed model was evaluated through comparisons with several models based on YOLO-v8, which shares a similar structure. The training parameters for YOLO-v8 were set as follows: the initial learning rate was 0.01 and gradually decreased over time. The momentum was set to 0.9, and the IoU threshold for NMS was set to 0.7. The models from the following studies were included for comparison: Reis et al. [30], who used only YOLO-v8 in object detection; Zhang et al. [13], who proposed a method for detecting drones by segmenting images and enlarging object sizes; Bae et al. [31], who applied a bilateral filter to segment images and suppress noise around objects, thereby enhancing detection performance; Yoon et al. [16], who improved detection performance by segmenting images and applying SRCNN for small object detection.

Figure 7 compares the detection performance of NSSRD and the other models based on the drone flight altitude. Figure 7a shows that NSSRD outperformed the other similar models across all altitudes in terms of the drone detection rate. Notably, at an altitude of 40 m, NSSRD recorded at least a 20% higher detection rate than the other models. Figure 7b shows that NSSRD achieved a lower false detection rate than the other models at all altitudes. Notably, at an altitude of 40 m, NSSRD demonstrated superior performance with a reduction of more than 20% compared to the model of Bae et al. [31]. Figure 7c shows the recall values of the model, indicating the ratio of correctly detected drones to all detected objects. NSSRD exhibited higher recall values at all altitudes, confirming that it effectively detects drones. The model of Bae et al. [31] exhibited performance similar to that of NSSRD at an altitude of 10 m; however, as the altitude increased, the performance gap widened, with a difference of more than 20% at 40 m. The models of Zhang et al. [13] and Yoon et al. [16], which use image segmentation for precise detection, demonstrated a recall difference 10–20% lower than that of NSSRD, indicating vulnerability in detecting small drones. This is likely because these models are specialized for environments where the sensor-to-object distance is short or where objects are less affected by noise, such as on the horizon of the sea. However, this study focused on drone detection in flight environments with noise, such as clouds; thus, the existing models may have led to false alarms and missed detections in such environments.

Figure 8 shows the detection results of NSSRD and existing models at a drone altitude and distance of 20 and 10 m, respectively. The top-left corner of each subfigure presents a magnified view of the flying drone, and the top-right corner visualizes the noise around the drone using a Laplacian filter. In these images, the white bounding boxes represent the ground truth, the green boxes indicate correctly detected drones, and the red boxes denote false alarms.

As shown in Figure 8a, the model of Reis et al. [30] successfully detected the drone because the relatively short distance preserved the drone features and boundaries. However, as shown in Figure 8b,c, the models of Zhang et al. [13] and Bae et al. [31], which rely on image segmentation followed by YOLO-based detection, failed to detect the drone. The segmentation process of these models introduced distortions when resizing the segmented images to fit the input dimensions of YOLO, thereby leading to the loss of critical object information. Similarly, Yoon et al. [16] attempted to minimize information loss by enhancing the image resolution using SRCNN after segmentation. However, as shown in Figure 8d, this approach inadvertently restored the noise around the object, resulting in detection failure. In contrast, NSSRD demonstrated superior performance, as shown in Figure 8e. By combining SRCNN for image resolution enhancement with a bilateral filter to suppress noise, NSSRD effectively minimized object information loss and performed accurate drone detection. These findings highlight the advantages of NSSRD over the comparison models in detecting drones under similar conditions.

Figure 9 shows the drone detection results at an altitude of 20 m and a distance of 60 m. As the distance between the sensor and the object increased, the boundaries of the object became blurred, resulting in failed detections by all existing models except NSSRD. For the models of Reis et al. [30], Zhang et al. [13], and Bae et al. [31], the magnified images revealed blurred resolution and indistinct boundaries, leading to detection failure. Even when a Laplacian filter was applied, the object boundaries remained ambiguous, highlighting the reasons for their inability to detect the drone. Yoon et al. [16] attempted to improve the object resolution using SRCNN; however, this process also amplified the noise surrounding the object, preventing accurate detection. In contrast, NSSRD successfully addressed these challenges by applying a bilateral filter, effectively removing the noise around the object that hindered detection in the model of Yoon et al. [16]. This approach allowed NSSRD to restore clearer object boundaries and achieve accurate drone detection, demonstrating its robustness under challenging conditions.

Figure 10 shows the drone detection results at an altitude of 40 m and a distance of 60 m in an environment affected by clouds. Similar to the detection results at a 20 m altitude and a 60 m distance, all existing models except NSSRD, failed to detect the drone. In particular, Bae et al. [31] attempted to remove noise around the object using a bilateral filter; however, this resulted in excessive object information loss compared with the other models. This also led to a blurring of the drone shape; thus, false alarms occurred in areas that were not the object. In contrast, NSSRD combined a bilateral filter with SRCNN, thereby effectively suppressing unnecessary noise around the object, enhancing the resolution, and enabling the accurate restoration of the object boundaries. These results provide strong evidence that NSSRD can minimize the impact of noise while accurately detecting drones.

3.3. Comparative Evaluation of Inference Speed

To verify the real-time processing capability of the proposed NSSRD model, the floating point operation (FLOP) values were calculated. The FLOP metric represents the number of FLOPs performed per second by each model. The FLOP results are summarized in Table 1. In the comparative experiment, the model of Reis et al. [30], which uses the simplest structure with YOLO, exhibited a FLOP value of 28G. In contrast, the FLOP values of the model of Yoon et al. [16], which added preprocessing and super-resolution image enhancement modules, increased to 307G. However, despite including noise suppression and super-resolution image enhancement functions similar to the model of Yoon et al. [16], NSSRD exhibited a significantly lower FLOP value of 34G. This is approximately a 20% increase compared to that of the model of Reis et al. [30] but is still within an acceptable range for real-time detection. Notably, the FLOP value of NSSRD was more than seven times lower than those of the other models, excluding the model of Reis et al. [30], demonstrating balanced performance in terms of both detection accuracy and computational efficiency. Thus, NSSRD not only maintains a low computational load but also achieves superior detection performance compared to existing models, as confirmed in previous experiments. This demonstrates the potential of the proposed model for real-time application and its high efficiency.

4. Conclusions

This study proposed a detection framework NSSRD to enhance the performance of long-range small drone detection while meeting the real-time airspace monitoring requirements. The proposed model introduces a noise suppression preprocessing structure based on bilateral filtering to effectively suppress the noise generated during image segmentation and enlargement, aiming to improve small drone detection performance. Additionally, an efficient detection process integrating super-resolution image enhancement was developed to preserve the critical silhouette information of the drone as much as possible, thereby improving both the reliability and efficiency of drone detection. Furthermore, by introducing an ROI exploration process based on Laplacian filtering, the model selects only images containing drones from the segmented images, reducing unnecessary computations and enhancing overall computational efficiency and detection speed. Experimental results demonstrated that NSSRD outperformed existing related models under various environmental conditions and altitudes, with a notable improvement of up to 24% in the TP values at an altitude of 40 m. Additionally, by reducing computational load and improving detection speed, the proposed model design satisfies real-time processing requirements.

The key contribution of this study is the presentation of an integrated approach that enhances the reliability and efficiency of drone detection, thereby advancing real-time drone monitoring and detection technologies. Future research could further expand the applicability of NSSRD through the development of additional algorithms and hardware optimizations aimed at improving detection performance under various environmental conditions, such as complex backgrounds, low light conditions, or adverse weather. In addition, integrating path prediction and threat assessment algorithms after drone detection can lead to the development of a more sophisticated airspace monitoring system.

Author Contributions

J.Y. and J.C. participated in the discussion of the work described in this paper. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by a National Research Foundation of Korea (NRF) grant funded by the Korean government (MOE) (No. 2021R1I1A3055973) and the Soonchunhyang University Research Fund.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Acknowledgments

The authors thank the editor and anonymous reviewers for their helpful comments and valuable suggestions.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Seidaliyeva, U.; Alduraibi, M.; Ilipbayeva, L.; Almagambetov, A. Detection of loaded and unloaded UAV using deep neural network. In Proceedings of the IEEE International Conference on Robotic Computing (IRC), Taichung, Taiwan, 9–11 November 2020. [Google Scholar]
Liu, B.; Luo, H. An improved YOLOv5 for multi-rotor UAV detection. Electronics 2022, 11, 2330. [Google Scholar] [CrossRef]
Long, T.; Ozger, M.; Cetinkaya, O.; Akan, O.B. Energy neutral Internet of drones. IEEE Commun. Mag. 2018, 56, 22–28. [Google Scholar] [CrossRef]
Samaras, S.; Diamantidou, E.; Ataloglou, D.; Sakellariou, N.; Vafeiadis, A.; Magoulianitis, V.; Lalas, A.; Dimou, A.; Zarpalas, D.; Votis, K.; et al. Deep learning on multi-sensor data for counter UAV applications: A systematic review. Sensors 2019, 19, 4837. [Google Scholar] [CrossRef] [PubMed]
Park, S.; Kim, Y.; Lee, K.; Smith, A.H.; Dietz, J.E.; Matson, E.T. Accessible real-time surveillance radar system for object detection. Sensors 2020, 20, 2215. [Google Scholar] [CrossRef] [PubMed]
Gao, F.; Yi, J.; Wan, X.; Liu, Y.; Ke, H. Experimental research of multistatic passive radar with a single antenna for drone detection. IEEE Access 2018, 6, 33542–33551. [Google Scholar]
Wu, Q.; Chen, J.; Lu, Y.; Zhang, Y. A complete automatic target recognition system of low altitude, small RCS, and slow speed (LSS) targets based on multi-dimensional feature fusion. Sensors 2019, 19, 5048. [Google Scholar] [CrossRef]
Fan, S.; Wu, Z.; Xu, W.; Zhu, J.; Tu, G. Micro-Doppler signature detection and recognition of UAVs based on OMP algorithm. Sensors 2023, 23, 7922. [Google Scholar] [CrossRef] [PubMed]
Gong, J.; Yan, J.; Hu, H.; Kong, D.; Li, D. Improved radar detection of small drones using Doppler signal-to-clutter ratio (DSCR) detector. Drones 2023, 7, 316. [Google Scholar] [CrossRef]
Kim, J.; Park, C.; Ahn, J.; Ko, Y.; Park, J.; Gallagher, J.C. Real-time UAV sound detection and analysis system. In Proceedings of the IEEE Sensors Applications Symposium (SAS), Glassboro, NJ, USA, 13–15 March 2017; pp. 1–5. [Google Scholar]
Seo, Y.; Jang, B.; Im, S. Drone detection using convolutional neural networks with acoustic STFT features. In Proceedings of the IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Auckland, New Zealand, 27–30 November 2018. [Google Scholar]
Akyon, F.C.; Altinuc, S.O.; Temizel, A. Slicing-aided hyper inference and fine-tuning for small object detection. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Bordeaux, France, 16–19 October 2022; pp. 966–970. [Google Scholar]
Zhang, H.; Hao, C.; Song, W.; Jiang, B.; Li, B. Adaptive slicing-aided hyper inference for small object detection in high-resolution remote sensing images. Remote Sens. 2023, 15, 124. [Google Scholar] [CrossRef]
Bashir, S.M.A.; Wang, Y. Small object detection in remote sensing images with residual feature aggregation-based super-resolution and object detector network. Remote Sens. 2021, 13, 1854. [Google Scholar] [CrossRef]
Zhao, X.; Li, W.; Zhang, Y.; Feng, Z. Residual super-resolution single shot network for low-resolution object detection. IEEE Access 2018, 6, 47780–47793. [Google Scholar] [CrossRef]
Yoon, S.; Jalal, A.; Cho, J. MODAN: Multifocal object detection associative network for maritime horizon surveillance. J. Mar. Sci. Eng. 2023, 11, 1890. [Google Scholar] [CrossRef]
Yang, X.; Song, Y.; Zhou, Y.; Liao, Y.; Yang, J.; Huang, J.; Bai, Y. An efficient detection framework for aerial imagery based on uniform slicing window. Remote Sens. 2023, 15, 4122. [Google Scholar] [CrossRef]
Al-E’mari, S.; Sanjalawe, Y.; Alqudah, H. Integrating enhanced security protocols with moving object detection: A YOLO-based approach for real-time surveillance. In Proceedings of the International Conference on Cyber Resilience (ICCR), Dubai, United Arab Emirates, 26–28 February 2024; pp. 1–6. [Google Scholar]
Xuan, L.; Hong, Z. An improved Canny edge detection algorithm. In Proceedings of the IEEE International Conference on Software Engineering and Service Science (ICSESS), Beijing, China, 24–26 November 2017; pp. 275–278. [Google Scholar]
Zhang, J.Y.; Chen, Y.; Huang, X.X. Edge detection of images based on improved Sobel operator and genetic algorithms. In Proceedings of the International Conference on Image Analysis and Signal Processing, Linhai, China, 11–12 April 2009; pp. 31–35. [Google Scholar]
Stanković, I.; Brajović, M.; Stanković, L.; Daković, M. Laplacian Filter in Reconstruction of Images using Gradient-Based Algorithm. In Proceedings of the 29th Telecommunications Forum (TELFOR), Belgrade, Serbia, 23–24 November 2021. [Google Scholar]
Gao, Y.; Wang, Y.; Zhang, Y.; Li, Z.; Chen, C.; Feng, H. Feature super-resolution fusion with cross-scale distillation for small-object detection in optical remote sensing images. IEEE Geosci. Remote Sens. Lett. 2024, 21, 6008105. [Google Scholar] [CrossRef]
Zou, H.; Gao, Y.; Guo, X.; Zheng, M. Small Object Detection Based on Super-Resolution Enhanced Detection Network. In Proceedings of the International Conference on Computer Information Science and Artificial Intelligence (CISAI), Kunming, China, 17–19 September 2021. [Google Scholar]
Rabbi, J.; Ray, N.; Schubert, M.; Chowdhury, S.; Chao, D. Small-Object Detection in Remote Sensing Images with End-to-End Edge-Enhanced GAN and Object Detector Network. Remote Sens. 2020, 12, 1432. [Google Scholar] [CrossRef]
Liu, G.; Wang, M.; Liu, L.; Liu, Y.; Jiang, Y. A noise-aware framework for blind image super-resolution. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), Taipei, Taiwan, 18–22 July 2022; pp. 1–6. [Google Scholar]
Lee, G.; Hong, S.; Cho, D. Self-Supervised Feature Enhancement Networks for Small Object Detection in Noisy Images. IEEE Signal Process. Lett. 2021, 28, 1026–1030. [Google Scholar] [CrossRef]
Zafar, A.; Aamir, M.; Mohd, N.; Arshad, A.; Riaz, S.; Alruban, A.; Dutta, A.; Almotairi, S. A comparison of pooling methods for convolutional neural networks. Appl. Sci. 2022, 12, 8643. [Google Scholar] [CrossRef]
Wang, F.; Wang, H.; Qin, Z.; Tang, J. UAV Target Detection Algorithm Based on Improved YOLOv8. IEEE Access 2023, 11, 116534–116544. [Google Scholar] [CrossRef]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
Reis, D.; Jordan, K.; Hong, J.; Daoudi, A. Real-time flying object detection with YOLOv8. arXiv 2024, arXiv:2305.09972. [Google Scholar]
Bae, T.W. Small target detection using bilateral filter and temporal cross product in infrared images. Infrared Phys. Technol. 2011, 54, 403–411. [Google Scholar] [CrossRef]

Figure 1. Block diagram of the proposed NSSRD.

Figure 2. Partitioning results of image containing drone: (a) original image; (b) partitioned image.

Figure 3. ROI image extraction process using Laplacian filter.

Figure 4. Comparison before and after the application of bilateral filter: (a) original segmented image; (b) image after applying the Laplacian filter; (c) image after applying bilateral filter followed by the Laplacian filter.

Figure 5. Image processing flow of SRCNN.

Figure 6. Example of resolution enhancement for small drones using SRCNN: (a) original image; (b) enhanced-resolution image.

Figure 7. Drone detection performance at different flight altitudes; (a) TP; (b) FN; (c) recall [13,16,30,31].

Figure 8. Drone detection results at altitude of 20 m and distance of 10 m: (a) Reis et al. [30]; (b) Zhang et al. [13]; (c) Bae et al. [31]; (d) Yoon et al. [16]; (e) NSSRD.

Figure 9. Drone detection results at altitude of 20 m and distance of 60 m: (a) Reis et al. [30]; (b) Zhang et al. [13]; (c) Bae et al. [31]; (d) Yoon et al. [16]; (e) NSSRD.

Figure 10. Drone detection results at altitude of 40 m and distance of 60 m in environments affected by clouds: (a) Reis et al. [30]; (b) Zhang et al. [13]; (c) Bae et al. [31]; (d) Yoon et al. [16]; (e) NSSRD.

Table 1. FLOPs per model.

Model	Grid Partitioning	Preprocessing	Resolution Enhancement	Detection	FLOPs
Model	Grid Partitioning	Bilateral Filter	SRCNN	YOLO-v8	FLOPs
Reis et al. [30]				√	28G
Zhang et al. [13]	√			√	252G
Bae et al. [31]	√	√		√	259G
Yoon et al. [16]	√		√	√	307G
NSSRD	√	√	√	√	34G

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yoo, J.; Cho, J. Enhanced Detection of Small Unmanned Aerial System Using Noise Suppression Super-Resolution Detector for Effective Airspace Surveillance. Appl. Sci. 2025, 15, 3076. https://doi.org/10.3390/app15063076

AMA Style

Yoo J, Cho J. Enhanced Detection of Small Unmanned Aerial System Using Noise Suppression Super-Resolution Detector for Effective Airspace Surveillance. Applied Sciences. 2025; 15(6):3076. https://doi.org/10.3390/app15063076

Chicago/Turabian Style

Yoo, Jiho, and Jeongho Cho. 2025. "Enhanced Detection of Small Unmanned Aerial System Using Noise Suppression Super-Resolution Detector for Effective Airspace Surveillance" Applied Sciences 15, no. 6: 3076. https://doi.org/10.3390/app15063076

APA Style

Yoo, J., & Cho, J. (2025). Enhanced Detection of Small Unmanned Aerial System Using Noise Suppression Super-Resolution Detector for Effective Airspace Surveillance. Applied Sciences, 15(6), 3076. https://doi.org/10.3390/app15063076

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Enhanced Detection of Small Unmanned Aerial System Using Noise Suppression Super-Resolution Detector for Effective Airspace Surveillance

Abstract

1. Introduction

2. Materials and Methods

2.1. Grid Partitioning

2.2. ROI Image Extraction

2.2.1. Edge Detection

2.2.2. Binarization and RGB Image Matching

2.3. Enhanced Object Detection

2.3.1. Noise Suppression

2.3.2. Image Enhancement

2.3.3. Object Detection

3. Experimental Results

3.1. Experimental Setup

3.2. Comparative Evaluation of Detection Performance

3.3. Comparative Evaluation of Inference Speed

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI