Real-Time Fire Detection: Integrating Lightweight Deep Learning Models on Drones with Edge Computing

Titu, Md Fahim Shahoriar; Pavel, Mahir Afser; Michael, Goh Kah Ong; Babar, Hisham; Aman, Umama; Khan, Riasat

doi:10.3390/drones8090483

Open AccessArticle

Real-Time Fire Detection: Integrating Lightweight Deep Learning Models on Drones with Edge Computing

by

Md Fahim Shahoriar Titu

¹

,

Mahir Afser Pavel

¹

,

Goh Kah Ong Michael

^2,*

,

Hisham Babar

¹

,

Umama Aman

¹

and

Riasat Khan

^1,*

¹

Electrical and Computer Engineering, North South University, Dhaka 1229, Bangladesh

²

Faculty of Information Science & Technology, Multimedia University, Melaka 75450, Malaysia

^*

Authors to whom correspondence should be addressed.

Drones 2024, 8(9), 483; https://doi.org/10.3390/drones8090483

Submission received: 1 July 2024 / Revised: 1 August 2024 / Accepted: 3 August 2024 / Published: 13 September 2024

(This article belongs to the Special Issue Drones for Wildfire and Prescribed Fire Science)

Download

Browse Figures

Versions Notes

Abstract

:

Fire accidents are life-threatening catastrophes leading to losses of life, financial damage, climate change, and ecological destruction. Promptly and efficiently detecting and extinguishing fires is essential to reduce the loss of lives and damage. This study uses drone, edge computing, and artificial intelligence (AI) techniques, presenting novel methods for real-time fire detection. This proposed work utilizes a comprehensive dataset of 7187 fire images and advanced deep learning models, e.g., Detection Transformer (DETR), Detectron2, You Only Look Once YOLOv8, and Autodistill-based knowledge distillation techniques to improve the model performance. The knowledge distillation approach has been implemented with the YOLOv8m (medium) as the teacher (base) model. The distilled (student) frameworks are developed employing the YOLOv8n (Nano) and DETR techniques. The YOLOv8n attains the best performance with 95.21% detection accuracy and 0.985 F1 score. A powerful hardware setup, including a Raspberry Pi 5 microcontroller, Pi camera module 3, and a DJI F450 custom-built drone, has been constructed. The distilled YOLOv8n model has been deployed in the proposed hardware setup for real-time fire identification. The YOLOv8n model achieves 89.23% accuracy and an approximate frame rate of 8 for the conducted live experiments. Integrating deep learning techniques with drone and edge devices demonstrates the proposed system’s effectiveness and potential for practical applications in fire hazard mitigation.

Keywords:

artificial intelligence; deep learning; edge computing; knowledge distillation; Raspberry Pi 5; real-time fire detection; you only look once

1. Introduction

Fire is a specific chemical reaction produced by combusting wood or paper, which emits light and heat [1]. It is a double-edged sword, offering benefits and drawbacks for civilization and the environment [2]. While fire has provided significant advantages and shaped ecosystems over time, human activities have altered its role, resulting in positive and negative impacts on society and the natural world. For civilization, fire is a vital natural asset contributing to human well-being by providing warmth, light, and protection [3]. Economically, fire’s effects can be both beneficial and detrimental: it can revitalize ecosystems, and yet uncontrolled fires can cause severe environmental problems leading to climate change and global warming.

Fire incidents, which cause significant damage to life and property, are a problem worldwide. Various factors, including electrical faults, human negligence, and natural causes can trigger fires. These disasters not only result in losses of life but also have severe financial repercussions. Globally, fires have a significant influence on lives and economies. Over the past 40 years, there have been approximately 2805 fatalities and over 8000 injuries, poignantly affecting approximately 7 million people. Fires also have influence global trends. A fire accident in a specific industry or domain can thus have a knock-on effect on the global market supply chain, as can be seen, for example, in the “Amazon forest fire” calamity.

Modern megacities, characterized by skyscrapers, dense urban forests, and industrial zones, are particularly susceptible to rapid fire outbreaks [4]. The conflagration-based life-threatening catastrophe poses severe threats to community security and societal progress. This scenario necessitates the development of automated fire detection and prevention systems utilizing advanced sensing equipment, drone technology, and artificial intelligence (AI) techniques [5]. Traditionally, fire safety relied on direct observation, followed by implementing basic alarm systems. Contemporary advancements have led to the integration of sprinkler systems with smoke and heat detectors, computer vision, and drone-based automatic fire detection [6]. The modern automatic fire detection system minimizes the danger of injury, fatalities, and property damage. Furthermore, it allows for higher accuracies, rapid discoveries, and quick responses from local authorities.

Artificial intelligence (AI) helps computers operate in a way that emulates human abilities. AI edge systems [7] refer to AI applications managed by machines capable of performing various tasks in the physical world. These systems work via a multi-step approach. In this process, AI systems learn from past incidents and improve over time. The utilization of AI continues to grow across a wide range of applications, significantly affecting daily tasks, jobs, and organizational operations. The primary reason for the comprehensive adoption of AI is its ability to perform lessons rapidly and accurately, often surpassing human capabilities. AI robots are also employed in hazardous jobs, such as defusing bomb, reducing hazards to human life. Therefore, the use of AI is expected to grow, driven by advancements in task management, resource allocation, and cost efficiency.

Conventional fire extinguishing techniques with sensors that detect heat or smoke have certain limitations. These sensors can be susceptible, leading to a limited detection range, slow detection, and false alarms triggered by small-scale sources like cigarette smoke or candle flames. This unreliability necessitates a more accurate and prompt fire detection system. Automatic fire detection using AI addresses this need by providing noncontact and effective sensing, non-human intervention, and the precise and rapid identification of fire incidents. An effective fire detection system can significantly reduce the loss of human life, environmental factors, and property by enabling a prompt response.

The emerging system for accurate real-time fire detection using AI incorporates modern techniques for effective fire identification. This research aims to achieve deep learning as well as edge device- and drone-based automatic fire detection using images of various fire scenarios, including indoor, outdoor, forest, and natural fires. The major contributions of the proposed automatic fire detection study are illustrated below.

A significant contribution of this work is the development of a comprehensive dataset comprising 7187 images by combining multiple open source fire events datasets. The labeling process is carried out precisely using advanced tools, LabelImg and Roboflow API, involving human labeling, sorting, renaming, and carefully classifying each picture. Data augmentation techniques have been employed to increase the size and diversity of the dataset.
Advanced deep learning models such as Detection Transformer (DETR), Detectron2, YOLOv8, and Autodistill-built knowledge distillation techniques have been applied for automatic fire detection. YOLOv8m has been used as the teacher (base) model and YOLOv8n and DETR are employed as the distilled student models for the knowledge distillation approach. Various metrics, precision, recall, mean average precision (mAP), intersection over union (IoU), loss metrics, and overall accuracy are demonstrated to verify the reliability and effectiveness of the applied models.
The proposed automatic fire detection system has been assembled into a robust hardware setup comprising a Raspberry Pi 5 microcontroller, a DJI F450 drone, and a Raspberry Pi camera module 3. The lightweight YOLOv8n technique has been deployed into the microcontroller and drone system for real-time fire detection.
The novelty of this work is integrating a lightweight knowledge distillation-based deep learning technique with a Raspberry Pi 5 edge device and a drone for instant real-time fire detection utilizing a comprehensive fire events dataset.

Section 2 discusses related articles in fire detection and computer vision on edge devices. Section 3 elaborates on the proposed system, detailing its software and hardware components, the created dataset, and the applied deep learning models for automatic fire detection. Section 4 reviews the simulation and hardware results of the proposed fire extinguishing system. Finally, the conclusions and potential future enhancements for this system in fire detection and monitoring are illustrated in Section 5.

2. Related Works

Significant efforts have been initiated to mitigate the damage and consequences of fire accidents, which lead to the destruction of human habitats and natural ecosystems, environmental pollution, soil erosion, and other adverse effects. Fire hazard surveillance systems encompass traditional watchtower-based human supervision, sensor-built heat, smoke, and fumes detection, and recent advancements in artificial intelligence for automatic identification. The efficiency of earlier methods is limited due to their confined detection range, lower accuracy, elevated false alarm rates, and slow response times. In the following paragraphs, related articles on fire accident detection systems are briefly discussed.

2.1. Sensor-Based Fire Detection

Park et al. [8] developed a fuzzy logic system to enhance the reliability of Internet of Things (IoT)-based fire detection systems by recognizing fire signal patterns. The authors analyzed the characteristics of fire signals and created a fuzzy logic system capable of identifying such patterns, thereby reducing false alarms and enabling the early detection of fires. The system comprises several components, including flame, smoke, and temperature sensors; multi-sensor nodes; wireless communication modules; a server; a control room interface; an Internet network; a CCTV system; and speakers for audio alerts. Baek et al. [9] employed automatic algorithms, sensor network configurations, and data analysis methods for fire detection. The authors utilized real-life fire sensor data from the NIST repository to evaluate the implemented system. Sensors were installed in a manufactured home, and data were collected during real-time fire scenarios. The system’s performance was compared with existing methods using MATLAB to detect fires across different scenarios.

2.2. Computer Vision and AI-Based Fire Detection without Embedded Deployment

Biswas et al. [10] presented a novel deep learning model to detect fire and smoke. Leveraging an open source dataset, the authors applied the Inception-V3 model by integrating a novel optimization function. The improved Inception-V3 technique attained approximately 96% accuracy and 0.96 specificity with comparatively fewer epochs and reduced the computational cost. Wang and others [11] introduced a decoder-free fully transformer (DFFT) approach for early smoke and fire prediction, aiming to enhance the detection accuracy. This study combined publicly available datasets with custom-curated samples. Various baseline models were implemented and evaluated on these datasets. The applied DFFT model accomplished outstanding efficiencies in the detection task with mAP coefficients of 87.40% and 81.12% on the respective smoke and fire datasets.

Shamta et al. [12] designed a forest fire surveillance system, employing deep learning techniques and a quad-rotor drone. The authors used the YOLOv8 and a combined CNN-RCNN model for fire detection and image classification, respectively. The applied YOLOv8 model attained a 0.96 mAP score for fire detection, and the CNN-RCNN framework achieved 96% classification accuracy. Avazov et al. [13] introduced novel fire detection techniques for aquatic transport vehicles utilizing the YOLOv7 technique and a dataset comprising more than 4.6k images with extensive data augmentation modalities. Various YOLOv7 models, including YOLOv7, YOLOv7-W6, YOLOv7-tiny, YOLOv7-X, YOLOv7-E6, and YOLOv7-D6, were implemented across multiple tasks. The YOLOv7 technique was able to attain a superior performance at 0.81 mAP with 50% IoU and 0.93 F1 coefficient.

Sathishkumar and others [14] utilized multiple pre-trained deep learning-based convolution neural networks (CNN) with an additional (learning without forgetting) LwF technique. The VGG16, InceptionV3, and Xception models with learning without forgetting technique obtained accuracies of 95.46%, 97.01%, and 98.72%, respectively. Saydirasulovich and their members [15] devised an advanced fire detection method with YOLOv6, YOLOv3, and Faster R-CNN deep learning algorithms with a private dataset of 4000 samples. The applied YOLOv6 performed best, with a 0.43 F1 score and 39.50% mAP. Geng and their co-authors [16] developed the FocalNext Network, an efficient algorithm for overcoming noise in feature extraction, complexity, and deployment on resource-constrained devices. FM-VOC dataset with 18,644 images was labeled manually. YOLOFM notably improved the baseline network’s accuracy, recall, F1 score, mean average precision at 50% intersection over union (mAP50), and mean average precision from 50% to 95% (mAP50-95) by 3.1%, 3.9%, 3.0%, 2.2%, and 7.9%, respectively.

2.3. Drone-Based Fire Detection

Nusrat et al. [17] designed an uncrewed aerial vehicle that can fight fire employing Pixhawk PX4 (for controlling the drone), Arduino Nano R3 (for managing multiple sensors and is linked to the NodeMCU) and NodeMCU ESP8266 (for data processing). The authors conducted exhaustive real-time experiments in an open space in Dhaka, Bangladesh, with the devised drone integrated with a first-person view (FPV) camera and multiple gas sensors. The hardware drone effectively monitored gas concentrations and extinguished various sources of fires. Choutri et al. [18] constructed a cost-effective drone with a Pixhawk embedded device for automatic fire detection and the corresponding location’s position identification. The authors developed a combined sample of fire event images from three open source datasets. The applied YOLO-NS model effectively classified fire images with an F1 score and a mAP of 0.68 and 80%, respectively. Manoj and Valliyammai [19] investigated the efficiency of different pretrained neural network models for automated forest fire detection. The authors devised a multi-agent-based robotic system to track the location of the fire. The GoogleNet model achieved the best classification output with 96% accuracy and a 0.97 F1 score.

After reviewing the literature on automatic fire detection systems, it is evident that various state-of-the-art artificial intelligence techniques and complex sensing devices have been employed. Table 1 summarizes the major advantages and disadvantages of related works on automatic fire detection. However, a limited number of these studies successfully deployed such classifiers on memory-constrained edge devices and integrated them into efficient drone or robotic systems for real-time fire incident identification. While some research explored similar directions, our work aims to bridge this gap by implementing lightweight knowledge distillation-based deep learning models that are particularly optimized for deployment on drone-integrated edge devices, ensuring high accuracy and operational efficiency in resource-constrained environments.

3. Materials and Methods

3.1. Dataset

3.1.1. Dataset Collection

A significant contribution of this study is developing a comprehensive fire dataset combining multiple public repositories [20,21]. The proposed dataset contains various fire-related images of two categories (fire and no fire): forest and outdoor, vehicle, industrial, accidental, and residential fires. Additionally, we incorporated drone-view images of fire-related scenes, considering the importance of aerial perspectives. This selection was inspired by a previous study [22], highlighting the requirement for non-fire (standard) images for comparison. Figure 1 shows some sample images of the proposed fire dataset employed in this research. Consequently, the dataset comprises 7187 images of various fire incidents, with 3738 and 3449 images for the fire and no fire categories, respectively.

Table 2 provides a detailed breakdown of the comprehensive fire dataset used in this work, including scene descriptions, image features, the number of images, target quantity, and target sizes for various fire categories.

3.1.2. Dataset Preprocessing

Upon completing the data collection phase, the preprocessing stage has been performed to render the data suitable for system integration. This stage encompasses data cleaning, normalization, annotation, splitting (8:1:1), and augmentation processes. We first consolidated all fire-related images and manually sifted through them to eliminate poor-quality or distorted images. Following this meticulous screening process, our dataset was refined to 7187 images, comprising 3738 fire images and 3449 non-fire images. Table 3 presents the number of training images of fire and no fire categories before and after augmentation.

In this work, only the fire-related images are labeled accordingly. The annotation process is facilitated by Roboflow software (Version 1.1.37), which is renowned for its dataset annotation and labeling capabilities. The labeled datasets of fire samples are shown in Figure 2. Image annotation is performed through a bound box or segmentation mask. After completing the annotation, all annotated images are manually checked to ensure that no image is incorrectly or partially annotated.

The dataset was stratified into three segments for model training purposes: 80% designated for the training set, 10% for the validation set, and 10% allocated to the test set. After the annotation phase, we proceeded with data augmentation to increase the diversity of the training dataset. The employed augmentation techniques and corresponding parameters are detailed in Table 4. Following augmentation, the total count of labeled fire images reached 7607, while non-fire images were 5866.

3.2. Applied Deep Learning Models

In this work, various deep learning models, such as YOLOv8, Detection Transformer (DETR), and knowledge transformer, have been used for automatic fire identification. For knowledge distillation, YOLOv8m has been employed as the base (teacher) and the YOLOv8n and DETR are applied as the lightweight student models with reduced parameters and faster inference time to deploy in the Raspberry Pi 5B edge device for real-time fire detection.

3.2.1. Detectron2

For object detection, segmentation and other visual recognition tasks, Detectron2 is a powerful and faithful platform [23]. This model was developed by the Facebook (meta) AI Research team and succeeded in the original detection and camouflage test. Detectron2 provides a diverse collection of baseline results and pretrained models in the Model Zoo, including Faster R-CNN, Mask R-CNN, RetinaNet, and Cascade R-CNN. Widely embraced in computer vision research and production applications, Detectron2 continues to make its mark in the field. It has become trendy in computer vision research and has some practical applications. The Detectron2 deep learning module efficiently detects the targeted class with the following design features. Backbone network: This component extracts features from input images. Task-specific heads: These heads handle tasks like bounding box regression and mask prediction. Detectron2 also boasts panoptic segmentation capabilities.

3.2.2. DETR

Detection transformer (DETR) is a deep learning model for object detection [24]. It was designed especially for natural language processing problems. However, its main component was to address the object detection problem uniquely and effectively. DETR can process parallel ways that increase the inference time. DETR has a self-attention mechanism that can capture the relationship between objects and spatial contexts, but it requires more computational resources. The applied DETR model’s number of parameters is summarized in Table 5.

3.2.3. YOLOv8

You Only Look Once (YOLO) is a groundbreaking object detection algorithm in computer vision models [25]. Unlike traditional object detection methods, YOLO processes the entire image into one pass, making it incredibly fast. It achieves this efficiency by dividing the input images into a grid and directly predicting the bounding boxes and class probabilities from the grid cells. Despite its speed, YOLO has a high detection accuracy, which makes it widely adopted in various industries and research domains. YOLO has applications in both object detection and image segmentation tasks. In this study, two variants of YOLOv8 are used in the knowledge distillation technique, i.e., YOLOv8m and YOLOv8n as the baseline (teacher) and target (student) models, respectively. The implemented YOLOv8 model’s summary is described in Table 6.

3.2.4. Knowledge Distillation with Autodistill

Knowledge distillation [26] is an optimization technique that transforms knowledge from a large, heavy model to a small, tiny model. The goal is to create a simple model capable of giving a closer performance to the large model. It requires two or more models, i.e., teacher (base) and student. At first, the teacher model must be trained on the dataset to gain knowledge. It achieves very high accuracy but is also computationally expensive and memory-intensive. The teacher model generates soft targets used in the distillation process. The student model learns to mimic the teacher’s prediction, comprising a merged loss function. It has a standard cross-entropy loss or student loss with the hard label and a distillation loss with the generated soft targets to guide the student model learning process. The results obtained from the student model are similar to the teacher model but with fewer parameters, making it efficient and suitable for deployment on resource-constrained devices. Model compression, upgraded inference speed, and reduced memory usage are the benefits of the knowledge distillation process.

Autodistill is an optimization technique that uses image datasets with large models to create datasets and transfers them to smaller, lightweight edge models with reduced parameters and a faster inference time for computer vision tasks [27]. Labeling images is one of the most time-consuming tasks for detecting multiple objects. The Autodistill process involves base (teacher) and target (student) models. The base model takes unlabeled data to create a dataset by using ontology. The ontology prompts the base model with captions and describes the dataset that the target model will predict. After the base model generates a dataset, target models consume it and become distilled models. The output models will become ready for deployment in memory-constrained edge devices. Distilling the knowledge into the target models illustrates smaller and faster models with the visibility of the training data. Consequently, training with zero annotated images reduces the dependency on manual labeling.

Figure 3 shows the Autodistill process and selected models. In this work, YOLOv8m is used as the base model to label the zero annotated images. At first, a pretrained YOLOv8m model has been used to annotate images. The base model generates a bounding box-labeled dataset for target models. After processing the dataset, the target models YOLOv8n and DETR became the distilled model in this work. Finally, the YOLOv8n distilled model has been deployed in the proposed Raspberry Pi 5 edge deceive for fire detection.

3.3. Tuneable Hyperparameters

Hyperparameters are settings and configurations that can control the training state of the model. Finding the appropriate set of parameters may impact the performance and efficiency of the models. These hyperparameters have been set by employing the ray tune optimization tool before the training begins and guides the learning process. It also aids the model to prevent overfitting and increase generalization. Table 7 lists the hyperparameters used in the Autodistill process of the applied knowledge distillation technique.

3.4. Hardware Implementation of the Proposed Fire Detection System

Hardware Equipment for the Proposed Drone Design

Drone: In this work, a DJI F450 custom-built delicate drone frame constructed with lightweight and enhanced material has been used.
1.
Pixhawk: The Pixhawk is a versatile, open source flight controller widely used for various applications such as copters, planes, and uncrewed aerial vehicles. The critical feature that highlights the flight controller is its robust sensors and processors that can reduce instability and have multiple interfaces for GPS and I2C devices. Due to the built-in accelerometers, gyroscopes, and barometers, it can maintain stability and give precise flight control. In this research, the proportional–integral–derivative (PID) control system has been implemented to help the proposed drone maintain stability by adjusting the control output from the remote control. The adjustment is based on the difference between the desired and actual performance. Through the PID control system, various weather and flight conditions cannot reduce the drone’s flight performance. Hence, the Pixhawk ensures robustness and reliability in numerous flight circumstances.
2.
GPS module: A NEO 7M global positioning system (GPS) module has been employed for the drone’s positioning and wireless communication. With the help of this module, the drone can be operated in different modes, i.e., manual, altitude, and loitering. The altitude hold mode has been used to provide more accessible commands and the better control of the designed drone.
3.
30A ESC: The 30A Electronic Speed Controller (ESC) uses the power supply to give the motors precise instructions to accelerate, spin, and change directions accordingly.
4.
BLDC Motors: This work uses four 1000 kV Brushless Direct Current (BLDC) motors to hover the drone accordingly. These motors are placed on top of each arm, and after installing, ensures that the screws do not touch motor coils. Each motor draws around 2A while suspended in the air, so it draws 8A when hovered around.
5.
Radio telemetry: It is a popular wireless communication module used in many applications such as IoT, sensor networks, and RC systems. The module is low-cost and consumes little power. The module has a receiver, which enables two-way communication with the drone. Using the module, data transmission over a long distance becomes easy. In this research, the low-power consumption NRF24L01 module has been used because it is suitable for the Li-Po battery. Many software can be used to cooperate with this module, e.g., ArduPilot (Version 3.4.0) and QCGround Control (Version v4.2.3). The whole operation can be monitored instantaneously using these software tools. The NRF24L01 telemetry kit is an open source software enabling users to plan their flights and show real-time data. It can show the drone’s altitude, ground speed, and vertical speed.
6.
FlySky FS-i6: In this work, the remote controller FlySky FS-i6 is used. It is known for its reliability, range, and ease of use. One key feature is its fail-safe function, which adds a layer of safety for the user. This controller consists of six channels in the receiver, and the first four channels are used for primary communication, while channels five and six act as auxiliary components. Each channel of the receiver will be for respective motors. Channel 1 controls the drone’s flight, which will regulate the longitudinal motion of the drone. Channel 2 manages the elevator, which maintains the lateral axis in the air. Channel 3 regulates the drone’s throttle, and channel 4 is the rudder that controls the drone’s vertical movement. There is a connection between the flight controller and the receiver with an antenna.
After carefully setting up and assembling the drone, the calibration stage of the flight controller is up next before initiating its flight. In this stage, the drone has been placed on a flat surface, and then all the different values are set to zero using a trim key. Afterward, the receiver and transmitter are similarly bound, where the red light glows solid. The glowing red indicates that the calibration system is ready for the test flight.
Raspberry Pi 5B: Raspberry Pi is a series of small, affordable single-board computers. It features a compact design, such as CPU, memory, I/O ports, and other supporting peripheral devices that run on a Linus-based operating system. In this research, Raspberry Pi 5B 8 GB has been used to deploy computer vision models for instantaneous fire detection. In addition, a Pi camera module 3 has been utilized to allow the system to capture real-time images and videos.
Pi Camera: In this proposed hardware system, a Raspberry Pi camera module 3 is used as it is highly compatible with the Raspberry Pi model 5B. The camera module uses a 12-megapixel Sony IMX708 sensor. Consequently, it improves low-light performance, autofocus, and support for HDR.

Figure 4 shows the proposed hardware design of the proposed fire detection drone with various components. The entire system, including the DJI F450 quadcopter drone, Raspberry Pi and Pi camera, costs approximately USD 600 to build.

The overall hardware setup and connection diagram of the proposed automatic deep learning-based fire extinguishing drone have been presented in detail in Figure 5 and Figure 6, respectively. The DJI F450 drone is powered by the 3S 2200 mAh Li-Po battery, which would provide a flight time of approximately 12 min, judging the individual current draw of each motor at peak thrust (10 A). The whole connection is booted via an XT60 connector, which would handle a 60 A current maximum. The flight controller draws the battery power via the power supply module, operating on only 5 V. Next, the radio receiver draws the necessary voltage from the flight controller via a common power bus. The power supply module has two objectives, i.e., to provide power and monitor the battery voltage and current. There is a built-in PCB power distribution board in the drone’s chassis. The PCB helps to supply the battery power to all the ESCs that run on 12 V. Through the help of ESCs, all the BLDC motors run with a nominal voltage of approximately 11.3 V. Lastly, the Raspberry Pi 5B module consumes battery power. This portion of power goes through the LM2596 DC–DC converter. The converter decreases the voltage significantly so the Pi module can work smoothly.

Figure 7 shows how the drone is connected with various components. The Pixhawk flight controller helps the GPS module and telemetry kit to share a connection. The location of the drone can be set using the GPS module, and the telemetry kit allows the user to have a two-way communication system with the flight controller. The Raspberry Pi is connected to the Pi camera, which records the data in the storage. Finally, the captured data are transmitted to the deployed deep learning model (YOLOv8n), which enables real-time fire detection.

Figure 8 illustrates the proposed automatic fire detection system’s working progressions, which involve the following tasks. The proposed automatic fire detection system begins with defining research objectives and assembling a dataset of fire incident images labeled and augmented using tools like LabelImg and Roboflow API. Advanced deep learning models such as DETR, Detectron2, and YOLOv8 are selected and enhanced through knowledge distillation, followed by training on Google Colab Pro. The trained model, YOLOv8n, is then integrated into a Raspberry Pi 5 microcontroller and drone system for real-time fire detection. The system processes video feeds from the drone, detecting fires and sending notifications to aid in prompt hazard management.

4. Results and Discussion

This section thoroughly evaluates the proposed AI and drone-based fire detection system. To create a comprehensive dataset, we collected various fire-related images from related articles and public repositories, including vehicle, forest, accident and indoor fire incidents. The dataset is classified into target classes for image classification and divided into training, validation and testing sets with a ratio of 8:1:1. Training and evaluation strategies for the applied YOLOv8, DETR, and Detectron2 models are summarized below.

The loss function is essential for identifying errors or discrepancies in the model’s learning process. YOLOv8 combines binary cross-entropy loss for classification and complete intersection over union loss for bounding box regression. DETR uses a set-based global loss that enforces unique predictions via bipartite matching, including classification and bounding box losses. Detectron2 combines classification loss, bounding box regression loss, and mask loss, for instance, segmentation.
The optimization balances the prediction relation against the loss function to determine the best input weights. In this work, Adam and AdamW optimizers are used to train the YOLOv8 and DETR models, respectively.
YOLOv8 and DETR techniques are trained for 50 epochs. Each model has been configured with a batch size of 16.
Various evaluation metrics, like precision, recall, mean average precision, and intersection over union, are used to assess the model’s performance for classification accuracy in YOLOv8, Detectron2, and DETR.
Annotated images in COCO JSON and TXT formats are used to train each model. Finally, the distilled YOLOv8n model has been implemented on a Raspberry Pi 5 8 GB, mounted on a drone for real-time fire detection.

4.1. Performance of the Undistilled Models (without Knowledge Distillation)

The performances of the undistilled models, YOLOv8, DETR, and Detectron2, are described in the following paragraphs.

4.1.1. Detectron2: Undistilled

Figure 9 illustrates the classification accuracy of the Detectron2 model with the change of training iterations. The accuracy shows a sharp increase during the initial iterations, followed by a more gradual improvement and stabilization as the number of iterations increases. The training and validation accuracy closely track each other, indicating good generalization performance of the applied Detectron2 model. Towards the end of the training process, the model reaches its maximum accuracy, close to 0.94.

The classification losses vs. training iterations for Detectron2 are depicted in Figure 10. This figure indicates the loss values’ fluctuation ranging from 0.15 to 0.55. This graph highlights the Detectron2 model’s ability to reduce classification error over time, signifying improved performance in fire detection tasks.

4.1.2. YOLOv8n: Undistilled

Figure 11 depicts precision, recall, and F1 score vs. confidence of the YOLOv8n baseline model. A confidence level of 0.934 is ideal for fire detection systems as it ensures perfect accuracy for all classes, minimizes false positives and enables reliable prediction for timely response to potential fire hazards. With a confidence level of 0 and a recall of 0.89 for all classes, this curve indicates that this model can identify most fire events. Therefore, this balanced approach is crucial for effective fire detection and response.

Various performance metrics, e.g., training and validation losses, precision, recall, and mAP of the YOLOv8n model, are illustrated in Figure 12. The metrics demonstrate a consistent decrease in losses over the training epochs and an improvement in precision, recall, and mAP, indicating improved model performance and accuracy.

Table 8 summarizes the performance of the YOLOv8n model after training 50 epochs. The model achieves an accuracy of 82%, with a precision of 0.934, recall of 0.89, and F1 score of 0.91. These results suggest that the YOLOv8n model is effective for automatic fire detection.

4.1.3. DETR: Undistilled

Figure 13 illustrates different training and validation losses of the DETR model with the change in the number of iterations. According to this figure, the bounding box and cross-entropy validation and training losses depict a steady decline, indicating an improvement in model performance.

The summary of the DETR model’s performance for the proposed fire detection system is presented in Table 9. The model has achieved a training cross-entropy loss of 0.602 and a validation cross-entropy loss of 0.914, indicating a reasonable fit to the training fire scenario samples. The training and validation bounding box losses are 0.055 and 0.073, respectively, demonstrating effective localization performance. Additionally, the model obtains an average precision of 41.5%, an average recall of 78.2%, and an accuracy of 73%, reflecting a balanced detection capability.

4.2. Performances of the Knowledge Distillation Model

In this work, knowledge distillation models employing the Autodistill process composed of base (teacher) and target (student) models are applied for automatic fire detection. In the knowledge distillation model for fire event detection, YOLOv8m is chosen as the teacher model due to its higher accuracy and moderate size, providing reliable and precise detection capabilities. YOLOv8n and DETR are selected as the student model because of their smaller size and faster inference speed, making them suitable for deployment in the employed Raspberry Pi 5 resource-constrained environments.

4.2.1. YOLOv8m: Teacher

Figure 14 presents the precision and recall vs. confidences of the YOLOv8m base model. The precision–confidence curve indicates that, as the confidence threshold increases from 0.0 to 1.0, the precision rises from near 0.0 to approximately 1.0 for the fire class. At lower confidence thresholds, precision is lower due to a higher number of false positives. As the threshold rises, the model becomes more selective in improving precision. Notably, at a confidence threshold of 0.925, precision reaches 1.00. Similarly, Figure 14a shows how recall decreases as the confidence threshold increases for the fire class. At the lowest confidence threshold (0.0), recall is highest, indicating that the model identifies 93% of fire instances. As the confidence threshold approaches 1.0, recall steadily drops to near 0.

Figure 15 refers to the training and validation loss curves and metrics of the YOLOv8m base model. The YOLOv8m model for the fire detection task shows strong performance with an mAP coefficient of 91.81% at a 50% IoU threshold. This model is highly effective at correctly detecting and localizing fires. The precision of 92.5% reflects the accuracy of the fire detections, meaning that a high proportion of the predicted fires are actual fires. The recall of 93% indicates the model’s ability to detect most of the fires. Finally, the applied YOLOv8m teacher (base) model attains an accuracy of 94%, showing a good balance between precision and recall.

Table 10 presents the performance summary of the YOLOv8m teacher model for the proposed fire events detection, demonstrating an impressive accuracy of 94.11% and a precision of 92.5%. Additionally, the model achieves a recall of 93%, an F1 score of 0.927, and a mean average precision (mAP) of 91.81%, indicating robust detection capabilities.

4.2.2. YOLOv8n: Student

Figure 16 presents the precision and recall evaluation of the distilled YOLOv8n student model. The model demonstrates the enhanced performance compared to the base YOLOv8m and undistilled YOLOv8n models. The precision–confidence curve for the distilled YOLOv8n model shows that, as the confidence threshold increases from 0.0 to 1.0, precision significantly improves, achieving a value of 1.00 at a confidence threshold of 0.989 for all classes. Conversely, at lower confidence thresholds, precision is lower due to a higher number of false positives. As the threshold increases, the YOLOv8n model becomes more selective, significantly improving precision.

Figure 17 illustrates the performance metrics of the YOLOv8n student model across different epochs, showing a consistent decrease in training and validation losses (box, classification, and DFL losses) as the epochs progress. Concurrently, the precision, recall, mAP50, and mAP50-95 metrics notably improve, indicating enhanced model accuracy and effectiveness over time.

4.2.3. DETR: Student

Figure 18 illustrate the evaluation of the distilled DETR model, demonstrating significant improvements over the baseline model. The horizontal axis represents the total number of steps, while the vertical axis denotes metric or loss values. During training, graphs show fluctuations due to the trial-and-error nature of model training and numerous peaks indicating the model moments of challenge in classifying objects. Despite these fluctuations, there is an overall decrease in loss, indicating that the model is learning to predict object locations more accurately over time. However, the general downward trend in this graph suggests that the model is gradually improving its classification abilities, overcoming initial challenges, and becoming more adept at distinguishing between different objects. In the evaluation phase, graphs (a) reflect the model’s accuracy in predicting object locations and measuring the model’s success in classifying objects. While the model evaluated images, the loss decreased consistently, suggesting improved localization capabilities. These graphs show a sharp initial decline, indicating a rapid improvement in classification accuracy. As the model continues, the loss stabilizes at much lower values than the initial ones, demonstrating the model’s enhanced proficiency in classification by the end of the evaluation phase.

Validation and training losses across the number of iterations for the DETR student model applied to the proposed fire detection task are depicted in Figure 18. Figure 18a,b show the validation bounding box and cross-entropy losses, respectively, both demonstrating a downward trend, indicating improved model performance with more iterations. As expected, the training bounding box and cross-entropy losses show gradually decreasing fluctuations, reflecting the model’s learning process and convergence over time.

Table 11 shows the applied distilled student models’ performance in detecting fire images. The YOLOv8n model achieves a remarkable mean average precision of 93.31% at an IoU threshold of 50%, demonstrating excellent accuracy in detecting and localizing fires. With a precision of 98.9%, the model accurately identifies the majority of actual fires in its detection while boasting a recall rate of 98%, showcasing its proficiency in detecting most fires present. The overall accuracy of 95.21% reflects a well-balanced performance between precision and recall. Conversely, the DETR model achieves a satisfactory accuracy (81.10%) and F1 score (0.806) and a degraded mAP of 71.4% at the same IoU threshold.

Figure 19 illustrates the accuracy improvements achieved through model distillation for both DETR and YOLOv8 models. The baseline (undistilled) DETR model achieves an accuracy of 73%, which increases to 81% after distillation, employing the advanced Autodistill technique. Similarly, the baseline YOLOv8 model shows an accuracy of 82%, which significantly improves to 95.21% with the application of distillation techniques. This circumstance demonstrates the effectiveness of the applied knowledge distillation techniques in enhancing the performance of the proposed fire detection task.

4.3. Proposed Drone and Raspberry Pi-Integrated Hardware Device for Real-Time Fire Detection

Finally, the applied AI-based deep learning models and Raspberry Pi 5 edge computing integrated drone have been tested in real-time for instantaneous fire event detection. The applied YOLOv8n model has been deployed to the embedded device because of its superior performance. The experiments were performed in the Mirpur area of Dhaka, Bangladesh, in June 2024. The drone maintained an average height of 6.90 m (22.64 feet) and a flight time of approximately 10 min. Figure 20 depicts the various stages of the operation: the drone in flight, the fire on the ground, the ArduPilot control interface, and the live telemetry data showcasing the drone’s altitude, speed, and positioning over the target area.

The proposed AI-based fire detection drone system demonstrates an exceptional performance in real-time testing, as illustrated in Table 12. Out of 65 attempts, it correctly detects fires in 38 instances and accurately identifies the absence of fires (no fire) in 20 cases. Notably, there are five false positive detections where the system incorrectly identified a fire. This impeccable performance results in an impressive accuracy rate of 89.23%. The system’s ability to accurately detect fires while minimizing false alarms underscores its reliability and effectiveness in real-world scenarios. Overall, the proposed drone integrated with Raspberry Pi 5 and Pi camera module 3 is highly dependable for fire detection tasks, offering a valuable tool for ensuring safety and a prompt response in various environments.

Figure 21 demonstrates the real-time fire detection capability of the proposed drone system during live experiments. Frame rate, expressed in frames per second (FPS), illustrates the frequency at which consecutive images (frames) are captured or displayed in a video. The YOLOv8n model, deployed on a Raspberry Pi 5 with a Pi camera module 3, successfully identifies and labels multiple instances of fire with an average frame rate of 8 FPS, i.e., eight distinct frames are processed every second. The confidence levels of the drone change with its altitude and the presence of multiple fire sources.

Table 13 offers a comparative overview of the proposed AI-based fire detection system with similar existing works. Few previous studies implemented fire identification systems with embedded devices. To the best of our knowledge, this is the first time that a knowledge distillation-based AI technique has been integrated with a drone and a Raspberry Pi 5 edge device for instantaneous fire detection in this work.

4.4. Limitations of the Proposed System

The proposed drone system for real-time fire detection using AI and edge devices confers some limitations. The dataset employed in this work encompassed a wide range of fire images. Generalizing the findings to other environments may be challenging due to variations in fire scenarios, infrastructure, and environmental contexts. The detection performance of the implemented deep learning models could be affected by changes in weather conditions, lighting, and seasonal variations. Due to the use of a Raspberry Pi 5 edge device, the continuous maintenance of a 27-watt power supply and the average image resolution can be considered the limitations of the proposed system.

5. Conclusions

Efficient and prompt fire detection is necessary to reduce economic and environmental losses. This work implements an automatic fire detection system using advanced computer vision techniques on an edge device. This system employs a combined dataset of 7187 images of various fire scenarios. Various cutting-edge deep learning models have been applied, e.g., DETR, Detectron2, and YOLOv8, in the PyTorch framework. Next, the knowledge distillation approach is implemented with YOLOv8m as the teacher (base) model and YOLOv8n and DETR as distilled student models. Additionally, this study proposes a low-cost implementation of an advanced embedded system built with a Raspberry Pi 5B and Pi camera integrated into drones. The lightweight YOLOv8n model has been deployed into the Raspberry Pi edge device for instantaneous fire identification. Real experiments have been performed to investigate the effectiveness of the proposed AI and edge device-based fire detection system. Comprehensive testing and data collection were conducted to assess the model’s accuracy and overall performance. The results of the proposed AI-based automatic fire detection study illustrate the potential to assist relevant authorities in promptly identifying and mitigating fire hazards.

In the future, the proposed model can be deployed into a string smartphone application platform to monitor real-time fire identification. Heterogeneous sensors comprising images, temperature, gas, and flame can be integrated with the AI-based device for robust fire detection. Open-world learning techniques can be implemented to reduce false positives and improve the system’s robustness.

Author Contributions

Conceptualization, M.F.S.T. and R.K.; Methodology, M.F.S.T.; Software, M.F.S.T. and M.A.P.; Validation, M.F.S.T. and M.A.P.; Investigation, H.B. and U.A.; Resources, G.K.O.M.; Data curation, M.A.P.; Writing—original draft, M.F.S.T. and M.A.P.; Writing—review & editing, G.K.O.M. and R.K.; Visualization, H.B. and U.A.; Supervision, R.K.; Funding acquisition, G.K.O.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by North South University, Dhaka, Bangladesh, grant number CTRG-22SEPS-03, and Multimedia University, Melaka, Malaysia, IR Fund MMUI/220041, in collaboration with the Postdoctoral Research Fellow Program.

Data Availability Statement

Data will be available upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Prétrel, H.; Vaux, S. Fire-induced flows for complex fire scenarios in a mechanically ventilated two-storey structure. J. Fire Sci. 2024, 07349041241256796. [Google Scholar] [CrossRef]
Scott, A.C.; Chaloner, W.G.; Belcher, C.M.; Roos, C.I. The interaction of fire and humankind: Introduction. Philos. Trans. R. Soc. B Biol. Sci. 2016, 371, 1–8. [Google Scholar] [CrossRef]
Morgan, D.R. World on fire: Two scenarios of the destruction of human civilization and possible extinction of the human race. Futures 2009, 41, 683–693. [Google Scholar] [CrossRef]
Shokouhi, M.; Nasiriani, K.; Cheraghi, Z.; Ardalan, A.; Khankeh, H.R.; Fallahzadeh, H.; Khorasani-Zavareh, D. Preventive measures for fire-related injuries and their risk factors in residential buildings: A systematic review. J. Inj. Violence Res. 2019, 11, 1–14. [Google Scholar] [CrossRef] [PubMed]
Partheepan, S.; Sanati, F.; Hassan, J. Autonomous Unmanned Aerial Vehicles in Bushfire Management: Challenges and Opportunities. Drones 2023, 7, 47. [Google Scholar] [CrossRef]
Khan, F.; Xu, Z.; Sun, J.; Khan, F.M.; Ahmed, A.; Zhao, Y. Recent Advances in Sensors for Fire Detection. Sensors 2022, 22, 3310. [Google Scholar] [CrossRef] [PubMed]
Singh, R.; Gill, S.S. Edge AI: A survey. Internet Things Cyber-Phys. Syst. 2023, 3, 71–92. [Google Scholar] [CrossRef]
Park, S.; Kim, D.; Kim, S. Recognition of IoT-based fire-detection system fire-signal patterns applying fuzzy logic. Heliyon 2023, 9, e12964. [Google Scholar] [CrossRef] [PubMed]
Baek, J.; Alhindi, T.J.; Jeong, Y.S.; Jeong, M.K.; Seo, S.; Kang, J.; Shim, W.; Heo, Y. Real-time fire detection system based on dynamic time warping of multichannel sensor networks. Fire Saf. J. 2021, 123, 103364. [Google Scholar] [CrossRef]
Biswas, A.; Ghosh, S.K.; Ghosh, A. Early Fire Detection and Alert System using Modified Inception-v3 under Deep Learning Framework. Procedia Comput. Sci. 2023, 218, 2243–2252. [Google Scholar] [CrossRef]
Wang, X.; Li, M.; Gao, M.; Liu, Q.; Li, Z.; Kou, L. Early smoke and flame detection based on transformer. J. Saf. Sci. Resil. 2023, 4, 294–304. [Google Scholar] [CrossRef]
Shamta, I.; Demir, B.E. Development of a deep learning-based surveillance system for forest fire detection and monitoring using UAV. PLoS ONE 2024, 19, e0299058. [Google Scholar] [CrossRef] [PubMed]
Avazov, K.; Jamil, M.K.; Muminov, B.; Abdusalomov, A.B.; Cho, Y.I. Fire Detection and Notification Method in Ship Areas Using Deep Learning and Computer Vision Approaches. Sensors 2023, 23, 7078. [Google Scholar] [CrossRef] [PubMed]
Sathishkumar, V.E.; Cho, J.; Subramanian, M.; Naren, O.S. Forest fire and smoke detection using deep learning-based learning without forgetting. Fire Ecol. 2023, 19, 9. [Google Scholar] [CrossRef]
Norkobil Saydirasulovich, S.; Abdusalomov, A.; Jamil, M.K.; Nasimov, R.; Kozhamzharova, D.; Cho, Y.I. A YOLOv6-Based Improved Fire Detection Approach for Smart City Environments. Sensors 2023, 23, 3161. [Google Scholar] [CrossRef] [PubMed]
Geng, X.; Su, Y.; Cao, X.; Li, H.; Liu, L. YOLOFM: An improved fire and smoke object detection algorithm based on YOLOv5n. Sci. Rep. 2024, 14, 4543. [Google Scholar] [CrossRef] [PubMed]
Jahan, N.; Niloy, T.B.M.; Silvi, J.F.; Hasan, M.; Nashia, I.J.; Khan, R. Development of an IoT-based firefighting drone for enhanced safety and efficiency in fire suppression. Meas. Control 2024, 2024, 00202940241238674. [Google Scholar] [CrossRef]
Choutri, K.; Lagha, M.; Meshoul, S.; Batouche, M.; Bouzidi, F.; Charef, W. Fire Detection and Geo-Localization Using UAV′s Aerial Images and Yolo-Based Models. Appl. Sci. 2023, 13, 11548. [Google Scholar] [CrossRef]
Manoj, S.; Valliyammai, C. Drone network for early warning of forest fire and dynamic fire quenching plan generation. EURASIP J. Wirel. Commun. Netw. 2023, 2023, 112. [Google Scholar] [CrossRef]
Cortez, P.; Morais, A. Forest Fires. UCI Machine Learning Repository. 2008. Available online: https://archive.ics.uci.edu/dataset/162/forest+fires (accessed on 1 June 2024).
Khan, A.; Hassan, B. Dataset for Forest Fire Detection. Mendeley Data. 2020. Available online: https://data.mendeley.com/datasets/gjmr63rz2r/1 (accessed on 1 June 2024).
Davis, M.; Shekaramiz, M. Desert/Forest Fire Detection Using Machine/Deep Learning Techniques. Fire 2023, 6, 418. [Google Scholar] [CrossRef]
Chincholi, F.; Koestler, H. Detectron2 for Lesion Detection in Diabetic Retinopathy. Algorithms 2023, 16, 147. [Google Scholar] [CrossRef]
Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-End Object Detection with Transformers. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; Springer: Cham, Switzerland, 2020; pp. 213–229. [Google Scholar]
Alruwaili, M.; Atta, M.N.; Siddiqi, M.H.; Khan, A.; Khan, A.; Alhwaiti, Y.; Alanazi, S. Deep Learning-Based YOLO Models for the Detection of People With Disabilities. IEEE Access 2024, 12, 2543–2566. [Google Scholar] [CrossRef]
Zhou, Q.; Wang, H.; Tang, Y.; Wang, Y. Defect Detection Method Based on Knowledge Distillation. IEEE Access 2023, 11, 35866–35873. [Google Scholar] [CrossRef]
Abluton, A. Knowledge Distillation for a Domain-Adaptive Visual Recommender System. In Proceedings of the International Conference of the Italian Association for Artificial Intelligence; Springer: Berlin/Heidelberg, Germany, 2023. [Google Scholar]
Dampage, U.; Bandaranayake, L.; Wanasinghe, R.; Kottahachchi, K.; Jayasanka, B. Forest fire detection system using wireless sensor networks and machine learning. Sci. Rep. 2022, 12, 46. [Google Scholar] [CrossRef] [PubMed]
Deng, L.; Chen, Q.; He, Y.; Sui, X.; Liu, Q.; Hu, L. Fire detection with infrared images using cascaded neural network. J. Algorithms Comput. Technol. 2019, 13. [Google Scholar] [CrossRef]
Talaat, F.M.; ZainEldin, H. An improved fire detection approach based on YOLO-v8 for smart cities. Neural Comput. Appl. 2023, 35, 20939–20954. [Google Scholar] [CrossRef]
Zheng, H.; Duan, J.; Dong, Y.; Liu, Y. Real-time fire detection algorithms running on small embedded devices based on MobileNetV3 and YOLOv4. Fire Ecol. 2023, 19, 31. [Google Scholar] [CrossRef]

Figure 1. Sample images of the employed fire detection dataset: (a) fire and (b) no fire.

Figure 2. Image annotation using Roboflow framework.

Figure 3. Architecture of the proposed Autodistill-based knowledge distillation technique.

Figure 4. Hardware design of the proposed fire detection drone with various components.

Figure 5. Overall hardware setup of the proposed drone integrated fire detection system.

Figure 6. Connection diagram of the proposed hardware system.

Figure 7. Circuit diagram of the drone network.

Figure 8. The working progressions of the proposed automatic fire detection system.

Figure 9. Classification accuracy of the Detectron2 model vs. training iterations.

Figure 10. Classification loss of the Detectron2 model vs. training iterations.

Figure 11. Evaluation results of YOLOv8n model.

Figure 12. Various performance metrics vs. epochs of the YOLOv8n model.

Figure 13. Various validation and training losses vs. number of iterations of DETR model.

Figure 14. Evaluation results of the YOLOv8m base (teacher) model.

Figure 15. Various performance metrics vs. epochs of the YOLOv8m base (teacher) model.

Figure 16. Evaluation results of YOLOv8m student model.

Figure 17. Various performance metrics vs. epochs of the YOLOv8n student model.

Figure 18. Various validation and training losses vs. number of iterations of DETR student model.

Figure 19. Accuracies of the baseline (undistilled) and distilled models.

Figure 20. Real-time experimental setup of the proposed fire detection drone.

Figure 21. Fire detection of the proposed drone in live experiments.

Table 1. Summary of related works on fire detection systems.

Study	Techniques	Advantages	Disadvantages
[8]	Fuzzy logic with IoT sensors	Reduces false alarms, early detection	High cost and complexity due to multiple sensors
[9]	Automatic algorithms with sensor networks	Better performance using real-life data	Extensive sensor networks needed, potential for sensor malfunction
[10]	Inception-V3 deep learning model	High accuracy, lower computational cost	Requires powerful computational resources
[11]	Decoder-free fully transformer (DFFT)	High-detection efficiency	High computational requirements, complex models
[12]	YOLOv8 and CNN-RCNN models with drones	High accuracy, rapid response	Limited drone battery life, flight stability issues
[13]	YOLOv7 models for aquatic vehicles	Robust, high accuracy	Challenges with waterproofing, marine environment stability
[14]	Pre-trained CNNs with LwF technique	High accuracies	Potential overfitting, need for continual updates
[15]	YOLOv6, YOLOv3, Faster R-CNN	Best performance with YOLOv6	Limited reproducibility with private dataset
[16]	FocalNext network	Improved accuracy, reduced noise	Balancing complexity with resource constraints
[17]	Drone with Pixhawk PX4 and sensors	Effective real-time monitoring and fire extinguishing	Limited flight duration, payload capacity
[18]	Pixhawk embedded drone with YOLO-NS	Cost-effective	Performance can be improved
[19]	GoogleNet for forest fire detection	High classification accuracy	Performance in diverse environments needs evaluation

Table 2. Detailed features of the comprehensive fire dataset used in this work.

Category	Scene Description	Image Features	# of Images	Target Quantity	Target Size (pixels)
Forest and outdoor fires	Forest fires in various seasons, bushfires, outdoor campfires	Various resolutions, daylight, and night images	1000	1–50 per image	50 × 50 to 500 × 500
Vehicle fires	Fires in cars, buses, trucks, both in motion and stationary	Various resolutions, close-up and distant shots	700	1–20 per image	30 × 30 to 300 × 300
Industrial fires	Factory fires, warehouse fires, chemical plant fires	Various resolutions, indoor and outdoor images	900	1–100 per image	50 × 50 to 500 × 500
Accidental fires	Household fires, kitchen fires, electrical fires	Various resolutions, different room settings	800	1–50 per image	30 × 30 to 300 × 300
Residential fires	Apartment and house fires, building fires	Various resolutions, interior and exterior views	700	1–20 per image	50 × 50 to 400 × 400
Drone-view fires	Aerial images of various fire scenes	High resolution, different altitudes, and angles	638	1–50 per image	100 × 100 to 1000 × 1000
No fire (standard)	Images without fire for comparison, various scenes like forests, vehicles, industrial areas, residential areas	Various resolutions	3449	0	N/A

Table 3. Number of training images for each category of the employed dataset.

Class	# of Initial Images	# of Augmented Images
Fire	2990	7607
No Fire	2760	5866
Total	5750	13,473

Table 4. Applied augmentation approaches and corresponding parameters.

Augmentation Technique	Parameters
Flip	Horizontal and vertical
Saturation	Between −20% and +20%
Random noise addition	Up to 0.5% of pixels

Table 5. Number of parameters of the employed DETR model.

Characteristic	Value
Trainable parameters	41.3 M
Non-trainable parameters	222 K
Total parameters	41.5 M
Model size	166.01 MB

Table 6. Summary of the applied YOLOv8 Model.

Characteristic	YOLOv8m (Medium)	YOLOv8n (Nano)
Layers	295	225
Parameters	25.8 M	3.01 M
Gradients	25,856,883	3,011,027
GFLOPs	79.1	8.2
Size (MB)	103.2	32.8

Table 7. Hyperparameters for the applied YOLOv8- and DETR-based knowledge distillation models.

Hyperparameter	YOLOv8	DETR
Epochs	50	50
Image size (pixels)	640	640
Batch size	32	32
Optimizer	AdamW	AdamW
Intersection over union (IoU)	0.7	0.8
Learning rate (Lr)	-	1 × 10⁻⁴
Initial learning rate (Lr0)	0.01	-
Final learning rate (Lrf)	0.01	-
Learning rate for backbone (Lr backbone)	-	1 × 10⁻⁵
Momentum	0.937	-
Weight decay	0.0005	1 × 10⁻⁴
Warm-up epochs	3.0	-
Warm-up momentum	0.8	-
Warm-up bias learning rate	0.1	-

Table 8. YOLOv8 model’s summary.

Accuracy	Precision	Recall	F1 Score
82%	93.4%	89%	0.91

Table 9. Results summary of DETR model.

Metric	Value
Training cross-entropy loss	0.602
Validation cross-entropy loss	0.914
Training bounding box loss	0.055
Validation bounding box loss	0.073
Average precision	41.5%
Average recall	78.2%
Accuracy	73%

Table 10. YOLOv8m teacher model’s performance summary.

Accuracy	Precision	Recall	F1 Score	mAP
94.11%	92.5%	93%	0.927	91.81%

Table 11. YOLOv8n and DETR student models’ evaluation metrics.

Student Model	Accuracy	Precision	Recall	F1 Score	mAP
YOLOv8n	95.21%	98.90%	98%	0.985	93.31%
DETR	81.10%	71.40%	92.50%	0.806	71.40%

Table 12. Real-time detection performance of the proposed fire detection drone.

Attempt	Fire	Wrong Detection	No Fire	Wrong Detection
65	38	5	20	2

Table 13. Comparison of the proposed fire detection system with other similar works.

Ref	Dataset: # of Samples	Best Model	Hardware Implementation	Metrics
[8]	Alarm scenarios: 233	Fuzzy logic	IoT-based fire detectors	Accuracy: 86.27%
[9]	Fire scenarios: 14	KNN	Temperature, CO and ionization sensors	False alarm rate: 7.30%
[28]	Private: 7000 samples	Multiple Regression	Various sensors with SIM800L	Accuracy: 90%
[29]	Private: 5000 images	Cascaded CNN	No	Accuracy: 95.30%
[30]	26,520 images	YOLOv8	No	F1 score: 0.96 mAP: 97.50%
[16]	Private: 18,644 images	YOLOFM	No	F1 score: 0.94 mAP: 97.50%
[31]	Public: 226 images	MobileNetV3-YOLOv4	Firefighting robot with Jetson Xavier	Accuracy: 95.14%
[18]	Public: 12,000 images	YOLO-NAS	Drone with Pixhawk	F1 score: 0.68
[13]	Public: 1586 images	YOLOv7	No	F1 score: 0.93 mAP: 81%
This work	Merged various public datasets: 7187 images	Knowledge distillation (YOLOv8n)	Drone with Raspberry Pi 5 and Pi camera module 3	Accuracy: 95.21% F1 score: 0.98

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Titu, M.F.S.; Pavel, M.A.; Michael, G.K.O.; Babar, H.; Aman, U.; Khan, R. Real-Time Fire Detection: Integrating Lightweight Deep Learning Models on Drones with Edge Computing. Drones 2024, 8, 483. https://doi.org/10.3390/drones8090483

AMA Style

Titu MFS, Pavel MA, Michael GKO, Babar H, Aman U, Khan R. Real-Time Fire Detection: Integrating Lightweight Deep Learning Models on Drones with Edge Computing. Drones. 2024; 8(9):483. https://doi.org/10.3390/drones8090483

Chicago/Turabian Style

Titu, Md Fahim Shahoriar, Mahir Afser Pavel, Goh Kah Ong Michael, Hisham Babar, Umama Aman, and Riasat Khan. 2024. "Real-Time Fire Detection: Integrating Lightweight Deep Learning Models on Drones with Edge Computing" Drones 8, no. 9: 483. https://doi.org/10.3390/drones8090483

APA Style

Titu, M. F. S., Pavel, M. A., Michael, G. K. O., Babar, H., Aman, U., & Khan, R. (2024). Real-Time Fire Detection: Integrating Lightweight Deep Learning Models on Drones with Edge Computing. Drones, 8(9), 483. https://doi.org/10.3390/drones8090483

Article Menu

Real-Time Fire Detection: Integrating Lightweight Deep Learning Models on Drones with Edge Computing

Abstract

1. Introduction

2. Related Works

2.1. Sensor-Based Fire Detection

2.2. Computer Vision and AI-Based Fire Detection without Embedded Deployment

2.3. Drone-Based Fire Detection

3. Materials and Methods

3.1. Dataset

3.1.1. Dataset Collection

3.1.2. Dataset Preprocessing

3.2. Applied Deep Learning Models

3.2.1. Detectron2

3.2.2. DETR

3.2.3. YOLOv8

3.2.4. Knowledge Distillation with Autodistill

3.3. Tuneable Hyperparameters

3.4. Hardware Implementation of the Proposed Fire Detection System

Hardware Equipment for the Proposed Drone Design

4. Results and Discussion

4.1. Performance of the Undistilled Models (without Knowledge Distillation)

4.1.1. Detectron2: Undistilled

4.1.2. YOLOv8n: Undistilled

4.1.3. DETR: Undistilled

4.2. Performances of the Knowledge Distillation Model

4.2.1. YOLOv8m: Teacher

4.2.2. YOLOv8n: Student

4.2.3. DETR: Student

4.3. Proposed Drone and Raspberry Pi-Integrated Hardware Device for Real-Time Fire Detection

4.4. Limitations of the Proposed System

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI