1. Introduction
Fire is a specific chemical reaction produced by combusting wood or paper, which emits light and heat [
1]. It is a double-edged sword, offering benefits and drawbacks for civilization and the environment [
2]. While fire has provided significant advantages and shaped ecosystems over time, human activities have altered its role, resulting in positive and negative impacts on society and the natural world. For civilization, fire is a vital natural asset contributing to human well-being by providing warmth, light, and protection [
3]. Economically, fire’s effects can be both beneficial and detrimental: it can revitalize ecosystems, and yet uncontrolled fires can cause severe environmental problems leading to climate change and global warming.
Fire incidents, which cause significant damage to life and property, are a problem worldwide. Various factors, including electrical faults, human negligence, and natural causes can trigger fires. These disasters not only result in losses of life but also have severe financial repercussions. Globally, fires have a significant influence on lives and economies. Over the past 40 years, there have been approximately 2805 fatalities and over 8000 injuries, poignantly affecting approximately 7 million people. Fires also have influence global trends. A fire accident in a specific industry or domain can thus have a knock-on effect on the global market supply chain, as can be seen, for example, in the “Amazon forest fire” calamity.
Modern megacities, characterized by skyscrapers, dense urban forests, and industrial zones, are particularly susceptible to rapid fire outbreaks [
4]. The conflagration-based life-threatening catastrophe poses severe threats to community security and societal progress. This scenario necessitates the development of automated fire detection and prevention systems utilizing advanced sensing equipment, drone technology, and artificial intelligence (AI) techniques [
5]. Traditionally, fire safety relied on direct observation, followed by implementing basic alarm systems. Contemporary advancements have led to the integration of sprinkler systems with smoke and heat detectors, computer vision, and drone-based automatic fire detection [
6]. The modern automatic fire detection system minimizes the danger of injury, fatalities, and property damage. Furthermore, it allows for higher accuracies, rapid discoveries, and quick responses from local authorities.
Artificial intelligence (AI) helps computers operate in a way that emulates human abilities. AI edge systems [
7] refer to AI applications managed by machines capable of performing various tasks in the physical world. These systems work via a multi-step approach. In this process, AI systems learn from past incidents and improve over time. The utilization of AI continues to grow across a wide range of applications, significantly affecting daily tasks, jobs, and organizational operations. The primary reason for the comprehensive adoption of AI is its ability to perform lessons rapidly and accurately, often surpassing human capabilities. AI robots are also employed in hazardous jobs, such as defusing bomb, reducing hazards to human life. Therefore, the use of AI is expected to grow, driven by advancements in task management, resource allocation, and cost efficiency.
Conventional fire extinguishing techniques with sensors that detect heat or smoke have certain limitations. These sensors can be susceptible, leading to a limited detection range, slow detection, and false alarms triggered by small-scale sources like cigarette smoke or candle flames. This unreliability necessitates a more accurate and prompt fire detection system. Automatic fire detection using AI addresses this need by providing noncontact and effective sensing, non-human intervention, and the precise and rapid identification of fire incidents. An effective fire detection system can significantly reduce the loss of human life, environmental factors, and property by enabling a prompt response.
The emerging system for accurate real-time fire detection using AI incorporates modern techniques for effective fire identification. This research aims to achieve deep learning as well as edge device- and drone-based automatic fire detection using images of various fire scenarios, including indoor, outdoor, forest, and natural fires. The major contributions of the proposed automatic fire detection study are illustrated below.
A significant contribution of this work is the development of a comprehensive dataset comprising 7187 images by combining multiple open source fire events datasets. The labeling process is carried out precisely using advanced tools, LabelImg and Roboflow API, involving human labeling, sorting, renaming, and carefully classifying each picture. Data augmentation techniques have been employed to increase the size and diversity of the dataset.
Advanced deep learning models such as Detection Transformer (DETR), Detectron2, YOLOv8, and Autodistill-built knowledge distillation techniques have been applied for automatic fire detection. YOLOv8m has been used as the teacher (base) model and YOLOv8n and DETR are employed as the distilled student models for the knowledge distillation approach. Various metrics, precision, recall, mean average precision (mAP), intersection over union (IoU), loss metrics, and overall accuracy are demonstrated to verify the reliability and effectiveness of the applied models.
The proposed automatic fire detection system has been assembled into a robust hardware setup comprising a Raspberry Pi 5 microcontroller, a DJI F450 drone, and a Raspberry Pi camera module 3. The lightweight YOLOv8n technique has been deployed into the microcontroller and drone system for real-time fire detection.
The novelty of this work is integrating a lightweight knowledge distillation-based deep learning technique with a Raspberry Pi 5 edge device and a drone for instant real-time fire detection utilizing a comprehensive fire events dataset.
Section 2 discusses related articles in fire detection and computer vision on edge devices.
Section 3 elaborates on the proposed system, detailing its software and hardware components, the created dataset, and the applied deep learning models for automatic fire detection.
Section 4 reviews the simulation and hardware results of the proposed fire extinguishing system. Finally, the conclusions and potential future enhancements for this system in fire detection and monitoring are illustrated in
Section 5.
2. Related Works
Significant efforts have been initiated to mitigate the damage and consequences of fire accidents, which lead to the destruction of human habitats and natural ecosystems, environmental pollution, soil erosion, and other adverse effects. Fire hazard surveillance systems encompass traditional watchtower-based human supervision, sensor-built heat, smoke, and fumes detection, and recent advancements in artificial intelligence for automatic identification. The efficiency of earlier methods is limited due to their confined detection range, lower accuracy, elevated false alarm rates, and slow response times. In the following paragraphs, related articles on fire accident detection systems are briefly discussed.
2.1. Sensor-Based Fire Detection
Park et al. [
8] developed a fuzzy logic system to enhance the reliability of Internet of Things (IoT)-based fire detection systems by recognizing fire signal patterns. The authors analyzed the characteristics of fire signals and created a fuzzy logic system capable of identifying such patterns, thereby reducing false alarms and enabling the early detection of fires. The system comprises several components, including flame, smoke, and temperature sensors; multi-sensor nodes; wireless communication modules; a server; a control room interface; an Internet network; a CCTV system; and speakers for audio alerts. Baek et al. [
9] employed automatic algorithms, sensor network configurations, and data analysis methods for fire detection. The authors utilized real-life fire sensor data from the NIST repository to evaluate the implemented system. Sensors were installed in a manufactured home, and data were collected during real-time fire scenarios. The system’s performance was compared with existing methods using MATLAB to detect fires across different scenarios.
2.2. Computer Vision and AI-Based Fire Detection without Embedded Deployment
Biswas et al. [
10] presented a novel deep learning model to detect fire and smoke. Leveraging an open source dataset, the authors applied the Inception-V3 model by integrating a novel optimization function. The improved Inception-V3 technique attained approximately 96% accuracy and 0.96 specificity with comparatively fewer epochs and reduced the computational cost. Wang and others [
11] introduced a decoder-free fully transformer (DFFT) approach for early smoke and fire prediction, aiming to enhance the detection accuracy. This study combined publicly available datasets with custom-curated samples. Various baseline models were implemented and evaluated on these datasets. The applied DFFT model accomplished outstanding efficiencies in the detection task with mAP coefficients of 87.40% and 81.12% on the respective smoke and fire datasets.
Shamta et al. [
12] designed a forest fire surveillance system, employing deep learning techniques and a quad-rotor drone. The authors used the YOLOv8 and a combined CNN-RCNN model for fire detection and image classification, respectively. The applied YOLOv8 model attained a 0.96 mAP score for fire detection, and the CNN-RCNN framework achieved 96% classification accuracy. Avazov et al. [
13] introduced novel fire detection techniques for aquatic transport vehicles utilizing the YOLOv7 technique and a dataset comprising more than 4.6k images with extensive data augmentation modalities. Various YOLOv7 models, including YOLOv7, YOLOv7-W6, YOLOv7-tiny, YOLOv7-X, YOLOv7-E6, and YOLOv7-D6, were implemented across multiple tasks. The YOLOv7 technique was able to attain a superior performance at 0.81 mAP with 50% IoU and 0.93 F1 coefficient.
Sathishkumar and others [
14] utilized multiple pre-trained deep learning-based convolution neural networks (CNN) with an additional (learning without forgetting) LwF technique. The VGG16, InceptionV3, and Xception models with learning without forgetting technique obtained accuracies of 95.46%, 97.01%, and 98.72%, respectively. Saydirasulovich and their members [
15] devised an advanced fire detection method with YOLOv6, YOLOv3, and Faster R-CNN deep learning algorithms with a private dataset of 4000 samples. The applied YOLOv6 performed best, with a 0.43 F1 score and 39.50% mAP. Geng and their co-authors [
16] developed the FocalNext Network, an efficient algorithm for overcoming noise in feature extraction, complexity, and deployment on resource-constrained devices. FM-VOC dataset with 18,644 images was labeled manually. YOLOFM notably improved the baseline network’s accuracy, recall, F1 score, mean average precision at 50% intersection over union (mAP50), and mean average precision from 50% to 95% (mAP50-95) by 3.1%, 3.9%, 3.0%, 2.2%, and 7.9%, respectively.
2.3. Drone-Based Fire Detection
Nusrat et al. [
17] designed an uncrewed aerial vehicle that can fight fire employing Pixhawk PX4 (for controlling the drone), Arduino Nano R3 (for managing multiple sensors and is linked to the NodeMCU) and NodeMCU ESP8266 (for data processing). The authors conducted exhaustive real-time experiments in an open space in Dhaka, Bangladesh, with the devised drone integrated with a first-person view (FPV) camera and multiple gas sensors. The hardware drone effectively monitored gas concentrations and extinguished various sources of fires. Choutri et al. [
18] constructed a cost-effective drone with a Pixhawk embedded device for automatic fire detection and the corresponding location’s position identification. The authors developed a combined sample of fire event images from three open source datasets. The applied YOLO-NS model effectively classified fire images with an F1 score and a mAP of 0.68 and 80%, respectively. Manoj and Valliyammai [
19] investigated the efficiency of different pretrained neural network models for automated forest fire detection. The authors devised a multi-agent-based robotic system to track the location of the fire. The GoogleNet model achieved the best classification output with 96% accuracy and a 0.97 F1 score.
After reviewing the literature on automatic fire detection systems, it is evident that various state-of-the-art artificial intelligence techniques and complex sensing devices have been employed.
Table 1 summarizes the major advantages and disadvantages of related works on automatic fire detection. However, a limited number of these studies successfully deployed such classifiers on memory-constrained edge devices and integrated them into efficient drone or robotic systems for real-time fire incident identification. While some research explored similar directions, our work aims to bridge this gap by implementing lightweight knowledge distillation-based deep learning models that are particularly optimized for deployment on drone-integrated edge devices, ensuring high accuracy and operational efficiency in resource-constrained environments.
4. Results and Discussion
This section thoroughly evaluates the proposed AI and drone-based fire detection system. To create a comprehensive dataset, we collected various fire-related images from related articles and public repositories, including vehicle, forest, accident and indoor fire incidents. The dataset is classified into target classes for image classification and divided into training, validation and testing sets with a ratio of 8:1:1. Training and evaluation strategies for the applied YOLOv8, DETR, and Detectron2 models are summarized below.
The loss function is essential for identifying errors or discrepancies in the model’s learning process. YOLOv8 combines binary cross-entropy loss for classification and complete intersection over union loss for bounding box regression. DETR uses a set-based global loss that enforces unique predictions via bipartite matching, including classification and bounding box losses. Detectron2 combines classification loss, bounding box regression loss, and mask loss, for instance, segmentation.
The optimization balances the prediction relation against the loss function to determine the best input weights. In this work, Adam and AdamW optimizers are used to train the YOLOv8 and DETR models, respectively.
YOLOv8 and DETR techniques are trained for 50 epochs. Each model has been configured with a batch size of 16.
Various evaluation metrics, like precision, recall, mean average precision, and intersection over union, are used to assess the model’s performance for classification accuracy in YOLOv8, Detectron2, and DETR.
Annotated images in COCO JSON and TXT formats are used to train each model. Finally, the distilled YOLOv8n model has been implemented on a Raspberry Pi 5 8 GB, mounted on a drone for real-time fire detection.
4.1. Performance of the Undistilled Models (without Knowledge Distillation)
The performances of the undistilled models, YOLOv8, DETR, and Detectron2, are described in the following paragraphs.
4.1.1. Detectron2: Undistilled
Figure 9 illustrates the classification accuracy of the Detectron2 model with the change of training iterations. The accuracy shows a sharp increase during the initial iterations, followed by a more gradual improvement and stabilization as the number of iterations increases. The training and validation accuracy closely track each other, indicating good generalization performance of the applied Detectron2 model. Towards the end of the training process, the model reaches its maximum accuracy, close to 0.94.
The classification losses vs. training iterations for Detectron2 are depicted in
Figure 10. This figure indicates the loss values’ fluctuation ranging from 0.15 to 0.55. This graph highlights the Detectron2 model’s ability to reduce classification error over time, signifying improved performance in fire detection tasks.
4.1.2. YOLOv8n: Undistilled
Figure 11 depicts precision, recall, and F1 score vs. confidence of the YOLOv8n baseline model. A confidence level of 0.934 is ideal for fire detection systems as it ensures perfect accuracy for all classes, minimizes false positives and enables reliable prediction for timely response to potential fire hazards. With a confidence level of 0 and a recall of 0.89 for all classes, this curve indicates that this model can identify most fire events. Therefore, this balanced approach is crucial for effective fire detection and response.
Various performance metrics, e.g., training and validation losses, precision, recall, and mAP of the YOLOv8n model, are illustrated in
Figure 12. The metrics demonstrate a consistent decrease in losses over the training epochs and an improvement in precision, recall, and mAP, indicating improved model performance and accuracy.
Table 8 summarizes the performance of the YOLOv8n model after training 50 epochs. The model achieves an accuracy of 82%, with a precision of 0.934, recall of 0.89, and F1 score of 0.91. These results suggest that the YOLOv8n model is effective for automatic fire detection.
4.1.3. DETR: Undistilled
Figure 13 illustrates different training and validation losses of the DETR model with the change in the number of iterations. According to this figure, the bounding box and cross-entropy validation and training losses depict a steady decline, indicating an improvement in model performance.
The summary of the DETR model’s performance for the proposed fire detection system is presented in
Table 9. The model has achieved a training cross-entropy loss of 0.602 and a validation cross-entropy loss of 0.914, indicating a reasonable fit to the training fire scenario samples. The training and validation bounding box losses are 0.055 and 0.073, respectively, demonstrating effective localization performance. Additionally, the model obtains an average precision of 41.5%, an average recall of 78.2%, and an accuracy of 73%, reflecting a balanced detection capability.
4.2. Performances of the Knowledge Distillation Model
In this work, knowledge distillation models employing the Autodistill process composed of base (teacher) and target (student) models are applied for automatic fire detection. In the knowledge distillation model for fire event detection, YOLOv8m is chosen as the teacher model due to its higher accuracy and moderate size, providing reliable and precise detection capabilities. YOLOv8n and DETR are selected as the student model because of their smaller size and faster inference speed, making them suitable for deployment in the employed Raspberry Pi 5 resource-constrained environments.
4.2.1. YOLOv8m: Teacher
Figure 14 presents the precision and recall vs. confidences of the YOLOv8m base model. The precision–confidence curve indicates that, as the confidence threshold increases from 0.0 to 1.0, the precision rises from near 0.0 to approximately 1.0 for the fire class. At lower confidence thresholds, precision is lower due to a higher number of false positives. As the threshold rises, the model becomes more selective in improving precision. Notably, at a confidence threshold of 0.925, precision reaches 1.00. Similarly,
Figure 14a shows how recall decreases as the confidence threshold increases for the fire class. At the lowest confidence threshold (0.0), recall is highest, indicating that the model identifies 93% of fire instances. As the confidence threshold approaches 1.0, recall steadily drops to near 0.
Figure 15 refers to the training and validation loss curves and metrics of the YOLOv8m base model. The YOLOv8m model for the fire detection task shows strong performance with an mAP coefficient of 91.81% at a 50% IoU threshold. This model is highly effective at correctly detecting and localizing fires. The precision of 92.5% reflects the accuracy of the fire detections, meaning that a high proportion of the predicted fires are actual fires. The recall of 93% indicates the model’s ability to detect most of the fires. Finally, the applied YOLOv8m teacher (base) model attains an accuracy of 94%, showing a good balance between precision and recall.
Table 10 presents the performance summary of the YOLOv8m teacher model for the proposed fire events detection, demonstrating an impressive accuracy of 94.11% and a precision of 92.5%. Additionally, the model achieves a recall of 93%, an F1 score of 0.927, and a mean average precision (mAP) of 91.81%, indicating robust detection capabilities.
4.2.2. YOLOv8n: Student
Figure 16 presents the precision and recall evaluation of the distilled YOLOv8n student model. The model demonstrates the enhanced performance compared to the base YOLOv8m and undistilled YOLOv8n models. The precision–confidence curve for the distilled YOLOv8n model shows that, as the confidence threshold increases from 0.0 to 1.0, precision significantly improves, achieving a value of 1.00 at a confidence threshold of 0.989 for all classes. Conversely, at lower confidence thresholds, precision is lower due to a higher number of false positives. As the threshold increases, the YOLOv8n model becomes more selective, significantly improving precision.
Figure 17 illustrates the performance metrics of the YOLOv8n student model across different epochs, showing a consistent decrease in training and validation losses (box, classification, and DFL losses) as the epochs progress. Concurrently, the precision, recall, mAP50, and mAP50-95 metrics notably improve, indicating enhanced model accuracy and effectiveness over time.
4.2.3. DETR: Student
Figure 18 illustrate the evaluation of the distilled DETR model, demonstrating significant improvements over the baseline model. The horizontal axis represents the total number of steps, while the vertical axis denotes metric or loss values. During training, graphs show fluctuations due to the trial-and-error nature of model training and numerous peaks indicating the model moments of challenge in classifying objects. Despite these fluctuations, there is an overall decrease in loss, indicating that the model is learning to predict object locations more accurately over time. However, the general downward trend in this graph suggests that the model is gradually improving its classification abilities, overcoming initial challenges, and becoming more adept at distinguishing between different objects. In the evaluation phase, graphs (a) reflect the model’s accuracy in predicting object locations and measuring the model’s success in classifying objects. While the model evaluated images, the loss decreased consistently, suggesting improved localization capabilities. These graphs show a sharp initial decline, indicating a rapid improvement in classification accuracy. As the model continues, the loss stabilizes at much lower values than the initial ones, demonstrating the model’s enhanced proficiency in classification by the end of the evaluation phase.
Validation and training losses across the number of iterations for the DETR student model applied to the proposed fire detection task are depicted in
Figure 18.
Figure 18a,b show the validation bounding box and cross-entropy losses, respectively, both demonstrating a downward trend, indicating improved model performance with more iterations. As expected, the training bounding box and cross-entropy losses show gradually decreasing fluctuations, reflecting the model’s learning process and convergence over time.
Table 11 shows the applied distilled student models’ performance in detecting fire images. The YOLOv8n model achieves a remarkable mean average precision of 93.31% at an IoU threshold of 50%, demonstrating excellent accuracy in detecting and localizing fires. With a precision of 98.9%, the model accurately identifies the majority of actual fires in its detection while boasting a recall rate of 98%, showcasing its proficiency in detecting most fires present. The overall accuracy of 95.21% reflects a well-balanced performance between precision and recall. Conversely, the DETR model achieves a satisfactory accuracy (81.10%) and F1 score (0.806) and a degraded mAP of 71.4% at the same IoU threshold.
Figure 19 illustrates the accuracy improvements achieved through model distillation for both DETR and YOLOv8 models. The baseline (undistilled) DETR model achieves an accuracy of 73%, which increases to 81% after distillation, employing the advanced Autodistill technique. Similarly, the baseline YOLOv8 model shows an accuracy of 82%, which significantly improves to 95.21% with the application of distillation techniques. This circumstance demonstrates the effectiveness of the applied knowledge distillation techniques in enhancing the performance of the proposed fire detection task.
4.3. Proposed Drone and Raspberry Pi-Integrated Hardware Device for Real-Time Fire Detection
Finally, the applied AI-based deep learning models and Raspberry Pi 5 edge computing integrated drone have been tested in real-time for instantaneous fire event detection. The applied YOLOv8n model has been deployed to the embedded device because of its superior performance. The experiments were performed in the Mirpur area of Dhaka, Bangladesh, in June 2024. The drone maintained an average height of 6.90 m (22.64 feet) and a flight time of approximately 10 min.
Figure 20 depicts the various stages of the operation: the drone in flight, the fire on the ground, the ArduPilot control interface, and the live telemetry data showcasing the drone’s altitude, speed, and positioning over the target area.
The proposed AI-based fire detection drone system demonstrates an exceptional performance in real-time testing, as illustrated in
Table 12. Out of 65 attempts, it correctly detects fires in 38 instances and accurately identifies the absence of fires (no fire) in 20 cases. Notably, there are five false positive detections where the system incorrectly identified a fire. This impeccable performance results in an impressive accuracy rate of 89.23%. The system’s ability to accurately detect fires while minimizing false alarms underscores its reliability and effectiveness in real-world scenarios. Overall, the proposed drone integrated with Raspberry Pi 5 and Pi camera module 3 is highly dependable for fire detection tasks, offering a valuable tool for ensuring safety and a prompt response in various environments.
Figure 21 demonstrates the real-time fire detection capability of the proposed drone system during live experiments. Frame rate, expressed in frames per second (FPS), illustrates the frequency at which consecutive images (frames) are captured or displayed in a video. The YOLOv8n model, deployed on a Raspberry Pi 5 with a Pi camera module 3, successfully identifies and labels multiple instances of fire with an average frame rate of 8 FPS, i.e., eight distinct frames are processed every second. The confidence levels of the drone change with its altitude and the presence of multiple fire sources.
Table 13 offers a comparative overview of the proposed AI-based fire detection system with similar existing works. Few previous studies implemented fire identification systems with embedded devices. To the best of our knowledge, this is the first time that a knowledge distillation-based AI technique has been integrated with a drone and a Raspberry Pi 5 edge device for instantaneous fire detection in this work.
4.4. Limitations of the Proposed System
The proposed drone system for real-time fire detection using AI and edge devices confers some limitations. The dataset employed in this work encompassed a wide range of fire images. Generalizing the findings to other environments may be challenging due to variations in fire scenarios, infrastructure, and environmental contexts. The detection performance of the implemented deep learning models could be affected by changes in weather conditions, lighting, and seasonal variations. Due to the use of a Raspberry Pi 5 edge device, the continuous maintenance of a 27-watt power supply and the average image resolution can be considered the limitations of the proposed system.
5. Conclusions
Efficient and prompt fire detection is necessary to reduce economic and environmental losses. This work implements an automatic fire detection system using advanced computer vision techniques on an edge device. This system employs a combined dataset of 7187 images of various fire scenarios. Various cutting-edge deep learning models have been applied, e.g., DETR, Detectron2, and YOLOv8, in the PyTorch framework. Next, the knowledge distillation approach is implemented with YOLOv8m as the teacher (base) model and YOLOv8n and DETR as distilled student models. Additionally, this study proposes a low-cost implementation of an advanced embedded system built with a Raspberry Pi 5B and Pi camera integrated into drones. The lightweight YOLOv8n model has been deployed into the Raspberry Pi edge device for instantaneous fire identification. Real experiments have been performed to investigate the effectiveness of the proposed AI and edge device-based fire detection system. Comprehensive testing and data collection were conducted to assess the model’s accuracy and overall performance. The results of the proposed AI-based automatic fire detection study illustrate the potential to assist relevant authorities in promptly identifying and mitigating fire hazards.
In the future, the proposed model can be deployed into a string smartphone application platform to monitor real-time fire identification. Heterogeneous sensors comprising images, temperature, gas, and flame can be integrated with the AI-based device for robust fire detection. Open-world learning techniques can be implemented to reduce false positives and improve the system’s robustness.