1. Introduction
Fire and smoke detection systems use machine learning and deep learning but have limitations. They employ convolutional neural networks to evaluate photos and videos for fires, but fog and illumination misinterpretations cause many false alarms. System performance depends on data quality and diversity, and adversaries can trick models. Many fire event detection methods use Faster R-CNN, YOLO, or SSD to identify fire-related items in video frames [
1,
2,
3,
4]. These models are trained to spot flames and smoke. The RGB values of collected frames can be affected by chromatic aberration, an optical distortion that causes color fringing [
5]. High-quality gear is expensive and impracticable in remote regions. In addition, scalability is limited by computational resources. Use color channels to analyze fire events in still photos [
6]. To differentiate fire from its surroundings, researchers use RGB, HSV, or YUV color schemes. Improvements to dataset diversity, algorithm robustness, and hardware cost are underway [
7]. Using thermal imaging and environmental sensors to improve accuracy is being considered. These issues must be addressed to improve fire detection systems’ reliability and accessibility. Fire and smoke detection systems issue early alarms and reduce fire damage using machine learning and deep learning [
8,
9,
10,
11]. However, these systems are limited. Image and video smoke and flame detection commonly uses CNNs. Though intriguing, this technique has drawbacks. Zhang et al. developed DL-based convolutional neural network battery thermal imaging in 2023. Our CNN model correctly identified smoke detection system issues such hotspots and thermal irregularities [
12]. These algorithms can misinterpret steam, fog, or rapidly changing lighting conditions as smoke or flames, causing many false alarms.
Deep learning fire and smoke detection systems can improve public safety. A strong deep neural network that can effectively identify fire and smoke in real-time video or picture feeds is the main goal. It seeks to identify fires early, reduce false alarms, and speed up reaction times. Using an intelligent method to discern real dangers from steam or dust, the initiative aims to save lives, protect property, and improve community safety. To detect threats quickly and reduce false positives, the system must be optimized in varied environmental situations while balancing sensitivity and specificity. In addition to technological innovation and professional progress, using deep learning technology to give accurate, timely, and actionable fire protection information benefits society. The goal of a deep learning-based fire and smoke detection system is to improve fire safety in varied situations through intelligent and proactive solutions. Creating a robust deep learning model that can quickly and effectively detect fire and smoke in real-time video or picture feeds is the main problem. Early fire detection, reducing false alarms, and providing fast fire response are the goals of this system. A deep neural network architecture that can assess camera or sensor data, discriminate fire and smoke events from steam or dust, and quickly inform authorities or workers are the main goals. The system’s performance in different illumination, camera angles, and camera characteristics must also be optimized. Implementation requires balancing sensitivity and specificity to detect serious threats and reduce false positives. Using deep learning technology to give accurate, fast, and actionable information to safeguard lives and property, the fire and smoke detection system reduces fire risk and improves safety.
Figure 1 shows us the various functionalities and methodologies which were used in the fire and smoke detection systems. The functional aspects which were also mentioned in the existing work include image segmentation and the counter for the image frames and video to image frames conversion from the real-time video capturing technique. The methodologies which are used in the referred papers in the same domain include yolov5, Efficient Net B3, Efficient Net B2, and Res Net 50, which comes under the CNN family for precise object detection and detecting the depth, width, and resolution of the image frame. The confidence percentage which represents the model’s level of certainty or confidence in a particular prediction or classification helps to evaluate the obtained result. The existing framework which classifies the image and optimizes frames per second achieved the intended results which are promising. These aspects define the result of smoke and fire detection system by using machine learning techniques.
The image-based detection module uses an embedded counter program to classify fire occurrences from .jpg, .png, or .jpeg images. As long as fire occurrences are recognized, the counter gives firefighters real-time information. Fire detection uses image brightness, color index, and irregularity analysis to identify brightness anomalies. Faster R-CNN or YOLO algorithms improve object detection and shape analysis by finding fire-related items in pictures or video streams. Image segmentation and object detection increase the system’s precision by defining fire-related object boundaries. Fire detection using Support Vector Machines and Random Forests shows accuracy and reliability issues when trained on image and video information. Relational learning uses relevance clues to find unvisited relational spaces, creating heat combat anchors in firefighting zones. Deep learning and convolutional neural networks (CNNs) increase fire detection accuracy and speed. Frames per second (FPS) are critical in smoke detection; hence, the system optimizes computational resources for real-time applications using lightweight architectures like MobileNet and EfficientNet for embedded devices.
2. Related Work
Current fire and smoke detection systems often leverage machine learning and deep learning algorithms to enhance their capabilities, striving to provide early warnings and minimize fire-related damage [
13]. Nonetheless, these systems exhibit their own set of constraints and shortcomings. A prevalent approach is to use convolutional neural networks (CNNs) for real-time image and video analysis to detect smoke and flames. Furthermore, a hybrid system can be developed by integrating smoke detection methods with the current work on the intelligent management of fire disasters [
14]. While this method shows promise, it is not devoid of drawbacks. Notably, false alarms are a significant concern, as these algorithms can misinterpret non-fire events, such as steam, fog, or rapidly changing lighting conditions, as smoke or flames. These false positives can lead to unnecessary panic and response costs. Additionally, the system’s performance heavily relies on the quality of input data and the model’s training dataset. If the dataset lacks diversity or fails to represent real-world scenarios, the system may struggle to generalize and accurately detect fires. Furthermore, the models can be susceptible to adversarial attacks, where minor modifications to input images can deceive the algorithms into making incorrect predictions.
Another limitation is the need for high-quality hardware, including cameras and sensors, which can be expensive to install and maintain. In certain scenarios, deploying these systems may be impractical, particularly in remote or resource-constrained areas. For the poor performance in samples with large smoke, the classifier may have learned that fire is often accompanied by smoke during training [
15]. Moreover, the computational resources required for real-time fire and smoke detection can be substantial, limiting the scalability and accessibility of these systems, particularly in regions with limited access to powerful computing infrastructure. Despite these challenges, ongoing research in the field is dedicated to addressing these issues. Researchers are actively working to improve dataset diversity, enhance algorithm robustness, and reduce hardware costs. A fire detection and notification system were developed for BVI people using deep CNN models and an improved YOLOv4 object detector. The proposed fire detection system was trained using a custom indoor fire image dataset [
16]. Additionally, the integration of multiple data sources, such as thermal imaging and environmental sensors, is being explored to boost the accuracy and reliability of fire detection systems. In conclusion, current fire and smoke detection systems based on machine learning and deep learning algorithms have made significant advancements in enhancing fire safety. Nonetheless, they continue to face challenges related to false alarms, dataset quality, hardware prerequisites, and computational resources. It is essential to sustain research and development efforts to alleviate these limitations and make these systems more dependable and accessible for widespread use.
3. Proposed System
The proposed fire and smoke detection system represents a significant advancement, achieving an impressive 99% accuracy by harnessing the power of EfficientNet and YOLOv5 deep learning models. EfficientNet, a family of convolutional neural networks, strikes a balance between model accuracy and computational efficiency through a systematic approach to scaling network depth, width, and resolution. This approach ensures that the model learns a broad spectrum of features, from low-level details to high-level abstractions. A key advantage of EfficientNet is its use of a compound scaling method, which dynamically adjusts scaling parameters, optimizing accuracy while minimizing computational overhead. On the other hand, YOLOv5, which stands for “You Only Look Once,” is renowned for its real-time object detection capabilities, achieving high precision. It utilizes a backbone network based on CSPDarknet53 and incorporates PANet (Path Aggregation Network) for effectively capturing multi-scale features. This enhances the model’s accuracy in detecting fire and smoke in various sizes and contexts. YOLOv5 employs an anchor-based object detection approach with anchors of different sizes and aspect ratios, enabling it to accurately identify fire and smoke in diverse settings.
Figure 2 shows the exact process in which the detection is taking place. The first point is image/video channel recognition which identifies the challenges and best suited methods to handle them. The next point is image segmentation that deals with the detection of the area of the fire from the image or video. The third part is a counter which will count these segments and say how many fires there are. One of the mentioned papers deals with an efficient way to do so. The other components regard the dataset maintenance and updating which are carried out following the normal updating rules (ACID) and principles of the database.
Efficient Net and YOLOv5 both utilize sophisticated image processing techniques, such as depth-wise separable convolutions, batch normalization, and anchor boxes. These strategies are essential for optimizing image analysis, minimizing computing redundancies, and improving feature learning. Significantly, these changes lead to a more effective model structure, removing the requirement for computationally intensive post-processing procedures such as non-maximum suppression. This not only speeds up the inference process but also enhances the overall efficiency of the fire and smoke detection system, as
Figure 3.
4. Results and Discussion
The system depicts the number of fire objects in the image frame processed by the deep learning algorithms, the intensity of the detected fire object, and the exact location of the fire which has been overlooked. To make the system work anywhere, we can opt for a GUI framework. Now, most of the devices, including CCTV cameras, are also connected to LAN or the Internet. So, from the Internet, the application can be run with just any browser or even in JavaScript engines like v8. So, it becomes platform-independent and suitable to be used anywhere, unlike normal GUIs which are only device-specific. This will also cut down costs and will eliminate the need of semiconductors and normal sensors. To enhance fire and smoke detection systems, efforts are directed towards refining models with diverse datasets, expanding their use in real-world environments, including smart homes and industrial settings, and broadening their applications for environmental and health monitoring. Additionally, research focuses on enhancing security against adversarial attacks and integrating these systems with emergency responses for faster reactions in critical situations. Reducing costs is also a priority, making these systems more accessible for various users and applications. For the image and surveillance video dataset, the following dataset is used for the github DFire Dataset repository.
Figure 4 illustrates the performance graph of various versions of YOLO, including YOLOv2, YOLOv3, YOLOv4, YOLOv5 YOLOv6, YOLOv7, and YOLOv8, across essential metrics. We demonstrate comparable performance, particularly in precision, while YOLOv5 performs similarly to YOLOv7 but with a slight increase in average precision. The performance of YOLOv6 and YOLOv8 is also depicted, providing a comprehensive overview of how different versions of YOLO fare in terms of key evaluation metrics, aiding in the assessment and selection of the most suitable version for specific object detection requirements.
5. Conclusions
This study detected fire and smoke with impressive accuracy using the Efficient Net and YOLOv5 models. The models balance computational economy and precision well. Efficient Net’s systematic scaling and YOLOv5’s real-time object recognition helps the system capture features at several sizes, boosting its fire and smoke detection capabilities. YOLOv5’s anchor-based object recognition improves localization and decreases false alarms. By avoiding redundancy, advanced image processing methods reduce computational time and eliminate post-processing. False alarms, dataset quality and high-quality technologies are challenges that remain. This work emphasizes the need for ongoing research to overcome these constraints. It suggests integrating new data sources and diversifying datasets to make fire and smoke detection systems more accessible and reliable.