FireXplainNet: Optimizing Convolution Block Architecture for Enhanced Wildfire Detection and Interpretability

Khan, Muneeb A.; Park, Heemin

doi:10.3390/electronics13101881

Open AccessArticle

FireXplainNet: Optimizing Convolution Block Architecture for Enhanced Wildfire Detection and Interpretability

by

Muneeb A. Khan

and

Heemin Park

^*

Department of Software, Sangmyung University, Cheonan 31066, Republic of Korea

^*

Author to whom correspondence should be addressed.

Electronics 2024, 13(10), 1881; https://doi.org/10.3390/electronics13101881

Submission received: 29 February 2024 / Revised: 16 April 2024 / Accepted: 8 May 2024 / Published: 11 May 2024

(This article belongs to the Special Issue Advanced Machine Learning, Pattern Recognition, and Deep Learning Technologies: Methodologies and Applications)

Download

Browse Figures

Versions Notes

Abstract

:

The early detection of wildfires is a crucial challenge in environmental monitoring, pivotal for effective disaster management and ecological conservation. Traditional detection methods often fail to detect fires accurately and in a timely manner, resulting in significant adverse consequences. This paper presents FireXplainNet, a Convolutional Neural Network (CNN) base model, designed specifically to address these limitations through enhanced efficiency and precision in wildfire detection. We optimized data input via specialized preprocessing techniques, significantly improving detection accuracy on both the Wildfire Image and FLAME datasets. A distinctive feature of our approach is the integration of Local Interpretable Model-agnostic Explanations (LIME), which facilitates a deeper understanding of and trust in the model’s predictive capabilities. Additionally, we have delved into optimizing pretrained models through transfer learning, enriching our analysis and offering insights into the comparative effectiveness of FireXplainNet. The model achieved an accuracy of 87.32% on the FLAME dataset and 98.70% on the Wildfire Image dataset, with inference times of 0.221 and 0.168 milliseconds, respectively. These performance metrics are critical for the application of real-time fire detection systems, underscoring the potential of FireXplainNet in environmental monitoring and disaster management strategies.

Keywords:

fire detection; neural network; convolutional neural networks (CNNs); convolutional block architecture

1. Introduction

Recent years have seen an alarming increase in the number of wildfires worldwide, highlighting the need for advanced and effective fire detection and management systems. Forests, which cover more than 30% of the Earth’s land area [1], are increasingly vulnerable to these intense wildfires due to factors such as climate change [2] and human activities [3]. In 2022, there were an alarming 66,255 wildfires that burned approximately 7.5 million acres of land in the United States alone [4]. These fires, whether caused by natural phenomena such as lightning and volcanic eruptions or by human-caused factors such as agricultural burning [5], have devastating effects. They lead to habitat destruction, loss of biodiversity, and significant deterioration in air quality, affecting vulnerable populations such as infants, pregnant women, the elderly, and those with chronic health conditions [6,7,8]. The Food and Agriculture Organization of the United Nations highlights the severity of these fires, noting their contribution to greenhouse gas emissions and the release of particulate matter, a major contributor to air pollution [9].

Traditional wildfire detection methods, such as satellite imagery [10,11] and ground-based sensor systems [12,13], are increasingly inadequate in the face of these increasing threats. While these traditional methods provide tangible data, they often face limitations due to the complex and unpredictable nature of wildfires. Ground-based sensor systems, for example, struggle with spatial constraints and inadequate resolution for comprehensive early detection in large, densely forested areas [14,15,16]. In addition, their effectiveness is often hampered by delayed response times. Vision-based systems, on the other hand, require manual feature extraction and threshold parameter adjustments, relying heavily on expert knowledge. This dependency limits their ability, restricting them to detecting only the basic characteristics of the flames and reducing their adaptability to the dynamic conditions of wildfires [17,18].

In response to these shortcomings, deep learning (DL) models, particularly Convolutional Neural Networks (CNNs), have emerged as a promising solution. These data-driven approaches eliminate the need for manual feature extraction and, unlike traditional methods, are able to learn and refine decision processes in real time [19,20]. By analyzing complex visual data from diverse sources, such as Unmanned Aerial Vehicles (UAVs) [21,22], Closed-Circuit Television (CCTV) cameras [23,24], and satellites [11,25], DL models have revolutionized the field of fire detection. These models significantly enhance the accuracy and reliability of detection systems, marking a paradigm shift from conventional methods to more advanced, data-oriented solutions [26]. However, despite their advancements, DL models typically require high computational costs, longer inference times, and large model sizes, which can be impractical for real-time applications. These challenges highlight the need for more efficient models, designed and tailored specifically for the detection of wildfires.

Moreover, a critical and often underaddressed aspect of AI-based fire detection systems is the lack of explainability and interpretability in these models. The integration of Explainable AI (XAI) into fire detection systems addresses this critical issue by making the output of the AI model understandable and transparent to users. It allows for a better understanding of why certain areas are flagged as potential fire zones, thereby not only improving model performance but also enabling more precise and reliable early detection of fires. This aspect is particularly crucial in modern wildfire management strategies, offering a more insightful and user-friendly approach to handling the complexities of wildfire detection and response [27].

Addressing these challenges, our study presents a novel approach to wildfire detection by employing a relatively simple DL architecture designed for efficiency in resource-constrained environments. The proposed model maintains a balance between performance and computational efficiency, making it suitable for real-world applications. Furthermore, we integrate Explainable AI (XAI) to enhance the interpretability of our model predictions. Our significant contributions not only include the development of a streamlined and interpretable model but also demonstrate that even less complex architectures can indeed compete with and, in some respects, outperform the multi-layered models typically employed in commercial settings. The main contributions of our papers include:

We present FireXplainNet, which features a lightweight design with reduced trainable parameters, enhancing computational efficiency. The model is built on a sequential CNN framework, known for its simplicity and adaptability in image classification tasks. This design choice underlines our commitment to creating a model that is both practical and effective for real-world applications, particularly on devices with limited resources.
The operational efficiency of FireXplainNet shows lower memory requirements and faster inference times, making it not only accurate but also practical for real-time applications. Unlike general-purpose, pretrained models with complex structures, FireXplainNet’s problem-specific, simplified architecture ensures enhanced performance with minimum resources, which is crucial for rapid response scenarios. This balance of accuracy and efficiency is vital in fire detection scenarios, where a rapid response is crucial. The performance of the model is validated on challenging datasets [28,29], demonstrating its robustness and reliability.
We conducted a thorough comparative analysis of FireXplainNet against established pretrained models (https://keras.io/api/applications/ (accessed on 24 January 2024)) using transfer learning. This analysis underscores the superior accuracy of our model in identifying fire incidents, reinforcing its potential for real-time application in fire surveillance systems. Our model not only stands out in accuracy but also excels in precision, recall, and F1 score, establishing its efficacy in diverse and complex fire scenarios.
This research incorporates Local Interpretable Model-agnostic Explanations (LIME) [30] to enhance the interpretability of our model. This integration facilitates a deeper understanding of the model’s decision-making process by highlighting critical features responsible for triggering wildfire detection. This level of interpretability enhances the transparency of the AI model and builds trust in its predictions, which is crucial for real-world fire detection applications.

2. Related Work

The evolution of wildfire detection methodologies, particularly with the integration of Artificial Intelligence (AI) and deep learning models, represents a significant stride in addressing the inherent complexities and unpredictability of wildfires. This section presents a review of contemporary advancements in fire detection methodologies.

Meena et al. [31] introduced a Region-Based Convolutional Neural Network (RCNN) that integrates with aerial surveillance for forest fire detection, achieving a commendable 97% accuracy at 20 epochs. Despite its high performance, the RCNN model requires substantial computational resources, which could hinder its scalability and practical deployment in real-world settings. Similarly, Shees et al. [32] developed FireNet-v2, a lightweight CNN optimized for real-time Internet of Things (IoT) applications, achieving 98.43% accuracy. While the model excels in environments where quick detection is paramount, its effectiveness across varied and complex forest ecosystems remains less explored, which could impact its generalizability and reliability in diverse operational scenarios.

In contrast, Pan et al. [33] proposed a computationally efficient pruned deep CNN model, achieving an accuracy of 93.36% and a detection rate of 91.60% through the integration of transfer learning and a window-based analysis approach. Although it addresses the need for faster processing in real-time fire detection systems, its low accuracy compared to more sophisticated models raises concerns about its reliability and effectiveness in diverse forest environments. Meanwhile, Muhammad et al. [34] tailored a CNN architecture inspired by SqueezeNet for CCTV surveillance systems, focusing on reducing computational demands by using smaller convolutional kernels and excluding dense, fully connected layers. This model achieves a balance between efficiency and accuracy, with an accuracy of 94.50%. However, the specific architectural choices aimed at minimizing the computational load might limit the model’s performance in detecting more subtle or varied fire characteristics in complex environments.

Several studies have also explored the generation of synthetic training data to overcome the limitation of scarce datasets. Zhang et al. [35] and Park et al. [36] utilized novel data augmentation techniques to enrich training sets for more effective model training. While these methods enhance model robustness, the synthetic data might not perfectly replicate the complex dynamics of real forest fires, potentially leading to inaccuracies in practical applications. Chaoxia et al. [37] developed an improved Faster R-CNN method for flame detection, incorporating a novel color-guided anchoring strategy and global information integration to enhance flame detection accuracy, while Barmpoutis et al. [38] combined Faster R-CNN with Linear Dynamic Systems (LDSs) for the precise localization of fire regions.

Additionally, in comparative studies, Li et al. [39] highlighted the strengths and weaknesses of various detection models, noting that YOLOv3 outperformed other models like Faster-RCNN and SSD in terms of speed (28 fps) and precision (83.7%). However, its generalization to different fire types remains an area for further improvement. Zhao et al. [40] proposed the Fire-YOLO algorithm, an enhanced version of YOLO specifically for fire detection, focusing on identifying small targets in varied scenarios, although the adaptation to broader types of fires is not fully explored.

Zhang et al. [41] proposed the FT-ResNet50, a modified version of the standard ResNet50, incorporating a Mish activation function and focal loss for wildfire detection in drone images. This model achieved a 79.48% accuracy, demonstrating improvements over the traditional VGG16 and ResNet50 models. However, while the FT-ResNet50 shows promise, the sub-80% accuracy indicates potential limitations in handling the variability and complexity of real-world fire scenarios. Guan et al. [42] introduced the DSA-ResNet (Dual Semantic Attention ResNet) for wildfire image classification based on aerial images. This model enhances detection accuracy by integrating wildfire characteristics across various convolutional layers through an added attention module, achieving 93.65% accuracy. While this represents a substantial advancement, the implementation complexity and computational demands of attention mechanisms in real-time applications warrant further exploration.

Pundir et al. [43] developed a dual deep learning framework that combines image-based and motion-based features for efficient smoke detection using Deep Convolutional Neural Networks (DCNNs). This approach emphasizes the importance of integrating various image analysis techniques for enhanced feature extraction, although the dual-feature integration might increase the computational overhead, impacting real-time deployment. Dutta et al. [44] presented a combined architecture of a separable convolution neural network and digital image processing aimed at the early detection of small-scale forest burns. This model seeks to facilitate early intervention in catastrophic events, though its effectiveness in large-scale and diverse forest conditions remains to be thoroughly assessed.

Hong et al. [45] introduced FireCNN, a novel neural network model designed for active fire detection in remote sensing images. Utilizing multiscale convolution and residual acceptance, this model focuses on efficient feature extraction. However, the scalability and adaptability of FireCNN in different environmental settings need to be validated in further studies.

Furthermore, Treneska et al. [46] explored the use of fine-tuned CNNs (VGG16, VGG19, Inception, ResNet50, and Xception) for the classification of wildfires, with ResNet50 showing superior performance with an 88.01% accuracy rate and a fast test time of 0.12 s in the FLAME dataset. Similarly, Khan et al. [47] employed the VGG19 model to recognize wildfires in aerial images, achieving a 95% accuracy rate through the use of transfer learning. These studies highlight the potential of fine-tuned CNNs in achieving high accuracy and speed, yet the variability in performance across different models and datasets underscores the need for careful model selection and tuning based on specific application requirements.

The existing literature on wildfire detection is often lacking in computational efficiency [29,39,40,42,46,48,49,50], with limited focus on model interpretability [32,33,36,37] and comprehensive comparative analysis [39,40,46,48,51]. Our FireXplainNet addresses these shortcomings by offering a lightweight and efficient design, integrating interpretability through LIME [30], and conducting extensive comparative evaluations, enhancing both practicality and transparency in diverse fire scenarios.

3. Methodology

3.1. Dataset Collection

In this study, we utilized two diverse datasets: the FLAME dataset [29] obtained from IEEE Dataport and the Wildfire Detection Image dataset [28] obtained from Kaggle. Both datasets comprise two distinct classes, Fire and No Fire, and provide a diverse range of real-world fire scenarios, like varying fire intensities and environmental conditions, for model training and validation.

The FLAME dataset [29], comprising a vast collection of 47,992 images, is divided into 31,501 for training, 7874 for validation, and 8617 for testing. This dataset exhibits a balanced mix, featuring 30,155 (62.8%) images under the Fire category and 17,837 (37.2%) images labeled as No Fire. Specifically, the training subset contains 25,018 Fire and 14,357 No-Fire images, while the testing subset includes 5137 Fire and 3480 No-Fire images for model training and evaluation.

Conversely, the Wildfire Image dataset [28], though more concise with 1900 images, offers an equally pivotal resource for model development. It is thoughtfully partitioned into 1465 training images, 367 validation images, and 68 testing images, with an even split of Fire and No-Fire images (950 each), ensuring a balanced dataset. The training portion includes 928 Fire and 904 No-Fire images, while the testing segment houses 22 Fire and 46 No-Fire images.

These datasets (as shown in Table 1) are critical to the development of an interpretable convolutional block-based architecture for efficient wildfire detection. These diverse and comprehensive datasets provide a robust foundation for training and testing to enhance wildfire detection capabilities.

3.2. Data Preprocessing

The datasets utilized in our study have diverse aspect ratios, with dimensions of 250 × 250 for Kaggle and 254 × 254 for the FLAME dataset. We standardized the resolution of the images across both datasets to a uniform size of 254 × 254 pixels. This step was critical in streamlining the training process by ensuring consistency in input data and minimizing computational complexities. Moreover, to address variations in illumination and contrast across the dataset, we normalized pixel intensities to a [0, 1] range. This ensures consistent brightness and contrast levels across the dataset, thereby improving the accuracy and adaptability of our model to diverse conditions.

3.3. FireXplainNet

This model is designed to be lightweight, characterized by a significantly reduced number of trainable parameters (1.9 M), a reduced computation time, and an improved accuracy. In pursuit of these objectives, a sequential Convolutional Neural Network (CNN) is utilized as the foundational architecture. The choice of a sequential CNN stems from its simplicity and widespread preference in image classification tasks due to its straightforward implementation and adaptability.

The proposed model, FireXplainNet, is adeptly designed to process input images (

I_{i n}

) with dimensions of 254 × 254 pixels, encompassing three color channels, indicative of the RGB color model. This configuration is particularly advantageous for the model’s primary task of fire detection. The three-channel architecture of the RGB color model is instrumental in capturing distinct fire-related features and offering a comprehensive view of the fire’s characteristics. Such a multi-channel approach enables the model to perform robust feature extraction, employ diverse data augmentation strategies, and enhance overall model generalization capabilities. Furthermore, the consistency of the RGB format with common imaging standards aids in the seamless integration and interpretation of fire detection outputs, making the model more accessible and trustworthy for users in practical applications. Additionally, the variation in intensities across the RGB channels plays a crucial role in improving edge detection capabilities, which is vital for accurately delineating fire boundaries. The input configuration for the model is mathematically represented as

I_{i n} \in R^{254 \times 254 \times 3}

.

Convolutional layers are essential for processing structured grid data, such as those found in fire detection scenarios. These layers work by convolving each input map (n − 1) with a two-dimensional filter

(F_{n x} \times F_{n y})

, where (x) and (y) denote the spatial dimensions of the filter. Each convolutional layer is composed of neurons with trainable biases and weights, allowing the model to iteratively extract relevant features from the input images. During the feedforward process, the filter dimensions (

F_{x}

and

F_{y}

) traverse the input maps to generate an output map for the

n th

layer, which is achieved by summing the convolutional responses from the previous (n− 1) layers.

The convolutional process to abstract complex features from input images can be encapsulated by the following mathematical representation:

I_{o u t} = R e L U (\sum_{i = 1}^{n - 1} I_{i n} * α_{n - 1 i} \times ω_{n i j} + β_{n j})

(1)

where ∗ signifies the convolution operation,

α_{n - 1 i}

denotes the activation of the filter from previous layers,

ω_{n i j}

represents the weight connecting input and output maps, and

β_{n j}

is the associated bias. The dimensions of

I_{i n}

and

I_{o u t}

remain consistent, underscoring the model’s efficiency in handling data.

The architecture of FireXplainNet (as shown in Figure 1) places a significant emphasis on convolutional blocks, which are designed to efficiently extract features from the input. These blocks contain layers that progressively employ filters of varying sizes, ranging from 32 to 512, to ensure the detection of increasingly complex patterns indicative of fire presence. This structured approach to feature extraction is critical for identifying subtle and overt indicators of fire in images. To support stable learning and optimize processing, batch normalization is implemented alongside convolutional operations, facilitating enhanced model training convergence. Additionally, max-pooling layers within the model serve to reduce the spatial dimensions of feature maps, thereby decreasing computational demands while preserving essential feature information, further exemplifying the model’s design efficiency and effectiveness in fire detection tasks.

Furthermore, to optimize the balance between the computational intensity and the effectiveness of the model, we fine-tune parameters such as kernel size and output dimensions with precision. The integration of the Rectified Linear Unit (ReLU) activation function and dropout as a regularization mechanism has been instrumental in enhancing the accuracy of FireXplainNet. FireXplainNet employs the sequence of increasing convolutional filters (32, 64, 128, 256) that are precisely engineered for the layered and nuanced detection of wildfires. This progression allows for the initial capture of general fire features, progressing to more detailed aspects, essential for accurate wildfire identification. The subsequent combination of GlobalAveragePooling2D and BatchNormalization is tailored to enhance the network’s processing efficiency, crucial for real-time fire scenario analysis, while the structured dropout layers at each stage significantly reduce overfitting, ensuring reliable performance in varied environmental conditions. The detailed architecture is presented in Figure 2.

3.4. Explainability of FireXplainNet Using LIME

In the domain of wildfire detection using deep learning models, explainability is a critical aspect. This research incorporates Local Interpretable Model-agnostic Explanations (LIME) [30] to enhance the interpretability of complex models. LIME elucidates the decision-making process of the model by creating an interpretable model around its predictions. This is achieved by perturbing the input data and observing the variations in predictions. This approach allows for the identification of key features that impact the proposed model decisions for specific instances.

We employ the LIME method to enhance the explainability of our complex model f. LIME’s distinctive approach is encapsulated in the following formulation:

ξ (x) = \underset{g \in G}{argmin} L (f, g, π_{x}) + Ω (g)

(2)

Equation (2),

ξ (x)

represents the explanation model specifically designed for an instance x. The model f signifies the complex, base model used in our study, while g denotes an interpretable model selected from a set G. The expression

\underset{g \in G}{argmin}

illustrates the optimization process to identify the optimal g that minimizes the objective function. This function is composed of two parts:

L (f, g, π_{x})

, which measures the fidelity between the predictions of f and g for perturbed samples near x, and

Ω (g)

, which quantifies the complexity of g. This formulation is central to the LIME approach, aiming to generate a model g that not only closely approximates f in the vicinity of x but also remains interpretable, bridging the gap between complex predictive models and understandable explanations.

Applying LIME in our wildfire detection framework enables the discernment of influential features in an image, such as specific color patterns or textures that are indicative of the presence of the fire. This interpretability not only increases the transparency of the AI model but also reinforces trust in its predictions. This is paramount in applications such as wildfire detection, where the rationale behind predictions can substantially influence decision-making and response strategies.

3.5. Employing Transfer Learning with Pretrained Models

In this study, the performance of the FireXplainNet model is evaluated against eight pretrained models: VGG16 [48], VGG19 [48], MobileNetv2 [52], EfficientNetB7 [53], ResNet50v2 [54], Xception [55], DenseNet121 [56], and Inception [57]. These models, despite their maturity, remain benchmarks in image classification, offering a relevant and rigorous standard for assessing FireXplainNet’s performance in wildfire detection. This comparison ensures a comprehensive evaluation against established methods in the field. We opted for pretrained models that have already been trained on the extensive ImageNet dataset. ImageNet is a comprehensive labeled dataset that encompasses over 14 million images across 1000 categories. To ensure an equitable comparison, we employed transfer learning to adapt these pretrained models to our specific dataset, facilitating a direct accuracy comparison with FireXnet. Furthermore, we implemented fine-tuning on these models, enhancing their accuracy to optimal levels. The methodology of the transfer learning approach used in this study is depicted in Figure 3.

In our transfer learning implementation, we froze all existing layers of the models except for the last two, which were retrained with our dataset. We introduced a batch normalization layer before the fully connected layer in each model. This was followed by a flattening step and the addition of a dense layer with 128 units and ReLU activation. Subsequently, we included batch normalization and a dropout layer with a rate of 0.4. Another dense layer followed, with a dropout rate of 0.3, culminating in a final dense layer with two units using the sigmoid activation function. This final layer was designed to output probabilities for four classes. The sigmoid function was chosen for its ability to maintain a constant total probability of “1” across the classes, as described by [58].

4. Results

4.1. Key Performance Indicators (KPIs)

The selection of robust evaluation metrics is imperative in the field of fire detection, as it critically determines the model’s proficiency in distinguishing between fire and non-fire scenarios, which is essential for reliable alarm or alert systems. For our fire detection model, we have chosen key performance indicators (KPIs), including accuracy, precision, recall, F1 score, and specificity.

Accuracy, as defined in Equation (3), measures the proportion of correctly identified samples to the total number of samples, providing an overall assessment of the model’s effectiveness.

Accuracy (A) = \frac{T P + T N}{T P + T N + F P + F N}

(3)

Precision, shown in Equation (4), evaluates the exactness of the model in identifying fire, indicating the reliability of positive detections.

Precision (P) = \frac{T P}{T P + F P}

(4)

Recall, detailed in Equation (5), quantifies the model’s capability to correctly identify all fire instances, aiming to minimize false negatives.

Recall (R) = \frac{T P}{T P + F N}

(5)

The F1 score, expressed in Equation (6), is a harmonic mean of precision and recall, reflecting the model’s balanced performance in accuracy and completeness.

F 1 - score = 2 \times \frac{P \times R}{P + R}

(6)

Specificity, as formulated in Equation (7), assesses the model’s ability to correctly classify non-fire instances, which is essential for reducing false positives.

Specificity = \frac{T N}{T N + F P}

(7)

These KPIs collectively establish a thorough evaluation framework, providing insights into the model’s diagnostic effectiveness in fire detection.

4.2. Performance Evaluation on the Wildfire Image Dataset

In our experiments, we trained the proposed model for 20 epochs to optimize the balance between learning effectiveness and computational efficiency. We utilized the Adam optimizer with a learning rate of 0.0001, carefully selected to ensure steady convergence and avoid common training issues, such as getting stuck in local minima or overshooting optimal weights. Figure 4 shows a graph of training and validation accuracy of the FireXplainNet on Wildfire Image dataset.

A comprehensive experiment on the Wildfire Image dataset [28] was conducted to evaluate the effectiveness of our proposed model. This study involved comparing FireXplainNet’s performance with several state-of-the-art (SOTA) models, including VGG16 [48], VGG19 [48], MobileNetv2 [52], EfficientNetB7 [53], ResNet50v2 [54], Xception [55], DenseNet121 [56], and Inception [57]. The results underscore our model’s exceptional capability, underscoring its suitability for real-time fire detection applications. A detailed comparison of our model against SOTA benchmark models, utilizing performance metrics from the test dataset, is presented in Table 2.

In accuracy, our proposed model, FireXplainNet, achieved 98.70%, surpassing other state-of-the-art (SOTA) models: VGG16 (97.13%), VGG19 (93.54%), MobileNetv2 (95.61%), EfficientNetB7 (77.13%), ResNet50v2 (94.66%), Xception (96.67%), DenseNet121 (96.32%), and Inception (97.87%). Specifically, our model outperformed these by margins of 1.57, 5.16, 3.09, 21.57, 4.04, 2.03, 2.38, and 0.83 percentage points, respectively.

As for precision, FireXplainNet attained 98.67%, which is higher than VGG16 (97.10%), VGG19 (93.11%), MobileNetv2 (95.25%), EfficientNetB7 (82.39%), ResNet50v2 (94.05%), Xception (96.70%), DenseNet121 (94.26%), and Inception (97.85%). The proposed model demonstrates a superior ability to correctly identify fire instances, minimizing false positives. It exceeds the precision of the aforementioned models by margins of 1.57%, 5.56%, 3.42%, 16.28%, 4.62%, 1.97%, 4.41%, and 0.82% respectively.

In recall, FireXplainNet achieved 98.69%, which is higher than all compared models. For the F1 score, FireXplainNet also excelled with 98.7%, outperforming all others. These metrics, coupled with the model’s high specificity rate of 98.80%, establish its efficiency and reliability in fire detection scenarios on the Wildfire Image dataset.

4.3. Performance Evaluation on the FLAME Dataset

For a comprehensive evaluation of our model, our model was compared against a suite of state-of-the-art (SOTA) counterparts, including VGG16 [48], VGG19 [48], MobileNetv2 [52], EfficientNetB7 [53], ResNet50v2 [54], Xception [55], DenseNet121 [56], and Inception [57]. This comparative analysis, as delineated in Table 3, underscores the robustness and superior accuracy of our proposed solution in the identification of fire incidents, thus strengthening its potential for real-time application in fire surveillance systems. Figure 5 shows a graph of training and validation accuracy of the FireXplainNet on Wildfire Image dataset.

Our model, FireXplainNet, exhibited exemplary performance in accuracy, achieving 87.32%, a critical metric in the realm of fire detection, where timely and precise identification is paramount. This performance surpasses that of its counterparts, notably outstripping models like VGG16 (63.98%) by 23.34%, MobileNetv2 (65.83%) by 21.49%, and EfficientNetB7 (42.09%) by a significant 45.23%. Such margins of superiority highlight our model’s adeptness at discerning fire scenarios from diverse and complex backdrops, a nontrivial feat, given the erratic nature of fire and smoke patterns.

Precision, a key determinant in reducing false alarms—a common challenge in fire detection systems—stood at an impressive 85.28% for our model. This metric is particularly crucial, as false alarms can lead to unnecessary resource deployment and panic. Our model outperforms other models like VGG16 (62.36%), MobileNetv2 (81.44%), and EfficientNetB7 (72.95%), emphasizing its reliability and accuracy in fire identification.

The recall rate of FireXplainNet, at 87.01%, further cements its ability to correctly identify fire incidents, a critical aspect in preventing the oversight of actual fire occurrences. The model’s F1 score, harmonizing precision and recall, manifests at 86.50%, demonstrating its balanced and reliable performance in fire detection scenarios.

The proposed CNN model not only stands out in accuracy but also excels in precision, recall, and F1 score. This multi-faceted superiority, coupled with its operational efficiency, makes it a promising candidate for real-time fire detection. Its ability to perform exceptionally well on a challenging dataset, reflective of real-world complexities, positions it as a robust and reliable tool in the arsenal against fire outbreaks.

4.4. Interpretative Analysis of Feature Visualizations

LIME has been adopted to elucidate the decision-making process of our FireXplainNet in the fire detection task. The integration of LIME facilitates a deeper understanding of the model by highlighting critical features responsible for triggering wildfire detections. LIME’s methodology of creating local, interpretable models around specific predictions allows for a detailed examination of the influence exerted by various features on the output of the model. This technique provides an in-depth look into the contributory significance of individual features in the model decisions. This method is an instance-centric approach that offers tailored explanations for each prediction, rendering it highly effective for nuanced interpretations. After highlighting LIME’s instance-centric approach and its effectiveness in nuanced interpretations, it is pivotal to specify how LIME dissects complex wildfire imagery. LIME distinguishes between true fire signatures and misleading fire-like features by analyzing key image attributes, such as color variations and texture patterns. This focused analysis crucially minimizes false positives and heightens detection accuracy. Moreover, the instance-specific explanations generated by LIME are instrumental in revealing which particular image features predominantly influence FireXplainNet’s predictions, thereby providing actionable insights for model refinement, especially in the variable and challenging visual scenarios of wildfires. The efficacy of this method is exemplified through visualizations, as shown in Figure 6, which delineate the feature-wise contributions to the predictions made by our proposed FireXplainNet model.

5. Discussion

In this section, we delve into the comparative analysis of FireXplainNet’s performance against established models, such as VGG16, VGG19, MobileNetv2, EfficientNetB7, ResNet50v2, Xception, DenseNet121, and Inception, across two distinct datasets: the Wildfire Image dataset and the FLAME dataset. The analysis spans both predictive performance metrics (accuracy, precision, recall, and F1 score) and operational metrics (specificity, memory usage, elapsed time, and average inference time), providing a comprehensive evaluation of FireXplainNet’s efficacy and efficiency.

FireXplainNet achieved an accuracy of 87.32% (as shown in Table 3), significantly outperforming DenseNet121, Xception, and VGG19, which achieved accuracy rates of 82.24%, 79.99%, and 74.01%, respectively. This superior accuracy highlights its robustness across different fire scenarios. A detailed comparative analysis of operational metrics across various models on the FLAME dataset, as presented in Table 4, underscores the efficiency of FireXplainNet. It achieves a sensitivity of 88.87% and a specificity of 88.01%, significantly outperforming other models. For instance, Xception and DenseNet121, which demonstrate sensitivity rates of 75.51% and 73.75% and specificity rates of 79.60% and 83.97%, respectively, are outperformed by FireXplainNet. This significant margin underscores FireXplainNet’s ability to accurately detect fire incidents and distinguish between fire and non-fire images, a critical attribute for minimizing false alarms and ensuring the reliability of fire detection systems.

Furthermore, in terms of computational efficiency, FireXplainNet outperforms VGG16, VGG19, and MobileNetv2 by using less memory (5.11 MBytes compared to 5.75, 6.47, and 7.44 MBytes, respectively) and less elapsed time (1787.29 s compared to 3173.20, 3610.79, and 1975.76 s, respectively). It also boasts the lowest average inference time of 0.221 s, demonstrating superior processing speed compared to other models. These metrics are particularly vital for real-time fire detection applications, where a swift response and minimal computational load are paramount.

On the Wildfire Image dataset, the proposed model, FireXplainNet, achieved a precision rate of 98.70% (as shown in Table 2), outperforming VGG16, VGG19, and EfficientNetB7 by margins of 1.57%, 5.56%, and 16.28% respectively. This high precision highlights FireXplainNet’s ability to effectively minimize false positives and prevent false alarms. In terms of recall, FireXplainNet also demonstrated superior performance, outperforming MobileNetv2, ResNet50v2, Xception, DenseNet121, and Inception by margins of 6.79%, 5.82%, 2.04%, 3.79%, and 0.81%, respectively.

Moreover, when assessing the operational performance of the model on the WildFire dataset (as shown in Table 5), FireXplainNet again stands out, demonstrating outstanding efficiency and effectiveness. It achieved a sensitivity of 99.98%, the highest among all models tested, and a specificity of 99.14%. This represents a significant reduction in memory usage compared to other models, such as VGG19, Xception, and DenseNet121, which require 14.19, 19.36, and 21.55 MBytes, respectively. Furthermore, FireXplainNet’s processing time of 108.23 s and an average inference time of 0.168 s are substantially lower than those of other models like ResNet50v2, which takes 176.44 s for processing and 0.190 s for inference, and Inception, which requires 192.88 s and 0.198 s, respectively. These metrics, particularly the remarkable sensitivity, highlight FireXplainNet’s operational efficiency and suitability for deployment in resource-constrained environments, making it highly effective for real-time fire detection applications.

FireXplainNet’s balanced performance across both datasets in terms of predictive accuracy, precision, and operational efficiency highlights its reliability, accuracy, and operational efficiency in fire detection. These attributes are crucial in scenarios where the speed of detection and accuracy are critical to the successful mitigation and management of fire incidents. The robustness and adaptability of the model make it a promising solution for integration into advanced fire detection systems, including satellite and aerial surveillance platforms, where its efficiency and speed could significantly enhance early detection capabilities and response strategies.

Our decision to utilize real-world datasets without direct experimentation is driven by several factors. First, the availability of comprehensive and varied datasets allows for a broad evaluation of the proposed model capabilities across different scenarios, which is crucial for assessing its generalizability and robustness in fire detection tasks. Moreover, conducting physical experiments involving fire can pose significant safety, ethical, and logistical challenges. By leveraging existing datasets, we aimed to mitigate these concerns while still providing meaningful insights into the model performance.

However, we acknowledge that in the context of fire prevention, factors such as detection speed, reliability under different conditions, and false positive rates are equally important. The emphasis on accuracy is intended to demonstrate the improved adaptability of FireXplainNet compared to other models, like VGG16, VGG19, MobileNetv2, etc. However, we agree that a more comprehensive evaluation, considering additional performance metrics, would provide more insights into FireXplainNet’s applicability to real-world fire prevention needs.

The use of transfer learning in our study is motivated by the objective to critically evaluate the adaptability and efficiency of pretrained models for the task of fire detection—a critical issue with significant societal impacts. By utilizing these pretrained models, we aimed to illustrate the constraints faced by these models when repurposed for the task of fire detection, with significant inference and computational costs. Contrary to expectations of their high performance in other domains, our findings suggest that the direct application of these models to real-world scenarios of fire detection tasks may not always result in the anticipated efficiency or cost-effectiveness.

6. Conclusions

In this paper, a convolutional block-based architecture, FireXplainNet, is proposed specifically for efficient fire detection. This block-based convolutional architecture allows our model to perform more nuanced and detailed feature extraction from images, enabling it to identify a wide array of fire characteristics under diverse conditions. Analyzing performance across various challenging datasets, including the FLAME and WildFire datasets, our model consistently outperformed standard SOTA models in accuracy, precision, recall, F1 score, and operational parameters. This superiority not only highlights its effectiveness in fire identification but also underscores its potential for practical application in diverse and complex fire scenarios.

Although the model demonstrates promising outcomes in controlled experimental environments, it still requires a comprehensive evaluation in real-world scenarios. Future efforts will be directed toward testing the model with datasets encompassing a variety of environmental conditions, aiming to evaluate its practical effectiveness in real-time fire detection applications. The ultimate objective is to offer a reliable and efficient tool for fire surveillance, contributing to better disaster management and improved safety measures. Additionally, in line with these objectives, we plan to conduct comprehensive ablation studies on FireXplainNet. This will specifically involve assessing the impact of pretraining on the model’s wildfire detection performance and comparing the pretrained model against a non-pretrained version. Key metrics such as detection accuracy and computational efficiency will be meticulously evaluated to substantiate the pretraining’s contribution to enhancing the model’s robustness and operational reliability in diverse and challenging fire scenarios. In the future, we aim to extend the validation of our results by incorporating additional datasets and enhancing FireXplainNet to include multi-class detection capabilities, such as distinguishing between smoke, combined smoke and fire, and fog. By expanding these features, we aim to comprehensively evaluate our model’s effectiveness and applicability in complex, real-world fire detection scenarios.

Author Contributions

Conceptualization, M.A.K. and H.P.; methodology, M.A.K. and H.P.; software, M.A.K.; validation, M.A.K.; formal analysis, M.A.K. and H.P.; investigation, M.A.K. and H.P.; resources, H.P.; data curation, M.A.K. and H.P.; writing—original draft preparation, M.A.K.; writing—review and editing, M.A.K. and H.P.; visualization, M.A.K. and H.P.; supervision, H.P.; project administration, H.P.; funding acquisition, H.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by a 2022 Research Grant from Sangmyung University, Republic of Korea.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The datasets utilized in this study comprise Aerial imagery pile burn detection using deep learning: The FLAME dataset [29] and Baris Dincer’s Wildfire Detection Image data [28] (https://www.kaggle.com/datasets/brsdincer/wildfire-detection-image-data (accessed on 24 January 2024)).

Conflicts of Interest

The authors declare no conflicts of interest.

References

World Health Organization. Wildfires. 2022. Available online: https://www.who.int/health-topics/wildfires#tab=tab_1 (accessed on 24 January 2024).
World Health Organization. Climate Change. 2024. Available online: https://www.who.int/health-topics/climate-change#tab=tab_1 (accessed on 24 January 2024).
Centers for Disease Control and Prevention. Wildfires. 2020. Available online: https://www.cdc.gov/climateandhealth/effects/wildfires.htm (accessed on 24 January 2024).
National Centers for Environmental Information (NCEI). Monthly Fire Reports. 2022. Available online: https://www.ncei.noaa.gov/access/monitoring/monthly-report/fire/202213 (accessed on 9 May 2024).
Lecina-Diaz, J.; Chas-Amil, M.L.; Aquilué, N.; Sil, Â.; Brotons, L.; Regos, A.; Touza, J. Incorporating fire-smartness into agricultural policies reduces suppression costs and ecosystem services damages from wildfires. J. Environ. Manag. 2023, 337, 117707. [Google Scholar] [CrossRef]
Driscoll, D.A.; Armenteras, D.; Bennett, A.F.; Brotons, L.; Clarke, M.F.; Doherty, T.S.; Haslem, A.; Kelly, L.T.; Sato, C.F.; Sitters, H.; et al. How fire interacts with habitat loss and fragmentation. Biol. Rev. 2021, 96, 976–998. [Google Scholar] [CrossRef] [PubMed]
Jaffe, D.A.; O’Neill, S.M.; Larkin, N.K.; Holder, A.L.; Peterson, D.L.; Halofsky, J.E.; Rappold, A.G. Wildfire and prescribed burning impacts on air quality in the United States. J. Air Waste Manag. Assoc. 2020, 70, 583–615. [Google Scholar] [CrossRef] [PubMed]
Zhang, Y.; Tingting, Y.; Huang, W.; Yu, P.; Chen, G.; Xu, R.; Song, J.; Guo, Y.; Li, S. Health Impacts of Wildfire Smoke on Children and Adolescents: A Systematic Review and Meta-analysis. Curr. Environ. Health Rep. 2023, 11, 46–60. [Google Scholar] [CrossRef] [PubMed]
Food and Agriculture Organization of the United Nations. State of the World’s Forests. 2020. Available online: https://www.fao.org/state-of-forests (accessed on 24 January 2024).
Khryashchev, V.; Larionov, R. Wildfire segmentation on satellite images using deep learning. In Proceedings of the 2020 Moscow Workshop on Electronic and Networking Technologies (MWENT), Moscow, Russia, 1–13 March 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–5. [Google Scholar]
Rashkovetsky, D.; Mauracher, F.; Langer, M.; Schmitt, M. Wildfire detection from multisensor satellite imagery using deep semantic segmentation. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 7001–7016. [Google Scholar] [CrossRef]
Castagna, J.; Senatore, A.; Pellis, G.; Vitullo, M.; Bencardino, M.; Mendicino, G. Uncertainty assessment of remote sensing-and ground-based methods to estimate wildfire emissions: A case study in Calabria region (Italy). Air Qual. Atmos. Health 2023, 16, 705–717. [Google Scholar] [CrossRef]
Baijnath-Rodino, J.A.; Martinez, A.; York, R.A.; Foufoula-Georgiou, E.; AghaKouchak, A.; Banerjee, T. Quantifying the effectiveness of shaded fuel breaks from ground-based, aerial, and spaceborne observations. For. Ecol. Manag. 2023, 543, 121142. [Google Scholar] [CrossRef]
Li, J.; Yan, B.; Zhang, M.; Zhang, J.; Jin, B.; Wang, Y.; Wang, D. Long-range Raman distributed fiber temperature sensor with early warning model for fire detection and prevention. IEEE Sens. J. 2019, 19, 3711–3717. [Google Scholar] [CrossRef]
Qiu, X.; Wei, Y.; Li, N.; Guo, A.; Zhang, E.; Li, C.; Peng, Y.; Wei, J.; Zang, Z. Development of an early warning fire detection system based on a laser spectroscopic carbon monoxide sensor using a 32-bit system-on-chip. Infrared Phys. Technol. 2019, 96, 44–51. [Google Scholar] [CrossRef]
Rjoub, D.; Alsharoa, A.; Masadeh, A. Early wildfire detection using UAVs integrated with air quality and LiDAR sensors. In Proceedings of the 2022 IEEE 96th Vehicular Technology Conference (VTC2022-Fall), London, UK, 26–29 September 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1–5. [Google Scholar]
Celik, T.; Demirel, H. Fire detection in video sequences using a generic color model. Fire Saf. J. 2009, 44, 147–158. [Google Scholar] [CrossRef]
Chen, T.H.; Wu, P.H.; Chiou, Y.C. An early fire-detection method based on image processing. In Proceedings of the 2004 International Conference on Image Processing, Singapore, 24–27 October 2004; IEEE: Piscataway, NJ, USA, 2004; Volume 3, pp. 1707–1710. [Google Scholar]
Dunnings, A.J.; Breckon, T.P. Experimentally defined convolutional neural network architecture variants for non-temporal real-time fire detection. In Proceedings of the 2018 25th IEEE International Conference on Image Processing (ICIP), Athens, Greece, 7–10 October 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1558–1562. [Google Scholar]
Muhammad, K.; Khan, S.; Elhoseny, M.; Ahmed, S.H.; Baik, S.W. Efficient fire detection for uncertain surveillance environment. IEEE Trans. Ind. Inform. 2019, 15, 3113–3122. [Google Scholar] [CrossRef]
Saadat, M.N.; Husen, M.N. An application framework for forest fire and haze detection with data acquisition using unmanned aerial vehicle. In Proceedings of the 12th International Conference on Ubiquitous Information Management and Communication, Langkawi, Malaysia, 5–7 January 2018; pp. 1–7. [Google Scholar]
Dang-Ngoc, H.; Nguyen-Trung, H. Evaluation of forest fire detection model using video captured by UAVs. In Proceedings of the 2019 19th International Symposium on Communications and Information Technologies (ISCIT), Ho Chi Minh City, Vietnam, 25–27 September 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 513–518. [Google Scholar]
Ahn, Y.; Choi, H.; Kim, B.S. Development of early fire detection model for buildings using computer vision-based CCTV. J. Build. Eng. 2023, 65, 105647. [Google Scholar] [CrossRef]
Park, M.; Jeon, Y.; Bak, J.; Park, S. Forest-fire response system using deep-learning-based approaches with CCTV images and weather data. IEEE Access 2022, 10, 66061–66071. [Google Scholar]
Thangavel, K.; Spiller, D.; Sabatini, R.; Amici, S.; Sasidharan, S.T.; Fayek, H.; Marzocca, P. Autonomous Satellite Wildfire Detection Using Hyperspectral Imagery and Neural Networks: A Case Study on Australian Wildfire. Remote Sens. 2023, 15, 720. [Google Scholar] [CrossRef]
Sathyakala, G.; Kirthika, V.; Aishwarya, B. Computer vision based fire detection with a video alert system. In Proceedings of the 2018 International Conference on Communication and Signal Processing (ICCSP), Chennai, India, 3–5 April 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 725–727. [Google Scholar]
Ghali, R.; Akhloufi, M.A.; Mseddi, W.S. Deep learning and transformer approaches for UAV-based wildfire detection and segmentation. Sensors 2022, 22, 1977. [Google Scholar] [CrossRef] [PubMed]
Dincer, B. Wildfire Detection Image Data. 2021. Available online: https://www.kaggle.com/datasets/brsdincer/wildfire-detection-image-data (accessed on 24 January 2024).
Shamsoshoara, A.; Afghah, F.; Razi, A.; Zheng, L.; Fulé, P.Z.; Blasch, E. Aerial imagery pile burn detection using deep learning: The FLAME dataset. Comput. Netw. 2021, 193, 108001. [Google Scholar] [CrossRef]
Ribeiro, M.T.; Singh, S.; Guestrin, C. “Why should i trust you?” Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 1135–1144. [Google Scholar]
Meena, U.; Munjal, G.; Sachdeva, S.; Garg, P.; Dagar, D.; Gangal, A. RCNN Architecture for Forest Fire Detection. In Proceedings of the 2023 13th International Conference on Cloud Computing, Data Science & Engineering (Confluence), Noida, India, 19–20 January 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 699–704. [Google Scholar]
Shees, A.; Ansari, M.S.; Varshney, A.; Asghar, M.N.; Kanwal, N. FireNet-v2: Improved Lightweight Fire Detection Model for Real-Time IoT Applications. Procedia Comput. Sci. 2023, 218, 2233–2242. [Google Scholar] [CrossRef]
Pan, H.; Badawi, D.; Cetin, A.E. Computationally efficient wildfire detection method using a deep convolutional network pruned via fourier analysis. Sensors 2020, 20, 2891. [Google Scholar] [CrossRef] [PubMed]
Muhammad, K.; Ahmad, J.; Lv, Z.; Bellavista, P.; Yang, P.; Baik, S.W. Efficient deep CNN-based fire detection and localization in video surveillance applications. IEEE Trans. Syst. Man Cybern. Syst. 2018, 49, 1419–1434. [Google Scholar] [CrossRef]
Zhang, Q.X.; Lin, G.H.; Zhang, Y.M.; Xu, G.; Wang, J.J. Wildland forest fire smoke detection based on faster R-CNN using synthetic smoke images. Procedia Eng. 2018, 211, 441–446. [Google Scholar] [CrossRef]
Park, M.; Tran, D.Q.; Bak, J.; Park, S. Advanced wildfire detection using generative adversarial network-based augmented datasets and weakly supervised object localization. Int. J. Appl. Earth Obs. Geoinf. 2022, 114, 103052. [Google Scholar] [CrossRef]
Chaoxia, C.; Shang, W.; Zhang, F. Information-guided flame detection based on faster R-CNN. IEEE Access 2020, 8, 58923–58932. [Google Scholar] [CrossRef]
Barmpoutis, P.; Dimitropoulos, K.; Kaza, K.; Grammalidis, N. Fire detection from images using faster R-CNN and multidimensional texture analysis. In Proceedings of the ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 12–17 May 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 8301–8305. [Google Scholar]
Li, P.; Zhao, W. Image fire detection algorithms based on convolutional neural networks. Case Stud. Therm. Eng. 2020, 19, 100625. [Google Scholar] [CrossRef]
Zhao, L.; Zhi, L.; Zhao, C.; Zheng, W. Fire-YOLO: A small target object detection method for fire inspection. Sustainability 2022, 14, 4930. [Google Scholar] [CrossRef]
Zhang, L.; Wang, M.; Fu, Y.; Ding, Y. A Forest Fire Recognition Method Using UAV Images Based on Transfer Learning. Forests 2022, 13, 975. [Google Scholar] [CrossRef]
Guan, Z.; Miao, X.; Mu, Y.; Sun, Q.; Ye, Q.; Gao, D. Forest fire segmentation from Aerial Imagery data Using an improved instance segmentation model. Remote Sens. 2022, 14, 3159. [Google Scholar] [CrossRef]
Pundir, A.S.; Raman, B. Dual deep learning model for image based smoke detection. Fire Technol. 2019, 55, 2419–2442. [Google Scholar] [CrossRef]
Dutta, S.; Ghosh, S. Forest fire detection using combined architecture of separable convolution and image processing. In Proceedings of the 2021 1st International Conference on Artificial Intelligence and Data Analytics (CAIDA), Riyadh, Saudi Arabia, 6–7 April 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 36–41. [Google Scholar]
Hong, Z.; Tang, Z.; Pan, H.; Zhang, Y.; Zheng, Z.; Zhou, R.; Ma, Z.; Zhang, Y.; Han, Y.; Wang, J.; et al. Active fire detection using a novel convolutional neural network based on Himawari-8 satellite images. Front. Environ. Sci. 2022, 10, 794028. [Google Scholar] [CrossRef]
Treneska, S.; Stojkoska, B.R. Wildfire detection from UAV collected images using transfer learning. In Proceedings of the 18th International Conference on Informatics and Information Technologies, Skopje, North Macedonia, 19–20 April 2021; pp. 6–7. [Google Scholar]
Khan, A.; Hassan, B.; Khan, S.; Ahmed, R.; Abuassba, A. DeepFire: A novel dataset and deep transfer learning benchmark for forest fire detection. Mob. Inf. Syst. 2022, 2022, 5358359. [Google Scholar] [CrossRef]
Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
Sousa, M.J.; Moutinho, A.; Almeida, M. Wildfire detection using transfer learning on augmented datasets. Expert Syst. Appl. 2020, 142, 112975. [Google Scholar] [CrossRef]
Tang, Y.; Feng, H.; Chen, J.; Chen, Y. ForestResNet: A deep learning algorithm for forest image classification. J. Phys. Conf. Ser. 2021, 2024, 012053. [Google Scholar] [CrossRef]
Ahmad, K.; Khan, M.S.; Ahmed, F.; Driss, M.; Boulila, W.; Alazeb, A.; Alsulami, M.; Alshehri, M.S.; Ghadi, Y.Y.; Ahmad, J. FireXnet: An explainable AI-based tailored deep learning model for wildfire detection on resource-constrained devices. Fire Ecol. 2023, 19, 54. [Google Scholar] [CrossRef]
Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv 2017, arXiv:1704.04861. [Google Scholar]
Tan, M.; Le, Q. Efficientnet: Rethinking model scaling for convolutional neural networks. In Proceedings of the 36th International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; pp. 6105–6114. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Chollet, F. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1251–1258. [Google Scholar]
Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar]
Umair, M.; Khan, M.S.; Ahmed, F.; Baothman, F.; Alqahtani, F.; Alian, M.; Ahmad, J. Detection of COVID-19 using transfer learning and grad-cam visualization on indigenously collected X-ray dataset. Sensors 2021, 21, 5813. [Google Scholar] [CrossRef]

Figure 1. Block diagram of the FireXplainNet architecture.

Figure 2. Proposed architecture of FireXplainNet.

Figure 3. Overview of the transfer learning approach.

Figure 4. FireXplainNet training and validation accuracy on Wildfire Image dataset [28].

Figure 5. FireXplainNet training and validation accuracy on FLAME dataset.

Figure 6. An example of LIME depicting the impact of individual features on a model prediction.

Table 1. Overview of FLAME and Wildfire Image datasets.

Dataset	Total Images	Classes	Training Set	Validation Set	Testing Set	Label Distribution
FLAME dataset [29]	47,992	Fire No Fire	31,501	7874	8617	Fire: 62.80% No Fire: 37.2%
Wildfire Image dataset [28]	1900	Fire No Fire	1465	367	68	Fire: 50% No Fire: 50%
Merged dataset	49,892	Fire No Fire	32,966	8241	8685	Fire: 62.34% No Fire: 37.66%

Table 2. Comparative performance metrics for fire detection on Wildfire Image dataset [28].

Model	Accuracy (%)	Precision (%)	Recall (%)	F1 Score (%)
VGG16 [48]	97.13	97.10	97.12	97.13
VGG19 [48]	93.54	93.11	92.87	93.40
MobileNetv2 [52]	95.61	95.25	91.90	92.21
EfficientNetB7 [53]	77.13	82.39	78.56	72.54
ResNet50v2 [54]	94.66	94.05	92.87	91.98
Xception [55]	96.67	96.70	96.65	96.67
DenseNet121 [56]	96.32	94.26	94.90	93.65
Inception [57]	97.87	97.85	97.88	97.86
FireXplainNet	98.70	98.67	98.69	98.7

Table 3. Comparative performance metrics for fire detection on FLAME dataset.

Model	Accuracy (%)	Precision (%)	Recall (%)	F1 Score (%)
VGG16 [48]	63.98	62.36	62.03	62.59
VGG19 [48]	74.01	73.54	72.52	70.15
MobileNetv2 [52]	65.83	81.44	65.83	64.06
EfficientNetB7 [53]	42.09	72.95	42.09	27.00
ResNet50v2 [54]	69.30	72.12	69.30	69.56
Xception [55]	79.99	80.18	79.99	80.06
DenseNet121 [56]	82.24	84.94	82.24	82.40
Inception [57]	76.24	72.71	74.24	71.08
FireXplainNet	87.32	85.28	87.01	86.50

Table 4. Operational metrics’ comparison across various models on FLAME dataset.

Model	Sensitivity (%)	Specificity (%)	Memory Usage (MByte)	Elapsed Time (s)	Average Inference Time (s)
VGG16 [48]	61.11	54.99	5.75	3173.20	0.236
VGG19 [48]	52.14	71.69	6.47	3610.79	0.244
MobileNetv2 [52]	82.84	71.34	7.44	1975.76	0.235
EfficientNetB7[53]	50.01	51.39	8.72	2712.52	0.287
ResNet50v2 [54]	65.63	70.70	10.62	2938.89	0.246
Xception [56]	75.51	79.60	11.66	3709.46	0.256
DenseNet121 [56]	73.75	83.97	13.37	3745.86	0.329
Inception [57]	77.04	78.63	15.06	2999.20	0.258
FireXplainNet	88.87	88.01	5.11	1787.29	0.221

Table 5. Operational metrics’ comparison across various models on WildFire dataset.

Model	Sensitivity (%)	Specificity (%)	Memory Usage (MByte)	Elapsed Time (s)	Average Inference Time (s)
VGG16 [48]	97.72	97.73	12.58	172.95	0.177
VGG19 [48]	95.55	95.55	14.19	185.39	0.171
MobileNetv2 [52]	99.02	97.17	15.11	128.72	0.181
EfficientNetB7 [53]	65.90	65.91	17.22	178.30	0.193
ResNet50v2 [54]	96.64	96.64	18.65	176.44	0.190
Xception [55]	99.97	96.21	19.36	211.18	0.187
DenseNet121 [56]	97.67	98.11	21.55	267.05	0.206
Inception [57]	88.76	97.09	24.40	192.88	0.198
FireXplainNet	99.98	99.14	9.45	108.23	0.168

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Khan, M.A.; Park, H. FireXplainNet: Optimizing Convolution Block Architecture for Enhanced Wildfire Detection and Interpretability. Electronics 2024, 13, 1881. https://doi.org/10.3390/electronics13101881

AMA Style

Khan MA, Park H. FireXplainNet: Optimizing Convolution Block Architecture for Enhanced Wildfire Detection and Interpretability. Electronics. 2024; 13(10):1881. https://doi.org/10.3390/electronics13101881

Chicago/Turabian Style

Khan, Muneeb A., and Heemin Park. 2024. "FireXplainNet: Optimizing Convolution Block Architecture for Enhanced Wildfire Detection and Interpretability" Electronics 13, no. 10: 1881. https://doi.org/10.3390/electronics13101881

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

FireXplainNet: Optimizing Convolution Block Architecture for Enhanced Wildfire Detection and Interpretability

Abstract

1. Introduction

2. Related Work

3. Methodology

3.1. Dataset Collection

3.2. Data Preprocessing

3.3. FireXplainNet

3.4. Explainability of FireXplainNet Using LIME

3.5. Employing Transfer Learning with Pretrained Models

4. Results

4.1. Key Performance Indicators (KPIs)

4.2. Performance Evaluation on the Wildfire Image Dataset

4.3. Performance Evaluation on the FLAME Dataset

4.4. Interpretative Analysis of Feature Visualizations

5. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI