1. Introduction
As a natural hazard, wildfires represent one of the most important reasons for the evolution of ecosystems in the Earth’s system on a global scale [
1,
2,
3]. Recently, the frequency of occurrence of wildfires has increased significantly due to climate change and human activities around the world [
4,
5]. Wildfires can be influenced by the environment from different aspects, such as soil erosion, increasing flood risk, and habitat degradation for wildlife [
6,
7]. Furthermore, wildfires generate a wide range of pollutants, including greenhouse gases (i.e., methane and carbon dioxide) [
8].
Burned area mapping (BAM) can be useful to predict the behavior of a fire, to define the burning biomass, for compensation from insurance companies, and for estimation of greenhouse gases emitted [
9,
10]. As result, the generation of reliable and accurate burned area maps is necessary for their management and planning in the support of decision making. BAM by traditional methods (e.g., field surveys) is a major challenge, and these methods have some limitations, such as the wide areas to be covered and the lack of direct access to the region of interest, which leads to large time and financial costs [
10].
The Earth observation satellite fleet has steadily grown over the last few decades [
11]. The diversity of Earth observation datasets means that remote sensing (RS) is now known as a key tool in the provision of valuable information about the Earth that is available at low cost and time needs on a global scale [
12]. Currently, the upcoming new series of RS sensors (e.g., Landsat-9,
PRecursore IperSpettrale della Missione Applicativa (PRISMA), Sentinel-5) provides improvements in terms of spatial, temporal, and spectral detail, with RS now becoming a routine tool with an extensive range of applications [
13,
14]. The most common applications of RS include classification [
15,
16] and detection of targets [
17,
18] and changes [
19,
20].
The diversity of RS Earth observation imagery and its free availability has meant that monitoring of changes following disasters has turned into a hot topic for research [
21,
22,
23,
24,
25,
26,
27,
28]. Indeed, we are witnessing many BAM products on a global scale that differ in terms of spatial resolution and reliability of the burned areas mapped. Based on spatial resolution, the recent BAM methods can be categorized into two main groups: (1) coarse spatial resolution satellite sensors and (2) fine spatial resolution sensors.
Burned area mapping based on the low and medium resolution of satellite imagery is common in the RS community. In recent years, many studies have used BAM based on Moderate Resolution Imaging Spectroradiometer (MODIS), Sentinel-3, Medium Resolution Imaging Spectrometer (MERIS), and Visible/Infrared Imager Radiometer Suite (VIIRS) [
29,
30,
31]. However, while these sensors have a high temporal resolution, they suffer from low spatial resolution. Accurate BAM for small areas is a major challenge due to the mixing of pixels. Furthermore, the complex diversity of scenes can result in spectrum gains in one burned pixel to be mixed with some other material. Furthermore, these are based on ruleset classification and manual feature extraction such that the extraction of suitable features and the finding of optimum threshold values are time consuming.
Recently, with the arrival of a new series of cloud computing platforms (e.g., Google Earth Engine, Microsoft Azure), BAM using fine-resolution datasets has been considered by researchers. The capacity of cloud computing platforms has created a great opportunity for BAM based on high-resolution datasets and advanced machine-learning-based methods for accurate mapping. Based on the structure of the algorithm, we can categorize these methods into two main categories: (1) BAM by conventional machine-learning methods and (2) BAM via deep-learning-based frameworks.
Burned area mapping based on conventional machine-learning-based methods can be used to extract spectral and spatial features, and then to define the burned areas according to a classifier [
10]. For instance, Donezar, et al. [
32] designed a BAM framework based on the multitemporal change-detection method and time series synthetic aperture radar (SAR) imagery. They used an object-based image analysis method for classification of the SAR imagery. They also used the Shuttle Radar Topography Mission (SRTM) for digital elevation models to enhance their BAM results. Additionally, Xulu, et al. [
33] considered a BAM method based on differenced normalized burned ratios and Sentinel-2 imagery in the cloud-based Google Earth engine. A random forest classifier method was used for the BAM. They reported an overall accuracy close to 97% for detection of burned areas. Moreover, Seydi, Akhoondzadeh, Amani and Mahdavi [
10] evaluated the performance of a statistical machine-learning method for BAM using the Google Earth Engine and pre/post-fire Sentinel-2 imagery. Furthermore, they evaluated the potential spectral and spatial texture features using a Harris hawks optimization algorithm for the BAM. They reported an accuracy of 92% by the random forest classifier on the validation dataset. Liu, et al. [
34] proposed a new index for BAM for bi-temporal Landsat-8 imagery and an automatic thresholding method. They evaluated the efficiency of their proposed method in different areas. Their BAM results showed that their presented method had high efficiency.
Recently, deep-learning-based approaches have been applied increasingly for mapping RS imagery, with promising results obtained. These methods can extract high-level features from the raw data automatically, by convolution layers. This advantage of deep-learning-based methods has resulted in their use for BAM. BAM based on deep-learning-based methods has become a hot topic of research, with many methods being proposed. For instance, Nolde, et al. [
35] designed large-scale burned area monitoring in near-real-time based on the morphological active contour approach. This framework was applied through several steps: (1) generation of a normalized difference vegetation index (NDVI) for pre/post-fire; (2) determination of the region of interest based on active fires, anomaly detection, and region-growing methodologies; (3) accurate shape of the burned area perimeter extraction based on morphological snakes; (4) confidence evaluation based on a burn area index; and (5) tracking. The result was accuracy of 76% by evaluation with reference data. Knopp, et al. [
36] carried out BAM by deep-learning-based semantic segmentation based on mono-temporal Sentinel-2 imagery. They used the U-Net architecture for their BAM. A binary change map was obtained based on the thresholding of the U-Net probability map. They reported the difficulty of the segmentation model in some areas, such as agriculture fields, rocky coastlines, and lake shores. de Bem, et al. [
37] investigated the effects of patch sizes on the result of burned area classification with deep-learning methods using Landsat-8 OLI. Here, three different deep-learning methods were investigated: simple convolutional neural network (CNN), U-Net, and Res-U-Net. Their results showed that Res-U-Net had high efficiency, with a patch-size of 256 × 266. Hu, Ban and Nascetti [
26] evaluated the potential of deep learning methods for BAM based on the unitemporal multispectral Sentinel-2 and Landsat-8 datasets. Their study showed that deep-learning methods have a high potential for BAM in comparison to machine-learning methods. Ban, et al. [
38] experimented with the capacity of time series SAR imagery for BAM by a deep-learning method. To this end, their deep-learning framework was based on CNN and was developed to automatically detect burned areas by investigating backscatter variations in the time series of Sentinel-1 SAR imagery. They reported accuracy of <95% for BAM. Zhang, et al. [
39] proposed a deep-learning framework for mapping burned areas based on fusion Sentinel-1 and Sentinel-2 imagery. Furthermore, they investigated two scenarios for training the deep-learning method: (1) continuous joint training with all historical data and (2) learning-without-forgetting based on new incoming data alone. They reported that the second scenario for BAM showed accuracy close to 90%, in terms of overall accuracy. Zhang, Ban and Nascetti [
39] presented a deep-learning-based BAM framework by fusion of optical and radar datasets. They proposed a deep-learning framework based on CNN, with two convolution layers, max-pooling, and two fully connected layers. They showed an increase in the complexity of the network that resulted in rising computing needs, while the results for the burned area detection were not enhanced. Farasin, et al. [
40] presented an automatic framework for evaluation of the damage severity level based on a supervised deep-learning method and post-fire Sentinel-2 satellite imagery. They used double-step U-Net architecture for two tasks (classification and regression). The classification generated binary damage maps and the regression was used to generate damage severity levels. Lestari, et al. [
41] increased the efficiency of statistical machine-learning methods and a CNN classifier for BAM using optical and SAR imagery. Their BAM results showed that the CNN method has high efficiency in comparison with other machine-learning methods with texture features. Furthermore, the fusion of optical and SAR imagery can enhance the results of BAM. Belenguer-Plomer, et al. [
42] developed a CNN-based BAM method by combining active and passive datasets. Sentinel-1 and Sentinel-2 were used by the CNN algorithm to generate burned areas. Their proposed CNN architecture included two convolution layers, a max-pooling layer, and two fully connected layers. The results of BAM have shown that combining Sentinel-1 and Sentinel-2 imagery can provide improvements.
Although many research efforts have proposed several algorithms for BAM and applied them to fine-resolution optical and SAR RS imagery, many limitations remain: (1) Semantic segmentation based methods (e.g., U-Net DeeplabV3+ and Seg-Net) have provided promising results, but they need large numbers of labeled datasets, and finding large amounts of sample data with specific sizes (i.e., 512 × 512 or 1024 × 1024) for small areas is a major challenge. (2) The performance of statistical classification methods such as random forest or support vector machine classifiers depend on the setting of the input features, while the selection and extraction of the informative manual features can be a time-consuming process. (3) Some studies have focused on only spectral features for BAM, while the efficiency of spatial features in BAM has been shown in many studies; furthermore, an unsupervised thresholding manner on the spectral index is not always effective due to the complexity of the background and ecosystem characteristics, and to the topographic effects on the surface reflectance [
43,
44]. (4) Shallow feature representation methods have been shown not to be applicable in complex areas, especially for BAM tasks. (5) More methods have focused on time series Sentinel-1 imagery; however, preprocessing and processing of SAR imagery is very difficult due to noise conditions.
To overcome these problems, the present study presents a novel framework for BAM with heterogeneous datasets that has many advantages compared to other state-of-the-art methods. The method proposed here is applied in an end-to-end manner without additional processes, based on a deep morphological network. This method is based on change detection that uses pre/post-fire datasets based on deep Siamese morphological operators. Additionally, the efficiency of the hyperspectral dataset in comparison with the multispectral dataset shows that this study takes advantage of the hyperspectral dataset. The proposed framework is additive with the type of datasets, whereby the pre-event dataset is Sentinel-2 imagery while the post-event dataset can be either Sentinel-2 or hyperspectral PRISMA datasets.
The main contributions of this study follow: (1) BAM is based on deep morphological layers for the first time; (2) it takes advantage of the hyperspectral PRISMA sensor dataset for accurate BAM for the first time; and (3) it includes evaluation of the performance of the multispectral and hyperspectral dataset in BAM and comparison of the results with state-of-the-art methods.
This paper is outlined as follows:
Section 2 provides the details of the DSMNN-Net for BAM.
Section 3 introduces the study areas and the datasets. The evaluation results of this study area are provided in
Section 4, and the experimentation results are discussed in
Section 5.
5. Discussion
This study focused on BAM based on deep-learning methods based on bi-temporal multispectral and hyperspectral imagery. BAM has mainly been applied based on low-resolution satellite imagery (e.g., MODIS, VIIRS, and Sentinel-3). However, BAM based on these sensors has provided promising results, although mapping of small burned areas is the most important challenge. These methods support the high coverage areas but do not provide suitable results for small areas. Furthermore, there are some burned area products on a global scale based on the MODIS satellite imagery. Many studies have evaluated the accuracy obtained by BAM based on the MODIS collection, where this has been reported as <80%, while for the BAM for both study areas, the DSMNN-Net provided an accuracy of >98% by the OA index (
Table 4 and
Table 5).
Most BAM is mainly based on high-resolution imagery (e.g., Landsat-8, Sentinel-2) for the normalized burned ratio index. Although this index has provided some promising results for BAM, due to the dependency of burned areas on the environmental features and the behavior of the fire, it is hard to discriminate burned areas from the background. This issue has reduced the efficiency of the BAM methods by the need to threshold the normalized burned ratio indices. Furthermore, some high-resolution burned area products on a global scale have been obtained based on this. Thus, these products do not support accurate BAM in practical real-world burned area estimation. Similarly, some unsupervised thresholding methods have been used, but due to the complexity of the background and noise conditions, the selection of suitable thresholds is another limitation of these methods.
Many BAM methods have been proposed based on machine-learning methods, such as random forest, K-nearest neighbor, and support machine vector. While these methods have provided acceptable results for BAM, they use handcrafted feature extraction. Manual feature extraction and then selection of suitable features is a time-consuming process. This issue needs to be considered when the study area is very large scale and the number of features is high. Additionally, these methods mainly focus on spectral features and ignore the potential of spatial features. The potential of spatial features has been shown in many studies on BAM based on machine-learning methods. The deep-learning-based methods can automatically extract deep features that are a combination of spectral and spatial features. This study has used the deep-feature extraction manner for BAM based on convolution and morphological layers.
To show further the effectiveness of the DSMNN-Net process, we visualized the feature maps in the different layers to look inside their internal operation and behavior.
Figure 11 illustrates the visualization of the feature maps extracted from some layers in the proposed DSMNN-Net for random pixels. The first layers show the shallow features, while the middle layers focus on the general structure around the central pixel.
The efficiency of deep features and handcrafted features can be seen in
Table 6. Based on these data, the DSMNN-Net provided greater accuracy compared to other state-of-the-art methods for BAM using the Sentinel-2 imagery.
Hyperspectral imagery has a high content of spectral information in comparison with multispectral imagery. This advantage of hyperspectral imagery helps to detect burned areas with a highly complex background. Thus, the main reason behind the robust results provided by the method proposed here is the use of the hyperspectral dataset for the BAM. The burned pixels have a high similarity to some unburned pixels, and to clarify this subject, we presented some spectral signatures of burned and unburned pixels in the different areas.
Figure 12 illustrates the similarities of the spectral signatures for the two main classes. Based on these data, the burned and unburned pixels have similar behaviors in the 0.45 to 0.8 μm range, while for other areas there are some differences in the reflectance. Therefore, hyperspectral imagery and combining pre-event datasets can be useful for BAM.
Additionally, the proposed suitable deep-feature extraction framework is very important in deep-learning-based methods. Among these three deep-learning-based methods, the method proposed here provided the best performance in all of the scenarios and for both study areas. This issue originated from the architecture of the deep-learning methods in the extraction of deep features. The DSMNN-Net extracts the deep features based on the type of kernel convolution and morphological layers. Initially, the DSMNN-Net uses multiscale 3D kernel convolution that investigates the relation among the spectral bands in deep-feature extraction. Based on
Figure 12, there are some differences among the spectral signatures for the same classes, and these differences are greater for some spectral bands. Furthermore, there is some overlap between the two classes in the spectral bands. Therefore, using 3D convolution can enhance the efficiency of the network, because this can consider the relations between the spectral bands and the relations between the central pixel and the neighboring pixels. This advantage is the most important factor in taking the full content of spectral information for the BAM. Then, the morphological layers are used to explore nonlinear characterizers of the input dataset in the mapping. Thus, the DSMNN-Net can extract high-level and informative deep features based on proposed architectures; as a result, accurate BAM is possible using this proposed method. Additionally, the diversity of objects and the complexity of unburned areas mean that the BAM changes, which is a challenge. Solving this challenge mainly requires increasing the depth of the network, which results in an increasing number of parameters, and the need for greater training data and time. Here, the DSMNN-Net uses morphological operation cases to investigate the complexity of the background. The morphological operators use nonlinear operations that increase the efficiency of the network for the BAM.
Recently, some semantic-segmentation-based (U-Net architecture) methods have been proposed. While these methods can provide considerably improved results for BAM, it needs to be noted that they require large amounts of labeled data of a specific size (i.e., 512 × 512 or 256 × 256). Obtaining a large amount of labeled datasets for such an application is very difficult and time consuming. Furthermore, these models are more complex due to the higher number of parameters and the need for more time for training the network. The DSMNN-Net used close to 45,000 pixels for the BAM, and obtaining this amount of sample data is easy according to the extent of the study areas.
One of the most common challenges for BAM based on changes in detection methods is the detection of nontarget changes. For example, the second study area has some nontarget change areas where their changes originated from changes in the water level of the lakes. This issue meant that the methods considered these as burned areas, while they are nonburned areas. This challenge is more evident in the BAM by the CNN method proposed in [
42]. The sample data should cover more areas in the background, although the method proposed here controlled this issue in the BAM.
The method proposed here uses adaptive heterogeneous datasets in the mapping of burned areas. However, pixel-based change detection methods can be applied for bi-temporal multispectral/hyperspectral datasets easily, while they are difficult to apply for a heterogeneous dataset. In other words, some methods (i.e., image differencing algorithm) compare the pixel-to-pixel of the first and second time of bi-temporal dataset for BAM, while for heterogeneous datasets this is very difficult due to difference in a number of spectral bands, and content of datasets. The proposed DSMNN-Net can be applied to heterogeneous datasets without any additional processing (e.g., dimensional reduction). These advantages will also help in BAM when applied in a near real-time manner. It is worth noting that the proposed DSMNN-Net applied based on pre-event multispectral and post-event hyperspectral datasets while the bi-temporal pre-event and post-event hyperspectral datasets can improve the result of the BAM.