*3.2. Sentinel-2 Images*

Sentinel-2 is a European Space Agency Earth observation project that provides continuity to services dependent on multispectral high-spatial-resolution observations over the whole land surface of the Earth. This mission consists of two satellites, Sentinel-2-A and Sentinel-2-B, which have completed the existing Landsat and Spot missions and have enhanced data availability for RS communications. One satellite has a temporal resolution of 10 days, while the two satellites have a temporal resolution of 5 days [10]. The Sentinel-2 main sensor, the multispectral instrument, is based on the push-broom principle. Sentinel-2 has 13 spectral bands and broad spectral coverage.

This study used the Level-2A product as input data for BAM, which are surface reflectance data. Furthermore, it was necessary to convert the spatial resolution of the Sentinel-2 dataset into the spatial resolution of the hyperspectral dataset (30 m).

### *3.3. PRISMA Images*

PRISMA is a medium-resolution hyperspectral imaging mission of the Italian Space Agency that was launched in March 2019 [74]. The PRISMA sensor is a spaceborne system that acquires hyperspectral datasets continuously, with a repeat orbital cycle of approximately 29 days [75]. PRISMA images the Earth surface in 240 contiguous spectral bands (66 visible to near-infrared, plus 174 short-wave infrared) with a push-broom scanning mode, covering the wavelengths between 400 and 2500 nm, at a spatial resolution of 30 m. The high dimensional spectral bands provide the possibility to analyze complex land-cover objects [14]. We chose the level-2-D product for the BAM, which was preprocessed (i.e., atmospheric correction and geolocation, orthorectification) [14]. The PRISMA hyperspectral dataset is freely available on this website: http://prisma.asi.it/missionselect/, accessed on 16 November 2021. After removing the noisy and no-data bands, 169 spectral bands were chosen for the next analysis.

### **4. Experiments and Results**

Gathering of sample data is required to estimate the burned area due to using a supervised learning method. The quality and quantity of the sample data have a key role in BAM. In this study, the numbers of the sample data were kept at a sufficient level for the two classes (i.e., burned areas, unburned areas). Figure 6 shows the spatial distribution of the sample data.

**Figure 6.** *Cont.*

**Figure 6.** The spatial distributions of the two study areas for the burned area mapping. (**a**) Sample data for the first study area. (**b**) Sample data for the second study area.

In addition, Table 2 shows the sizes of sample data for two classes in the study areas.

**Table 2.** The number of samples used for mapping of burned area in the two study areas.


#### *4.1. Parameter Setting*

The DSMNN-Net has hyperparameters that need to be set. These hyperparameters were set manually based on trial and error. The optimum values of these parameters were set as follows: the input patch-size for Sentinel-2 and PRISMA sensors were 11 × 11 × 13 and 11 × 11 × 169, respectively, with 500 epochs; the weight initializer was set as Henormal-Initializer [76] for convolution layers; the random value for initializing of the morphological layers, number of neurons at the fully-connected layer was 900; the initial learning rate was 10−4; and the minibatch size was 550. It is worth noting that all of the hyperparameters were constant during the process for all of the CNN methods. Similarly, the two other methods set such values. Additionally, the selection of some of these

parameters was related to hardware (e.g., increasing minibatch size quickly filled the RAM of the system). Moreover, the weight initializer by the He-normal-initializer increased the speed of network convergence compared to the random initializer.

### *4.2. Results*

The results for the BAM for the two study areas are considered in this section. For the two main scenarios, these were investigated according to Table 3.



#### 4.2.1. First Study Area

Figure 7 shows the results of the BAM based on the post/pre-event Sentinel-2 imagery. Based on these results, the DSMNN-Net differed from the BAM. Most methods detected the burned areas, with differences seen in the detail. For example, there are some missed detection areas in the results of the two CNN-based methods (center of scene) while the method proposed here detected these well.

Figure 8 shows the mapping results for the heterogeneous dataset provided by various methods. As shown in Figure 8, all of the methods provided better performance in comparison with the first scenario (S#1) as a result of the small missed detection area that was significantly decreased. The results of the DSMNN-Net fit better with ground truth while results of other methods have many false pixels; in particular, for the Siamese network (Figure 8a). The main differences among these results are obvious at the edges of the burned areas.

**Figure 7.** *Cont.*

**Figure 7.** Visual comparisons of the results for the burned area mapping based on the post/pre-event Sentinel-2 imagery for the first study area. (**a**) Deep Siamese network. (**b**) Using the CNN method proposed in [42]. (**c**) Using the method proposed in the present study. (**d**) Ground truth map.

The numerical results for the BAM for the first study area are given in Table 4. Based on these data, the accuracies of all of the methods in both scenarios were >87% in all terms. The accuracy of the algorithms in combining the hyperspectral datasets with the multispectral dataset was significantly better than only the multispectral datasets. The accuracy of the BAM results based on the fusion of the PRISMA imagery and Sentinel-2 imagery was >94% by OA index. However, the results of BAM based on only the Sentinel-2 imagery were very close together, but these were considerably different in the second scenario (S#2). The method proposed here provided an accuracy of >97% in terms of the OA, Recall, and F1-score indices. Furthermore, this provided the highest score by KC index for the second scenario.

**Figure 8.** *Cont.*

**Figure 8.** Visual comparisons of the results for the burned area mapping based on pre-event Sentinel-2 and post-event PRISMA imagery for the first study area. (**a**) Deep Siamese network. (**b**) Using the CNN method proposed in [42]. (**c**) Using the DSMNN-Net. (**d**) Ground truth map.

**Table 4.** Accuracy assessment for the burned area mapping for the first study area. S#1, pre/post-event Sentinel-2 imagery; S#2, pre-event Sentinel-2 imagery and post-event PRISMA imagery.


OA, overall accuracy; IOU, intersection over union; KC, kappa coefficient.

#### 4.2.2. Second Study Area

Figure 9 illustrates the results of the BAM based on the bi-temporal multispectral/hyperspectral datasets for the second study area. Based on these results, there are some differences among the algorithms seen for the details. Figure 9a shows the performance of the deep Siamese network, in that it has low false pixels, although many missed detection pixels can be seen in the result. However, the lowest missed pixels can be seen in the BAM for the method proposed by [42] in Figure 9b, although it shows high false pixels in the results presented. The result of BAM by the DSMNN-Net can be seen in Figure 9c, which shows the lowest false pixels and missed pixels in the mapping.

The results of the BAM based on the Sentinel-2 and PRISMA sensors for the second area are presented in Figure 10. Based on the comparisons of the results presented with the multispectral dataset, there are some improvements in the details of the mapping. These improvements are more evident in the results of the DSMNN-Net, as some false pixels were classified correctly.

**Figure 9.** Visual comparisons of the results for the burned area mapping based on the post/pre-event Sentinel-2 imagery for the second study area. (**a**) Deep Siamese network. (**b**) Using the CNN method proposed in [42]. (**c**) Using the method proposed in the present study. (**d**) Ground truth map.

**Figure 10.** Visual comparisons of the results for the burned area mapping based on the pre-event Sentinel-2 and post-event PRISMA imagery for the second study area. (**a**) Deep Siamese network. (**b**) Using the CNN method proposed in [42]. (**c**) Using the DSMNN-Net. (**d**) Ground truth map.

> Table 5 presents the quantitative performance comparison of the methods for the second study area for the BAM. Based on these data, there are some enhancements in the results of the BAM in all terms for the case study area. The enhancement of the methods is more evident for the Recall, F1-Score, IOU, and KC indices. For example, the difference between Recall of the Siamese network in the first and second scenarios is >20%; moreover,

in other terms, this improvement is >3%. The improvement of the BAM in the CNN method proposed by [42] was slight. The DSMNN-Net has some improvements that are more evident by terms KC, IOU, F1-Score, and Recall. For example, these terms show greater improvement for the method proposed here in the detection of burned pixels, while there is some improvement in the detection of nonburned pixels.



OA, overall accuracy; IOU, intersection over union; KC, kappa coefficient.

#### **5. Discussion**

This study focused on BAM based on deep-learning methods based on bi-temporal multispectral and hyperspectral imagery. BAM has mainly been applied based on lowresolution satellite imagery (e.g., MODIS, VIIRS, and Sentinel-3). However, BAM based on these sensors has provided promising results, although mapping of small burned areas is the most important challenge. These methods support the high coverage areas but do not provide suitable results for small areas. Furthermore, there are some burned area products on a global scale based on the MODIS satellite imagery. Many studies have evaluated the accuracy obtained by BAM based on the MODIS collection, where this has been reported as <80%, while for the BAM for both study areas, the DSMNN-Net provided an accuracy of >98% by the OA index (Tables 4 and 5).

Most BAM is mainly based on high-resolution imagery (e.g., Landsat-8, Sentinel-2) for the normalized burned ratio index. Although this index has provided some promising results for BAM, due to the dependency of burned areas on the environmental features and the behavior of the fire, it is hard to discriminate burned areas from the background. This issue has reduced the efficiency of the BAM methods by the need to threshold the normalized burned ratio indices. Furthermore, some high-resolution burned area products on a global scale have been obtained based on this. Thus, these products do not support accurate BAM in practical real-world burned area estimation. Similarly, some unsupervised thresholding methods have been used, but due to the complexity of the background and noise conditions, the selection of suitable thresholds is another limitation of these methods.

Many BAM methods have been proposed based on machine-learning methods, such as random forest, K-nearest neighbor, and support machine vector. While these methods have provided acceptable results for BAM, they use handcrafted feature extraction. Manual feature extraction and then selection of suitable features is a time-consuming process. This issue needs to be considered when the study area is very large scale and the number of features is high. Additionally, these methods mainly focus on spectral features and ignore the potential of spatial features. The potential of spatial features has been shown in many studies on BAM based on machine-learning methods. The deep-learning-based methods can automatically extract deep features that are a combination of spectral and spatial features. This study has used the deep-feature extraction manner for BAM based on convolution and morphological layers.

To show further the effectiveness of the DSMNN-Net process, we visualized the feature maps in the different layers to look inside their internal operation and behavior. Figure 11 illustrates the visualization of the feature maps extracted from some layers in

the proposed DSMNN-Net for random pixels. The first layers show the shallow features, while the middle layers focus on the general structure around the central pixel.

**Figure 11.** Visualization of feature maps in the DSMNN-Net for a random pixel.

The efficiency of deep features and handcrafted features can be seen in Table 6. Based on these data, the DSMNN-Net provided greater accuracy compared to other state-of-theart methods for BAM using the Sentinel-2 imagery.


**Table 6.** Comparison of performance of the DSMNN-Net with other burned area mapping methods.

OA, overall accuracy.

Hyperspectral imagery has a high content of spectral information in comparison with multispectral imagery. This advantage of hyperspectral imagery helps to detect burned areas with a highly complex background. Thus, the main reason behind the robust results provided by the method proposed here is the use of the hyperspectral dataset for the BAM. The burned pixels have a high similarity to some unburned pixels, and to clarify this subject, we presented some spectral signatures of burned and unburned pixels in the different areas. Figure 12 illustrates the similarities of the spectral signatures for the two main classes. Based on these data, the burned and unburned pixels have similar behaviors in the 0.45 to 0.8 μm range, while for other areas there are some differences in the reflectance. Therefore, hyperspectral imagery and combining pre-event datasets can be useful for BAM.

Additionally, the proposed suitable deep-feature extraction framework is very important in deep-learning-based methods. Among these three deep-learning-based methods, the method proposed here provided the best performance in all of the scenarios and for both study areas. This issue originated from the architecture of the deep-learning methods in the extraction of deep features. The DSMNN-Net extracts the deep features based on the type of kernel convolution and morphological layers. Initially, the DSMNN-Net uses multiscale 3D kernel convolution that investigates the relation among the spectral bands in deep-feature extraction. Based on Figure 12, there are some differences among the spectral signatures for the same classes, and these differences are greater for some spectral bands. Furthermore, there is some overlap between the two classes in the spectral bands. Therefore, using 3D convolution can enhance the efficiency of the network, because this can consider the relations between the spectral bands and the relations between the central pixel and the neighboring pixels. This advantage is the most important factor in taking the full content of spectral information for the BAM. Then, the morphological layers are used to explore nonlinear characterizers of the input dataset in the mapping. Thus, the DSMNN-Net can extract high-level and informative deep features based on proposed architectures; as a result, accurate BAM is possible using this proposed method. Additionally, the diversity of objects and the complexity of unburned areas mean that the BAM changes, which is a challenge. Solving this challenge mainly requires increasing the depth of the network, which results in an increasing number of parameters, and the need for greater training data and time. Here, the DSMNN-Net uses morphological operation cases to investigate the

complexity of the background. The morphological operators use nonlinear operations that increase the efficiency of the network for the BAM.

**Figure 12.** Spectral signatures among the burned and unburned pixels.

Recently, some semantic-segmentation-based (U-Net architecture) methods have been proposed. While these methods can provide considerably improved results for BAM, it needs to be noted that they require large amounts of labeled data of a specific size (i.e., 512 × 512 or 256 × 256). Obtaining a large amount of labeled datasets for such an application is very difficult and time consuming. Furthermore, these models are more complex due to the higher number of parameters and the need for more time for training the network. The DSMNN-Net used close to 45,000 pixels for the BAM, and obtaining this amount of sample data is easy according to the extent of the study areas.

One of the most common challenges for BAM based on changes in detection methods is the detection of nontarget changes. For example, the second study area has some nontarget change areas where their changes originated from changes in the water level of the lakes. This issue meant that the methods considered these as burned areas, while they are nonburned areas. This challenge is more evident in the BAM by the CNN method proposed in [42]. The sample data should cover more areas in the background, although the method proposed here controlled this issue in the BAM.

The method proposed here uses adaptive heterogeneous datasets in the mapping of burned areas. However, pixel-based change detection methods can be applied for bitemporal multispectral/hyperspectral datasets easily, while they are difficult to apply for a heterogeneous dataset. In other words, some methods (i.e., image differencing algorithm) compare the pixel-to-pixel of the first and second time of bi-temporal dataset for BAM, while for heterogeneous datasets this is very difficult due to difference in a number of spectral bands, and content of datasets. The proposed DSMNN-Net can be applied to heterogeneous datasets without any additional processing (e.g., dimensional reduction). These advantages will also help in BAM when applied in a near real-time manner. It is worth noting that the proposed DSMNN-Net applied based on pre-event multispectral and post-event hyperspectral datasets while the bi-temporal pre-event and post-event hyperspectral datasets can improve the result of the BAM.

#### **6. Conclusions**

Accurate and timely BAM is the most important factor in wildfire damage assessment and management. In this study, a novel framework based on a deep-learning method (DSMNN-Net) and the use of bi-temporal multispectral and hyperspectral datasets was proposed. We evaluated the performance of the DSMNN-Net for two study areas in two scenarios: (1) BAM based on bi-temporal Sentinel-2 datasets and (2) BAM based on preevent Sentinel-2 and post-event PRISMA datasets. Furthermore, the results for the BAM are compared with other state-of-the-art methods, both visually and numerically. The results of the BAM show that the method proposed here has high efficiency in comparison with the other methods for BAM. Additionally, the use of hyperspectral datasets can improve the performance of BAM based on deep-learning-based methods. The experimental results of this study illustrate that the DSMNN-Net has some advantages: (1) it provides high accuracy for BAM; (2) it has a high sensitivity for BAM for complex background areas; (3) it is adaptive, with heterogeneous datasets for BAM (multispectral and hyperspectral); and (4) it can be applied in an end-to-end framework without any additional processing.

**Author Contributions:** Conceptualization, S.T.S. and M.H.; methodology, S.T.S.; writing—original draft preparation, S.T.S.; writing—review and editing, S.T.S., M.H. and J.C.; visualization, S.T.S.; supervision, M.H. and J.C.; funding acquisition, J.C. All authors have read and agreed to the published version of the manuscript.

**Funding:** This study received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Publicly available datasets were analyzed in this study. These datasets can be found at https://scihub.copernicus.eu/, accessed on 16 November 2021 and http://prisma. asi.it/missionselect/, accessed on 16 November 2021.

**Acknowledgments:** The authors would like to thank the European Space Agency and the Italian Space Agency for providing the datasets. We thank the anonymous reviewers for their valuable comments on our manuscript. This research was partially supported by the AXA fund for research and by MIAI @ Grenoble Alpes, (ANR-19-P3IA-0003).

**Conflicts of Interest:** The authors declare that they have no conflict of interest.

#### **References**

