**1. Introduction**

As a natural hazard, wildfires represent one of the most important reasons for the evolution of ecosystems in the Earth's system on a global scale [1–3]. Recently, the frequency of occurrence of wildfires has increased significantly due to climate change and human activities around the world [4,5]. Wildfires can be influenced by the environment from different aspects, such as soil erosion, increasing flood risk, and habitat degradation for wildlife [6,7]. Furthermore, wildfires generate a wide range of pollutants, including greenhouse gases (i.e., methane and carbon dioxide) [8].

Burned area mapping (BAM) can be useful to predict the behavior of a fire, to define the burning biomass, for compensation from insurance companies, and for estimation of greenhouse gases emitted [9,10]. As result, the generation of reliable and accurate burned area maps is necessary for their management and planning in the support of decision

**Citation:** Seydi, S.T.; Hasanlou, M.; Chanussot, J. DSMNN-Net: A Deep Siamese Morphological Neural Network Model for Burned Area Mapping Using Multispectral Sentinel-2 and Hyperspectral PRISMA Images. *Remote Sens.* **2021**, *13*, 5138. https://doi.org/10.3390/ rs13245138

Academic Editors: Saverio Romeo and Paolo Mazzanti

Received: 14 November 2021 Accepted: 15 December 2021 Published: 17 December 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

making. BAM by traditional methods (e.g., field surveys) is a major challenge, and these methods have some limitations, such as the wide areas to be covered and the lack of direct access to the region of interest, which leads to large time and financial costs [10].

The Earth observation satellite fleet has steadily grown over the last few decades [11]. The diversity of Earth observation datasets means that remote sensing (RS) is now known as a key tool in the provision of valuable information about the Earth that is available at low cost and time needs on a global scale [12]. Currently, the upcoming new series of RS sensors (e.g., Landsat-9, *PRecursore IperSpettrale della Missione Applicativa* (PRISMA), Sentinel-5) provides improvements in terms of spatial, temporal, and spectral detail, with RS now becoming a routine tool with an extensive range of applications [13,14]. The most common applications of RS include classification [15,16] and detection of targets [17,18] and changes [19,20].

The diversity of RS Earth observation imagery and its free availability has meant that monitoring of changes following disasters has turned into a hot topic for research [21–28]. Indeed, we are witnessing many BAM products on a global scale that differ in terms of spatial resolution and reliability of the burned areas mapped. Based on spatial resolution, the recent BAM methods can be categorized into two main groups: (1) coarse spatial resolution satellite sensors and (2) fine spatial resolution sensors.

Burned area mapping based on the low and medium resolution of satellite imagery is common in the RS community. In recent years, many studies have used BAM based on Moderate Resolution Imaging Spectroradiometer (MODIS), Sentinel-3, Medium Resolution Imaging Spectrometer (MERIS), and Visible/Infrared Imager Radiometer Suite (VIIRS) [29–31]. However, while these sensors have a high temporal resolution, they suffer from low spatial resolution. Accurate BAM for small areas is a major challenge due to the mixing of pixels. Furthermore, the complex diversity of scenes can result in spectrum gains in one burned pixel to be mixed with some other material. Furthermore, these are based on ruleset classification and manual feature extraction such that the extraction of suitable features and the finding of optimum threshold values are time consuming.

Recently, with the arrival of a new series of cloud computing platforms (e.g., Google Earth Engine, Microsoft Azure), BAM using fine-resolution datasets has been considered by researchers. The capacity of cloud computing platforms has created a great opportunity for BAM based on high-resolution datasets and advanced machine-learning-based methods for accurate mapping. Based on the structure of the algorithm, we can categorize these methods into two main categories: (1) BAM by conventional machine-learning methods and (2) BAM via deep-learning-based frameworks.

Burned area mapping based on conventional machine-learning-based methods can be used to extract spectral and spatial features, and then to define the burned areas according to a classifier [10]. For instance, Donezar, et al. [32] designed a BAM framework based on the multitemporal change-detection method and time series synthetic aperture radar (SAR) imagery. They used an object-based image analysis method for classification of the SAR imagery. They also used the Shuttle Radar Topography Mission (SRTM) for digital elevation models to enhance their BAM results. Additionally, Xulu, et al. [33] considered a BAM method based on differenced normalized burned ratios and Sentinel-2 imagery in the cloud-based Google Earth engine. A random forest classifier method was used for the BAM. They reported an overall accuracy close to 97% for detection of burned areas. Moreover, Seydi, Akhoondzadeh, Amani and Mahdavi [10] evaluated the performance of a statistical machine-learning method for BAM using the Google Earth Engine and pre/post-fire Sentinel-2 imagery. Furthermore, they evaluated the potential spectral and spatial texture features using a Harris hawks optimization algorithm for the BAM. They reported an accuracy of 92% by the random forest classifier on the validation dataset. Liu, et al. [34] proposed a new index for BAM for bi-temporal Landsat-8 imagery and an automatic thresholding method. They evaluated the efficiency of their proposed method in different areas. Their BAM results showed that their presented method had high efficiency.

Recently, deep-learning-based approaches have been applied increasingly for mapping RS imagery, with promising results obtained. These methods can extract high-level features from the raw data automatically, by convolution layers. This advantage of deeplearning-based methods has resulted in their use for BAM. BAM based on deep-learningbased methods has become a hot topic of research, with many methods being proposed. For instance, Nolde, et al. [35] designed large-scale burned area monitoring in near-realtime based on the morphological active contour approach. This framework was applied through several steps: (1) generation of a normalized difference vegetation index (NDVI) for pre/post-fire; (2) determination of the region of interest based on active fires, anomaly detection, and region-growing methodologies; (3) accurate shape of the burned area perimeter extraction based on morphological snakes; (4) confidence evaluation based on a burn area index; and (5) tracking. The result was accuracy of 76% by evaluation with reference data. Knopp, et al. [36] carried out BAM by deep-learning-based semantic segmentation based on mono-temporal Sentinel-2 imagery. They used the U-Net architecture for their BAM. A binary change map was obtained based on the thresholding of the U-Net probability map. They reported the difficulty of the segmentation model in some areas, such as agriculture fields, rocky coastlines, and lake shores. de Bem, et al. [37] investigated the effects of patch sizes on the result of burned area classification with deep-learning methods using Landsat-8 OLI. Here, three different deep-learning methods were investigated: simple convolutional neural network (CNN), U-Net, and Res-U-Net. Their results showed that Res-U-Net had high efficiency, with a patch-size of 256 × 266. Hu, Ban and Nascetti [26] evaluated the potential of deep learning methods for BAM based on the unitemporal multispectral Sentinel-2 and Landsat-8 datasets. Their study showed that deep-learning methods have a high potential for BAM in comparison to machine-learning methods. Ban, et al. [38] experimented with the capacity of time series SAR imagery for BAM by a deep-learning method. To this end, their deep-learning framework was based on CNN and was developed to automatically detect burned areas by investigating backscatter variations in the time series of Sentinel-1 SAR imagery. They reported accuracy of <95% for BAM. Zhang, et al. [39] proposed a deep-learning framework for mapping burned areas based on fusion Sentinel-1 and Sentinel-2 imagery. Furthermore, they investigated two scenarios for training the deep-learning method: (1) continuous joint training with all historical data and (2) learning-without-forgetting based on new incoming data alone. They reported that the second scenario for BAM showed accuracy close to 90%, in terms of overall accuracy. Zhang, Ban and Nascetti [39] presented a deep-learning-based BAM framework by fusion of optical and radar datasets. They proposed a deep-learning framework based on CNN, with two convolution layers, max-pooling, and two fully connected layers. They showed an increase in the complexity of the network that resulted in rising computing needs, while the results for the burned area detection were not enhanced. Farasin, et al. [40] presented an automatic framework for evaluation of the damage severity level based on a supervised deep-learning method and post-fire Sentinel-2 satellite imagery. They used double-step U-Net architecture for two tasks (classification and regression). The classification generated binary damage maps and the regression was used to generate damage severity levels. Lestari, et al. [41] increased the efficiency of statistical machine-learning methods and a CNN classifier for BAM using optical and SAR imagery. Their BAM results showed that the CNN method has high efficiency in comparison with other machine-learning methods with texture features. Furthermore, the fusion of optical and SAR imagery can enhance the results of BAM. Belenguer-Plomer, et al. [42] developed a CNN-based BAM method by combining active and passive datasets. Sentinel-1 and Sentinel-2 were used by the CNN algorithm to generate burned areas. Their proposed CNN architecture included two convolution layers, a max-pooling layer, and two fully connected layers. The results of BAM have shown that combining Sentinel-1 and Sentinel-2 imagery can provide improvements.

Although many research efforts have proposed several algorithms for BAM and applied them to fine-resolution optical and SAR RS imagery, many limitations remain: (1) Semantic segmentation based methods (e.g., U-Net DeeplabV3+ and Seg-Net) have

provided promising results, but they need large numbers of labeled datasets, and finding large amounts of sample data with specific sizes (i.e., 512 × 512 or 1024 × 1024) for small areas is a major challenge. (2) The performance of statistical classification methods such as random forest or support vector machine classifiers depend on the setting of the input features, while the selection and extraction of the informative manual features can be a time-consuming process. (3) Some studies have focused on only spectral features for BAM, while the efficiency of spatial features in BAM has been shown in many studies; furthermore, an unsupervised thresholding manner on the spectral index is not always effective due to the complexity of the background and ecosystem characteristics, and to the topographic effects on the surface reflectance [43,44]. (4) Shallow feature representation methods have been shown not to be applicable in complex areas, especially for BAM tasks. (5) More methods have focused on time series Sentinel-1 imagery; however, preprocessing and processing of SAR imagery is very difficult due to noise conditions.

To overcome these problems, the present study presents a novel framework for BAM with heterogeneous datasets that has many advantages compared to other state-of-theart methods. The method proposed here is applied in an end-to-end manner without additional processes, based on a deep morphological network. This method is based on change detection that uses pre/post-fire datasets based on deep Siamese morphological operators. Additionally, the efficiency of the hyperspectral dataset in comparison with the multispectral dataset shows that this study takes advantage of the hyperspectral dataset. The proposed framework is additive with the type of datasets, whereby the pre-event dataset is Sentinel-2 imagery while the post-event dataset can be either Sentinel-2 or hyperspectral PRISMA datasets.

The main contributions of this study follow: (1) BAM is based on deep morphological layers for the first time; (2) it takes advantage of the hyperspectral PRISMA sensor dataset for accurate BAM for the first time; and (3) it includes evaluation of the performance of the multispectral and hyperspectral dataset in BAM and comparison of the results with state-of-the-art methods.

This paper is outlined as follows: Section 2 provides the details of the DSMNN-Net for BAM. Section 3 introduces the study areas and the datasets. The evaluation results of this study area are provided in Section 4, and the experimentation results are discussed in Section 5.

#### **2. Methodology**

The proposed framework is conducted in three steps, according to the flowchart in Figure 1. The first step is image preparation, and in this step, some preprocessing (i.e., registration) is applied. The second step is the training of the proposed network to tune the network parameters based on reference sample data. The training and validation datasets are exploited in the training process to optimize the model parameters, while the testing dataset is used to evaluate model hyperparameters. The third step is burned area map generation and accuracy assessment of the result of the BAM.

### *2.1. Proposed Deep Learning Architecture*

The proposed DSMNN-Net architecture for the detection of burned areas is illustrated in Figure 2. Accordingly, the framework has two parts: (1) two streams of deep-featureextraction models and (2) classification. The deep-feature-extraction task is conducted in a double-stream manner, such that these streams are for post-event and pre-event datasets, respectively. Then, the deep features are transformed to the next task, which is the classification. The classification task included two fully connected layers and a softmax layer for making a decision. More details of the DSMNN-Net are explained in the next subsection.
