*4.2. Validation Product*

The reference mapping that is generated by ICNF proves to be quite efficient in the generation of geospatial data, providing a database that is rich in accurate information of the burned areas throughout the national territory and of open access. However, here we list some advantages and limitations of the product, based on a visual comparison after the classification process. Initially, we emphasize the thoroughness of the delimitation at the edges of the burned area that is generated by the ICNF product, in front of a complex landscape, where the study area is inserted. This could be proven in both product classifiers that are generated by the MODIS sensor, as expected, with a high frequency of omission and commission pixels at the edges of the burned area and in urban areas, as shown in Figure 7. However, according to Mouillot et al. [45], OE and CE found at the edges of burned areas cannot be strictly seen as false or omitted alarms. For a given level of CE and OE, it is acceptable as long as both are similar. We can also see that the influence of the spatial validation product provided by ICNF was crucial for the errors that are shown in Figure 7, since this product is based on the 10 m resolution bands of MSI [129], thus causing the lowest error estimates for the ASTER sensor (15 m) due to its greater proximity to spatial detail.

Even with the absence of the blue and SWIR bands in the ASTER images, this sensor showed the highest accuracy parameters in both of the classifiers, although the good results found in OLI and MSI can be attributed to the use of these bands. Therefore, the 15 m GSD of ASTER was responsible for this good performance, although its proximity to the date of the reference mapping must also be taken into account.

#### *4.3. kNN and RF Classifiers*

The RF algorithm presented the highest quality values of the classification among all of the sensors and a greater stability in relation to the data change in the attribution of burned and unburned classes. This result corroborates the low complexity of its application, low cost of time, and memory. Despite the variations found in the OOB error with the number of trees, this parameter may not be very relevant in binary classifications, since the use of two classes reduces the voting options of each set of trees in the data set. In general, empirically, the error in the classification with this algorithm depends on the strength of the individual tree and the correlation between two trees in the forest. Strength can be interpreted as a performance measure for each tree. Increasing the correlation increases the error rate of the forest, and increasing the strength of an individual tree decreases the error rate of the forest, since a tree with a low error rate is a strong classifier. On the other side, reducing the number of selected random attributes reduces correlation and strength [85,130]. In our study, we selected 400 trees. In several studies of buried areas by the RF classifier, the largest number of trees commonly used ranges from 100 to 1500 [83,84,131].

However, kNN, even with accuracy values very close to the RF, mainly in the AUC parameter, has a direct relationship with the k parameter, time, and memory. Once a k value is given, more training samples are needed to improve the performance, but more time and storage memory were needed. In this study, the k value that was based on the RMSE was used. Therefore, the disparities found in the quality of the mapping of this classifier can be attributed to other parameters not tested here. The values are consistent with the studies conducted by Meng et al. [132]. The value of k may not present significant differences in relation to the final result of the mapping; however, this value directly influences the processing time. It is worth mentioning that, for k = 5, the processing time was 0.56 min, while, for k = 20, the time was approximately 70 min. This time interval depends significantly on the resources of the computer used and are common for kNN classifications, depending on the size and composition of the data set [133]. Blanzieri and Melgani [134] show that the best values of k were found empirically belowk=5 using SAR data, which could be explained by the image filtering applied to the true soil homogeneity. This indicates that the decrease in k is associated with the registration of optically active elements in the images. This statement is also related to the location of the pixels to be classified in relation to the training samples. When the k neighboring pixels are close enough, the precision will naturally tend to the value very close to the sample pixel set, consisting of a decrease in time and error.

#### *4.4. Accuracy Analysis*

It is observed that, in general, all of the classifications have low CE more frequently within the perimeter that is affected by the fire, although, for ASTER, there is a significant presence of missing mixing pixels and CE outside the burned area. This behavior may be related to the confusion of the classifiers in distinguishing between burned areas and dark soils with little vegetation. As said, this was quite evident in the classification with ASTER images, since, for this sensor, only three bands (green, red, and NIR) were used in the classification, which is, less resources for feature detection, which also favored the increase of false alarms pixels in relation to the other sensors for both classification

algorithms, as also found in [59]. This is spectrally true while taking the results found with OLI and MSI sensors into account because of NIR and SWIR bands used in the classification probably influenced the presence of low CE. These channels strongly reflect the spectral signal of change detection in the vegetation state in addition to having high separability between burned and unburned areas, as shown in Table 3 and in several works [46,135–137], who also used this region of the spectrum for the separation of burned areas obtaining satisfactory results. According to Lambin et al. [138], reflectance generally decreases in the NIR range after the fire event due to the removal of vegetation retained by water due to the fire. The decrease in brightness is more substantive than in the visible, which makes the NIR range more suitable for discriminating burned areas. The low CE for OLI, MSI, and ASTER can be attributed to the higher spatial resolution, since this condition improves the performance of classifying algorithms mainly in places with homogeneous and more compact distribution of the burned area [31,139]. For the classifications that were performed with MODIS, the largest CE of the data series were observed with pixels well distributed throughout the affected perimeter. In this case, the low spatial resolution of this sensor was the main cause of the errors, causing a high frequency of underestimated pixels inside and outside the burned area.

OE, being represented by pixels mistakenly classified as unburned areas, presented significant and well distributed values on the maps, with emphasis on the east sector of the burned area in both classifiers. These errors are related to the high frequency of pixels referring to small urban centers that are inserted in the burned area, which, in turn, were correctly classified as unburned areas, but, due to problems of pixels spectrally mixed at the edges of these features, there was a high presence of pixels of burned areas omitted from their assignment in the classification. This problem was also found in [46,140], who showed moderate performance in mapping burned areas in optically complex locations, caused by ambiguity problems in the classification and spectral mixing.

In the western sector, the same problem occurred, but, more frequently, because, in addition to the housing polygons, agricultural areas also caused confusion in the classifiers. This directly influenced the results of the spatial distribution of the missing pixels in burned areas, where both of the sensors presented area variations between 17 and 18 km2 for kNN classifier and between 16 and 17 km<sup>2</sup> for RF, which is, a high frequency of pixels incorrectly classified as unburned areas.

It is observed that for ASTER images, the classifications presented the smallest OE, with a spatial distribution of 8.73 km<sup>2</sup> of areas with missing pixels for kNN and 8.19 km<sup>2</sup> for RF. These values were already expected, since this sensor has the best spatial resolution of the set of images and, consequently, reduced spectral mixing problems, even using only visible bands. In addition, the use of ASTER images limited the overestimation of the burned areas due to the pixel size, most suitable for classifying unburned areas that are inserted in the investigated fire polygon [141].

In contrast, despite the lower spatial resolution of MODIS, there was a moderate frequency of missing pixels within the burned area when compared to the other sensors, which decreased the sensors OE reaching ~13–14 km2. It is more evident in the upper border, as shown in Figure 9g,h, the place of transition between burned and unburned areas, which, in turn, is more susceptible to errors that are caused by low spatial resolution. Another influencing factor can be explained by the process of creating the image composition of the MODIS sensor with the acquisition of the best pixel within the eight-day period. The result generates an image with moderate quality once some information is lost.

For the classifications that are generated by the best spatial resolution sensors (OLI, MSI, and ASTER), errors were found in the different elements of land use in the study area, for example, in the products generated by the OLI and MSI scenes. We detected a high frequency of OE on the main highways that cut the area that is affected by the fire, especially the highways N2 and N244 (Figure 9a,b,k,l), thus showing the limitation of the ICNF product in the detection of burned areas in these characteristics. In addition, these errors were also found in the ASTER images, but more frequently in the areas of pasture

and agriculture with approximately ~0.06–0.1 km<sup>2</sup> (Figure 9g,h). However, the reference product proved to be advantageous in the classification of areas of soil degradation in kNN and RF classifiers, erroneously classifying these areas as burned areas, as shown in Figure 9i,j.

Finally, kNN and RF classifiers were not efficient in differentiating water bodies and burned areas in all sensors, causing several CE pixels, as shown in Figure 9c–f. This result is in accordance with Roy et al. [74], Palomino-Ángel et al. [142], and Shimabukuro et al. [143], who reported classification errors in burned areas caused by the spectral similarity with water bodies.

#### *4.5. OA and Algorithms Errors*

Overall, the classifications present good estimates of OA and DC. These OA values are also related to the correct classification of unburned areas and, for this reason, particular attention needs to be paid to this parameter, not using it as the only thematic quality parameter [61]. The high DC values, as summarized in Table 5, show a good performance in continuous adherence with the reference data for the class of presence of burned area, even when considering the sensitivity of this parameter to false alarms and missing pixels shown in the maps of Figure 7.

Although Tanase et al. [144], in studies of burned areas in Tropical Africa, suggested that temporarily short sample units may underestimate the accuracy of the detection of burned areas, Schroeder et al. [145] showed, in their studies in the Brazilian Amazon, that the date of the imaging must be as close as possible with respect to the spatial reference data, which may have intensified the OE or increased areas with different time on the hour scale. The methods of detecting changes based on the application of temporal metrics to assess sudden variations in the pixel signature of moderate and coarse resolution sensors are gaining importance as better-quality satellite data sets become available [146,147].

In general, ASTER presented the highest values of OA and DC in relation to the values of the other sensors, because its spatial resolution may have a greater influence in detecting the details of fire scars. MODIS sensor showed the lowest values of OA and DC of all the sensors, being, however, large values. These data are important, as they show that even the low spatial accuracy of MODIS in relation to the reference map as well as OE and CE greater than 10% did not drastically decrease the estimates of OA and DC, because, with both classifiers and sensors, the maps were considered to be excellent according to Cohen's classification [117]. The same behavior was seen in Lanorte et al. [141], who showed, in applications of ASTER and MODIS sensors in burned areas in southern Italy, that these data were efficient in allowing the detection of burned areas and discriminating the severity of the fire.

The OLI and MSI sensors did not show significant variations in OA and DC, displaying MSI the best results, which is attributed to the low OE that was made in the classification. An identical result was found in [71,148], who reiterated that the reason why the classification provided by Sentinel-2 is more accurate than by Landsat 8 is due to the higher spatial resolution of Sentinel-2 images. Because of this, the burned areas obtained with the classification process on Landsat-8 may have been overestimated. Other studies following this approach also found similar OA values, for example, 90% in Axel [149], 79.2% in Liu et al. [75], 95% in Libonati et al. [61], 94.7% in Zhang et al. [150], 99% in Alonso-Cañas and Chuvieco [46], and 96% in Roy et al. [74].

It is worth noting that both of the classifiers require that choices be made by the modeler concerning numerous parameters under different performances. In general, the classifiers based on kNN and RF brought high quality in the classification of burned areas with AUC values above 0.88, DC above 76%, and OA above 89%, in addition to the ability to process data efficiently and enable parallel training of the same samples in different orbital data sets.

Therefore, the results show a statistically significant ROC curve with an AUC varying between 0.88 and 0.94 for both algorithms, showing that, even in the case of supervised clas-

sifications, approximately 90% of the burned areas were well classified by the algorithms in the different sensors. This result agrees with the initial study by Chou et al. [151], where the classification improvement was significant when accounting for spatial autocorrelation in logistic fire probability models in Southern California. Likewise, Siljander [152] found values of AUC in the order of 0.86–0.94, indicating that the fire classification models that were responsible for the spatial distribution of the affected areas showed themselves to be superior in the estimate of burned area on a regional scale when compared with products of global scale burning. In addition, Dlamini [153] found high precision with AUCs of 0.94 and 0.97 in models of Bayesian networks for data of active fire and burned area in ASTER images, respectively. The author also stressed the validity of the Bayesian networks and that the probability estimation based on the data from the burned area can estimate the fire risk a little better than from the active fire data.

#### **5. Conclusions**

Based on kNN and RF classifiers and using Landsat-8, Sentinel-2, and Terra imagery, a methodology for assessing their performance in the classification of burned areas in a forest fire occurred in central Portugal is proposed. The main conclusions are as follows:

(i) Less separability is observed for the visible and SWIR bands in all sensors, particularly in the green range, and high separability for NIR region.

(ii) For kNN classification algorithm, k = 5 was found as the best parameter. In the same line, for RF, 400 trees were selected as an optimal value.

(iii) No significant differences in the burned areas that were obtained with each classifier for each sensor were found.

(iv) When compared with ICNF validation data, the lower errors in the total burned area were found in the classifications that were performed with ASTER and the largest errors with MODIS.

(v) Contrary to expectation, the classification that was performed by OLI had greater precision but lower accuracy when compared to MSI. In general, high precision and accuracy were found in the classifications.

(vi) The lowest CE (<5%) were found in the classifications carried out with kNN and RF in OLI, MSI, and ASTER, and large CE, of the order of 15%, with MODIS, with a significant presence in ASTER outside the burned areas. Related to OE, significant and well distributed values were found in all sensors (8–20%), with emphasis on the eastern sector of the burned area, being the low values for ASTER.

(vii) The classification that was based on kNN and RF for the different sensors mapped the burned area with a very high accuracy (OA > 89% and DC > 0.8). The results show a statistically significant ROC curve with an AUC varying between 0.88 and 0.94 for both classifiers, showing that, even in the case of supervised classifications, approximately 90% of the burned areas were well classified by the algorithms in the different sensors.

It is possible to observe that the visible, intermediate, and SWIR bands showed low values of separability, which corresponds to the results that were found by Pereira et al. [118], who stated that the spectral changes induced by fire in the SWIR are similar to those in the visible range, since the burned areas are generally more reflective than green vegetation, but darker than vegetation predominantly in savannas during the dry season. It is important to note that the SWIR band has the advantage of having low interference with atmospheric scattering during the scene recording process. Following this premise, there may be no significant reduction in the spectral contrast of the surface in the images, consequently resulting in increased separability indices. However, this behavior was not observed in our experiments.

This methodology can be useful for mapping the burned areas in regions of native vegetation and the improvement of methods for monitoring the burned areas in Portugal, in addition to assisting in the management of fire in the region and estimating the impacts that are generated by it. The availability of detailed information on the spatial and temporal distributions of the burned areas is currently crucial. Therefore, the applied method makes it possible to survey the scars of fires while using geospatial data with the greatest possible accuracy, assisting in the maintenance of an information bank, serving not only the management of the territory, but also the comparison with related future events.

In general, the errors that were found in both kNN and RF classifiers can also be related to the creation of very heterogeneous objects, even in a region with a predominance of sparse vegetation. Despite the similar results of OE and CE and the differences in the processing of each algorithm, it was shown that the spectral resolution and, especially the spatial resolution, is a more important factor in the process of classification of burned areas. OE and CE are directly linked to the burned areas used as reference mapping, as product incompatibility can generate low generalization capacity and, consequently, OE and CE close to 100% as found in Lizundia-Laiola et al. [154].

Finally, this study opens up the possibility of using multiple Earth Observation data to assess environmental disturbances, increasing the range of possibilities for implementing these data when, for example, there is no scene or a specific band for a given period or problems with cloud cover.

**Author Contributions:** Conceptualization, A.d.P.P., J.A.d.S.J., A.M.R.-A. and R.F.F.H.; Data curation, A.d.P.P., J.A.d.S.J., A.M.R.-A. and R.F.F.H.; Formal analysis, A.d.P.P., J.A.d.S.J., A.M.R.-A. and R.F.F.H.; Investigation, A.d.P.P., J.A.d.S.J., A.M.R.-A. and R.F.F.H.; Methodology, A.d.P.P., J.A.d.S.J., A.M.R.-A. and R.F.F.H.; Supervision, A.d.P.P., J.A.d.S.J., A.M.R.-A. and R.F.F.H.; Validation, A.d.P.P., J.A.d.S.J., A.M.R.-A. and R.F.F.H.; Writing—original draft, A.d.P.P., J.A.d.S.J., A.M.R.-A. and R.F.F.H.; Writing review & editing, A.d.P.P., J.A.d.S.J., A.M.R.-A. and R.F.F.H. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Acknowledgments:** Research was supported by PAIUJA-2019/2020 and CEACTEMA from University of Jaén (Spain), and RNM-282 research group from the Junta de Andalucía (Spain). Special thanks to the four anonymous reviewers for their insightful comments.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**

