2.2.2. Training Samples

The initialization of a supervised classification process requires composite images and training samples (polygons). The sample polygons selected in the composite images are used to obtain the burned and unburned areas class descriptors. The training areas were randomly collected, with 30 polygons with an area of 65 km<sup>2</sup> for each class, respecting the separation limits that are based on the ICNF burned area product.

#### 2.2.3. Separability Analysis

The purpose of the separability analysis was to evaluate the spectral separability in all of the bands used in the classification of burned and unburned areas and contribute, for instance, to the decision of which bands have greater classification properties in supervised classification algorithms. The separability of each pair between classes can be quantitatively measured by the average distance between the class density distributions of the pairs or histograms of the values of each band [75]. The Jeffries–Matusita (JM) distance is one of the most widely criterion used in remote sensing in the field of pattern recognition and feature selection. In comparison with other separability indices, JM distance has been suggested as more reliable in separability measures, and also more suitable for less homogeneous main classes [102]. Therefore, we chose the JM distance to indicate the separability between the burned and unburned vegetation. It is calculated according to Equation (1), as [103]:

$$JM\_{ij} = \sqrt{2(1 - \varepsilon^{-B})} \tag{1}$$

where *B* is the Bhattacharyya distance given by Equation (2), as:

$$B\_{ij} = \frac{1}{8} \left(\mu\_j - \mu\_i\right)^T \left[\frac{\Sigma\_i + \Sigma\_j}{2}\right]^{-1} \left(\mu\_j - \mu\_i\right) + \frac{1}{2} \ln \frac{\frac{|\Sigma\_i + \Sigma\_j|}{2}}{\sqrt{|\Sigma\_i||\Sigma\_j|}}\tag{2}$$

For classes *i* and *j*, *μ* is the mean vector of the reflectance values and Σ is the variancecovariance matrix. Previous research has shown that the JM distance can provide a more

accurate classification than other distance measures, such as the Euclidean distance or divergence [104]. It ranges between 0 (completely inseparable) and 2 (completely separable) [102].

#### 2.2.4. kNN Classifier

The kNN algorithm proposed by Aha et al. [105] is an instance-based learning method that classifies elements based on the closest k training samples in the resource space. These data play important roles in spatial forecasting, in addition to being the main adjustment parameter of the kNN algorithm. kNN is a common classification tool used in remote sensing data mining applications [63,105], and it is widely used for mapping burned areas [106,107]. kNN is a non-parametric MLA, which makes no assumptions regarding the main data set. This is important when classifying processes of change in territory, such as floods and fires, for which there is little or no prior knowledge of data distribution. In kNN, the pixel whose class is unknown is a member of a class, as described by its spectrally closest neighbors whose class identities are recognized. Figure 3 shows the scheme of the kNN algorithm.

**Figure 3.** k-Nearest Neighbor (kNN) classification scheme (**a**–**c**).

Initially, the parameter k, which represents the number of closest neighbors, must be selected. This parameter will direct the number of neighbors. In the case of k = 5 in a binary group, the five closest points are identified by the Euclidean distance. In this way, through the shortest distance between the existing k neighbors, which is, the point to be classified and all points in the data set, it is possible to know which class is most similar to. Thus, the classification is completed, and the unknown point is classified. The parameter k plays an important role in the performance of the kNN, being the main kNN adjustment parameter. In this study, we tested different k values (5 to 20) to select the ideal parameter for the kNN classifier based on the lowest estimate of the Root Mean Square Error (RMSE), using different subsets of data. However, in previous studies, as in Cariou et al. [108] and Noi and Kappas [63], it was revealed that this is not the only criterion for selecting an appropriate k value because a small or large k value has characteristics that are suitable for each case. We used SNAP (Sentinel Application Software, ESA) software for this classification.

#### 2.2.5. RF Classifier

The RF algorithm is based on the creation of several decision trees, combining them to obtain a more accurate and stable forecast. According to Rodríguez-Galiano et al. [109], the RF algorithm has advantages in remote sensing area, as it generates an internal unbiased estimate of the generalization that is represented by the Out of Bag (OOB) error, which is a way of validating the RF model. Therefore, it is relatively robust for outliers and noise, in addition to being computationally lighter than other tree set methods. The RF is trained using bootstrap aggregation, where each new tree is adjusted based on a bootstrap sample from the training observations. OOB is the average error for each calculated tree using predictions from trees that do not contain it in their respective bootstrap sample. This allows for the RF classifier to be adjusted and validated while being trained [110].

The Information Gain Rate criterion [111] and the Gini Index [112] are the attribute selection measures most frequently used to induce the decision tree. We chose the Gini Index, which measures the impurity of an attribute in relation to the classes. For a given T training set, it randomly selects a case (pixel) and determines the class that it belongs to.

In this work, the RF classification was tested for 10 to 400 trees for the set of images composed for each sensor. One-third of the training number of trees was used to test the error that is associated with the predictions, the above-mentioned OOB error. In RF, the parameter MTRY, the optimal trees at each node, controls the number of variables available to split at each node of a tree [113]. In this study, a default value was used as provided by the SNAP software.

#### 2.2.6. Validation and Accuracy Analysis

The validation of remote sensing data is generally based on measurements that were obtained in field campaigns, which are seen as a reference on site. In many cases, the validation process is carried out by remote sensing products provided by official institutions or by sensors with high spectral or spatial resolutions. In this work, the validation product that was used as a reference was the 2019 annual burned area of the atlas provided by the National Institute for Conservation of Nature and Forests (ICNF) of Portugal.

The data were made available on the website http://www.icnf.pt/ (accessed on 14 February 2021) in an ESRI shapefile format covering the entire national territory through the representation of polygons from areas that are affected by fires, coupled with information such as area, date, duration, and the cause that started the fire. The elaboration of the national mapping of the burned areas through the compilation of all geospatial files comes from semiautomatic classification processes using Landsat, Sentinel, or other satellite images [114].

The quality of a given thematic map that is derived from remote sensing data is generally assessed by systematic comparison with other maps also derived from remote sensing [6]. Quality assessment is generally carried out based on verification measures derived from confusion matrices [115]. The choice of validation methods and objectives must be guided by the end use of the products. The cross-tabulation approach is the most common way to assess thematic accuracy. In this context, the comparison and analysis of the quality of the burned area maps that were obtained by the kNN and RF classifications in the different tested sensors were carried out.

The burned area polygon that was obtained by the ICNF map was used as a spatial reference in this study. The pixel-based analysis was based on a confusion matrix (Table 2). Following the terminology that was presented by Fawcett [116], the reference data (true class) will be referred to as positive or negative (burned or unburned). If the instance is positive (burned) and classified as positive (burned), it will be counted as a true positive (TP); if it is classified as negative (unburned), it will be counted as false positive (FP). On the other hand, if the instance is negative (unburned) and it is classified as negative (unburned), it will be counted as true negative (TN); if it is classified as positive (burned), it will be counted as false negative (FN) (Table 2).


Unburned FP TN FP + TN Total TP + FP FN + TN TP + FN + FP + TN

**Table 2.** Confusion matrix between the reference product and the burned/unburned classified areas.

The confusion matrices aim to determine the probability of detection of burned areas in the different sizes of fractions of this area at the study site. This explains the error inherent in the burned areas due to the difference between the reference product and the resolutions between the sensor images. According to Cohen [117], the classification methods are evaluated while using statistical parameters, such as the Omission Error (OE), Commission Error (CE), Overall Accuracy (OA), and Dice Coefficient (DC).

OE is related to the producer's accuracy, which is, when a pixel is classified as unburned area being really burned area. CE is related to the user's accuracy, which is, when a pixel is attributed to a class of burned area to which it does not really belong. OA is defined as the fraction of pixels correctly classified as burned or unburned [61]. Finally, DC is a measure of similarity between the classifier and reference map in terms of the number of common burned pixels.

OE and CE vary on a reverse scale of (0–100%), where the lowest values indicate the best estimates. For OA and DC, on the contrary, the largest values indicate the best estimates.

#### 2.2.7. ROC Curve and AUC

The ROC curve has been used in studies of burned areas analysis to verify the general performance of classifiers and models. The ROC curve and a useful statistic calculated from it, the area under the curve (AUC), are mainly used to compare diagnostic tests and act as a performance measure for classifying binary data. The AUC value, as in Equation (3), shows the success rate of the model through the analysis of the training data set and its forecast rate for the tested data set.

$$\text{AUC} = \frac{\sum \text{TP} + \sum \text{TN}}{\text{M} + \text{N}} \tag{3}$$

where M and N are the total number of pixels in the burned and unburned areas. An AUC value that is close to 1 indicates a better performance. An AUC value of 1 indicates a perfect model, while an AUC value of 0 indicates a poor performance model. Between these values, the model performance is classified as excellent (0.9–1), very good (0.8–0.9), good (0.7–0.8), medium (0.6–0.7), and poor (0–0.6).

### **3. Results**

#### *3.1. Spectral Separability Analysis*

Table 3 summarizes the JM separability values at the study site, where the burned and unburned pixels were analyzed for each spectral band used between the OLI, MSI, ASTER, and MODIS sensors.

**Table 3.** Jeffries–Matusita (JM) separability values and band for the OLI, MSI, ASTER, and MODIS sensor bands used in the classification.


In general, less separability is observed for the visible bands in all sensors, mainly for the bands B1 and B2 for ASTER, and especially in the green range for OLI, MSI, and ASTER, where the bands presented low separability values, with the exception of MODIS, which presented slightly greater separability in this range.

The near infrared (NIR) is the spectral region where the sign of recent fire scars is the strongest, being generally considered to be the best spectral region for detection and mapping burned areas [118] and, therefore, of crucial contribution to image digital classification processes. This is seen in the results of Table 3 with the high values of separability in all sensors, even with some existing spectral and spatial resolution disparities. In addition, the results corroborate the spectral resolution of the sensors, where the thinner infrared

range of MODIS and OLI (Table 1) ensured greater separability, very different from the sparse range of MSI and ASTER, even with a slight difference in the spatial resolution and methods of pre-processing.

In the visible–NIR transition bands, there was high separability, as shown in bands B6 = 1.82 and B7 = 1.83 for MSI sensor, except in band B5 = 0.45. However, the band B5 presented low separability, because it is closer to the red band in relation to bands B6 and B7.

The short-wavelength infrared SWIR bands showed low JM separability values.

#### *3.2. kNN Training*

In this study, we tested different k values (5 to 20) to select the ideal kNN classifier parameter for each set of images. The lowest RMSE value was used as a criterion to select the best k parameter. Thus, despite the low RMSE, from Figure 4 we can see that, after tests, the k parameter was set to 5. It shows that, the lower the value of k, the higher the accuracy of the classification.

**Figure 4.** Evaluation of the performance of the kNN classifier with RMSE in relation to k value.

#### *3.3. RF Training*

Figure 5 shows the distribution of OOB errors for a different number of trees from 10 to 400. It is observed that the classification error between the sensors in the same tree does not change significantly. However, with the increase in the number of trees, the error decreases considerably. In this study, we used the number of trees that had the lowest OOB error. It can be seen that 400 is the best value for trees. One of the advantages of using the RF classifier is its versatility with the processing time, and this can be verified in this work. The classification performed with 10 trees took 10 s, while for 400 trees it took two minutes, a moderately acceptable time interval.

**Figure 5.** Evaluation of the performance of the RF classifier with the Out of Bag (OOB) error in relation to the number of trees (ntree).

#### *3.4. Burned Area Analysis*

Figure 6 and Table 4 show the pixel distribution and size of the burned area for the classification provided by the different sensors with both kNN and RF algorithms. The finer spatial resolution of OLI, MSI, and ASTER showed a burned area with greater spatial detail, but with less density of features. In turn, the map that was generated by MODIS presented, as expected, a burned area with less detail at the edges and a high distribution of overestimated features within the burned area. When comparing the classifiers, the maps visually showed no significant differences with variations in the burned areas ranging between 0.36 and 1.43 km2, with the lowest differences being for MODIS (0.36 km2) and the largest for MSI (1.43 km2). However, they presented important errors in the total burned area when compared to the ICNF reference map. The errors in the total burned area are not constant, ranging between 4.3% and 51.1% (Table 4), and being the difference sensitive to the technical specifications of the images.

**Figure 6.** Spatial distribution of the burned area for the Random Forest (RF) and kNN classifiers in the OLI, MSI, ASTER, and MODIS classifications.

> **Table 4.** Size of the burned area obtained with each classifier for each sensor as well as the size of the burned area in the reference map (INCF), differences between the areas obtained with each classifier, and errors when compared with the reference map together to the percentage that they represent with respect to the reference area.

