**1. Introduction**

Floods are among the most common and disastrous worldwide natural events, being the cause of many human life losses, heritage and environmental damages, and economic costs [1]. Those events are progressively more recurrent and catastrophic not only due to extreme weather-related to climate change but also owing to the increase in the number of exposed elements triggered by socio-economic activities that settle in fluvial spaces [2]. Thus, in order to implement flood risk management measures, it is necessary to understand the magnitude, severity, and frequency of past events [3]. Hence, flood disaster assessment must determine the flood event extension, which depends on timely and effective monitoring at a regional scale [4]. However, this diagnosis has difficulties due to the lack of coverage and density of meteorological and gauging stations [5].

Thus, taking advantage of the accelerated development of remote sensing techniques, massive satellite data have become an available practical source for flood monitoring [6]. Several studies have been carried out for this purpose, applying different methods [7,8]. Nevertheless, identifying the surfaces covered by water is the main task in flooding diagnoses, and hence it is necessary to implement remote sensing techniques that provide high spatial images with high acquisition frequency [4].

In this regard, the European Space Agency (ESA) has developed Earth observation missions under the Copernicus program for land monitoring, including the Sentinel-2

**Citation:** Lombana, L.;

Martínez-Graña, A. A Flood Mapping Method for Land Use Management in Small-Size Water Bodies: Validation of Spectral Indexes and a Machine Learning Technique. *Agronomy* **2022**, *12*, 1280. https://doi.org/10.3390/agronomy 12061280

Academic Editor: Francisco Manzano-Agugliaro

Received: 24 February 2022 Accepted: 25 May 2022 Published: 26 May 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

mission, which supports global satellite data with a wide width and multispectral band [9]. This mission provides freely available multispectral Images (MSI) with a 5-day revisit frequency and a 10 m spatial resolution for visible bands, making Sentinel-2 imagery an excellent resource for surface water mapping [7,10,11], becoming one of the main satellites used to detect flood extents [12,13].

In this aspect, during the last decades, different methodologies have been developed for the digital processing of satellite images, reducing costs, time, and efforts [11]. These methodologies involve several techniques, which can be classified into four main classes: single band, spectral index, machine learning, and spectral unmixing based methods [7].

Among the techniques applied, water spectral indexes are the most widely used methods to classify surfaces into water and non-water using multispectral images. These indices allow enhancing the contrast between pixels of these two categories, exploiting the information of different bands [8,14]. In this regard, the near-infrared (NIR) band is usually used to identify water and non-water pixels due to its high absorption and low reflectance rates [15,16].

The water indices have been improved to achieve good performances in the delineation of water bodies. In 1996, McFeeters developed the Normalized Difference Water Index (NDWI) [17], which uses green and NIR bands to obtain an index representing water with positive values and non-water with negative values. However, it was found that this index is not efficient in built-up suppressing, so the identified water surfaces can mix water and built-up land noises. Thus, the modified Normalized Difference Water Index (mNDWI) was implemented by using the shortwave-infrared (SWIR) band instead of the NIR band [18]. On the other hand, considering the classification issues caused by shadow noises, the AWE indexes (AWEIsh and AWEInsh) were developed [19]. These use different band combinations and weightings, including the blue, green, NIR, and SWIR bands, where water pixel values are above 0 and non-water pixels below 0. It was determined that AWEInsh is appropriate for images where clouds do not represent a problem, suppressing dark surfaces in urban areas. On the other hand, AWEIsh successfully removes shadow noises. Likewise, other studies have used vegetation indices to extract water features [20].

Although the spectral water indexes normally represent water and non-water pixels with values above or below 0, it is necessary to determine a suitable threshold value depending on the characteristics of the water surfaces in each multispectral image [11]. Different studies have established optimal threshold values by visual interpretation of frequency histograms; however, it is not a useful method, especially when analyzing large multispectral images [8]. Therefore, automatic histogram-based thresholding algorithms have been applied for this purpose, with Otsu's method being the most widely used algorithm to improve the classification accuracy [11,15]. This one operates with the maximum between-class variance to recognize objects and background [10,21].

On the other hand, surface water mapping methodologies based on machine learning have been applied [7,22], including the classification algorithms, which can be categorized into supervised and unsupervised methods depending on the human training needed [7]. The unsupervised methods have been demonstrated to be effective in differentiating between water from non-water pixels [12,23]. One of the methods used is the Clustering algorithm, where points with similar features are automatically grouped into the same cluster, such as Kmeans clustering, mean-shift clustering, and expectation–maximization (EM) clustering [7,12].

Although many methods have been developed to improve water surface mapping, there are still many challenges in this subject, including the issues related to the delineation accuracy considering the resolution of remotely sensed images and the water bodies' sizes. In this case, when water surfaces have sizes similar to the spatial resolution of the satellite images used, they are hard to detect and map [7,10,24].

In an effort to address part of these challenges, some studies have improved the spatial resolution of remotely sensed images by using sharpening algorithms; however, these have been applied to large water bodies, for instance, Poyang Lake in China, Aras River in Turkey, or Po River in Italy [6,10,25,26]. Some of them had used panchromatic bands with high resolution to sharper the bands with lower resolution and improved

the delineation accuracy [25]. Nevertheless, while many satellite imagery includes a panchromatic band, some do not, such as the Sentinel-2 images. For these images, other sharpening algorithms have been applied [11,26]. These methods can be classified into three categories: pan-sharpening per band, inverting an explicit imaging model, and machine learning methods [27]. The pan-sharpening per band method increases the spatial resolution of each band independently, mixing information from a high-resolution band [11,25,28]. However, the complexity of these processes can restrict the efficiency of water body mapping [10]. On the other hand, the model-based methods obtain a highresolution image by minimizing residual errors in a single optimization for all bands simultaneously [27]. Accordingly, Brodu [29] developed an algorithm for super-resolved multiresolution images, showing good results in agricultural lands with large uniform areas. It separates band-dependent spectral information from information that is common across all bands ("geometry of scene elements"), super-resolving the low-resolution bands while preserving their overall reflectance [29]. Although this method proved efficient in improving Sentinel-2 images' spatial resolution, it has not been directly tested to enhance flood mapping.

Considering the above, and taking into consideration the challenges in surface water mapping and the diversity of methods and techniques developed around this issue, flood mapping is considered an open research topic because no single method has been found suitable for all data sets and all conditions [7].

So, by combining the advantages of data held in Sentinel-2 multispectral images (MSI), image pre-processing techniques, spectral indexes, and machine learning methods, the present study developed a combined automated methodology for flood mapping for one of the biggest challenges: small-sized water bodies. Accordingly, this study involved four main phases (Figure 1): (1) Improve Sentinel-2 images resolution by applying and comparing resample, and a super-resolving algorithm; (2) Assess and compare the effectiveness of seven spectral indexes in highlight flood surfaces; (3) Apply and compare different flood extent mapping methods including 14 thresholding algorithms and a machine learning method for unsupervised classification; (4) Evaluate the performance of the flood mapping methods used; to finally define the most accurate combined method for flood mapping in a small-sized water body. *Agronomy* **2022**, *12*, x FOR PEER REVIEW 4 of 20

**Figure 1.** Methodology of the study. **Figure 1.** Methodology of the study.

occupations, land cover, and land uses.

The water body selected to validate the methodology framework was the Carrión River in the Duero basin, located in Palencia, Spain (Figure 2). It has a channel length of

a marked lateral migration with a wide alluvial plain, where an active and/or abandoned drainage network can be identified [31]. Throughout its alluvial plain, agricultural activities and urban land uses can be found, so different elements can be exposed to flood risks [32]. Thus, although the Carrión river can be classified as a narrow water body, it presents recurrent flood events due to its morphometric characteristics, fluvial dynamics,

The water body selected to validate the methodology framework was the Carrión River in the Duero basin, located in Palencia, Spain (Figure 2). It has a channel length of 197 km, with a drainage area of 3368 km<sup>2</sup> , average values of width of 40 m, river level of 0.5 m, discharges of 12 m3/s, and an annual maximum of 38 m3/s [30]. This river presents a marked lateral migration with a wide alluvial plain, where an active and/or abandoned drainage network can be identified [31]. Throughout its alluvial plain, agricultural activities and urban land uses can be found, so different elements can be exposed to flood risks [32]. Thus, although the Carrión river can be classified as a narrow water body, it presents recurrent flood events due to its morphometric characteristics, fluvial dynamics, occupations, land cover, and land uses. *Agronomy* **2022**, *12*, x FOR PEER REVIEW 5 of 20

**Figure 2.** Location of the study area and Spectral indices' images obtained from Sentinel-2: (**a**) General localization. (**b**) Studied River. (**c**) AWEInsh. (**d**) AWEIsh. (**e**) MNDWI. (**f**) NDPI. (**g**) NDVI. (**h**) NDWI. (**i**) SAVI. **Figure 2.** Location of the study area and Spectral indices' images obtained from Sentinel-2: (**a**) General localization. (**b**) Studied River. (**c**) AWEInsh. (**d**) AWEIsh. (**e**) MNDWI. (**f**) NDPI. (**g**) NDVI. (**h**) NDWI. (**i**) SAVI.

#### **2. Methods 2. Methods**

*2.1. Data Source Selection and Pre-Processing 2.1. Data Source Selection and Pre-Processing*

The imagery selected for the study was the high-resolution optical images provided by the Sentinel-2 satellite mission under the Copernicus Earth Observation program led by the European Commission and operated by the European Space Agency (ESA) (https://earth.esa.int/web/sentinel/home, accessed on 10 July 2021). The imagery selected for the study was the high-resolution optical images provided by the Sentinel-2 satellite mission under the Copernicus Earth Observation program led by the European Commission and operated by the European Space Agency (ESA) (https: //earth.esa.int/web/sentinel/home, accessed on 10 July 2021).

2019–2020 wet year, when the Duero basin presented rainfalls 105% higher than the 1981– 2020 average value, with November and December being the most important months [33]. On the other hand, the study area selection in the basin took into consideration that the water and non-water surfaces should be diverse and include principal channels, reservoirs, vegetation, buildings, bare land, and fields. Accordingly, Level-2A S2 images were downloaded employing the Sentinel-2 Toolbox for ArcGIS 10.8, using a maximum cloud percentage of 20%, between 1 November 2019 and 31 December 2019. The detailed

The time and areas of study in Carrion River were selected to assess post-flood events, considering diverse criteria. Regarding the selection of study dates, the

description of the image obtained is shown in Table 1.

The time and areas of study in Carrion River were selected to assess post-flood events, considering diverse criteria. Regarding the selection of study dates, the hydrological year analyses were considered. The period selected corresponded to the 2019–2020 wet year, when the Duero basin presented rainfalls 105% higher than the 1981–2020 average value, with November and December being the most important months [33]. On the other hand, the study area selection in the basin took into consideration that the water and non-water surfaces should be diverse and include principal channels, reservoirs, vegetation, buildings, bare land, and fields. Accordingly, Level-2A S2 images were downloaded employing the Sentinel-2 Toolbox for ArcGIS 10.8, using a maximum cloud percentage of 20%, between 1 November 2019 and 31 December 2019. The detailed description of the image obtained is shown in Table 1.



The Level-2A images downloaded were pre-processed using the Sentinel Application Platform (SNAP), developed by ESA (https://earth.esa.int/eogateway/tools/snap, accessed on 10 July 2021). This process consisted of three different steps. First, to work over the areas belonging to the Carrion River, the image subsetting was carried out to obtain a raster of 3132 × 6540 pixels. Sentinel-2A data have multiple bands that include four 10 m visible and near-infrared bands, six 20 m vegetation red edge and short wave infrared bands, and three 60 m coastal aerosol, water vapor, and SWIR-Cirrus bands [34].

Considering that the bands have different resolutions, it is necessary to obtain images of high quality and the same pixel size. In order to obtain high spatial resolution data of 10 m-bands, two methods were applied and compared. Firstly, a resampling algorithm was used, implementing a tool available in Raster-Geometry subset tools of SNAP. It was used by taking as reference data the Blue band and a nearing algorithm. Secondly, a super-resolution method was applied using the Sen2Res tool in SNAP [29]. This method was developed for multispectral and multiresolution images, such as Sentinel-2, which obtains information from pixels that have the highest resolution and reproduce these details to all other bands; thus, an image with the best resolution can be obtained [29]. The method works for uniform areas such as agricultural lands. Finally, a land use band combination was composited (using the 11, 8A, and 4 bands) and the two methods applied were compared by visual interpretation.

#### *2.2. Processing of Sentinel-2 Data: Spectral Indices*

For the study, a set of 7 spectral indices were applied to highlight surfaces covered by water before applying flood extent mapping methods. This kind of indices used complex ratios of multiple bands, which are selected depending on the spectral characteristics of the target element studied [7] (a further description of the algorithms can be found in Table 2). In this case, to calculate the indexes, the Raster Math tool available in SNAP was implemented.


**Table 2.** Summary of indexes used in the study.

\* B: Band.

#### *2.3. Flood Extent Mapping*

The flood extent mapping consists of the surface water detection and delineation using remote sensing images; therefore, multiple studies have been carried out on this subject [7,8,14,38]. In this study, from the indices calculated, two different methods were evaluated, the unsupervised classification and the Automatic Threshold Determination.

The unsupervised classification was performed using the expectation–maximization (EM) algorithm, which considers the number of clusters that must be established as well as their center points which are randomly initialized [7]. In the study, eight classes were performed according to the cover land. For that, the EM cluster analysis tool supported in SNAP was applied to each spectral index.

On the other hand, the Automatic Threshold Determination, a single band method, consisted of the application of image binarization algorithms to classify pixels into different categories, in this case, water and non-water surfaces [7]. A total of 14 automatic thresholding methods were evaluated for each spectral index applied using the ImageJ software [39]. The algorithms used are presented in Table 3.


**Table 3.** Thresholding methods applied in the study.

### *2.4. Accuracy Assessment of the Flood Mapping Methods*

Regarding the assessment of the extraction of the flooded areas, the quantitative accuracy index was applied [10]. This widely used method involves the generation of random water and non-water sample points where water extraction results are verified. Thus, for the study, this process included four steps:

a. Generation of 200 water and non-water sample points over the study area by applying the Create Random Points tool in ArcGIS.

b. Once the points were generated, high-resolution images were employed to verify and adjust the random sample points. In this case, the images were obtained from the Spanish National Geographic Institute, IGN, provided by National Aerial Orthophotography Plan (PNOA 2017). Additionally, scatter plots were plotted to verify the reflectance of the water sample points, which is expected to be low if the Red and NIR-1 bands are contrasted.

c. Subsequently, it was identified whether each of the random points corresponded to surfaces classified as water or non-water according to each flood mapping method applied, so the total number of points correctly identified was calculated.

d. The accuracy of each method was evaluated by applying four assessment indices. For this purpose, a confusion matrix-based approach was developed (Table 4). Thus,

by comparing the extracted water and non-water points with the reference data, four types of pixels were obtained: true positive (TP), the number of correctly extracted water pixels; false negative (FN), the number of undetected water pixels; false positive (FP), the number of incorrectly extracted water pixels; and true negative (TN), the number of correctly rejected non-water pixels [25]. Based on the four types of pixels, the producer accuracy, user accuracy, overall accuracy, and Kappa coefficient were applied according to the Equations (1)–(4), respectively [10].

$$Producer's accuracy = \frac{TP}{TP + FN} \tag{1}$$

$$User's accuracy = \frac{TP}{TP + FP} \tag{2}$$

$$\text{Overall accuracy} = \frac{TP + TN}{T} \tag{3}$$

$$\text{Kappa}(\text{Focient}) = \frac{T(TP + TN) - \left( (TP + FP)(TP + FN) + (FN + TN)(FP + TN) \right)}{T^2 - \left( (TP + FP)(TP + FN) + (FN + TN)(FP + TN) \right)} \tag{4}$$

**Table 4.** Confusion matrix-based applied in the study.


#### *2.5. Validation of the Index's Effectiveness*

By comparing the mean value of the index values in water and non-water points, the ability of each index to distinguish flood surfaces can be assessed [10], so the formula of the Contrast Value (CV) was applied (Equation (5)).

$$\mathbf{CV = E - K} \tag{5}$$

where, E and K denote the mean value of each index in terms of the water and non-water surface, respectively.

#### **3. Results**

#### *3.1. Image Processing*

After pre-processing Sentinel-2 images, an atmosphere corrected subset image was obtained. By applying the resample and super-resolution methods, two different images were produced; thus, the 11 and 12 bands were improved to a 10 m-resolution (Figure 3). As can be seen, although two images with the same pixel size were obtained, the image produced by the super-resolution method shows better results than the resampling method. This is because the resampling method resamples each band independently, while superresolution obtains information from other bands that have higher resolution [29]. In this sense, this method allowed obtaining high-resolution bands that are necessary to calculate spectral indexes, in this case, the shortwave-infrared bands (11-band and 12-band), going from 20 m to 10 m resolution. Consequently, the following steps in the proposed methodology were applied to the super-resolution image obtained.

**Figure 3.** Sentinel-2 images using composite bands: (**a**) Raw image. (**b**) Resampled image. (**c**) Superresolution image. **Figure 3.** Sentinel-2 images using composite bands: (**a**) Raw image. (**b**) Resampled image. (**c**) Superresolution image.

#### *3.2. Application of the Spectral Indices and Effectiveness Validation 3.2. Application of the Spectral Indices and Effectiveness Validation*

A total of seven indices were effectively applied to the Sentinel-2 image to identify flooded areas, as shown in Figure 4. By comparing the results with a false-color composite image created to enhance water surfaces (Figure 4b) through a visual interpretation, it can be indicated that the AWEInsh and AWEIsh indices show accurate results in reflecting flooded areas, but in any case, with diffuse boundaries. It seems that, although the MNDWI index does not fully bright flooded areas, it manages to delimit the zones with sharper boundaries, as does the NDWI index, although this shows lesser precision. The NDPI, NDVI, and SAVI indices do not seem to represent good results in flooded areas reflecting. However, the effectiveness of each index and the application of flood extent mapping methods will be evaluated later through quantitative methods. A total of seven indices were effectively applied to the Sentinel-2 image to identify flooded areas, as shown in Figure 4. By comparing the results with a false-color composite image created to enhance water surfaces (Figure 4b) through a visual interpretation, it can be indicated that the AWEInsh and AWEIsh indices show accurate results in reflecting flooded areas, but in any case, with diffuse boundaries. It seems that, although the MNDWI index does not fully bright flooded areas, it manages to delimit the zones with sharper boundaries, as does the NDWI index, although this shows lesser precision. The NDPI, NDVI, and SAVI indices do not seem to represent good results in flooded areas reflecting. However, the effectiveness of each index and the application of flood extent mapping methods will be evaluated later through quantitative methods.

As a product, histograms of each spectral index were also obtained (Figure 5). As shown, AWEInsh and AWEIsh index show a marked unimodal distribution, while MNDWI, NDWI, and NDPI tend to represent a bimodal distribution. The other indices present different peaks, so they show multimodal distributions.

In order to assess the effectiveness of each spectral index used, the mean and contrast values were calculated for the water and non-water random points previously generated. Accordingly, boxplots were made to compare the distribution of the data (Figure 6). As shown, in terms of the water points, the mean values of all indexes are closer to zero (between −0.23 and 0.05) than the values obtained for non-Water points (which are between −1.08 and 0.37). The AWEIsh, MNDWI, and NDWI indices showed similar mean values for both water and non-water points. The same case occurred with the NDVI and SAVI indices. Therefore, the highest contrast value was calculated for the AWEInsh index, followed by the MNDWI and NDPI indices. At first, it suggests that AWEInsh, MNDWI, and NDPI spectral indices are more effective in highlighting water surfaces in Sentinel-2 images.

#### *3.3. Flood Extent Mapping Performance*

As could be interpreted from visual interpretation and contrast values, AWEInsh highlights water surfaces brighter than the other indexes. However, it is necessary to generate flood maps by different methods to develop spatial and quantitative analysis, as was previously detailed. Firstly, the threshold values were calculated for each index,

as shown in Table 5. As can be seen, different thresholding methods provide the same results, such as the AWEInsh index applying the Huang's, Li, and Otsu algorithms, and the AWEish index using the Huang's, Isodata, and Percentile methods. In these cases, the flood maps produced are the same. *Agronomy* **2022**, *12*, x FOR PEER REVIEW 10 of 20

**Figure 4.** Flooded areas detected by each index used in the study area. (**a**) Real color image. (**b**) Composite image (R: 8, G: 11, B: 4). (**c**) AWEInsh. (**d**) AWEIsh. (**e**) MNDWI. (**f**) NDPI. (**g**) NDVI. (**h**) NDWI. (**i**) SAVI. As a product, histograms of each spectral index were also obtained (Figure 5). As **Figure 4.** Flooded areas detected by each index used in the study area. (**a**) Real color image. (**b**) Composite image (R: 8, G: 11, B: 4). (**c**) AWEInsh. (**d**) AWEIsh. (**e**) MNDWI. (**f**) NDPI. (**g**) NDVI. (**h**) NDWI. (**i**) SAVI.

shown, AWEInsh and AWEIsh index show a marked unimodal distribution, while MNDWI, NDWI, and NDPI tend to represent a bimodal distribution. The other indices present different peaks, so they show multimodal distributions. Each threshold value was used to classify pixels into the water and non-water surfaces; therefore, thematic maps from a single-band image were created by using discrete binary colors. Figure 7 shows the performances of each spectral index by applying the different thresholding methods and the unsupervised classification. It should have been previously indicated that Minimum, Yen, and Shanbhang thresholding methods did not work to classify pixels for any of the indices used.

As seen in Figure 7, mean and percentile algorithms extracted water bodies worse than the other methods for all indices. In addition, maximum entropy and intermodel methods only worked for the MNDWI and NDPI indexes. In contrast, the unsupervised classification method (EM Cluster) illustrated the water surfaces better than the thresholding methods in most cases.

Furthermore, the methods of Huang, Isodata, and Li and Tam only worked for the AWEInsh index. Thus, it seems that the best performances are obtained by classifying the AWEInsh index pixels with different methods.

Although the accuracy of each method can be explained by visual interpretation, it is important to analyze the results of the assessment methods implemented for this purpose. Thus, the overall accuracy and the Kappa coefficient values are summarized in Figure 8 and plotted in Figure 9 (the detailed results obtained by applying the producer accuracy, user accuracy, overall accuracy, and Kappa coefficient can be found in the Appendix A).

*Agronomy* **2022**, *12*, x FOR PEER REVIEW 11 of 20

**Figure 5.** Histograms obtained from different spectral indices. (**a**) AWEInsh. (**b**) AWEIsh. (**c**) MNDWI. (**d**) NDPI. (**e**) NDVI. (**f**) NDWI. (**g**) SAVI. **Figure 5.** Histograms obtained from different spectral indices. (**a**) AWEInsh. (**b**) AWEIsh. (**c**) MNDWI. (**d**) NDPI. (**e**) NDVI. (**f**) NDWI. (**g**) SAVI. and NDPI spectral indices are more effective in highlighting water surfaces in Sentinel-2 images.

In order to assess the effectiveness of each spectral index used, the mean and contrast

**Figure 6.** Box plots classified into water (W) and non-water (NW) categories and contrast values obtained for each index. **Figure 6.** Box plots classified into water (W) and non-water (NW) categories and contrast values obtained for each index.

As could be interpreted from visual interpretation and contrast values, AWEInsh

**AWEInsh AWEIsh MNDWI NDPI NDVI NDWI SAVI**

was previously detailed. Firstly, the threshold values were calculated for each index, as shown in Table 5. As can be seen, different thresholding methods provide the same results, such as the AWEInsh index applying the Huang's, Li, and Otsu algorithms, and the AWEish index using the Huang's, Isodata, and Percentile methods. In these cases, the

Huang and Wang's (Hu) −0.88 −0.44 −0.53 0.33 0.57 −0.52 0.26 Intermode (Int) −3.92 −1.37 0.1 −0.21 −0.05 0.05 0.05 Isodata (Iso) −0.93 −0.44 −0.48 0.27 0.5 −0.52 0.26 Li and Tam (Li) −0.88 −0.39 −0.48 0.29 0.48 −0.52 0.25 Maximum entropy (Me) 0.91 0.38 −0.03 −0.11 0.05 −0.11 0.04 Mean −1.08 −0.41 −0.5 0.31 0.45 −0.52 0.24 Minimum (Min) −6.31 −2.26 0.39 −0.47 −0.25 0.36 −0 Moment-preserving (Mp) −0.93 −0.27 −0.33 0.17 0.43 −0.4 0.25 Otsu (Ot) −0.88 −0.41 −0.19 0.29 0.5 −0.5 0.26 Percentile (p-tile) (Per) −1.13 −0.44 −0.52 0.32 0.43 −0.52 0.23 Renyi's entropy (Ren) 0.61 0.38 −0.05 −0.11 0.07 −0.12 0.05 Shanbhag's (Sh) 2.66 0.08 0.46 −0.4 0.49 0.01 0.26

**Table 5.** Thresholding values obtained from each method.

**Thresholding Methods Threshold Value**

*3.3. Flood Extent Mapping Performance*

flood maps produced are the same.

**Figure 7.** Flood-areas extraction performances generated using all indexes by each mapping method (refer to Table 5). **Figure 7.** Flood-areas extraction performances generated using all indexes by each mapping method (refer to Table 5).

As seen in Figure 7, mean and percentile algorithms extracted water bodies worse than the other methods for all indices. In addition, maximum entropy and intermodel methods only worked for the MNDWI and NDPI indexes. In contrast, the unsupervised classification method (EM Cluster) illustrated the water surfaces better than the

thresholding methods in most cases.


AWEInsh index pixels with different methods.

Furthermore, the methods of Huang, Isodata, and Li and Tam only worked for the AWEInsh index. Thus, it seems that the best performances are obtained by classifying the

Although the accuracy of each method can be explained by visual interpretation, it is important to analyze the results of the assessment methods implemented for this purpose. Thus, the overall accuracy and the Kappa coefficient values are summarized in Figure 8 and plotted in Figure 9 (the detailed results obtained by applying the producer accuracy, user accuracy, overall accuracy, and Kappa coefficient can be found in the Appendix A).

**Figure 8.** Accuracy values obtained for each index and mapping method: (**a**) Overall accuracy. (**b**) Kappa coefficient. **Figure 8.** Accuracy values obtained for each index and mapping method: (**a**) Overall accuracy. (**b**) Kappa coefficient.


**Figure 9.** Accuracy values plotted for each index and mapping method by Overall and Kappa algorithms. As can be seen, the AWE indexes showed the highest accuracy in water and nonwater pixels categorization. In these cases, the most effective methods used were the EM unsupervised classification (which represents an overall accuracy of 0.94 and a Kappa coefficient of 0.88 in both cases) and the moment-preserving thresholding method (reaching overall and Kappa values of 0.94 and 0.89, respectively).

However, thresholding methods such as Huang and Wang, Li and Tam, Isodata and Otsu also illustrated optimal results for the AWEInsh index, with overall and Kappa values over 0.88 and 0.75, respectively.

The NDVI, NDWI, and SAVI indices presented low accurate results in most of the cases, showing the NDVI the worst performance. Nevertheless, the NDWI index reaches an overall accuracy of 0.85 and a kappa coefficient of 0.69 by using the Triangle thresholding method.

Furthermore, the methods of Huang, Isodata, and Li and Tam only worked for the AWEInsh index. Thus, it seems that the best performances are obtained by classifying the

Although the accuracy of each method can be explained by visual interpretation, it is important to analyze the results of the assessment methods implemented for this purpose. Thus, the overall accuracy and the Kappa coefficient values are summarized in Figure 8 and plotted in Figure 9 (the detailed results obtained by applying the producer accuracy, user accuracy, overall accuracy, and Kappa coefficient can be found in the Appendix A).

**Figure 8.** Accuracy values obtained for each index and mapping method: (**a**) Overall accuracy. (**b**)

AWEInsh index pixels with different methods.

**Figure 9.** Accuracy values plotted for each index and mapping method by Overall and Kappa **Figure 9.** Accuracy values plotted for each index and mapping method by Overall and Kappa algorithms.

#### algorithms. **4. Discussion**

Kappa coefficient.

The super-resolve method developed by Brodu [29] showed good results in improving the resolution of Sentinel-2 images in the area of interest, confirming its efficiency when applied to uniform areas representative of agricultural lands. This algorithm simulates a panchromatic sharpening, which can extract geometric information and use it to separate the low-resolution pixels while preserving their overall reflectance [29], so it is effective for super-resolved bands needed to calculate water spectral indexes, especially the SWIR bands [7,11,19,26].

Once the Sentinel-2 bands had the same high resolution (10 m), it was possible to apply the seven spectral indexes. Regarding the histograms obtained for each index (Figure 5), previous studies have shown similar results for MNDWI, NDWI, and AWEIsh histograms, where peak values usually represent land or water surfaces [15]. In this regard, different authors have demonstrated that index histograms with bimodal and multimodal distributions are appropriate for implementing thresholding methods [8]; however, this study was successful in using automatic thresholding algorithms in index histograms with a unimodal distribution, such as the case of the AWE indices. Additionally, it is important to emphasize that in all histograms, except for the SAVI, the maximum peaks were close to the index mean values calculated for non-water points.

By analyzing the contrast values, the AWEInsh, MNDWI, and NDPI represented the maximum values, suggesting that they may be more effective in highlighting water surfaces. However, as verified in the flood mapping performances, the AWE indexes showed the most accurate results, considering the overall and Kappa coefficients; on the contrary, the MNDWI and NDPI did not present satisfactory results, obtaining accuracy values below 0.84. On the other hand, spectral indexes, such as NDWI and mNDWI, usually used in water surface identification, were not useful for detecting flood areas in the analyzed small-size water body [10]. Consequently, as revealed, the AWE indexes were the most

suitable for extracting flood areas in the study zone. This type of spectral indices has been structured to identify water features, so Ish was designed to remove shadow pixels and Insh for areas with an urban background [19]. Although the method was built for Landsat images, it works effectively for the Sentinel-2 image used in this study, as reported by other authors [14,15].

As for flood mapping methods, among the automatic thresholding algorithms employed, the moment-preserving thresholding method showed the most accurate results, followed by the Huang and Wang, Li and Tam, and Otsu algorithms, which also presented good outputs. However, the machine learning method for unsupervised classification (the EM cluster) showed the best results in all cases, only reached by the Moment-preserving thresholding algorithm. Consequently, according to the visual interpretation and the accuracy values obtained, it can be said that the flood mapping methods with Overall and Kappa values over 0.88 and 0.75, respectively, represent effective methods for the identification of flooded areas.

In this regard, the recommended combination of spectral indexes, thresholding algorithms, and/or unsupervised methods for flood mapping in small rivers is as follows: (a) Super-resolved Sentinel-2 images by applying the super-resolution method developed by Brodu [29], and obtain 10-m-resolution bands, including SWIR bands. (b) Calculate the AWEInsh index and apply the Huang and Wang, Li and Tam, Otsu, and EM cluster methods. (c) Calculate the AWEIsh index and use the moment-preserving algorithm and EM cluster method for pixel categorization. (d) Assess the accuracy of the flood mapping methods by using validation points and a derived confusion matrix.

The method proposed involves the use of simple and available access data and tools such as Sentinel-2 MSI, the SNAP (and its Sen2Res tool, Raster Math, cluster analysis packages), and ImageJ software. It holds the advantages of free access to information and an unsupervised algorithm that does not depend on training data efficiency [7], which makes the methodology a practical process for mapping floods in narrow rivers. However, further studies could evaluate the use of other machine learning techniques that can keep the pros of unsupervised classification [23].

Furthermore, the accuracy assessment results suggest that the method developed in this study is not just simple but effective in narrow rivers. Nevertheless, one limitation is related to the analysis of the effect of the presence of mixed spectra in the Sentinel-2 image pixels, so further studies should include this kind of evaluation [7]. In this regard, these analyses should include other accuracy assessment methods that allow measuring the results obtained by comparing flood maps with a Very High Resolution (VHR) imagery of the exact date of the flood event. It may involve new methods to quantitatively evaluate the extraction accuracy taking into account the spatial position [15]. On the other hand, it can be recommended that further investigations evaluate the methodology proposed using SAR images.

#### **5. Conclusions**

The combined automated method for water mapping developed in the present study contributes to improving the detection of surfaces covered by water during flood events in narrow rivers. This methodology exploits the advantages of data held in Sentinel-2 MSI by applying image pre-processing techniques and simple methods such as single band techniques, which include thresholding algorithms, combined with spectral indexes and unsupervised machine learning techniques. This can show accurate results in highlighting flood areas in small-size waterbodies whose width is similar to the size pixel.

It can be emphasized the effectiveness of AWE indexes in water surface detection in narrow rivers by combining mapping methods such as Huang and Wang, Li and Tam, and Otsu, moment-preserving thresholding algorithms, and EM cluster classification. In this order, the super-resolution method developed by Brodu was efficient in SWIR band super-resolving, enhancing the water spectral indexes application.

The method such as the proposed becomes a practical procedure for mapping floods in small-size water bodies due to it involves the use of free access data and software, such as Sentinel-2 MSI and SNAP, holding the advantages of unsupervised algorithms.

Future studies can evaluate this combined methodology in other situations in addition to agricultural lands, for instance, agricultural areas plus settlements and mixed water plus other vegetation conditions, among others. Another analysis could be performed, considering spectral mixture in pixels, different machine learning techniques based on unsupervised classification, and the application of other quantitative methods to accuracy assessment in flood mapping.

**Author Contributions:** Conceptualization, L.L.; methodology L.L. and A.M.-G.; software, L.L.; validation, L.L. and A.M.-G.; formal analysis, L.L.; investigation, L.L. and A.M.-G.; resources, L.L. and A.M.-G.; data curation, L.L. and A.M.-G.; writing—original draft preparation, L.L.; writing review and editing, L.L.; visualization, L.L.; supervision, L.L. and A.M.-G.; project administration, A.M.-G.; funding acquisition, A.M.-G. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Acknowledgments:** This work has been supported by the Regional Government of Castilla y León 2014–2020, European Social Fund Operational Programme, and the GEAPAGE research group (Environmental Geomorphology and Geological Heritage) of the University of Salamanca.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A**

Results obtained in the accuracy assessment of the flood mapping methods by applying the producer accuracy, user accuracy, overall accuracy, and Kappa coefficient.

#### **References**

