*4.6. Model Performance and Results Validation*

#### 4.6.1. Statistical Measures

At the first stage, the models' performance assessment will consist of the computation of the next statistical metrics: specificity, sensitivity, accuracy, kappa index. The aforementioned indices will be computed using the next mathematical relations:

$$k = \frac{p\_o - p\_\varepsilon}{1 - p\_\varepsilon} \tag{13}$$

$$Sensitivity = \frac{TP}{TP + FN} \tag{14}$$

$$Specificity = \frac{TN}{FP + TN} \tag{15}$$

$$Accuracy = \frac{TP + TN}{TP + FP + TN + FN} \tag{16}$$

where *TP* (True Positive) and *TN* (True Negative) are the sum of points that will be correctly classified, *FP* (False Positive) and *FN* (False Negative) are the sum of points erroneously classified; *k* is kappa coefficient, *po* is the sum of initially established torrential pixels, and *pe* is the sum of predicted torrential pixels.

#### 4.6.2. ROC Curve

The second stage of results validation implied the application of the ROC curve and Area Under Curve (AUC) to measure the model performance. An AUC closer to 1 will highlight a performant model, while the values near to 0 will indicate a weak prediction ability of the models [83,84]. The Success Rate will represent a first form of ROC curve which will be constructed with the training samples, while the Prediction Rate is the second variant of ROC curve which will be designed with the help of validation sample. The AUC values will be determined using the next formula:

$$ALIC = \frac{\left(\sum TP + \sum TN\right)}{\left(P + N\right)}\tag{17}$$

where *P* is the sum of points having torrential phenomena and *N* is the sum of nontorrential points.

#### **5. Results**

#### *5.1. Feature Selection Using LSVM*

According to the results achieved through Weka software, the application of LSVM provided the next scores: slope (0.659), profile curvature (0.476), land use (0.429), tpi (0.394), twi (0.362), convergence index (0.338), hydrological soil group (0.283), spi (0.253), lithology (0.231) and aspect (0.162) (Figure 5).

#### *5.2. FR and WOE Coefficients*

The values of FR and WOE coefficients are inserted in Table 1. The largest value of FR coefficients (7.295) was achieved by TWI class between 14.6 and 24.6, followed by slope class between 15 and 25◦ (3.925), SPI values lower than 50 (3.205), built-up areas land use category (2.715) and TPI class between −1 and 1.3 (1.695) (Figure 6). In terms of WOE weights, the highest score was assigned to built-up areas land use category (3.96), followed by TWI class between 14.6 and 24.6 (2.67), slope class between 15 and 25◦ (2.48), SPI values lower than 50 (1.88) and TPI class between −1 and 1.3 (1.39).

**Figure 6.** Distribution of FR and WOE coefficients within the classes of flash-flood predictors.

In order to be used as input in ADT and DLNN models, the FR and WOE values were normalized between 0 and 1.


#### **Table 1.** FR and WOE coefficients.

#### *5.3. Models Performance Assessment*

The configuration, in terms of the hardware and software environments, that was required for the computational modelling, is presented in Table 2.

**Table 2.** Hardware and software environmental configuration used for modelling.


It is mandatory that before the final mapping of flash-flood potential, the model's performance must be evaluated in order to verify its reliability in the methodological process. Thus, in terms of the training dataset, the DLNN-WOE ensemble achieved the highest accuracy (0.985), followed by DLNN-FR (0.982), ADT-FR (0.923) and ADT-WOE (0.92). In terms of the validating sample, the highest accuracy was achieved by DLNN-WOE (0.92), followed by DLNN-FR (0.903), ADT-WOE (0.896) and ADT-FR (0.878) (Table 3).

**Table 3.** Statistical metrics used to evaluate model's performance.


*5.4. Results of Machine Learning Ensembles*

5.4.1. DLNN-FR and DLNN-WOE Results

The DLNN based ensembles were trained by establishing the maximum number of epochs to 100 (Figure 7).

**Figure 7.** DLNN based ensemble running outputs (**a**) Training and Validating loss of DLNN-FR; (**b**) Training and Validating accuracy of DLNN-FR; (**c**) Training and Validating loss of DLNN-WOE; (**d**) Training and Validating accuracy of DLNN-WOE).

Figure 7 highlights the variability of loss and model accuracy according to the epochs number and also for both training and validating samples. Particularly, in the case of the DLNN-FR model, the best performances were achieved with the following model parameters: a number of two hidden layers; a maximum number of 100 hidden neurons in each hidden layer; a dropout rate of 0.3; a batch size value of 5 and a validation split of 0.3. The same number of hidden layers and neurons was used also in the case of DLNN-WOE, while the other parameters have the following value: a dropout rate of 0.4; a batch size of 4 and a validation rate of 0.2. The architecture of the DLNN-based ensembles are represented in Figure 8.

**Figure 8.** Deep Learning Neural Network architecture.

The next step in the flash-flood susceptibility computation process is the derivation of the flash-flood predictor's importance. In terms of DLNN-FR, the highest importance was assigned to slope factor (0.2). On the second-place rank, land use (0.143), followed by profile curvature (0.12), TWI (0.109), hydrological soil group (0.097), lithology (0.094), TPI (0.08), SPI (0.067), convergence index (0.061) and aspect (0.029) (Figure 9). The application of DLNN-WOE revealed that the most important factor was slope (0.235), and is followed by land use (0.149), SPI (0.089), hydrological soil group (0.086), TPI (0.086), TWI (0.082), lithology (0.074), convergence index (0.072), profile curvature (0.064) and aspect (0.063).

**Figure 9.** Flash-flood predictors importance.

The weights of flash-flood predictors were used in ArcGIS map algebra in order to derive the flash-flood potential index values. All the Flash-Flood Potential Index (FFPI) results, with values between 0 and 1, were reclassified in five classes using Natural Breaks method. In terms FFPIDLNN-FR, the very low flash-flood potential values cover around 7.5% of the study area and range between 0 and 0.42 (Figure 10a). The low flash-flood potential appears on around 15.6% of Bâsca Chiojdului river catchment and has values ranging from 0.43 and 0.55. It should be remarked that these values are mainly spread on the southern half of the area. The medium flash-flood potential has a span of 30.28% of the entire territory (Figure 11) and is characterized by FFPIDLNN-FR between 0.56 and 0.66. These values are uniformly distributed across the study zone. The high and very high flash-flood potential appears on areas with FFPIDLNN-FR higher than 0.67 and covers approximately 46.57% of the research area. This potential degree is mainly present in the northern half of Bâsca Chiojdului river basin.

**Figure 10.** Flash Flood Potential Index (**a**) DLNN-FR; (**b**) DLNN-WOE; (**c**) ADT-FR; (**d**) ADT-WOE.

**Figure 11.** Flash-Flood Potential Index (FFPI) classes weights.

In terms of FFPIDLNN-WOE, the very low flash-flood potential is characteristic for a percentage of 7% from the entire study perimeter, while the low values of the same indicator cover an area of 13.41% of the total territory. Ranging from 0.59 to 0.68 (Figure 10b), the medium flash-flood potential spans accross approximately 28.76% of the Bâsca Chiojdului river catchment. High and very high flash-flood susceptibility has values of FFPIDLNN-WOE higher than 0.69 and is spread over more than 50% of the research zone. It should be noted that the areas delineated through DLNN-WOE have a lower degree of fragmentation than the areas delineated by DLNN-FR.

#### 5.4.2. ADT-FR and ADT-WOE Results

A trial procedure was applied in order to determine the best parameter associated with the highest accuracy of ADT-FR and ADT-WOE for both training and validating samples. Thus, in terms of ADT-FR, the highest accuracies (0.923 for training and 0.878 for validating) were achieved after 23 iterations, while in terms of ADT-WOE the best accuracies (0.92 for training and 0.896 for validating) were determined after a number of 28 iterations (Table 4). Once the best parameters were determined, the optimally pruned decision trees were constructed (Figure 12a,b) and the flash-flood predictors importance were calculated.

**Table 4.** The optimal parameters of the ADT based ensembles.


**Figure 12.** Optimally pruned Decision Tree Structure for ADT based ensembles ((**a**) ADT-FR and (**b**) ADT-WOE ensembles).

Therefore, in terms of ADT-FR, the highest importance was assigned to slope factor (0.191). On the second-place rank land use (0.134), followed by hydrological soil group (0.131), lithology (0.125), profile curvature (0.108), convergence index (0.102), TWI (0.091), SPI (0.07), TPI (0.034) and aspect (0.013) (Figure 9). The application of ADT-WOE revealed that the most important factor was slope (0.198), and is followed by land use (0.156), hydrological soil group (0.123), lithology (0.117), profile curvature (0.096), TWI (0.086), convergence index (0.075), SPI (0.066), aspect (0.051) and TWI (0.032).

As in the case of the previous two ensembles, the FFPIADT-FR and FFPIADT-WOE were calculated. In terms FFPIADT-FR, the very low flash-flood potential spans around 3.26% of the study area and has values between 0 and 0.39 (Figure 10c). The low flash-flood potential is distributed on around 11.74% of the Bâsca Chiojdului river catchment and has values ranging from 0.4 to 0.58. The medium flash-flood potential spans 25.63% of the entire territory and has values between 0.59 and 0.7 (Figure 10c). The high and very high flash-flood potentials appear on areas with FFPIADT-FR higher than 0.71 and cover approximately 59.38% of the research area. In terms of FFPIADT-WOE, the very high flashflood potential covers 6.64% of the entire study perimeter, while the low values are spread over 13.87% of the total territory. With values from 0.59 to 0.69 (Figure 10d), the medium flash-flood potential occurs over 30.34% of the Bâsca Chiojdului river catchment. The high and very high flash-flood potential indices have values higher than 0.7 and account for almost 50% of the study zone.

#### *5.5. Results Validation Using ROC Curve*

The validation of the FFPI results provided by each ensemble model was carried out using the ROC curve method. Thus, in the case of the Success Rate, the highest performance was achieved by FFPIDLNN-WOE with an AUC of 0.96, being followed by FFPIDLNN-FR (AUC = 0.942), FFPIADT-WOE (AUC = 0.94) and FFPIADT-FR (AUC = 0.919) (Figure 13a). If we analyze the Prediction Rate outcomes, it can be seen that the same FFPIDLNN-WOE indicator achieved the highest performance (AUC = 0.921), followed by FFPIDLNN-FR (0.92), FFPIADT-WOE (0.909) and FFPIADT-FR (AUC = 0.879).

**Figure 13.** Receiver Operating Characteristic (ROC) curve (**a**) Success Rate; (**b**) Prediction Rate.

#### **6. Discussions**

With the undeniable advancement of technology, there are more and more possibilities to monitor the dangerous phenomena that occur on the Earth's surface. In this regard, it is worth remembering the rapid advance of observation techniques of the terrestrial surface by means of remote sensing sensors—with the help of which, the surfaces affected by natural hazards can be observed.

Thus, the present paper used images taken with the help of these sensors to identify the areas already affected by the torrential runoff from the Earth's surface. It should be mentioned that the most accurate identification of these areas is essential in obtaining results with high accuracy and which can be further used by the competent authorities in risk assessment and in adopting the most appropriate measures to reduce future damage caused by these hazards. Thus, by analyzing the images provided by remote sensing sensors, on the river basin of the river Bâsca Chiojdului, areas affected by torrential runoff totaling a total area of 34 km2, representing about 10% of the entire study area, were identified. Furthermore, in order to capitalize on the delimited surfaces, a sample of about 481 was generated, taking a sample of points affected by torrential phenomena transposed into relief microforms such as ravines. In order to ensure the correctness of the modelling results, another sample of 481 points was generated from the areas where the torrential phenomena did not take place; the entire data set being then divided into training and validating data. The values of 10 flash-flood predictors were also used as input data. It should be noted that Remote Sensing sensors also played a crucial role in generating 8 of the 10 flash-flood predictors. Thus, all morphometric parameters were derived from the digital terrain model taken from the SRTM database, 30 m which was acquired using radar techniques. In addition, the land use, taken from the Corine Land Cover 2018 database, was generated by the supervised classification of the images provided by the Remote Sensing sensors.

Data on the presence of phenomena and the values of the main predictors of flash-flood genesis were included in two of the state-of-the-art machine learning models represented by Deep Learning Neural Networks and Alternating Decision Trees. These two models are recommended due to the very good results they provided following their application in previous studies on the estimation of susceptibility to natural hazards [79,80]. For a higher degree of results objectivity, it was decided to process the training sample by assigning some coefficients using the bivariate methods statistics, Frequency Ratio and Weights of Evidence. This method has proven to be very useful in previous studies [46,56] where the initial data were processed with bivariate statistics algorithms.

The combination of DLNN with WOE proved to be the most efficient because the accuracy achieved during the training process exceeded 98%, while ROC curve applied to the final product FFPIDLNN-WOE showed a maximum AUC of 0.96. This value of AUC exceeds the value obtained by Costache et al. [38], when, by applying the hybrid combination between Multilayer Perceptron (MLP) and Statistical Index, for the same study area and for the FFPI calculation, a maximum AUC value of 0.94 was obtained. These results confirm the findings from the literature according to which DLNN, whose architecture includes several hidden layers, is able to surpass the MLP performances whose architecture includes a single hidden layer [57]. Moreover, the MLP performance from the previous study was exceeded by the DLNN-FR ensemble model, characterized by an AUC of 0.942. Overall, in the Bâsca Chiojdului basin, the models showed a percentage of the high and very high flash-flood potential between 46.57% (DLNN-FR) and 59.38% (ADT-FR).

#### **7. Conclusions**

In light of the continuous increase in the flash-flood events' frequency, the present research work proposed a workflow through which the areas susceptible to flash floods are identified based on remote sensing and GIS data included in Deep Learning and Alternating Decision Trees ensembles. Thus, using 418 torrential and 481 non-torrential locations along with 10 flash-flood predictors, the Flash-Flood Potential Index was determined across the

Bâsca Chiojdului river basin. Using as input data the FR and WOE coefficients, the FFPI was computed using the following four ensembles: DLNN-FR, DLNN-WOE, ADT-FR and ADT-WOE. As was expected, the slope angle and land use resulted in being the most important flash-flood predictors. The highest results accuracy was achieved by the DLNN-WOE model which is characterized by an AUC–ROC curve of 0.985. The percentage (59.38%) of high and very high FFPI classes was revealed by the application of ADT-FR ensemble.

The main novelty of this study is represented by the application for the first time in the literature of the four ensemble models for determining flash-flood potential index values.

This work is of real importance for the governmental authorities which can use the results in order to improve the measures taken to mitigate the negative effects of flash-flood hazards within the study area.

**Author Contributions:** Conceptualization, R.C., A.A. (Alireza Arabameri) and Q.B.P.; data curation, R.C., A.A. (Alireza Arabameri), Q.B.P., B.T.P., M.P. and A.A. (Aman Arora); methodology, R.C., A.A. (Alireza Arabameri), Q.B.P., B.T.P., M.P., A.A. (Aman Arora), N.T.T.L. and I.C.; writing—original draft. R.C., A.A. (Alireza Arabameri), Q.B.P., B.T.P., M.P. and A.A. (Aman Arora); writing—review and editing, R.C., A.A. (Alireza Arabameri), Q.B.P., B.T.P., M.P., A.A. (Aman Arora), T.B., N.T.T.L. and I.C. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was partly funded by the Austrian Science Fund (FWF) through the Doctoral College GIScience (DK W 1237-N23) at the University of Salzburg.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Data available on request.

**Conflicts of Interest:** There is no coflict of interest.

#### **References**

