*2.1. Data and Structure of Artificial Neural Networks*

Artificial neural networks are algorithms applied to map features into a series of outputs. Through a structure of the input, output and intermediate hidden layers, artificial neural networks can learn data relationships between input and output data [22]. A feedforward neural network is applied in this work for modeling the study area, proceeding and transmitting data in a network structure [23]. One of the most widely used ANN is the multilayer perceptron (MLP) [24]. The MLP consists of highly interconnected neurons organized in layers to process information. The neurons in one layer are fully connected to each neuron in the next layer. Each connection is then assigned a weight. Each neuron collects values from the previous layer by summing up the results from the previous neuron values multiplying the weight on each input arcs and storing the results on itself. An activation function is used to transfer the results from the hidden layers to the output layer, and a loss function is applied to measure the fit of the neural network to a set of input–output data pair.

In this case study, the input layer collects the seven inflows to the urban area of Kulmbach, given the hourly discharge intensity. The output layer is fed with the hourly raster inundation map with a resolution of 4 m × 4 m from the event database. Between the input layer and the output layer, the ANN has 2 hidden layers with 10 nodes per layer. One hundred twenty synthetic events from the event database are used for network training (see Section 3.2). Afterward, the other 60 events in the event database are used for model validation. This represents 2/3 of the date for training and 1/3 for testing, which is often found in the literature [25]. Finally, the model is applied to forecast

three historical events. The widely applied sigmoid function is chosen as the activation function for the neural network [26]. Due to the high-resolution of the map (4 m by 4 m), weights between the last hidden layer and the output layer would have been 1 GB RAM with a dimension of the problem of more than 30 million. The optimization of these weights is very time-consuming, even with the latest optimization techniques [27]. Hence, a "divide and conquer" strategy is used to enable calculation in a single PC. In addition, it should be noted that, in principle, the results of the two strategies should be the same. Some alternative artificial network structures, such as a convolutional neural network (CNN), could not be applied in this study. The network size would require a very large number of hypermeters, which was beyond the memory capacity of a personal computer for forecasting purposes [28]. Furthermore, the time for training can be reduced with our strategy due to parallelization. The estimated time for training all networks in parallel with four cores is 6 h. To reduce the training time and save memory requirements, the study area is divided into 50 × 50 squared grids (see Figure 1). A similar idea of splitting has been applied to a former study [19]. Each grid had four independent ANNs for intervals (3 h, 6 h, 9 h and 12 h). In total, 10,000 ANNs are trained to produce multistep forecasts.

**Figure 1.** The forward-feed neural network setup in the forecast study. The input layer is fed with discharge inflows of certain time interval windows. The output layer generates the flood inundation for that interval. Resilient backpropagation is applied for training this network.

#### *2.2. Hyperparameter Tuning in ANN*

To optimize weights in ANN, resilient backpropagation is a widely applied effective algorithm [29]. According to Shamim et al. and Panda et al. [23,30], backpropagation neural networks outperform other methods in flood forecasting studies for their more efficiency and higher robustness. Berkhahn et al. [19] compared the training algorithms for hyperparameter tuning. The authors showed that resilient backpropagation is more efficient than both backpropagation and Levenberg–Marquardt for maximum flood inundation prediction. The process has two stages: the training stage gathers information from the flood event database, changing the weights between layers to minimize the error on the output layer; the recalling stage generates the forecast for the rest of the events in the database for testing the model.

ݓ ܮ Formula 1 and Formula 2 show the scheme of a resilient backpropagation. To calculate the update of the network weights *wij* from ith neuron to jth neuron, the gradient descent algorithm is applied. It distinguishes the update of weights upon the derivative of the loss function *L* of the model. The loss function L takes the mean square error (MSE). The iteration stops once the loss function reaches its minimum (chosen 10−<sup>6</sup> in this case).

$$\Delta\_{ij}(t) = \begin{cases} \eta^{+} \cdot \Delta\_{ij}(t-1), \ \frac{\partial L}{\partial w\_{ij}}(t) \cdot \frac{\partial L}{\partial w\_{ij}}(t-1) > 0 \\\ \eta^{-} \cdot \Delta\_{ij}(t-1), \ \frac{\partial L}{\partial w\_{ij}}(t) \cdot \frac{\partial L}{\partial w\_{ij}}(t-1) < 0 \\\ \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \end{cases} \tag{1}$$

ሺݐ−1ሻ > 0

$$w\_{ij}(t) = \begin{cases} \begin{array}{c} \Delta\_{lj}(t-1) \leftarrow \cdots \\ w\_{ij}(t-1) + \Delta\_{lj}(t) \cdot \frac{\partial L}{\partial w\_{ij}}(t) < 0 \end{array} \\ w\_{ij}(t-1) - \Delta\_{ij}(t) \cdot \frac{\partial L}{\partial w\_{ij}}(t) > 0 \end{cases} \tag{2}$$

The learning rate is to scale the speed in each weight updating iteration. The larger alternative learning rate η <sup>+</sup> is chosen when the error gradient in the same signal in neighboring iterations and lower alternative learning rate η <sup>−</sup> when the loss function is close to zero, fulfilling 0 < η<sup>−</sup> < 1 < η+. In our study, these were set constant and equal to η <sup>−</sup> = 0.5, η <sup>+</sup> = 1.2. The deep learning toolbox of MATLAB version R2017a is used to form the forecasts. ߟ ା ߟ ାߟ>1> ିߟ>0 ି ߟ 1.2 = ାߟ 0.5, = ି

ା ∙ ∆ሺݐ−1ሻ,

డ

ሺݐሻ ∙ డ

#### *2.3. Prediction of the First Interval of Flood Events*

The ANN model is trained with the first 120 events in the synthetic flood event database (more details in Chapter 3.2). The time series of each event (starting from time 0) is extracted for training. The input inflow discharges are extracted from time 0 to X h (X takes the values from 3, 6, 9, 12). The respective output inundation maps at X h (X takes the values from 3, 6, 9, 12) are used as the output layer. The intervals of 3 h, 6 h, 9 h, 12 h of the flood events are used for training four networks with the same forecast lead times (see Figure 2). The ANN models only consider the input flow values from the initial time step (blue bars of the events in Figure 2), but not from the previous time steps. This was similar to the approach in the framework FloodEvac, which successfully produced forecasts based on the selection of pre-recorded flood maps [31].

**Figure 2.** Training of artificial neural networks (ANN) forecast model. Four ANN models for 3 h, 6 h, 9 h, 12 h first interval predictions are set up in this work, trained with the discharges from each synthetic flood event. After this, the models are to predict the corresponding first intervals for other events.

After the training, the models are tested for the first interval forecast for the rest 60 events in the synthetic database.

#### *2.4. Real-Time Forecasting for Sequential Multistep Forecast Intervals*

In this work, the flood forecast starts when a certain discharge forecast threshold is achieved. If the start point occurs sometime later at time x, the prediction begin is also shifted to time x accordingly. If all the discharge inflows fall below the forecast threshold, the forecast is stopped. With this setup, the forecast can run in continuous mode.

The ANN receives the corresponding discharge inputs of an interval, just as in real-time forecasts. After the forecast is complete for a certain step, the discharge forecast is repeated one hour ahead. The forecasts are done with the same ANN model, now starting one hour later, taking the discharge inputs from the next time interval. This procedure is repeated many times to enable the continuous mode of flood forecasting. In this case study, the real-time forecast is performed with the ANN models trained to forecast at multiple steps of 1–5 h. The forecast from time 0 and the shift forward of the forecast intervals by one hour and two hours are shown in Figure 3.

**Figure 3.** Shift of ANN forecast models for multistep forecast intervals. The yellow color shows the forecast of the first interval (forecast interval same as the training interval, i.e., at time 0). The green color shows applying the original 3 h forecast network for 1 h later forecast from 1–4 h. The orange color shows applying the original 3 h forecast network for 2 h later forecast from 2–5 h. The black box shows the general case of applying the original X h forecast network for S h later forecast, from S h to X h + S.

For an easier interpretation of the different forecast groups, we name each forecast as "X h + S" forecast. The "X h" indicates the forecast interval of X hours, and the "+ S" behind it shows the start time of the forecast.

#### *2.5. Model Evaluation*

The root-mean-square error (RMSE) is applied to access the ANN forecast performance in the study area. The forecasts of the ANN are compared against the inundation maps produced by the 2D dynamic model (see Section 3.2). Hence the 2D dynamic model results are assumed as the observed values in order to enable the evaluation of the ANN. Al, the events in the database have been processed by the FloodEvac tool [31] and validated [32]. As the ANN training is conducted within each grid, the RMSE is also evaluated for each grid.

RMSE = ට ଵ ∑ ሺܶ−ܵሻ ଶ ୀଵ RMSE = r 1 *n* X*<sup>n</sup> i*=1 (*T* − *S*) 2 , (3)

where

*T* is the predicted value, water depth from the ANN model in our case.

*S* is the observed value, water depth from the hydraulic model (HEC-RAS) in our case.

To assess the general conduct of the model over the training and validation dataset, the average RMSE is also calculated for the average accuracy among all the events in the testing dataset.

To quantify the forecast of inundation extent growth, the following indices are used to measure the correspondence between the ANN model and the hydraulic model, namely probability of detection (POD), false alarm ratio (FAR) and critical success index (CSI) [33].

$$\text{POD} = \frac{\text{hits}}{\text{hits} + \text{misses}'} \tag{4}$$

$$\text{FAR} = \frac{\text{false alarm}}{\text{hits} + \text{false alarm}} \,\text{\textdegree} \,\tag{5}$$

$$\text{CSI} = \frac{\text{hits}}{\text{hits} + \text{misses} + \text{false alarm}} \text{\textdegree} \tag{6}$$

A pixel with water depths under 10 cm is defined as a dry pixel, while over 10 cm as a wet pixel. Hits count the pixels that are both wet by the ANN forecast and the hydraulic simulation. Misses counting the pixels that are predicted dry by the ANN model but simulated wet by the hydraulic model. False alarms count the pixels predicted wet by ANN model but simulated dry by hydraulic model.

#### **3. Study Area and Database**

### *3.1. Study Area*

The study area of Kulmbach lies by the River Main in Bavaria, Germany. The White Main divides the city to the north and south parts. Seven streams, specifically the Red Main, White Main, Dobrach, Schorgast, Mühlbach, Kohlenbach and Kinzelsbach, flow into this area. The city Kulmbach has a population of 25, 866 inhabitants in an area of 92.77 km<sup>2</sup> . An extreme flood event hit the city on 28 May 2006. A flood mitigation plan was prepared by local stakeholders to mitigate future events. In the ANN model, the above seven streams are taken as the input boundary conditions. The goal of the ANN modeling is to replace the hydraulic processes within the marked study area to enable fast real-time forecasts (see Figure 4).

**Figure 4.** Map of the study area. It shows the location of Kulmbach in Germany. The blue curves represent the river network. The shaded region is the study area with its topography represented. On the marked boundary, the red points represent the seven inflows on the boundary (three rivers and four smaller streams).

#### *3.2. HEC-RAS and Synthetic Event Database*

The synthetic database is conducted by the 2D hydraulic model Hydrologic Engineering Center—river analysis system (HEC-RAS), Davis, CA, USA) for different precipitation durations, intensities and distributions [31]. Each event in the database contains a discharge hydrograph and an inundation map. The database contains 180 synthetic events in which discharge hydrographs are

generated by the hydrologic model large area runoff simulation model (LARSIM) [34]. The events of the final database cover a wide range of different return periods, ranging from one year to 1.5 × 100 year return periods. The 2D hydrodynamic model HEC-RAS is used for producing the flood inundation maps. In the end, 180 hydrographs and their corresponding inundation maps form the synthetic event databases. The tool for automating these procedures is named the FloodEvac tool. The model is validated [32]. All the events are with a high temporal resolution of 15 min, and the inundation map is projected to a high spatial resolution (4 m by 4 m).
