**Using High-Density LiDAR Data and 2D Streamflow Hydraulic Modeling to Improve Urban Flood Hazard Maps: A HEC-RAS Multi-Scenario Approach**

**Alin Mihu-Pintilie 1,\* , Cătălin Ioan Cîmpianu 2, Cristian Constantin Stoleriu 2, Martín Núñez Pérez <sup>3</sup> and Larisa Elena Paveluc 2,4**


Received: 31 July 2019; Accepted: 1 September 2019; Published: 3 September 2019

**Abstract:** The ability to extract streamflow hydraulic settings using geoinformatic techniques, especially in high populated territories like urban and peri-urban areas, is an important aspect of any disaster management plan and flood mitigation effort. 1D and 2D hydraulic models, generated based on DEMs with high accuracy (e.g., Light Detection and Ranging (LiDAR)) and processed in geographic information systems (GIS) modeling software (e.g., HEC-RAS), can improve urban flood hazard maps. In this study, we present a small-scale conceptual approach using HEC-RAS multi-scenario methodology based on remote sensing (RS), LiDAR data, and 2D hydraulic modeling for the urban and peri-urban area of Bacău City (Bistri¸ta River, NE Romania). In order to test the flood mitigation capacity of Bacău 1 reservoir (rB1) and Bacău 2 reservoir (rB2), four 2D streamflow hydraulic scenarios (s1–s4) based on average discharge and calculated discharge (s1–s4) data for rB1 spillway gate (Sw1) and for its hydro-power plant (H-pp) were computed. Compared with the large-scale flood hazard data provided by regional authorities, the 2D HEC-RAS multi-scenario provided a more realistic perspective about the possible flood threats in the study area and has shown to be a valuable asset in the improvement process of the official flood hazard maps.

**Keywords:** Light Detection and Ranging (LiDAR); HEC-RAS; 2D modeling; flood hazard; urban and peri-urban area

#### **1. Introduction**

#### *1.1. State-of-the-Art*

In the last decades, with climate change and global warming, the associated natural disasters have reached more disastrous and catastrophic scales [1–6]. Many natural disaster patterns have diversified and modified under the pressure of climate change [7–9]. Among them, the intensification of the hydrological cycle has made an unprecedented impact on the magnitude, spatial extend, duration and frequency of hydro-meteorological disaster events [10–12]. Supporting this statement, many scientific publications support the fact that the occurrence of climate-related disasters (e.g., mainly floods, severe storms, cyclones, typhoons, droughts) have significantly increased under the abrupt changes in hydrological climatic conditions, the ecosystems resilience to transitional (wet–dry), and other related disturbances [13–15]. According to the Emergency Events Database (EM-DAT), in the last 50 years, climate related disasters have increased trifold between 1970 and 2017, from an average of 90 events in the period 1970–1989, to an average of more than 300 events in the period that followed after the year 2000 [16,17]. Among them, flooding phenomena are the most widespread, frequent, and costly natural disaster for the human societies [18,19], and is the most common natural hazard and the third most damaging hazard globally after storms and earthquakes [20,21].

In this context, North East of Romania is no exception. Crossed by the Siret and Prut rivers, which together constitute the biggest river basins in the country according to their surface, this territory is the most vulnerable area in terms of global climate change and modification of climate patterns. Modifications in the thermal, pluviometric and hydrological regimes have been noticed [22]. In the last 30 years, historical flood events were recorded in this area for the entire territory of Romania [23]. Floods have become a constant threat (especially after the year 2000) as, every two years, a major flood event was documented (e.g., 2004, 2005, 2006, 2008, and 2010) [24–27].

Besides climate change phenomenon implications, the high frequency of floods events is also determined by a whole range of socio-economic factors such as: Land use practices (e.g., changing floodplain functionalities), the development of socio-economic activities in flood-prone areas, changing living standards (e.g., urbanization—construction in high flood risk areas, urban sprawl), and poor regulations reflected by the incapacity of the responsible authorities to implement efficient planning policies [28–32]. The anthropic intervention on land has profoundly modified the natural standard behavior, and the changes on land use and land management are affecting, in particular, the river hydrology that determines the flood hazard [33]. This has ended up being a continuous increased trend in terms of vulnerability and exposure to disasters and flood hazard [34,35]. However, according to [36], the population has, voluntarily and under the pressure of modern society, exposed the environment to the possibility of flooding.

In Romania, there is a lack of proper flood defense measures against floods within many river basins. In general, the current flood defense infrastructure and hydro-technical constructions were built-up in the 1970s and very few improvements have been made in the years after. In the same time, over the past decades, the river regularization measures, the intensification of deforestation in the Carpathian Mountains, and housing development on the floodplains have changed the flood regime in the Siret and Prut river basins and their sub-basins. This has also the case for one of the most modified watercourses in the Siret river basin: Bistri¸ta River. Ten reservoirs, including canals, dams, collectors, transfer flows, protection works, and banks have been developed on the main course of the Bistri¸ta River, in order to mitigate the flood effects, and for the production of electricity using hydro-power plants [22,26,27]. Although no significant flood event has occurred on the Bistri¸ta River due to this considerable protection, the recent changes in pluviometric and hydrological regimes due to climate change, anthropic intervention on land, and reservoir clogging prove to be real challenges in the future flood hazard assessment of this area [22].

According to [37], human societies have always tried to reduce flood impact and have always sought protection from natural disasters by settling in safe areas or by building defense infrastructures. Throughout human history, there has been a constant preoccupation to understand, assess, and predict flood events and their impact [38]. Many ways were introduced in order to deal with flooding phenomena. In this context, aspects like streamflow hydraulic modeling, flood hazard assessment, and flood risk management have received increasing attention in the second half of the 20th century and has become a fundamental issue in the beginning of 21st century. Flood risk management can include: Hazard assessments, exposure assessments, vulnerability assessments, and risk assessments [39,40]. Flood hazard assessment is the most important step in the development of effective flood risk management; the flood hazard maps are used for danger estimation and for taking preventive conservation measures for risk mitigation. In this way, valuable information for different decision makers regarding spatial planning, and the design of infrastructure and emergency response preparation, become available.

In order to conduct flood hazard assessments and propose flood risk management measures, it is important to know what kind of comprehensive tools are available to be used in the analysis of flood susceptibility [41]. Generally, in flood hazard assessment, field-observed data from gauging stations are used. The existence of long records of accurate river discharge measurements is requested. However, these observations can have several uncertainties and the data are not always available [39,42]. Earth Observation (EO) datasets (e.g., space borne, aerial images, and satellite images) together with geographic information systems (GIS) can be used to determine the extent of flood areas and for the production of flood hazard and risk maps [43]. Combined with the use of remote sensing (RS) and GIS, EO datasets provide a favorable environment for relevant information processing in order to obtain the spatial extent of flood hazard areas and flood mapping [44–47]. Even if the benefits are high, this technique offers valuable information only for specific flood events, and this aspect is perceived as a disadvantage because future flood details and impact cannot be investigated [42,45].

Another available tool used in the determination of flood hazard areas is streamflow hydraulic modeling. The usage of this technique witnessed a considerable development in the recent years. As flood events have increased significantly over the last decades under the current unpredictable climate change behavior and human activities, the development of flood inundation models have become a necessity [39]. Streamflow hydraulic modeling is a useful and efficient tool in the simulation of flood events, identification of vulnerable areas, and estimation of spatially-distributed variables such as flow velocity or depth [48]. Since 1970, considerable efforts have been made in order to improve the capabilities and functionalities of this technique. Recent computational advances in hydraulic modeling offer new opportunities to support decision-making and adaptation [49]. The applicability of streamflow models have proved efficient in flood hazard and risk mapping studies [50,51], real time flood forecasting [52], and in remodeling past flood events [53]. Also, the availability of Digital Elevation Models (DEMs) [54] based on high-density Light Detection and Ranging (LiDAR) data have considerably improved the accuracy of flood parameters [47]. Depending on the selected mathematical method and scale, the complexity of models can be classified in simple interpolation methods (1D) and spatially detailed models which solve the water equations in 2D [55]. The use of a 1D simple model is recommended for channels, as long as the water remains in the steep slopes of the channel and no significant water spread is expected. 1D modeling is the perfect solution in cases of flood propagation along the main river [56,57]. Two-dimensional models are recommended in cases when the water is expected to overtop and the flow direction may change, spreading across a large area [42,58]. However, there are many works which make comparisons between 1D and 2D models [56,58–60].

#### *1.2. Case Study: Aims and Objectives*

In this study, we developed a method for flood vulnerability assessment under real (average discharge) and mathematical (calculated discharge) hydrological data based on HEC-RAS, LiDAR data, and 2D hydraulic modeling. Different from other studies [23–27,47], which provide flood hazard maps based on hydrological data calibrated at river basin scale [23,24,26,27], we developed for the first time in the study area flood hazard maps adapted to local environment settings and calculated discharge. In this context, four 2D streamflow hydraulic scenarios (s1–s4) were computed in order to test the flood mitigation capacity of hydro-technical constructions located downstream on the Bistri¸ta River (North East Romania) (Figure 1a). The scenarios were based on different discharge releases (average discharge; calculated discharge: s1–s4) from the Bacău 1 reservoir (rB1), also known as the Lilieci storage reservoir, and from its hydro-power plant (H-pp) (Figure 1b). The average and calculated discharges were correlated with the official operating regulations of rB1. The results provided a more realistic perspective about the possible flood threats located downstream: rB1, an area which overlays with the urban and peri-urban area of Bacău City; and rB2, with the Bacău 2 reservoir (Figure 1c).

**Figure 1.** (**a**) Geographic location of the study area in North East Romania; (**b**) Elevation map and, (**c**) Digital Elevation Model (DEM)-derived slope map of the urban and peri-urban areas of Bacău City. Data collected by the National Administration Romanian Waters (NARW) in 2014; DEM derived from ground point's elevation data (0.5 point density) collected using Light Detection and Ranging (LiDAR) techniques from aircraft. The abbreviations within the (**b**,**c**) maps show the locations of the study site boundaries used in the 2D hydraulic modeling for the Bistri¸ta River, Bacău 1 (rB1) and Bacău 2 (rB2) reservoirs, Bacău 1 spillway (Sw1) and Bacău 2 spillway (Sw2) gates, and the Bacău 1 hydro-power plant (H-pp). All spatial data was defined and projected in the Romanian national projection (STEREO 70).

#### **2. Methodology**

#### *2.1. Selected Site for 2D Streamflow Modeling*

The study area selected for 2D streamflow hydraulic modeling and for urban food hazard assessment covers 29.73 km<sup>2</sup> in the urban and peri-urban area of Bacău City (Bacău County, North East Romania) (Figure 1a). The town has a population of 196,883 inhabitants according to the 2016 census, making it the 12th largest city in Romania. The considered Bistri¸ta River reach extends 2 km upstream from the confluence of the Bistri¸ta and Siret rivers (Figure 1b,c). Due to more than 10 large reservoirs and other complex hydro-technical works (e.g., dams, water gates, hydro-power plants, channels, headraces, flood protection dikes) which equip the downstream sector of the Bistri¸ta watershed, the predictive flood models are difficult to achieve through classical methods. According to [22], even if the Bacău City was included on the settlements list potentially affected by floods in North East Romania, the flood hazard maps provided by the authorities would not highlight this state. For this

reason, in the last 10 years, new neighborhood areas have been built-up in the floodplain of Bistri¸ta River or in the proximity of the rB1 and rB2 reservoirs.

#### *2.2. Data Acquisition*

Figure 2 summarizes the workflow chart followed in this study with the hydrological, LiDAR, and built-up data obtaining process, and the key steps for HEC-RAS 2D streamflow and flood modeling.

**Figure 2.** Workflow chart of 2D streamflow and flood hydraulic modeling process using hydrological data (Sw1) and LiDAR ground point elevation (0.5–3 m spatial density) within the urban and peri-urban area of Bacău City: (**a**) Generate the LiDAR based DEM with 0.5 m spatial resolution based on LiDAR ground point elevation; (**b**) manual digitalization of buildings using a high resolution orthophotos and built-up data integration in DEM; (**c**) generate the four 2D multi-scenario (s1–s4) using HEC-RAS v.5.0.3, where Qavg. was generated based on average discharge at Sw1 for 2D streamflow accuracy, and s1–s4 was computed using calculated discharge at Sw1 for 2D flood multi-scenario development;

(**d**) generate the flood pattern for each computed scenario and export the individual layers (e.g., flood extent, flood velocity, flood depth); (**e**) flood hazard assessment based on the built-up data and flood depth classification according to the Ministry of Land Infrastructure and Transport (MLIT) [42].

2.2.1. Development of DEM

The DEM data used in this study comprehends airborne LiDAR technology. LiDAR DEMs display the terrain with a high degree of horizontal and vertical accuracy, a feature that is essential to terrain-related applications such as streamflow and flood modeling [54,61,62]. The data was obtained from the National Administration Romanian Waters (NARW)–Siret Water Basin Administration (SWBA) and consisted of raw ground point elevation data, at a spatial density between 0.5 and 3 m. More than 7.833 <sup>×</sup> 10<sup>6</sup> points covering almost 100 km<sup>2</sup> was considered in the DEM computation (Figure 2a). The final product (DEM) was obtained by means of *natural neighbor* interpolation technique [63]. The *natural neighbor* interpolation technique was preferred as it finds the closest subset of input samples to a query point and applies weights to them based on proportionate areas in order to interpolate a value [64]. To capture as many topographical details as possible, the DEM was computed at 0.5 m spatial resolution (Figure 2a).

As the considered study area overlaps a highly urbanized space, crossed by one of the most developed hydro-technical rivers in the country, an overview regarding the current situation of the urban built-up areas was necessary. Initially the open source database offered by OpenStreetMap [65] was consulted. As the available database proved incomplete, the last step in built-up data acquisition consisted of the manual digitalization of the missing built-up areas using a high resolution orthophotos digital image (0.5 m resolution), dating from 2012. In this way, the minimum and maximum polygonal areas of the improved built-up vector was situated between 10 m<sup>2</sup> and 19,534 m<sup>2</sup> (Figure 2b). Furthermore, in order to assure the hydraulic streamflow accuracy, the obtained vector was integrated into the DEM. The new data was rasterized and then added to the already obtained LiDAR DEM, taking into consideration an average height of 10 m. The built-up areas affected by the inundation were identified using the built-up vector obtained in the previous steps. All residential and industrial built-up areas that intersected or shared a common spatial extend with the inundation extend, were considered impacted (Figure 2b).

#### 2.2.2. Hydrological Data

The hydrological data used in this study was extracted from the Lilieci reservoir (rB1 and H-pp) official operating rules dated from 2012 and offered by SWBA (Table 1). According to Romanian regulations STAS—4273/61 and STAS 4273/82, the Lilieci reservoir (rB1) is classified after the height of the dam and its storage capacity volume, in the third importance category—having the flow sizing rate Q2% (50-years recurrence interval) = 510 m3/s, and Q0.5% (500-years recurrence interval) verification flow rate of 1670 m3/s. Nevertheless, the rB1 spillway gates (Sw1) and levees were designed to transit flow rates that correspond the second importance category—Q1% (100-years recurrence interval) = 1140 m3/s and Q0.1% (1000-years recurrence interval) = 2100 m3/s. The H-pp is located at approximately 1726 m from the Sw1, on a specially designed channel and it is equipped with two Kaplan turbines with a 180 m3/s maximum flow rate.

**Table 1.** The volume of water contained (million m3) in the rB1 reservoir according to specific water surface elevations (m) and related scenarios (s1–s4) developed in the study.


#### *2.3. 2D Hydraulic Modelling*

In relation to watercourses (e.g., streams, rivers, channels), the hydraulic models describe water movement through space in three directions [66]: A downstream direction along the river channel, a lateral direction (e.g., whenever the water begins to spill out overland), and a vertical direction which practically defines the height of the flood. Standard flood modeling practices include 1D modeling (upstream to downstream direction), 2D modeling (downstream and lateral directions), hybrid 1D–2D modeling and 3D numerical models, along with hydrograph design, specified ground roughness, and accurate digital elevation data [39,67]. For the present study, a 2D approach was adopted for a number of reasons (Figure 2c):


• Two-dimensional flood propagation modeling offers additional information regarding some characteristics of the flood, such as flow velocity and water trend propagation.

The 2D functionality builds flood models in a more accurate way, considering the flow variability in time, and both spatial dimensions, the *x* and *y*, making it more suitable for case studies that consider wide floodplains, urban environments, dam/levee breach situations, where the water is expected to spread over an open, unconstrained area, in multiple directions [68]. The release of the HEC-RAS software (5.0 version), in 2016, integrated 2D unsteady flow capabilities, offering the possibility to analyse water propagation over a predefined surface, which is found in the form of a digital elevations model [69,70]. HEC-RAS is a well-known software, capable of modeling a flood inundation event [39,42]. The scenarios proposed by this study were simulated using the open source HEC-RAS software (5.0.3 version), developed by U.S. Army Corps of Engineers (USAGE). Starting with HEC-RAS software version 5.0, two-dimensional unsteady water flow modeling can be performed. The program 2D flow modeling algorithm solves either shallow water equations, also called bidimensional Saint Venant equations (Equation (1)), or the 2D diffusion wave equations (Equations (2) and (3)).

$$\frac{\partial \check{\zeta}}{\partial t} + \frac{\partial p}{\partial x} + \frac{\partial q}{\partial y} = 0 \tag{1}$$

$$\frac{\partial p}{\partial t} + \frac{\partial}{\partial \mathbf{x}} \left(\frac{p^2}{h}\right) + \frac{\partial}{\partial y} \left(\frac{pq}{h}\right) = -\frac{n^2 pg\sqrt{p^2 + q^2}}{h^2} - gh\frac{\partial \zeta}{\partial \mathbf{x}} + pf + \frac{\partial}{\rho \partial \mathbf{x}} (h\tau\_{\mathbf{x}\mathbf{x}}) + \frac{\partial}{\rho \partial y} (h\tau\_{xy}) \tag{2}$$

$$\frac{\partial\eta}{\partial t} + \frac{\partial}{\partial y}\left(\frac{q^2}{h}\right) + \frac{\partial}{\partial \mathbf{x}}\left(\frac{p\eta}{h}\right) = -\frac{n^2 q \mathbf{g} \sqrt{p^2 + q^2}}{h^2} - gh \frac{\partial \zeta}{\partial y} + qf + \frac{\partial}{\rho \partial y}\left(h\tau\_{yy}\right) + \frac{\partial}{\rho \partial \mathbf{x}}\left(h\tau\_{xy}\right) \tag{3}$$

where, *h* is the water depth (m), *p* and *q* are the specific flow in the *x* and *y* directions (m2s−1), ζ is the surface elevation (m), *g* is the acceleration due to gravity (ms−2), *n* is the Manning's Roughness coefficient (m−1/<sup>3</sup> s), ρ is the water density (kg m−3), τ*xx*, τ*yy*, and τ*xy* are the components of the effective shear stress and *f* is the Coriolis (s<sup>−</sup>1). When the diffusive wave is selected, the inertial terms of the momentum equations are neglected: ∂*p*/∂*t* + ∂/∂*x p*2/*h* + ∂/∂*y* (*pq*/*h*) = 0 (Equation (2)); ∂*q*/∂*t* + ∂/∂*y q*2/*h* + ∂/∂*x* (*pq*/*h*) = 0 (Equation (3)) [42,45,71].

Due to their faster computational time and greater stability properties, the 2D diffusion wave equations were preferred in the present study [70]. In this case, the floodplain flow was approximated as a two-dimensional diffusion wave where water can flow in any direction based on the defined topography and the resistance to flow determined by the type of land use [72,73]. A 2D model consists of a 2D computational grid which groups/discretizes the river and its adjacent areas into a collection of individual and connected cells, called grid cells (2D flow cells), that are used to characterize the underlying topography [56,74]. The 2D flow area is the region (the boundary selected by the user) where the water flow will be modeled.

A close polygonal mesh with a computation point spacing of 5 m was built in order to define the spacing between the computational grid-cells. This managed to capture all the LiDAR terrain characteristics. In this way, more than 2,000,000 grid cells were generated; therefore, no break line was needed to be introduced (e.g., top of the levees, along the main channels). The computational mesh controlled the water flow throughout the 2D flow designed area and from cell to cell. Considering the underlying terrain and the generated computational mesh, the software developed detailed elevation–volume relationships and detailed hydraulic property curves for each cell face (elevation vs. wetted perimeter, area, and roughness) [70]. In this way, the water elevation values were calculated for each centroid of the grid. The Manning's Roughness coefficients, specific to each landcover class was set according to [75] (Figure 3).

**Figure 3.** Landcover map within the study area and Manning's Roughness coefficient *n* (m−1/<sup>3</sup> s) value for each class according to [75].

After the geometric data is created and the roughness coefficient is set, the next step consists of establishing the boundary conditions. This means that flood data along the boundaries of the 2D flow area is assigned. Flow hydrographs and normal depth boundary conditions representing the average riverbed slope are used in order to bring water into the 2D flow area. Four flood scenarios were simulated and, for each scenario, two hydrographs and two normal depths boundary conditions were introduced. The hydrographs were 24 h long with values recorded each hour. The upstream boundary conditions (the hydrographs representing imposed flow condition) were located in two specific locations downstream of Lilieci dam: At the hydropower plant of Lilieci reservoir (on the controlled channel), and on the natural river bed (where water from the reservoir is released). The imposed water level boundary condition was located at the downstream extremes of the channel and at the end of storage lake control channel (in the limits of DEM). An energy slope of 10−<sup>4</sup> mm−<sup>1</sup> was used.

In order to ensure the stability of the model, the time steps were estimated according to the Courant–Friedrichs–Lewy condition [70] (Equation (4)).

$$\mathcal{C} = \frac{V\Delta T}{\Delta \mathbf{x}} \le 1.0 \text{ (with } \mathcal{C}\_{\text{max}} = 3.0\text{) or } \Delta T \le \frac{\Delta \mathbf{x}}{V} \text{ (with } \mathcal{C} = 1.0\text{)}\tag{4}$$

where, *C* is the Courant number, *V* is the flood wave velocity (m/s), Δ*T* the computational time step (s), and Δ*x* the average cell size (m) [67]. According to Equation (4), a time step of 10 s was selected. To run a 24 h simulation at a time step of 10 s, the 2D models took between ~3 h (for Qavg. scenario generated

based on average discharge) and ~12 h (for each s1–s4). The flood inundation (depth), flood velocity, and other related results were obtained for each hour according to the hydrograph output interval.

#### *2.4. Streamflow and Flood Simulation Accuracy*

For accuracy purposes, a 2D water flow test simulation was computed. The data needed for this flow accuracy assessment test consisted of average discharge values at the Lilieci reservoir (rB1) for the year 2012 and, respectively, an orthophoto digital image (Figure 4a) and a Landsat 5 satellite image (Figure 4b) showing the water extend in the area acquired in the same date. The results of this simulation showed a close agreement between the water extend obtained from the model computation and the available orthophoto digital image (Figure 4c). The performance assessment was realized by comparing these three sets of spatial data. A difference of 5% (16 ha) was recorded between the orthophoto digital image water extend and the 2D average discharge simulation result. A second comparison was carried out using the Landsat 5 satellite image, acquired on 26 August 2012. The water extend were extracted according to [23] and compared with the 2D average discharge simulation result. In this case a difference of 9% (29 ha) was observed. Taking into consideration the presence of vegetation which could have masked the presence of water in the digital orthophoto images and the coarse resolution of Landsat scene, the accuracy of the flood extent assessment was judged satisfactory.

**Figure 4.** Streamflow 2D accuracy: (**a**) orthophoto (digital image) for 2012, (**b**) Landsat 5 (satellite image) for 2012, and (**c**) results of 2D average discharge simulation result. In all images was highlighted the water extend within inflow area (fan-delta) of the rB2 reservoir for comparison.

#### *2.5. Multi-Scenario Development*

The flood scenarios were based on hydrological data extracted from the Lilieci reservoir (rB1) official operating rules. Upstream boundary conditions for all scenarios were developed on a simulation of water flow over a period of 24 h, at a temporal resolution of 1 h. The average discharge scenario, spatially displays the main channel at a water release of Q = 34 m3/s located at the Lilieci hydro-power plant (H-pp) and a Q = 2.8 m3/s flow located at the reservoir main gates (Sw1), where the water is released directly in the main river bed (in order to assure the river servitude discharge). Scenario 1 (s1) takes into consideration the average discharge at H-pp of the Lilieci reservoir (Q = 34 m3/s) and the maximum water release in the case of the full opening of one gate (Sw1) of the same storage lake (Q = 490 m3/s). The maximum values were used in the simulation for a period of 1 h, whereas for

the remaining 23 h, the flow rates were set to their average values (Q = 2.8 m3/s). Scenario 2 (s2) considers water flow modeling of Q = 34 m3/s at the H-pp and Q = 980 m3/s at the Sw1 when two gates are fully opened. The distribution of time periods was the same as for s1. For scenario 3 (s3), the same parameters were used as for s1 and s2, with the exception that a water flow of Q = 1470 m3/s is assumed when all 3 gates of the Sw1 are opened. The last scenario 4 (s4) simulated the water flow when all 4 gates of the Sw1 are fully open at a discharge rate of Q = 1960 m3/s. The other parameters remained unchanged. In Table 2 are the simulation parameters for the 4 modeled scenarios.


**Table 2.** Streamflow features and flood multi-scenario developed in this study.

<sup>1</sup> Sw1: Bacău 1 spillway gates; <sup>2</sup> H-pp: Hydro-power plant; <sup>3</sup> Q: Discharge (m3/s); <sup>4</sup> Q: Percentage of total spillway gates discharge capacity.

#### **3. Results**

#### *3.1. Flood Pattern*

The 2D modeling was performed using the HEC-RAS 5.0.3 version which offered various output possibilities in terms of detailed animation and mapping of flood characteristics within the RAS mapper feature. Following the calculation of the 4 scenarios given in Table 1, individual layers of the flooded areas along Bistri¸ta River and Bacău City regarding the flood extent, velocity, water surface elevation and depth, were exported.

#### 3.1.1. Flood Extent

The flood extent layer consists of an *Inundation Boundary* shapefile vector layer capturing the areas affected by the flood during the whole 24 h simulation period (Figure 5). According to s1, an area of 6.27 km<sup>2</sup> within the Bistri¸ta floodplain, the inflow area (fan-delta) of rB2 and downstream sector of rB2 reservoir, are potentially affected by floods. Also, six buildings located within the s1 flood extent are potentially affected. In the s2 simulation, the total flood extent occupies 8.66 km<sup>2</sup> and 173 buildings in the peri-urban area of Bacău City which are located in the flood zone. In the third scenario (s3), the water release from Sw1 and H-pp spills over protection dikes and floods the built-up area of Bacău City. The total flood extent increases to 11.46 km<sup>2</sup> (built-up area—1.09 km2) and 2461 of buildings (e.g., block with flats, houses and yards, other residential buildings, industrial units) are potentially affected by the inundation waters. The last scenario (s4) indicates the most catastrophic situation where total flood extent is 12.82 km2, from which 1.5 km2 are built-up area, and 3780 residential buildings are vulnerable to the flood hazard (Table 3). Overall, in the first two scenarios (s1 and s2), the flood extent affected only the built-up areas within the Bistri¸ta floodplain, and in the last two scenarios (s3 and s4) the flood extent occupied the peri-urban and urban areas.

**Table 3.** Flood extent area, built-up area, and number of buildings potentially affected by floods computed for each scenario (s1–s4).


**Figure 5.** Flood extent derived from the HEC-RAS 2D multi-scenario generated based on LiDAR data and average discharge; s1–s4 was computed using calculated discharge (see Table 2).

#### 3.1.2. Flood Velocity

Another important result of the 2D modeling process is flood velocity. This is computed by recording the maximum velocity for each cell in the computational mesh regardless the time when it was measured in the whole 24 h simulation period (Figure 6). According to all computed scenarios (s1–s4), the 0.01–1 m/s velocity class has the highest frequency (88.3%), which corresponds with the average velocity recorded downstream rB2 (0.6 m/s). In the first two scenarios (s1 and s2), all vulnerable buildings are potentially affected by the 0.01–1 m/s flood velocity class, except three buildings located in front of Sw1 or along the natural watercourse of Bistri¸ta River. In the case of s3, only 9% of total affected buildings are threatened by 2 m/s flood velocity, and in case of s4, only 14% of total built-up area is potentially affected by high velocity of water (1–5 m/s). However, due to floodplain roughness and hydro-technical works within study area, the high flood velocities (>5 m/s) are registered only at the Sw1 and Sw2 gates, and through the narrow sections between rB1 and rB2 river sector (Table 4). Overall, in each computed scenario, the flood velocity is not the main threat for built-up area, but the river morphology (e.g., river bed, banks, alluvial deposits), and the hydro-technical works conservation stage (e.g., dams, channel, bridges) can alter through time. For these reasons, the maintenance of the flood mitigation equipment within urban and peri-urban areas must be a priority for the competent authorities.

**Table 4.** Number of buildings potentially affected by floods vs. flood velocity (m/s) computed for each scenario (s1–s4).


#### 3.1.3. Flood Depth

The maximum flood depth maps were generated by the 2D model by taking into consideration the maximum depth for each cell, no matter the time when that maximum depth was registered during the whole 24 h simulation period (Figure 7). According to s1 and s2, 95.5% of vulnerable buildings are potentially affected by a flood depth which does not exceed 1 m depth. In case of s3, when water levels exceed the height of protection dikes and inundates the urban and peri-urban area of Bacău City, 1174 buildings will be affected by floods which do not exceed 1 m in depth; 39.4% of the vulnerable built-up areas are potentially affected by floods with a depth between 1–2 m, and 12.7% by floods with a depth between 2–3 m. In the same scenario (s3), only 4 buildings are located in areas with water that exceeds 3 m. In case of s4, the most catastrophic scenario taken into account in this study, 34.7% of buildings would be affected by floods with <1 m depth, 30.9% by floods with depths between 1–2 m, 25.9% by floods with depths between 2–3 m, 7.9% by floods with depths between 3–4 m, and the rest of 0.4% (18 buildings) by floods that exceed >5 m depth (Table 5).

**Table 5.** Number of buildings potentially affected by floods vs. flood depth (m) computed for each scenario (s1–s4).


**Figure 6.** Flood velocity derived from the HEC-RAS 2D multi-scenario generated based on LiDAR data and average discharge; s1–s4 was computed using calculated discharge (see Table 2).

**Figure 7.** Flood depth derived from HEC-RAS 2D multi-scenario generated based on LiDAR data and average discharge; s1–s4 was computed using calculated discharge (see Table 2).

#### *3.2. Flood Hazard Assessment*

Usually, the flood hazard assessment is based on quantifiable variables like flood extent, water velocity or water depth, and indicate the vulnerability of built-up areas to hydrological events with possible destructive impact. In this study, we provided the flood hazard assessment using only the flood extent and flood depth resulting from the four HEC-RAS 2D scenarios (s1–s4). The flood velocity was not taken into account because is a constant variable in all four flood scenarios (88.3% from total built-up area are affected by the 0.01–1 m/s velocity class). To generate the flood hazard categories, the water depth for each flood extent was classified according to the Japanese criteria of the Ministry of Land Infrastructure and Transport (MLIT) [42]. The criteria suggest five flood hazard categories: H1—very low hazard (flood depth < 0.5 m); H2—low hazard (flood depth between 0.5–1 m); H3—medium hazard (flood depth between 1–2 m); H4—high hazard (flood depth between 2–5 m); H5—extreme hazard (flood depth > 5 m) (Figure 8). The flood hazard classification methodology and hazard description based on flood depth is detailed in Table 6.


According to s1, even if all 5 hazard classes are encountered, all vulnerable buildings are situated in the very low class of hazard (H1). In case of s2, 86.7% of total vulnerable buildings are very low affected by flood (H1), 8.75% are low affected by flood (H2), and the rest of 4.6% are located in the medium exposed area to flood hazards (H3). In s3 situation, the number of potentially affected buildings is significantly increased: H1—674 buildings; H2—500 buildings; H3—971 buildings; and H4—316 buildings. Also, this is the moment when the floods will affect the urban area of Bacău City. According to s4, more the 34.3% of total flood extent are located in the high (H4) and extreme (H5) hazard classes, the number of potentially affected buildings exceed the all NARW scenarios based on recurrence flood interval probabilities: H1—694 buildings; H2—618 buildings; H3—1171 buildings; H4—1279 buildings; and H5—18 buildings (Table 7; Figure 9a,b).

**Table 7.** Number of buildings potentially affected by floods vs. flood hazard classes computed for each scenario (s1–s4).


<sup>1</sup> Flood hazard classes according to [42] (see Table 6).

**Figure 8.** Flood hazard maps based on flood depth classification according to the MLIT [42] (see Table 5). Data derived from HEC-RAS 2D multi-scenario generated based on LiDAR data and average discharge; s1–s4 was computed using calculated discharge (see Table 2).

**Figure 9.** The distributions of flood hazard classes within (**a**) flood extent area, and (**b**) built-up area, according to MLIT [42] criteria (see Table 6). The statistics apply for all 2D multi-scenarios (s1–s4) in each chart.

#### **4. Discussion**

Floods can cause a tremendous amount of damage, especially in the highly regulated river sectors overlaid with highly urbanized areas along the floodplain. For this reason, the multiple viewpoints, methods, assumptions, and future possibilities adapted to new trends (social, economic, natural) and determined by different factors, play an important role in the flood hazard management and establishing flood-vulnerable areas [39]. Even in a highly regulated river system, flood hazards exist; properties and lives are in danger. A flash flood, dense precipitation, or an error in the discharge flow at the gates of the reservoirs can turn into a catastrophic flood event. For this reason, a good preparation regarding this topic is always welcomed. As remote sensing (RS) techniques continue to improve and the availability of data increases, more RS data will be integrated and used in flood modeling. In this context, according to [54], 2D LiDAR DEM based flood simulations provide the best results. The horizontal, vertical accuracy is better on flat terrain (such as wide floodplains) and the integration of vegetation and building heights offer the perfect support for flood hydraulic simulations and accurate water flow propagation.

Due to the unavailability of LiDAR surface elevation points, we were not able to perform the simulation on a Digital Surface Model (DSM with vegetation and building integration). We computed the DEM and manually integrated the building heights. Manually integration of building heights (in the case of data unavailability) can be a solution. Prior to this step, we wanted to check if any open source satellite imagery could be used in order to extract the built-up areas. Initially, a Landsat 8 satellite image dated 11 August 2018 was downloaded using the US Geological Survey (USGS) website [76], pre-processed and processed—including radiometric and atmospheric calibration, maximum likelihood classification (MLC), classification accuracy assessment—in order to extract the areas containing artificial surfaces, residential, commercial, industrial, and transportation infrastructure. A full methodology of the previous mentioned steps can be consulted in the works of [77]. In this way a full perspective regarding the current situation of the urban development throughout the floodplain can be obtained. However, even though this method is practical and can be used in areas where no built-up data is available, the results proved to be at a coarse resolution and impracticable for the level of study we proposed. Therefore we decided to extract the built-up areas at a higher resolution. We manually digitized the missing built-up areas using an orthophotos digital image, dating from 2012. The association of open source imagery (Landsat, Sentinel) and LiDAR data offered unsuitable results.

The 2D flow models and LiDAR DEMs prove to be the perfect combination when it comes to the flood hazard assessment and accurate inundation delineation. Due to its accurate representation of the complex hydraulic conditions that can be found in floodplains (e.g., channels, confluences, bridges, water reservoirs, roads, etc.), 2D hydraulic models can capture the hydraulic behavior of the river in a more accurate way. Water propagation, extent, velocity, and elevation can be reproduced

under real and hypothetical flow data and boundary conditions input. In this way we can simulate for different purposes the past flood events, using registered hydrographs at gauging stations, or we can imagine a theoretical hypothesis and, in this way, improve flood hazard and risk maps. This method is the best option in case of flood map creation due to possible reservoir dam failure at various dam discharges. On the other hand, the unavailability of high resolution DEMs can end up underestimating the flood hazard and its dangers. A precise representation of reality is mandatory for accurate flood mapping process.

Our study was focused on a highly regulated and well-equipped hydro-technical river sector which overlays the urban area of Bacău. This proved to be an impediment in the validation process of the 2D hydraulic modeling, as no notable flood event was recorded here in the recent years. Controlled water discharges from the reservoirs located upstream protect the areas downstream. For this reason, a considerable development of built-up areas and urban sprawl was noticed downstream, on both sides of the riverbed. Having only the average official discharges for the year 2012 available at the H-pp and Sw1 to model, we still managed to model the flood hazard in the area based on the official operating rules of the dam Sw1 and taking into consideration the reservoir maximum water retention capacity (Figure 10a).

**Figure 10.** Comparison between flood extent maps within the urban and peri-urban area of Bacău City: (**a**) 2D HEC-RAS multi-scenario based on average discharge and calculated discharge (s1–s4) developed in these study, and (**b**) official flood extent maps (NARW) [78] based on large-scale hydrological and spatial data, calculated for 10% (10-year), 5% (20-year), 1% (100-year), and 0.1% (1000-year) recurrence intervals.

Furthermore, we managed to compare the 2D flood extent results with the official available hazard and risk maps available in [78], which was realized under the Directive 2007/60/EC of the European Parliament and Council [79,80] on the assessment and management of flood risks (Figure 10b). The comparison results showed similarities even when the input material for their computation were at different scales and resolution. According to the National Administration "Romanian Waters", the hazard and risk maps for Bistri¸ta River were realized for the entire catchment at a spatial resolution between 5 and 10 m, while our analysis used a local 0.5 m input DEM where all the topographical details and hydraulic conditions were taken into consideration. Additionally, in a highly regulated river sector where several reservoirs are present, such as in our study area, the flood hazard is governed by their operating rules and management. Any mistake in the manipulation of water discharge can prove catastrophic for the areas downstream.

#### **5. Conclusions**

The 2D hydraulic modeling using HEC-RAS (v 5.0.3) and high-density LiDAR data applied for streamflow simulation within the urban and peri-urban area of Bacău City (North East Romania), produced sufficiently accurate information regarding flood hazard vulnerability. Based on multi-scenario development using different discharge releases from the rB1 (average discharge; calculated discharge: s1–s4) and its hydro-power plant (H-pp), the following concluding remarks can be summarized:


Overall, the 2D hydraulic modeling multi-scenario results can be exported into a set of flood hazard parameters such as flood depth, flood extent, flood velocity, or water surface elevation; and can answer real questions regarding the flood hazard threat at local level. Developing streamflow scenarios on a small-scale level is a very important aspect for any flood mitigation effort, especially in the urban areas located along main rivers, because large-scale analysis (river basin analysis) understates flood risk perception. Due to the adaptability of the 2D streamflow hydraulic model proposed in this study, the method can become a valuable asset in flood mitigation. However, a LiDAR DEM based 2D flood simulation is essential for every urbanized environment in the context of climate change and modern society development pressure and trends.

**Author Contributions:** Conceptualization, A.M.-P. and C.I.C.; methodology, A.M.-P., C.I.C. and M.N.P.; software, A.M.-P., C.I.C., C.C.S. and M.N.P.; validation, A.M.-P., C.I.C., C.C.S. and M.N.P.; formal analysis, A.M.-P. and C.I.C.; investigation, A.M.-P., C.I.C., C.C.S. and L.E.P.; resources, A.M.-P. and L.E.P.; data curation, A.M.-P. and C.I.C.; writing—original draft preparation, A.M.-P. and C.I.C.; writing—review and editing, A.M.-P.; supervision, A.M.-P.; project administration, A.M.-P.; funding acquisition, A.M.-P.

**Funding:** This work was funded by the Ministry of Research and Innovation of Romania within Program 1—Development of the national RD system, Subprogram 1.2—Institutional Performance—RDI excellence funding projects, Contract no.34PFE/19.10.2018 (beneficiary: A.M.-P.).

**Acknowledgments:** The authors would like to express their gratitude to the employees of the Romanian Waters Agency Bucharest, Siret Water Basin Administration Bacău, who kindly provided a significant part of the LiDAR data used in the present study. All data was processed in the Geoarchaeology Laboratory (Coordinator: A.M.-P.) of Institute for Interdisciplinary Research, Science Research Department, "Alexandru Ioan Cuza" University of Ia¸si (UAIC), Romania. Our thanks go to the all anonymous reviewers, who helped us in improving the manuscript.

**Conflicts of Interest:** The authors declare no conflict of interest. The founding sponsors had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, and in the decision to publish the results.

#### **Abbreviations**

The following abbreviations are used in this manuscript:


#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **How to Account for the Human Motion to Improve Flood Risk Assessment in Urban Areas**

#### **Gabriele Bernardini \* and Enrico Quagliarini**

Department of Construction, Civil Engineering and Architecture, Università Politecnica delle Marche, via Brecce Bianche, 60131 Ancona, Italy; e.quagliarini@staff.univpm.it

**\*** Correspondence: g.bernardini@univpm.it; Tel.: +39-071-220-4246

Received: 23 March 2020; Accepted: 1 May 2020; Published: 7 May 2020

**Abstract:** Floods are critical disasters affecting urban areas and their users. Interactions with floodwater spreading and built environment features influence the users' reaction to the emergency, especially during immediate disaster phases (i.e., evacuation). Recent studies tried to define simulation models to evaluate such exposure-related criticalities, assess individuals' flood risk, and propose risk-mitigation strategies aimed at supporting the community's proper response. Although they generally include safety issues (e.g., human body stability), such tools usually adopt a simplified approach to individuals' motion representation in floodwaters, i.e., using input from non-specialized databases and models. This study provides general modelling approaches to estimate evacuation speed variations depending on individual's excitement (walking, running), floodwaters depths and individuals' features (age, gender, height, average speed on dry surfaces). The proposed models prefer a normalized evacuation speeds approach in respect of minimum motion constraint conditions to extend their applicability depending on the individuals' characteristics. Speed data from previous experiments are organized using linear regression models. Results confirm how individuals' speed reduces when depth and age increase. The most significant models are discussed to be implemented in evacuation simulation models to describe the evacuees' motion in floodwaters with different confidence degree levels and then assess the community's flood risk and risk-reduction strategies effectiveness.

**Keywords:** flood risk assessment; flood evacuation; evacuation modelling; behavioral design; urban built environment at risk; human motion in floodwaters

#### **1. Introduction**

Floods have been provoked over 93,000 victims worldwide in the years 2000–2018, being the second most disruptive natural disaster (after earthquakes) affecting our communities' safety (Source: EMDAT (2019): OFDA/CRED International Disaster Database, Université catholique de Louvain, Brussels, Belgium, https://www.emdat.be/, last access: 16 December 2019). Only in Italy, in the years 2014–2018, there were over 70 dead or missing persons, over 30 wounded, and over 21,000 individuals who had to evacuate and were homeless because of such kind of events (Source: http://polaris.irpi.cnr.it/report/last-report/, last access: 16 December 2019). In this general context, urban areas are the riskiest scenarios because of the combination between the built environment (in the following, BE) features (defined as a network of buildings, infrastructures and open spaces) and the high density of hosted exposed inhabitants [1–3]. Previous works demonstrated the significant importance of the interactions between the individuals and the flood-affected BE during the immediate response phases, and in particular during the emergency evacuation process [4–6]. In such conditions, the hosted population can attempt to move (drive or walk) in floodwaters-affected scenarios towards "safer" areas, with the aim to reduce threats [7–12]. Most of the flood-related fatalities occur in outdoor spaces during such activities and so it is essential to understand the underlying phenomena [13].

This is particularly evident in the following relevant scenarios [10,13–21]:


The following cases can be combined and represent only an exemplification of the main conditions based on the literature review.

In such scenarios, individual's behaviors could lead to additional risks, too [6,10,22].

In this way, a brief overview on the current topics concerning a better risk assessment, counting for people moving in floodwaters and a proper effectiveness analysis of risk mitigation strategies, is reported in the next sub-sections, to have a general reference framework:


In particular, this study deals with this last issue, to improve the development of pedestrians' evacuation simulators in the context of flood disasters.

#### *1.1. BE Risk-A*ff*ecting Factors and the Population-Related Contribute*

Risk-increasing factors concerning the urban areas can be distinguished between those related to the BE itself (by including physical and management-related issues) and those related to the hosted population.

About the BE itself, they can be mainly related (but not limited) to [1,5,23–27]:

• Location of settlement on floodplains;


From the inhabitants' perspective, the general high population density in urban BE (in terms of inhabitants' number and localization on the urban layout) is matched with additional exposure-related factors such as social-economic factors, individuals' risk perception and awareness/preparedness and behavioral issues, which can lead people to additional dangers in emergency scenarios [5,10,19,28–30].

In such a framework, the two main elements characterizing the floodwaters and their direct effects on evacuees are their depth *D* (m) and speed *Vf* (m/s) [7–11]. In critical *D-Vf* conditions, fatalities can be mainly due to: (a) interaction with debris dragged by floodwaters, with the possibility of minor injuries; (b) stability loss (e.g., for pedestrians, they can be due to possible buoyancy or body failure/dragging phenomena), with the consequent possibility of drowning and death. Similar problems can be related to both people moving on foot or by (motor) vehicles [7,8].

Besides, some human behavior in floods could increase risks during the evacuation process [4,6, 10,13,17,29,31–33]:


All these behaviors could lead people to interact with critical floodwater levels, by leading people to be exposed to more critical floodwater conditions because of, e.g., not being able to choose the proper evacuation direction or spend time in dangerous areas.

#### *1.2. Simulation Tools to Risk-Assessment and Risk-Mitigation Strategies Evaluation*

Simulation models are effective evaluation tools for safety planners and designers for coordinating preparedness-oriented and response-oriented efforts of local Authorities and First Responders. Anyway, they should be developed by pursuing a behavioral point of view, to include the man-floodwater and the man-BE interactions as focus assessment elements [1,5,13,21,34].

As for other kinds of disasters [35–37], evacuation simulators could support this process by jointly representing behavioral issues, floodwater spreading and BE modifications over time and space. Most of the current simulators generally deal with a territorial/full urban scale, by mainly using motor vehicles or public transportation as evacuation means [5,15,26,38–40]. Such kind of model could introduce simplifications in the evacuation rules by adopting "macroscopic" models (e.g., density-related models for motion estimation). This choice is reasonable if considering the widescale application but can limit the simulator effectiveness while it locally evaluates threats for the population, especially in case of local interactions with the floodwaters (i.e., for pedestrian evacuation). Meanwhile, the number of simulators dealing with pedestrian evacuation seems to be still limited [13,21,34]. In view of the above, "microscopic" approaches should be preferred [10,34,41]. These approaches assign motion rules to each of the simulated individuals to derive the overall evacuation phenomena and then estimate the safety level for the population [21]. Hence, they allow evaluating risks at the typical scale (dimension and configuration) of each element composing the BE (i.e., streets and other open spaces, buildings) [13].

#### *1.3. Modelling Pedestrians' Evacuation in Floodwater: Current Approaches and Limitations*

The "microscopic" approach defines the overall motion phenomena by overlapping the effects of different behavioral rules assigned to each simulated evacuee [10,41]. To this aim, it is important to analyze the behavioral rules of an "isolated" pedestrian, that is, an individual moving into the floodwater by himself/herself (free-flowing movement conditions). Additional interaction behaviors could be then overlapped to this, to define the overall motion rules. Furthermore, individuals' variations in behavior could be randomly assigned, by allowing to describe different evacuees' attitude in the evacuation process. In this view, besides the definition of qualitative behavioral rules in flood evacuation [6,10,17], the determination of motion quantities plays a pivotal role for the following reasons.

Previous studies demonstrate how differences between general-purpose and flood-related motion quantities exist, especially about the evacuation speed, which is affected by the levels of the floodwaters [10,33,42,43]. Efforts to assess the individual's evacuation speed *vi* (m/s) have been recently performed through real-world event analysis [10] or laboratory experiments in open channels [33,44] or pools [42]. Nevertheless, only a limited number of works succeeds in overlaying the main constraints due to [33,36,42]: (a) the limited number of individuals, also in relation to the different age classes; (b) the effects of individuals' excitement conditions (i.e., walking versus running).

This is the object of the present paper: *D* is here considered as the main floodwater evacuation affecting parameter in both walking and running conditions, as well as, its correlations with individuals' gender and age.

#### **2. Phases, Materials, and Methods**

This work is organized into three main phases:


3. Modelling of the normalized speed variation depending on *D* in respect to the minimum constraint and maximum excitement conditions for the considered database, according to methods in Section 2.4. Such model delves into specific experimental conditions and sample features, to generalize speed estimation data regardless of dry surface motion conditions data.

#### *2.1. Materials: Original Input Database Characterization*

The database by Bernardini et al. [44] is used as the input reference for the model definition (Raw data are available as supplementary files). More than 200 individuals were involved in laboratory experiments by moving in walking and running excitement conditions along a 24-m-long path into an open channel (rectangular section of 1 × 1 m; concrete horizontal bottom surface; individuals wearing fishermen's suits). A total of 555 samples compose the overall database.

The volunteers' sample was defined according to the Italian National age statistics and included individuals from 8 to 83 years old (males: 56%, females: 44%). Tested floodwaters *D* were equal to 0.2, 0.3, 0.4, 0.5, 0.6 and 0.7 m, by considering still water (*Vf* = 0 m/s) to better focus on the effects of depth on *vi*. For each individual involved in the test, the original database includes: (a) the motion speeds *vi* (m/s) referred to the specific tested floodwaters and excitement conditions; (b) the individuals' features (i.e., sex, gender, height, mass, body mass index); (c) the number of performed tests. Each individual performed the test once in walking and running conditions, for a given *D*, and 31 individuals performed the test for all the *D* values.

#### *2.2. General Methods for Model Definition and Characterization of the Selected Database*

A preliminary analysis to assess the relative position of the knee height in respect of the floodwater depth has been performed, to evidence if the original database could be affected by the individuals' height *H* (m). In fact, previous works on human body kinematics suggest that differences in individuals' motion could exist for intermediate *D* values (e.g., 0.4 to 0.6 cm) in case the knee is outside or inside the water level. In particular, a physiological constraint (i.e., "the difficulty to articulate the motion of the legs") can be induced while moving the knee and the part of the above leg inside the floodwaters [44]. In general terms, the knee height *K* (m) can be calculated as 0.285 × *H* [44,45]. The distribution of *K-D* ((cm) or (m)) in respect to *D* tested values is assessed and shown by a boxplot comparison, while linear regression is provided to evidence the general trend of the sample.

Then, the database has been mainly characterized by the comparison between walking and running conditions through *vw,exp-vr,exp* (m/s) pairs analysis depending on:


Linear regression models are tested by reporting the R<sup>2</sup> values to focus on the direct correlation between the two assessed variables. Then, simple models to be integrated into simulators can be traced. According to previous works on the original database [44], all the models have been evaluated through the Bisquare regression method, which allows finding the regression that fits the bulk of the data and minimizes the effect of outliers.

For each of this analysis, the *vw,ex* = *vr,exp* line (no differences between walking and running conditions) is shown to confirm if the above regressions are over this ratio.

#### *2.3. Methods for Speed Variation Modelling Depending on D and Dry Surface Motion Conditions*

The analysis methods described in this section are applied to the whole original sample, regardless of the number of tests performed by the volunteers. The database significant dimension is similar to those of previous experimental activities chosen as reference works [43,46].

Previous studies tried to derive correlations between the individuals' speeds *vi* (m/s) and *D* in absolute terms (m/s) [33,34,44], but only a limited number of works tried to define such correlations in respect of free-flowing and dry surface motion conditions [33], that is, the no-constraint conditions for human body motion also in evacuation conditions. The use of such normalized speed could instead improve the generalization of simulation models by allowing to overlap normalized speed variations and the main motion-affecting individuals' parameters (e.g., age, gender) [43,46]. Thus, this second approach has been followed, and the individuals' speeds *vi* (m/s) have been expressed in relation to maximum or ideal evacuation speeds and so in non-dimensional terms (-).

The input data are then organized to trace the variation of speed in walking and running conditions depending on the dry surface motion conditions according to literature values [43]. For each database element, the average motion speed on dry surfaces (m/s) is calculated as a function of the related individuals' age, according to the correlation curves given by previous studies on wide databases (see the curves in Figure 1). Hence, two values of dry surfaces average motion speed depending on age are provided for each individual: *vid,a*, according to the curve *a* [46] (see Figure 1a) and *vid,b*, according to the curve *b* [43] (see Figure 1b). These two curves offer different age-related values, and in particular, the curve *b* [43] can be selected to consider precautionary conditions in normal motion since it generally offers speed values lower than the one of the curve *a* [46].

**Figure 1.** Reference curves for speed normalization depending on the individuals' age according to: (**a**) the curve *a*, to derive *vid,a* [46] and (**b**) to the curve *b*, to derive *vid,b* [43].

Then, Equation (1) is applied to derive the individual's normalized speed *v\** (-) according to the two considered curves, in running (*v\*a,r* and *v\*b,r*) and walking (*v\*a,w* and *v\*b,w*) conditions:

$$
\upsilon^\* \iota\_{a,r} = \upsilon\_{\text{t,exp}} \psi\_{\text{id},a\star} \upsilon^\* \iota\_{\text{b},r} = \upsilon\_{\text{t,exp}} \psi\_{\text{id},b\star} \upsilon^\* \iota\_{a,w} = \upsilon\_{\text{w,exp}} \psi\_{\text{id},a\star} \upsilon^\* \iota\_{b,w} = \upsilon\_{\text{w,exp}} \psi\_{\text{id},b} \tag{1}
$$

These values are then organized according to the following modelling approaches, to retrieve according to linear regression models for the prediction of the output in both walking and running conditions:


levels. Male and female samples are shown in the regression model but considered together in the regression model to focus on the main factor influencing human behavior, which is the individuals' height. It is expected that *v\** should be equal to or higher than 1 when the knee is placed outside the floodwaters, especially in running conditions.

In addition, a dimensional age-evacuation speed (m/s) model has been tested for the different *D* values according to a polynomial (4th degree) regression model, which considered the general literature reference curves trend [43,46]. This model allows tracing the differences between the motion curves in no-constraint conditions versus *D*-constraints conditions. Male and female samples are considered together in the regression model. This choice allows: (a) focusing on the main factor influencing the human behavior, which is the individuals' age and (b) comparing this study curves with the ones of reference works. The regression models are tested for the whole sample (average behavior of the tested population) and for the maximum values of the experimental pairs (maximum limit behavior). Meanwhile, minimum values are not considered due to their similarities regardless of the specific *D* values, as demonstrated in the original input database assessment [44]. It is assumed that lower *D* curves could produce speed values higher than the references curves, especially for running conditions.

#### *2.4. Methods for Normalized Speed Variation Modelling Depending on D, Minimum Constraint, and Maximum Excitement Conditions for the Considered Database*

The analysis reported in this section concerns the database subsample including the individuals who performed all the experiments for each *D*. It includes 186 tests performed by 31 individuals (age characterization: mean age of 32 years, standard deviation of 9 years, range from 21 to 66 years; height characterization: mean height of 1.71 m, standard deviation of 0.07 m, range from 1.55 m to 1.83 m; 16 females and 15 males). The analysis of such data is considered to trace the variation of speed in walking and running conditions depending on the maximum excitement conditions (running) with minimum floodwater experimental constraint (*D* = 0.2 m). In fact, it is possible to compare the motion behaviors of considered volunteers for each of the tested conditions.

For each individual, the ideal normalized speed with respect to the maximum excitement-minimum experimental constraint conditions (-) is calculated as in Equation (2):

$$
\upsilon\_{i,r,0.2} = \upsilon\_{r,\text{exp}} \psi\_{r,0.2\_{\varDelta}} \upsilon\_{i,w,0.2} = \upsilon\_{w,\text{exp}} \psi\_{r,0.2} \tag{2}
$$

where *vr,*0.2 (m/s) is the individuals' running speed for *D* = 0.2 m. For this reason, all the *vi,r,*0.2 for *D* = 0.2 m will be equal to 1. The distribution of *vi,r,*0.2 (cm) in respect to *D* tested values is assessed and shown by a boxplot comparison. Linear regression is provided to evidence the general trend of the sample for separated running and walking excitement conditions. Due to the smaller sample dimension, it is assumed that the regressions are evaluated by collecting males and females in a unique sample.

In addition, the linear regressions are also tested for the sub-sample of individuals who moved with the knee inside and outside the floodwater, according to the same procedure. To this end, Equation (3) has been applied to separately retrieve the normalized speed value in running *vi,r,max* and walking *vi,w,max* excitement conditions:

$$
\omega\_{i,r,max} = \omega\_{r,\text{exp}} \delta \upsilon\_{r,\text{max}}, \\
\upsilon\_{i,\text{w},\text{max}} = \sigma\_{\text{w},\text{exp}} \delta \upsilon\_{\text{w},\text{max}} \tag{3}
$$

where *vr,max* and *vw,max* are respectively the maximum running and maximum walking speed. This normalization procedure allows to consider the effective *D* value for which the individual had the maximum motion speed. Hence, the boxplot distribution representation can have *vi,r,max* and *vi,w,max* values equal to 1 also for *D* > 0.2 m.

#### **3. Results**

#### *3.1. Characterization of the Selected Database*

Figure 2 demonstrates that the overall sample can be characterized by a significant linear trend in *K-D* (cm) values. The main classes affected by the possibility to have the knee inside or outside of the floodwater levels are *D* = 0.4 m and 0.5 m. Hence, the results for these classes can be effectively affected by the relative position between the knee and the floodwater surface. Lower *D* classes have many outliers with respect to the others because of the values connected to the child, which have a lower height.

**Figure 2.** Database characterization depending on *K-D* versus the tested *D* values. The boxplot graph includes possible outliers (shown by "+"). The linear regression and the 99% boundaries are shown.

Figure 3 confirms how the male sample individuals seem to move faster in running conditions than the female sample individuals [43]. Anyway, differences between the two linear regression models are slight (about 10% in respect of the male regression coefficient). R<sup>2</sup> values show a moderate correlation trend for both the gender-related sample.

**Figure 3.** Database characterization depending on *vw,exp* and *vr,exp* pairs, for each individual, by distinguishing females (continuous red line) and males (dashed blue line) linear regressions and related samples.

Figure 4 confirms how the increase of *D* implies a slighter difference between running and walking excitement conditions, with a general moderate correlation trend (compare R2 values for the regression models). *D* equal to 0.4 m and 0.5 m evidences small differences in regression trends (5% in respect of *D* = 0.5 m) because the related sample is composed by individuals with knees outside and inside the floodwater level, also according to Figure 2 results. Meanwhile, *D* equal to 0.5 m and 0.6 m seems to additionally provoke the same conditions in *vw,exp* and *vr,exp* pairs, confirming previous works outcomes [44]. In such conditions, the knee and a part of the above leg are generally placed inside the water, by involving a similar physiological and kinematic constraint in human body motion.

**Figure 4.** Database characterization depending on *vw,exp* and *vr,exp* pairs, for each individual, by distinguishing the linear regressions for each of the tested *D*. The regression equations are shown as a function of the respective *vw,exp* for the considered *D* value (subscript expressed in (cm)).

#### *3.2. Speed Variation Modelling Depending on D and Dry Surface Motion Conditions*

Figure 5 resumes the normalized speed variation models according to the two reference curves, for running conditions, while Figure 6 evidences the same model in walking conditions. In general terms, all the provided models confirm how [33,41,43,44]:


the adopted model. Besides, *v\** in running conditions seems to be equal to about 1 when *D* is about 0.3 to 0.5m, which corresponds to the *D* classes for which the individuals' knee is generally closer to the floodwater level, according to Figure 2 statistics;

4. Differences between dry surface motion in literature reference works [43,46] and the adopted database [44] exist (compare the second linear regression coefficient in all the models), by underlining how specific database should be adopted to represent the flood evacuation. Forecasted *v\** for *D* < 0.2 m are beyond the lower limit of the tested *D* conditions and so they could be affected by additional non-linear speed-*D* interferences [33]. Nevertheless, the importance of this result is shown by the *v\** values for the minimum tested constraint conditions (*D* = 0.2 m) in both walking and running conditions.

**Figure 5.** Normalized speed variation for running conditions depending on *D* in respect to dry surface motion conditions, for the whole sample (male and female together): (**a**) according to the curve *a* [46]; (**b**) according to the curve *b* [43]. Linear regressions (dashed lines refer to prevised values outside of the tested *D* range) and boundary conditions (99%) are shown. Outliers are shown in the boxplot representation (marked by "+").

**Figure 6.** Normalized speed variation for walking conditions depending on *D* in respect to dry surface motion conditions, for the whole sample (male and female together): (**a**) according to the curve *a* [46]; (**b**) according to the curve *b* [43]. Linear regressions (dashed lines refer to prevised values outside of the tested *D* range) and boundary conditions (99%) are shown. Outliers are shown in the boxplot representation (marked by "+").

The same results are highlighted by the running and walking models related to male and female separated samples, as shown by Figures 7 and 8. Besides the higher accuracy of the curve *a*-based model in prediction *v\** variation depending on *D*, the comparison between running and walking conditions for male and female subsamples underlines how:


**Figure 7.** Normalized speed variation depending on *D* in respect to dry surface motion conditions, for the male sample: (**a**) for running conditions, according to the curve *a* [46]; (**b**) for running conditions, according to the curve *b* [43]; (**c**) for walking conditions, according to the curve *a* [46]; (**d**) for walking conditions, according to the curve *b* [43]. Linear regressions (dashed lines refer to expected values outside of the tested *D* range) and boundary conditions (99%) are shown. Outliers are shown in the boxplot representation (marked by "+").

**Figure 8.** Normalized speed variation depending on *D* in respect to dry surface motion conditions, for the female sample: (**a**) for running conditions, according to the curve *a* [46]; (**b**) for running conditions, according to the curve *b* [43]; (**c**) for walking conditions, according to the curve *a* [46]; (**d**) for walking conditions, according to the curve *b* [43]. Linear regressions (dashed lines refer to expected values outside of the tested *D* range) and boundary conditions (99%) are shown. Outliers are shown in the boxplot representation (marked by "+").

Figures 9 and 10 confirm the previous results by focusing on the knee-floodwater depth interaction and by representing the normalized speed variation according to the reciprocal knee-*D* position. The database characterization according to *K-D* values is offered by previous Figure 2. The male and female samples are jointly considered to focus the regression model on the individuals' height-related behavioral driver. Firstly, according to Figures 6 and 7 outcomes, the effect of *D* on the human motion is more relevant in running conditions (see the first regression coefficient in all the models). Secondly, *v\** is quite equal to 1 for *K-D* values close to 0 in both running conditions related models. Finally, all the regression models (both using curve *a* and curve *b*) seem to have a more relevant statistical significance in respect of the *v\*-D* ones (R2 generally higher also for curve *b*-based model). All these results confirm the importance of floodwater-knee interaction as the main behavioral driver for the whole sample [44].

It is also worth noting that the general dispersion of forecaster *v\** data is quite small and increases for individuals moving with the knee outside of the water (i.e., *K-D* > 0 m). This result is shown by the reciprocal position of the linear boundary conditions (99%) in respect of the linear regression line (which refers to forecasted median values).

**Figure 9.** Normalized speed variation for running conditions depending on *K-D* in respect to dry surface motion conditions, for the whole sample (male and female together): (**a**) according to the curve *a* [46]; (**b**) according to the curve *b* [43]. Linear regressions (dashed lines refer to expected values outside of the tested *D* range) and boundary conditions (99%) are shown. Male and female samples experimental pairs are shown.

**Figure 10.** Normalized speed variation for walking conditions depending on *K-D* in respect to dry surface motion conditions, for the whole sample (male and female together): (**a**) according to the curve *a* [46]; (**b**) according to the curve *b* [43]. Linear regressions (dashed lines refer to expected values outside of the tested *D* range) and boundary conditions (99%) are shown. Male and female samples experimental pairs are shown.

The *age*-evacuation speed *v* (m/s) models are assessed at the different tested *D* values for the whole sample (average regression) and maximum *age-v* pairs, by using a 4th-degree polynomial regression which has a similar form in respect of the reference curves. Figures 11 and 12 respectively show the results for the overall sample in running and walking conditions, and by considering the comparison between the curve *a* and the curve *b*. Figures 13 and 14 show the regression for the maximum *age-v* pairs. In each excitement condition, considered age being equal, the increase of *D* corresponds to a decrease in predicted motion speed. Differences are more relevant in running tests. The regression models could be affected by the original database characterization in terms of age classes and overall age range [44], as outlined by low R2 values (<0.25) for "average" *age-v* regression line (due to the original database pairs dispersion). Nevertheless, they generally confirm the previous modelling outcomes as well as the results of previous works. Firstly, the general trend of 4th polynomial regression is confirmed in each

excitement conditions, by underlining that elderly's speeds are lower than the one of adult individuals. In addition, it could be evidenced that:


**Figure 11.** Average dimensional age-evacuation speed (m/s) model in running conditions, for the different *D* values according to a polynomial (4th degree) regression model, performed on all the experimental pairs (male and female together): (**a**) compared to the curve *a* [46]; (**b**) compared to the curve *b* [43].

**Figure 12.** Average dimensional age-evacuation speed (m/s) model in walking conditions, for the different *D* values according to a polynomial (4th degree) regression model, performed on all the experimental pairs (male and female together): (**a**) compared to the curve *a* [46]; (**b**) compared to the curve *b* [43].

**Figure 13.** Maximum dimensional age-evacuation speed (m/s) model in running conditions, for the different *D* values according to a polynomial (4th degree) regression model, performed on all the experimental pairs (male and female together): (**a**) compared to the curve *a* [46]; (**b**) compared to the curve *b* [43].

**Figure 14.** Average dimensional age-evacuation speed (m/s) model in walking conditions, for the different *D* values according to a polynomial (4th degree) regression model, performed on all the experimental pairs (male and female together): (**a**) compared to the curve *a* [46]; (**b**) compared to the curve *b* [43].

#### *3.3. Normalized Speed Variation Modelling Depending on D, Minimum Constraint, and Maximum Excitement Conditions for the Considered Database*

Figure 15 traces the normalized speed trends in respect to the minimum constraint-maximum excitement conditions (*D* = 0.2 m; running experiments), depending on *D*, for data concerning running (Figure 15a) and walking (Figure 15b) tests. Although the sample dimension is quite limited (31 individuals for 186 considered tests), the results confirm previous trends discussed in Section 3.2 concerning the speed reduction as a function of *D* increase.

In particular, Figure 15a evidences how the maximum motion speed is higher for lower *D* values in running conditions. Boxplot distribution data for *D* = 0.2 m are connected to the normalization of experimental speed by *vr,*0.2 (hence, all the values are equal to 1), but the maximum *vi,r,*0.2 values decrease while *D* increases. In addition, the median linear regression denotes a moderate relationship between *D* and *vi,r,*0.2, as shown by the R2 value. On the contrary, walking conditions-related speeds *vi,w,*0.2 seems to be less influenced by *D*, as demonstrated by the scattered boxplot values distribution and by the inconsistent R2 value.

**Figure 15.** Normalized speed variation depending on *D*, by normalizing the speed values by *vr,*0.2 (speed of the individual in minimum constraint and maximum excitement conditions within the tested *D* range): (**a**) for running conditions; (**b**) for walking conditions. Linear regressions (dashed lines refer to prevised values outside of the tested *D* range) and boundary conditions (99%) are shown.

Finally, Figures 16 and 17 examine the effect of Knee position in respect to the floodwater depth, respectively in running and walking conditions, by considering the normalization according to the individuals' effective maximum speed within the tested *D* range. Although the regressions generally show a lower statistical significance according to poorer R2 values, it is demonstrated that:

In running conditions, most of *vi,r,max* values are linked to the lowest D values, while walking conditions see a more widespread *vi,w,max* distribution. This outcome evidences the existence of excitement-related issues (i.e., motion effort increasing with constraints increase) in pseudo-evacuation excitement conditions, in respect to normal excitement conditions (pseudo-evacuation conditions can be approximated by running test, while normal conditions by walking tests) [44,47].

• The effect of *D* on an individual's speed is higher while the pedestrian is moving with the knee outside of the floodwater level, especially in running conditions as shown for the sub-sample with the knee inside the floodwater by: (a) the lower R<sup>2</sup> value, (b) the wider range for boundary conditions (99%) regression lines, and (c) the lower first regression coefficient value (only for running conditions of Figure 16).

Such outcomes verify the results reported in Section 3.2 about the normalized speed variation modelling depending on *D* and the dry surface motion condition. Meanwhile, they use data on the experimental *D* range from individuals who performed all the tests (possibility to trace the individuals' response in all the considered scenarios).

**Figure 16.** Effects of knee position (outside and inside the floodwater level) on the normalized speed variation depending on *D*, by normalizing the speed values by *vr,max*. The regressions are shown for running conditions. Linear regressions (dashed lines refer to prevised values outside of the tested *D* range) and boundary conditions (99%) are shown.

**Figure 17.** Effects of knee position (outside and inside the floodwater level) on the normalized speed variation depending on *D*, by normalizing the speed values by *vw,max*. The regressions are shown for walking conditions. Linear regressions (dashed lines refer to prevised values outside of the tested *D* range) and boundary conditions (99%) are shown.

#### **4. Results Discussion in View of Model Implementation in Flood Evacuation Simulators**

The modelling approaches offered by this work represent different solutions for the implementation of quantitative behavioral response aspects (i.e., individuals' speed) in flood evacuation simulators.

Firstly, all of them are based on a microscopic point of view and so they can be applied in such kind of simulation models [10,34,39,48]. According to this microscopic standpoint, they trace the "isolated" pedestrian speed by considering a single environmental driver, which is the floodwater depth and his/her specific features (mainly age, gender, and excitement level). In some terms, it could be defined as the ideal speed at which the evacuees try to tend. Anyway, in a real environment, this ideal speed could be affected by some other behavioral drivers [10,41], such as those described in Section 1.1. Hence, the final motion velocity (considering the speed (that is, the velocity magnitude), its direction, and its verse) of each individual should be calculated by overlapping all these phenomena. In this

context, it is important to underline that some specific effects can provide consistent modifications to the effective motion velocity:


In view of the above, the proposed microscopic standpoint allows increasing the possibility of applying the proposed model to different modelling techniques (e.g., force-driven, velocity-driven, agent-based).

Secondly, most of the proposed modelling approaches also include specific individuals' features, like age, gender and height. Such choice allows considering the possible variation in such parameters within the simulated population, especially while adopting agent-based modelling techniques [21,34].

Thirdly, the modelling approaches focus on the two different excitement conditions for pedestrians [41,43,44]:


In this way, simulators could adopt the related models depending on the assumed excitement conditions (e.g., walking conditions imply the maximization of motion time, moving towards conservative evaluations). At the same time, simulations under different excitement levels could be performed to assess the alterations among them [41].

Finally, in view of the above, Tables 1 and 2 summarize the selected modelling approaches (according to the model significance in terms of fitting performance and description of evacuation phenomena) that can be included in flood evacuation simulators, by distinguishing between:


Finally, the proposed models focus on different drivers of human behaviors by confirming the existence of the main driver in individuals' response while moving in floodwaters [6,9,10,33,44]. Nevertheless, it is worth noting that their statistical significance is strictly connected to the original database adopted by this work.


**Table 1.** Selected modelling approaches based on dimensional dependent variables.

<sup>1</sup> Coefficients are valid for the considered sample; further activities should be performed to check their validity. <sup>2</sup> F refers to female sample, M refers to male sample, A refers to both male and female samples together.

**Table 2.** Selected modelling approaches based on non-dimensional dependent variables. In the regression model column, the expression inside round brackets is *vnorm*.


<sup>1</sup> Coefficients are valid for the considered sample; further activities should be performed to check their validity. <sup>2</sup> F refers to female sample, M refers to male sample, A refers to both male and female samples together. <sup>3</sup> This model considers the original database-related speed and not the reference curve as the multiplication value of *vnorm*.

#### **5. Conclusions and Future Research Remarks**

The flood risk assessments in urban areas should consider the evaluation of hosted community's exposure to possible emergency conditions, especially if considering the possible man-floodwaters interactions that play a pivotal role in immediate emergency scenarios, i.e., in the evacuation process. The effectiveness verification of emergency planning and other risk-mitigation strategies implemented in the built environment (BE) and oriented at direct support of the involved population should take advantage of simulation-based analysis.

This work offers a fundamental step toward this goal by providing general and unified modelling approaches to represent the individuals' evacuation speed (m/s) depending on (a) his/her walking speeds (m/s) and gender, (b) the floodwater depth (m), (c) his/her age (years), and (d) the relative position between the knee height and the floodwater level (m). According to previous research outcomes, these models confirm how the individuals' speed decreases when age and floodwater depths increase, while males generally seem to move faster than females. Moreover, they are generally characterized by moderate or strong relationship confidence degree, guaranteeing their applicability to

modelling purposes. Finally, walking speeds can be evaluated by using inverse correlation modelling based on the proposed ones concerning running speed tests.

Although the used database is significant, different databases could be selected to collect input data and extend the validity of proposed regression models or compare/validate them. The considered original database describes characterization only in terms of floodwater depth. Hence, future research should try including additional significant variables, such as floodwater speed, sediment transport/mud presence. Additional experimental activities are needed to provide significant data to this end.

Besides, the current work is structured by defining regression models by preferring a continuous data existence field (for input and outputs). Nevertheless, widest databases could support the development of multiple linear models to derive the relationship between the moving speed and other factors (e.g., age, flood depth) one by one. Coherence models (e.g., regression tree-based) could be provided to have complete modelling result in a microscopic approach.

The proposed modelling approaches could be introduced in flood evacuation simulators by providing input data, which are more accurate than one of the general-purpose existing databases. Microscopic-based simulation tools can take advantage of this work's outcomes since they assign evacuation quantities to each simulated individual and allow considering specific individuals' features (i.e., age, gender, excitement level). The combination of behavioral simulation with flood spreading simulation in the BE could allow assessing the evacuation timing and so if/how many individuals could be exposed to significant risk depending on the floodwater hazard evaluation over space. Such simulators could additionally include rules for motor vehicle and public transport-based evacuation over time, to evidence which dependencies among effects could emerge about evacuation typologies and pedestrian-vehicle interactions. Finally, they could also support the analysis of First Responders' motion in flood-affected scenarios during the emergency response phases.

From this point of view, risk assessment analyses for different flood events and in different BE layouts will be achieved to trace the effectiveness of risk-reduction interventions. Proposed risk reduction strategies could involve, e.g., early warning systems, interventions on buildings and architectural components in public spaces, emergency planning and management including the possibility to schedule (delay, anticipate, divide per areas) the evacuation starting, interventions for controlling floodwater spreading, and evacuees' wayfinding systems. Furthermore, they could be combined with each other. Simulation-based approaches including the evacuation process will allow evaluating how such public efforts could effectively support efficiency gains also in case of risky conditions directly affecting the population and provoking the necessity to evacuate. The positioning over space of some solutions could be also evaluated.

Meanwhile, they will evaluate which are the minimum strategies that could be introduced in the BE, e.g., by moving towards minimum conditions in: (a) implementation costs (to reduce public efforts by local authorities), (b) management of complexity (to reduce the organizational efforts requested by the First Responders); (c) impact on the BE features (i.e., if considering historical urban heritage). In this sense, such kind of evaluation could be of interest especially in those urban scenarios where pedestrians could be caught in the middle of the flood "by surprise" (e.g., due to possible limitations in warning and communication with the population) as well as where difficulties due to social and economic factors could increase the overall urban risk (e.g., in underdeveloped countries).

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2073-4441/12/5/1316/s1, manuscript-supplementary.xlsx: raw data (Excel file) used as the input reference for the model definition [44].

**Author Contributions:** Conceptualization, E.Q.; methodology, G.B. and E.Q.; validation, G.B. and E.Q.; formal analysis, G.B.; investigation, G.B.; resources, E.Q. and G.B.; data curation, G.B.; writing—original draft preparation, G.B.; writing—review and editing, E.Q. and G.B.; visualization, G.B.; supervision, E.Q.; project administration, E.Q.; funding acquisition, E.Q. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the scientific project "Building Resilience to Flood Impact Deriving from Global Warming in Europe (BRIDGE)" funded by the Polytechnic University of Marche internal program 2017/2018.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

#### *Article*

## **Accounting for Uncertainty and Reconstruction of Flooding Patterns Based on Multi-Satellite Imagery and Support Vector Machine Technique: A Case Study of Can Tho City, Vietnam**

#### **Sastry Dhara 1,\* , Thanh Dang 2, Kajori Parial <sup>3</sup> and Xi Xi Lu <sup>1</sup>**


Received: 31 March 2020; Accepted: 25 May 2020; Published: 28 May 2020

**Abstract:** One of the most frequent natural perils affecting the world today is flooding, and over the years, flooding has caused a large loss of life and damage to property. Remote sensing technology and satellite imagery derived data are useful in mapping the inundated area, which is useful for flood risk management. In the current paper, commonly used satellite imagery from the public domain for flood inundated extent capturing are studied considering Can Tho City as a study area. The differences in the flood inundated areas from different satellite sensors and the possible reasons are explored. An effective and relatively advanced method to address the uncertainties—inundated area capture from different remote sensing sensors—was implemented while establishing the inundated area pattern between the years 2000 and 2018. This solution involves the usage of a machine learning technique, Support Vector Machine Regression (SVR) which further helps in filling the gaps whenever there is lack of data from a single satellite data source. This useful method could be extended to establish the inundated area patterns over the years in data-sparse regions and in areas where access is difficult. Furthermore, the method is economical, as freely available data are used for the purpose.

**Keywords:** remote sensing; flood extent mapping; Can Tho City; Google Earth engine; uncertainty; support vector machine regression (SVR)

#### **1. Introduction**

Floods are one of the most frequent natural hazards affecting the world. Over the years, they have caused a large loss of life and damage to property [1,2]. Due to changing climates, it is observed that the magnitude and frequency of the floods are on the rise. In recent times, some of the events were severe enough to be termed as catastrophic [3]. Such events underlined the importance of estimating the flood risk and its management.

An important prerequisite for flood risk management is precise information on the real extent of inundation. There has been limited experience and recorded evidence available about the spatial extent of extreme floods. Traditional ground survey techniques have certain challenges, like a limited network of monitoring stations in some developing nations, inaccessibility of the area during floods, time consuming and laborious processes. [4,5]. In order to address these challenges of conventional surveys, remote sensing has emerged as an effective and efficient alternative technique. The ability of remote sensing data to provide a synoptic view of a large area has been found useful for identifying

flood inundated areas [6]. It has become critical to use data derived from remote sensing technology along with hydrological models to better manage flooding risk [7].

With the increasing availability of satellite imagery data in the public domain, government agencies may use the remote sensing data for designing flood mitigation measures [8,9]. In the absence of reliable historical records, flood risk modelling helps us to look at possible extreme scenarios, while incorporating the changes in the flood risk profile with the help of remote sensing data [10–14]. Multilateral agencies such as the World Bank and the Asian Development Bank have used the results from flood risk assessments from time to time to explore various disaster risk transfer mechanisms [15,16].

The precise detection of dynamic changes in water levels and land cover change information to use for flood risk assessment depends on the spatial resolution of the image captured [17]. Various remote sensing products offer options in capturing the water boundaries over a large area at a certain point of time. Among them, high resolution satellite imagery data, such as Quickbird and WorldView images are available, but many of these data are not freely available [18]. Freely available optical imagery, such as Landsat Thematic Mapper (TM) and Enhanced Thematic Mapper (ETM+) can provide satellite imagery data at 30 m spatial resolution and an interval of 16 days. Archival records of satellite imagery data for more than three decades are available [19–21]. The longer-archived data and continual capture over the same areas makes the data very handy for mapping water bodies at a large regional scale for a variety of hydrological conditions. Band 7 data from Landsat is useful for distinguishing water bodies from surrounding land mass, with an error rate around 5%. There is another remote sensing satellite platform called Moderate Resolution Imaging Spectroradiometer (MODIS) (NASA, Washington, DC, USA) which includes two key instruments, Terra and Aqua satellites. Terra MODIS captures the imagery over the earth's surface every one to two days, and acquires the data in 36 spectral bands [8,9,15]. While Landsat data have relatively higher spatial resolution compared to data from MODIS, the revisit time of the Landsat sensor is longer than the MODIS. Remotely sensed images captured from the Sentinel-2 A and B optical sensors, launched by the European Space Agency in 2014, were also used in the current study. The revisit time of the Sentinel-2 sensor is about 10 days and the spatial resolution is between 10 to 60 m [10,16].

The Support Vector Machine (SVM) has been earlier used for applications such as land cover classification [22] in Can Tho City. The Google Earth Engine (GEE) became available later, which provides access to data from multiple satellite sensors via a single platform and also facilitates image processing functions which were earlier carried out in specialized remote sensing software (such as ERDAS/ENVI). The potential of GEE for analyzing flooded areas from the study area (Can Tho City) has not been studied much and this paper sheds light onto this important area. It is critical that such methods are explored in data scarce regions in SE Asia. Furthermore, some of these urban regions (such as Can Tho City) have experienced high economic growth in the recent years due to infrastructure development and migration from neighboring rural areas. There has been a scarcity of historical flood information, such as inundated area from ground surveys. In addition to establishing a procedure to retrieve the available historical information from GEE platform, our paper also proposes the use of SVM Regression, a relevant machine learning technique to address the spatial and temporal uncertainties in such regions. We believe that our paper proposes an important and economical solution which can be extended to establish historical flooding patterns for many SE Asian regions where such information is lacking.

Flooded water can be delineated using image processing techniques [11,12] on different satellite images, in the midst of increasing cloud cover. In order to gain the maximum benefit from satellite images, it is important to combine information from different kinds of sources, so that problems with one method can be overcome by employing the other methods [13,14]. However, advances in machine learning allow us to combine data from different satellites, thus providing an important option to minimize uncertainties in data resulting from different spatial and temporal resolutions. Here, we hypothesize that the use of an advanced machine learning technique of Support Vector Machine

Regression (SVR) can help identify the water-inundation areas over time when using satellite imagery with different spatio-temporal resolutions [23–25]. SVR also helps when incomplete data are available, thus enabling a more complete set of readings to be included into any further modelling. SVR can predict non-linear relationships and therefore used in the current context. The objective of this study is to minimize the uncertainties related to different spatial resolutions from different satellite platforms, by using SVR to reconstruct the flooding extents.

In this study, we used imagery from Landsat, MODIS and Sentinel-2 satellite sensors. These satellite platforms were chosen because of several reasons: 1) availability of data for long periods 2) data being available free of cost and 3) often two to three datasets would be available within the period from September to October (flooding season in Can Tho City) each year.

#### **2. Materials and Methods**

#### *2.1. Study Area*

The study area is Can Tho City, which is in the center of the Vietnamese Mekong Delta. Regions of the Vietnamese Mekong Delta are prone to high to extreme risk from floods [26–29]. The city is 75 km far from the Vietnamese East Sea (Figure 1). Can Tho City is one of the five cities centrally administered in Vietnam. It is the largest city in the Vietnamese Mekong Delta and an important base of increasing commerce, industry and transportation network [30,31]. As of April 2019, the city has a population of 1,235,171 [32] with a natural land area (in 2017) of about 1439 square km [33].

**Figure 1.** Lower Mekong Basin (**left figure**) in South-East Asia; Study Area Can Tho City (highlighted in the **right figure**).

The socioeconomic development goals for 2020 and 2030 of the Vietnamese Mekong Delta see the increasingly significant role of Can Tho City in achieving the growth objectives [29] Can Tho City is set to be the regional centre for many upcoming and developing industries, primarily based on the vast arable land and water bodies. These industries include aquaculture, fishing and food processing. Figure 2 below shows the entrance to Tra Noc industrial zone in Can Tho City which houses several industrial firms. The increasing outputs from these industries in Can Tho City make the city a key contributor to the food security in the Mekong Delta region. Apart from these industries, Can Tho City plays an important role in the transportation connectivity via its many waterways, road network and the international airport. Due to the development of infrastructure as well as the industries, Can Tho City has attracted a lot of migrants in search of jobs and favorable economic engagement. As a growing city, it has witnessed a lot of construction activities in the recent times which resulted in changes in land use patterns. Figure 3 shows the ongoing construction activity of an industrial establishment. However, some of these economic growth related activities are hampered by the frequent natural disasters such as floods [34,35].

**Figure 2.** Tra Noc Industrial Zone in Can Tho City. The zone is home to industrial firms producing electronic, food processing, automotive, construction materials and pharmaceutical products.

**Figure 3.** An industrial building under construction in Can Tho City.

Analysis of flood risk in Can Tho City can be adapted to larger sections of South and South-East Asia since several of the geographical features with relation to proximity to major rivers and ocean as well as the weather patterns are noted in several developing cities in South and South-East Asia. Can Tho City faces threats from the three kinds of flooding: fluvial, pluvial, and tidal [36].

Upstream hydropower development along the Mekong may also cause hydrological alterations [37], but the impact on the delta is limited, with the change in flooding extents being slow to develop once they reach the lower parts of the delta. Importantly, the circumstances of the city are constantly changing and this brings about challenges and uncertainties in flood risk analysis and management.

Given the importance of Can Tho City to the economy, indeed to several aspects of the Vietnamese Mekong Delta, it is a good place to understand the development and maintenance of flood mitigation and management strategies in place. Can Tho City faces multiple challenges such as the increase in sea level and river runoff due to climate change, urban runoff caused by imperviousness and potential intensification of extreme rainfall due to microclimatic shifts [38]. Floods in Can Tho City cause losses to the city's infrastructure and result in an adverse impact to the livelihoods of citizens. Some of the reasons for the flooding in Can Tho City include the lack of robust flood-prevention system and rapid urbanization [39].

Can Tho City is also an ideal place to study the effect of uncertainties in current flood risk assessment, which can eventually lead to strategies for tackling these uncertainties. In time, strategies developed in an accessible—but developing—city, can be adapted to larger sections of South and South East Asia.

Figure 4a–d below capture some of the scenes observed in the streets of Can Tho City during the floods in the year 2014. Due to the varied processes via which flooding can occur in the Mekong Delta, the Vietnamese government is attempting to provide plans to mitigate damage from flooding [29]. One way to do this would be to assess inundation patterns from the past. There is a paucity of on-the-ground data regarding flood extents in these areas. This has led to greater reliance on data gathered from remote sensed data. Remotely sensed images from satellites are used to reconstruct historical flood extent maps in several nations [40,41]. With the different kinds of satellites in use for accessing these data, reliability and usability of such data should be studied. One of the ways would be via examining the potential of freely available satellite imagery for the study area. Here, we used the capacity of the Google Earth Engine (GEE), a web-based platform, to analyze the flooded areas from three different satellite imagery (Landsat, MODIS and Sentinel-2). Each of these three satellite image sensors has their own advantages and disadvantages (e.g., different spatial and temporal resolutions). A range of potential flooded areas are provided by each of these sensors, and SVM was used to reconstruct the flooding patterns.

**Figure 4.** (**a**–**d**) Photographs showing the different inundated areas in Can Tho City during the floods in the year 2014 (Courtesy: Dr Dunja Krause, United Nations Research Institute for Social Development (UNRISD, Geneva, Switzerland).

#### *2.2. Methodology*

The below steps were followed to conduct the study. Further description about the data and techniques are provided in Sections 2.2.1–2.2.3, 2.3 and 2.4.


#### 2.2.1. Google Earth Engine

Google Earth Engine (GEE) is a web-based cloud computing platform capable of storing multi-layer catalog of satellite images and geospatial datasets within GEE's Public Data Catalog [42–44]. In addition to archival, within the GEE script-environment these large datasets can be analyzed for identifying the changes, mapping variations, and variances on the surface of the earth. Since GEE is a cloud-based service, there is no necessity to download and analyze imagery in the traditional manner and thus this saves time.

Phongsapan and colleagues studied the potential of GEE platform in deriving an operational flood risk index covering Myanmar [45]. Uddin et al. outlined the utility of GEE in using multi-temporal Sentinel-1 Synthetic Aperture Radar (SAR) satellite imagery in capturing the flood inundated areas in Bangladesh [46]. Sidhu and team and Celik and team also used GEE for identifying the land cover changes in urban areas in Singapore [47] and Ankara [48], respectively.

Nguyen and colleagues analyzed several Landsat satellite imageries to examine the changes in built-up area in Can Tho City between the years 1998 and 2018. However, GEE was not used to retrieve and analyze the Landsat imagery in this case [49]. Goldblatt et al. earlier demonstrated the application of GEE for identification of boundaries of urban regions (i.e., built-up areas) in India [50]. Review of studies such as the above illustrate that there is a gap in research and utilization of GEE for research work in Can Tho City (study area), more so in the important area of flooding risk. The important idea of utilization of historical satellite imagery via GEE, an efficient archival and computational platform in Can Tho City was introduced in our paper.

Researchers wishing to use the service neither need to be familiar with nor need to use specialized remote sensing software such as Environment for Visualizing Images (ENVI) and Earth Resources Data Analysis System (ERDAS) Imagine for regular image processing functions. As with other open source software, several algorithms are provided by other users, thus providing a greater expertise in analyzing different datasets than would be possible for a single person or team working together. Bulk downloads and memory use can be avoided since work can be performed online. In this study, datasets related to Can Tho City from three publicly available platforms, Landsat (5, 7 and 8), Sentinel-2 and MODIS were obtained and analyzed using the Google Earth Engine.

#### 2.2.2. Extraction of Water Body Extent

The Normalized Difference Water Index (NDWI) [42,51,52] is an index derived from the satellite imagery using the Near-Infrared (NIR) and Green wavelengths. NDWI can capture the water body presence from remote sensing imagery via separating non-water related features. Reflected near-infrared radiation and green light are used in the calculation since this helps in excluding vegetation and land, while improving water detection [52]. NDWI is calculated with the formula as shown below:

$$\text{NDWI} = \frac{\text{Green} - \text{NIR}}{\text{Green} + \text{NIR}} \tag{1}$$

where the Green is the band which captures reflected green light and the NIR represents the nearinfrared radiation.

By using green wavelengths, the typical reflectance of water features is maximized, the low reflectance of NIR by water bodies is minimized and the high reflectance of NIR by vegetation and soil features from the land is maximized. The outcomes from the above index are that water bodies have positive values while using multispectral imagery which has a reflected green band and an NIR band. Soil and vegetation features have zero or negative values due to the typical high reflectance of NIR than the green light [52].

During the processing of satellite imagery, an appropriate threshold was applied based on the presence and extent of cloud cover in the imagery. For example, a threshold value of 0.1 was applied for imagery with high cloud coverage whereas a threshold value of 0.3 was applied for the satellite imagery with relatively less cloud coverage [53]. This helps in capturing even very low values of NDWI which may result from high cloud coverage or the absence of water bearing pixels. When the water bodies are clearly visible, the increase in threshold value beyond 0.3 does not have much effect; hence a threshold value of 0.3 is good enough for imagery with clear water body presence and less cloud cover.

#### 2.2.3. Data (Landsat 5,7,8, Terra MODIS and Sentinel-2)

#### Landsat

The Landsat programme is a joint venture of United States Geological Survey (USGS), National Aeronautics and Space Administration (NASA) and National Oceanic and Atmospheric Administration (NOAA), and has been in use since 1972 [20,21]. For most Landsat imagery, the temporal resolution (sensor revisit time) of Landsat is 16 days. For Landsat imagery that includes multi-spectral and thermal data, spatial resolution is mostly about 30 m. Landsat data are available in the GEE in its raw form, surface reflection, Top Of Atmosphere (TOA)-corrected reflection, and other ready-made products, such as NDWI, NDVI and EVI indices. Landsat sensor, however, cannot capture the data when there is a cloud cover. From Figure 5a below, it can be noted that parts of the study area, covered with clouds are captured as white patches. This posed issue with derivation of accurate inundated area during certain periods. Scanline collector failure of the Landsat 7 Enhanced Thematic Mapper in 2003 led to data gaps with 22 percent of the coverable area missing. Figure 5b below illustrates this issue. Images acquired in these two cases (with cloud cover and scan line cover) introduce some uncertainty in the inundation area captured. Figure 5c,d below show the false color Landsat images acquired during dry and flood season respectively.

**Figure 5.** False color Landsat imagery showing the study area during (**a**) cloud cover (**b**) scan line collector failure (**c**) dry season (**d**) flood season.

#### MODIS

The Moderate Resolution Imaging Spectroradiometer (MODIS) was built by Santa Barbara Remote Sensing facility and was launched into the orbit by NASA during the year 1999 on board the Terra (EOS AM) Satellite, and in 2002 on board the Aqua (EOS PM) satellite [15,54]. These instruments are capable of imaging the surface of the Earth every 1 to 2 days, at 250 m, 500 m and 1 km resolutions. Data are captured in more than 35 spectral bands, ranging from <1 μm to nearly 15 μm. Changes and processes occurring in the oceans and on land can be captured by MODIS. Figure 6a,b show the False color Terra MODIS images acquired during dry and flood seasons respectively.

**Figure 6.** False color Terra Moderate Resolution Imaging Spectroradiometer (MODIS) image showing the study area during (**a**) dry season and (**b**) flood season.

#### Sentinel-2

Sentinel is a group of satellites managed by the European Space Agency (ESA, Paris, France) under the Copernicus program in 2014 [10,16,55]. Here, we used high-resolution optical images from

Sentinel-2A and 2B, which together have a temporal resolution of about 5 days and a spatial resolution of 30 to 60 m. Figure 7a,b show the false color Sentinel images showing the study area captured during dry and flood season respectively.

**Figure 7.** False color Sentinel-2 image showing the study area during (**a**) dry season and (**b**) flood season.

#### *2.3. Data Time-Period*

In this study, data from the years 2000 to 2018 were collected for the study area from Landsat (5,7,8) and Terra MODIS. We attempted to collect one data point from the data from each month in this time period (total of 228 possible months). Certain prerequisites needed to be maintained when remotely sensed images were collected for analyses of flooding extent, like less than 20% cloud coverage (0% if possible, since floods occur during the rainy season) of the images. Landsat provided spotty images between 2003 and 2014, due to the partial failure (scan line collector failure) of the Landsat 7 satellite [56,57]. Additionally, the Sentinel-2 satellite was launched recently, and data are only available from the year 2015 onwards.

Here, we used the classification of pixels in the remote sensing image into those containing water and those not containing water by using the NDWI. This was undertaken for all the available and usable images from Landsat, MODIS and Sentinel-2 sensors. Of the possible 228 data points from each type of sensor, data from 82 months is available from Landsat and data from 214 months is available from MODIS. For Sentinel, data are only available from September 2015, and of the possible data from 40 months, we obtained data available from 27 months.

While analyzing data from different platforms, discrepancies are noted in the inundated areas obtained from each platform. We therefore used support vector regression (SVR), a non-linear regression technique, to reconstruct flooding patterns.

#### *2.4. Support Vector Machine Regression (SVR)*

SVM has earlier been used mainly as a supervised classification technique while analyzing satellite imagery [58]. It is a supervised, non-parametric classification algorithm, used for linear binary classification of data points [23,59]. Support vector refers to data points that fall along the border of the margin of separation. As with any supervised classification algorithm, SVM requires a training data set. However, with SVM, a small training set is enough and can provide good classification accuracy. For every data point, a decision is made whether it is far from the support vector. The distance from the support vector determines the margin of classification. It classifies each new datapoint from the testing set without making assumptions as to the classification, i.e., data classification is not dependent on assumption of Gaussian distribution and not based on nearest neighbors. SVMs also provide a balance between data overfitting and underfitting, even when limited training samples are used. This method is often used to for images obtained from multi-spectral satellite sensor images.

The potential of Support Vector Machine (SVM) as a supervised machine learning classifier in deriving the flooded area from Landsat imagery has been studied by Ireland and colleagues [60]. Syifa et al. utilized SVM and Artificial Neural Network (ANN) classification techniques to derive the map showing the flooded area after the collapse of Brumadinho dam wall in Brazil during January 2019 [61]. However, the SVM technique can be applied for both classification and regression. In the current study, we applied the principles of SVM for regression, called Support Vector Machine Regression or simply Support Vector Regression (SVR). Gizaw and Gan applied SVR technique for the estimation of regional flood frequency covering two study areas in Canada [62]. Chen and Yu adopted SVR to develop a model to provide a deterministic flooding stage forecasting for Lang-Yang river in Taiwan [63]. SVM and SVR have become effective machine learning substitutes to Artificial Neural Networks (ANNs), finding more applications in flood prediction [64]. These methods have been applied in several flood prediction studies with better performance results compared to ANN and Multiple Linear Regression (MLR) techniques. These methods were applied involving data such as flood time series, extreme rainfall and streamflow [64,65].

For the training set, data from homogenous areas in the collected images were utilized. Here, SVR was used to adjust for the missing data from several months. When data from at least one sensor are available, SVR can generate a composite value for the inundated area. The major reason for using SVR is the different spatial and temporal resolutions for the three satellite imagery we analyzed here. While Landsat and Sentinel-2 imagery provide relatively high spatial resolution, they have lower temporal resolution. In addition, SVR can better predict non-linear predictor/predictand data as used here, and which cannot be well predicted via linear regression methods.

In total, we analyzed 83 Landsat images, 214 MODIS images and 27 Sentinel images. Where available, for each Landsat or Sentinel image, we found a corresponding MODIS image and there were 105 image pairs. We bootstrapped these pairs with the training and the testing ratio of 7:3, 100 times. The average value of coefficient of determination (R2) for 100 rounds of this was 0.85 for training, and 0.72 for prediction. By using this method, we were able to reconstruct integrated flood extent values from MODIS data using the SVR model constructed from the image pairs. Packages e1071 and boot were used for these analyses, performed on R [66–69].

#### **3. Results**

#### *3.1. Capturing Inundated Areas Using GEE*

In this study, "ground-truth" information from the field survey was not available for the study area under consideration. This issue is not uncommon for flooding risk and this often results in scarcity of quantitative validation [70]. To address this challenge, we collected 200 random high-resolution Google Earth images (100 water and 100 non-water images). The Kappa Coefficient of Landsat, MODIS and Sentinel-2 accessed from GEE were 0.87, 0.75 and 0.89 respectively. These Kappa index values demonstrate strong (Kappa value > 0.8) to moderate (Kappa value > 0.7) correlation between inundated areas as determined by satellite imagery from GEE and an external source, thus allowing for the use of satellite images and the corresponding thresholds in classifying the inundated areas/water bodies. The lower K value of MODIS compared to the two others may be caused by its lower spatial resolution.

#### *3.2. Correlation and Di*ff*erences Between Data from Di*ff*erent Satellite Sources*

In this study, data from different satellites were utilized to derive the extent of water inundation in Can Tho City. As an initial step, it is necessary to assess the degree of agreement between the flood extents derived from the different satellite images. As such, we performed Pearson's correlation between data from Landsat and MODIS images. The data are highly correlated, with a Pearson's r of 0.84 (Figure 8a). We also performed Pearson's correlation between Landsat and MODIS data separately for the low inundation period (from the months of February to July) and the high inundation period (August to January). For this process, we collated the Landsat data from the months of February to July throughout the study period and correlated it with the MODIS data collected from February to July during the study period. A similar correlation was performed on data collected from August

to January. The results showed that the Pearson's correlation coefficient was low during the low inundation period (r = 0.15, low inundation period). However, the correlation was high during the high inundation period (r = 0.79, high inundation).

**Figure 8.** Correlation of inundation data between different sensors. Monthly inundated area captured from one sensor is plotted on the *x*-axis and the other is plotted along the *y*-axis. (**a**) Left picture: Landsat and MODIS Sensors; (**b**) Middle picture: Landsat and Sentinel-2 Sensors and (**c**) Right picture: MODIS and Sentinel-2 Sensors.

Similarly, Pearson's correlation analysis was performed between data from Landsat and Sentinel-2 images (Figure 8b) and MODIS and Sentinel-2 images (Figure 8c). These data are also highly correlated, with a Pearson's r of 0.91 and 0.80, respectively. Limited data are available from Sentinel-2, since it was launched more recently than the others. Its correlation with the Landsat data is very high, given the similarly high spatial resolutions of both these data.

A major reason for the differences between the data collected from the three sensors is the differences in spatial resolution of the data collected. Additionally, cloud coverage, especially during the times of high inundation, is another important factor. High inundation happens during the rainy season, and therefore completely avoiding cloud cover would be nearly impossible. Nevertheless, the great degree of correlation informs us that over a long period of time, data from different sensors can similarly inform us about the water-inundated area over several years.

Unfortunately, the "ground-truth" information from the field survey is not available for the study area under consideration. This issue is not uncommon for flooding risk and this often results in scarcity of quantitative validation [70].

#### *3.3. Seasonal Variations in Water Levels Observed by Di*ff*erent Sensors*

As stated in the previous section, the correlation between Landsat and MODIS data was lower during the months of low inundation. To assess the agreeability between the different sensor data, we also plotted the time course of the Landsat, MODIS and Sentinel-2 data over the full time period of analysis, from 2000 to 2018. We observed that the data collected from the different sensors show some differences. However, the results from all the sensors demonstrate increase in the inundation area during the months of increase in rainfall and decrease in months when there is decrease in rainfall (Figure 9).

**Figure 9.** Seasonal variations in water levels observed by different sensors Landsat (red), Terra MODIS (blue) and Sentinel-2 (pink).

#### *3.4. Use of Support Vector Machine Regression (SVR) for Reconstructing Inundation Areas*

The correspondence of inundation areas derived from Landsat and Sentinel-2 imagery is observed to be high (Figure 8b), as can be expected based on the high correlation (Pearson's r = 0.91). Hence, we used the Landsat and Sentinel-2 data to train the MODIS data using the SVR approach.

Reconstructed values were obtained for inundation during all the months when images from MODIS were available. Inundated area values obtained from the sensors and reconstructed data are plotted in Figure 10 below. The MODIS data from the 105 image pairs are trained from the corresponding Landsat/Sentinel-2 data (Figure 10a, red dots) via SVM regression and the reconstructed data derived (Figure 10a, blue dots). There is a reduction in the variation in the reconstructed values compared to those from MODIS data (Figure 10b).

**Figure 10.** Reconstructed data from Support Vector Machine Regression (SVR) analysis (**a**)**.** Comparison of image pairs used for Support Vector Machine (SVM) regression analysis and reconstructed data. The 105 image pairs used to train the SVM regression model are plotted in red and the SVR results are plotted in blue. The *x*-axis indicates original inundated area in sq km derived from Terra MODIS and *y*-axis indicates inundated area data in sq km from Landsat/Sentinel-2 (for the red dots) and from reconstructed data (for the blue dots); (**b**)**.** Box plots showing the inundated area in square kilometers from Terra MODIS sensor (left) and the reconstructed value (right). The *y*-axis shows inundated area (MODIS data and reconstructed values) in sq km.

When the reconstructed data are plotted against time, the pattern of the data shows increases in the water-inundated area during the wet months and decreases during the dry months, when water flows only in the waterways, as is seen with the original data (Figure 11). This training approach ensures that data from more than one sensor can be used to generate a composite model of the inundation areas during all the months when data from at least one sensor is available.

**Figure 11.** Reconstructed area of flood inundation using support vector machine regression (black).

#### *3.5. Trend of Inundation Areas in Can Tho City*

During the years of the study, it was noted that the total inundation area has steadily decreased in Can Tho City. To better observe this, we obtained the average monthly inundation every year using the reconstructed SVR data (Figure 10). In the graph shown below (Figure 12), the average monthly inundated area (in square kilometers) reconstructed using Support Vector Machine technique is plotted on the *y*-axis against the respective years from 2000 to 2018 on the *X*-axis. The seasonal Mann-Kendall test was used to test this. The Mann-Kendall Tau was −0.1003, with a *p* value of 0.03. This indicates that the inundation decreased overall during the time period of the study. After the year 2000, the Vietnamese government invested more in the dyke system to increase agricultural production (rice) in Vietnamese Mekong Delta (VMD). This resulted in a general decrease in flooding area over the years. A similar pattern has been reported in studies conducted in the other parts of VMD, such as the Long Xuyen Quadrangle (LXQ), the Plains of Reeds (POR) and An Giang province of the Vietnamese Mekong delta [34]. Though it is not a direct validation, the similarity in the trends with the other regions provides the consistency in the effects after the investment in dyke systems from the year 2000.

**Figure 12.** The reconstructed average monthly inundation area in Can Tho City from 2000 to 2018. Inundation is decreasing over time similar to that observed in other parts of the Vietnamese Mekong Delta [34].

#### **4. Discussion**

In this study, we analyzed images of Can Tho city from three different satellites (Landsat, MODIS and Sentinel-2) with varying temporal and spatial resolutions, and used a non-linear regression method—Support Vector Machine Regression—to integrate all the data an derive one single value. Here, we show that the correlation between data from different satellites is high, allowing for the training of the SVM to obtain inundation data for the period of 19 years, from 2000 to 2018. We also observe that the inundation area has been steadily decreasing over the years. Our results provide a proof-of-concept that machine learning algorithms like SVM can be successfully applied to combine data from different satellites to obtain one seamless inundation result that can be used for flood risk management. The results show that this can be completed with the use of freely available data and cloud computing platforms, thus allowing economically viable solutions for rapidly developing regions in South East Asia, such as those exemplified by Can Tho City.

Several studies have been carried out on flood risk assessment, via the use of satellite imagery. However, use of satellite data leads to the introduction of uncertainties. This is because of the different spatial and temporal resolutions of the satellite data. In addition, the failure of Landsat scan line corrector from May 2003 adds another layer of uncertainty, leading to lack of data for a certain period. These uncertainties are the epistemic uncertainties seen in flood risk assessment, that are introduced due to a lack of sufficient data and knowledge [71]. These uncertainties can be reduced by combining data from different satellites as we did in this study, and by using the power of machine learning to resolve these issues, which is a step in the right direction. While it is tempting to compare the data from our current study to historical flood extent data, this was not carried out, because the historical flood extent data are obtained from Landsat data, and thus cannot be compared here.

Here, we also showed that GEE can be used successfully to capture the inundated areas. The cloud computing capacity of GEE can be used to analyze massive amounts of data in a relatively short period of time. One other advantage is that the GEE platform provides data from different satellites under one platform. While GEE has mostly been used to detect changes in land cover, here, we used the engine to analyze data related to inundated regions. From the reconstructed average monthly inundation between the years 2000 and 2018, it is noted that there is a general decreasing trend. The possible reasons for this decreasing trend include urban expansion (i.e., urban built up area) and increase in the local flood protection systems [22,72]. Furthermore, more water is transferred to the middle of the Vietnamese Mekong Delta due to the upstream flood prevention systems and sea level rise. [27,73,74]. There is some uncertainty in the inundated area capture due to the cloud cover in some satellite imagery and the operation of sluice gates. However, the effect of this uncertainty on the long-term inundation pattern is minimized by using the data from three different satellites (Landsat, MODIS and Sentinel-2) and SVM regression approach.

The temporal resolution of MODIS satellite sensor is around two days, whereas the satellite revisit periods of Landsat and Sentinel-2 are 16 and five days respectively. Furthermore, Sentinel-2 data only became available from September 2015. Landsat sensor provided spotty images between 2003 and 2014. Data from Terra MODIS are therefore more frequently available than the other two sensors, although its spatial resolution is relatively low. Hence, the data were available for certain months from the three different sensors, but there are cases where the inundated area data are available only from MODIS. We used the SVM technique to integrate the three techniques to output a single inundation value, a reconstructed inundated area figure when two or three different figures are available.

Satellite remote sensing imagery derived data have been increasingly recognized as a critical source for identifying land features and changes in the land cover from time to time. There are several freely available satellite data sources and the user community, particularly engaged in flood risk management, could benefit from this open source data. Whilst there are differences in the temporal and spatial resolutions from different data sources, appropriate techniques could be applied to derive optimal advantage from this disparate sets of data. For example, advanced data science techniques like machine learning could be employed to "fill-in" the data gaps while using the available records from multiple sources. This information is quite useful in the sense that it helps provide an indicative reference information based on scientific methods. During floods, it is not uncommon that a robust mechanism is not always available or possible to capture the "ground-truth" or actual flood extent map [75]. This issue is prevalent in many Asian territories, including Vietnam. The reasons for this include the difficulties in logistics of field surveying, and issues with reaching inaccessible areas during or immediately after a flooding event [5].

In the absence of ground-truthing data, alternative methods such as the usage of crowdsourcing data (photographs), development and deployment of mobile applications for automatic inundated area detections and analyzing the information shared via social media platforms (e.g., geotagged images) can be considered [76]. The role of people living in the flooded areas is crucial for information dissemination about the flood event. They can help in verifying the data collected and contribute to the reports when flood detection sensors are not available.

The contribution from citizens and development of integrated platforms to collect and analyse the data can help in improving the granularity of the data, such as both the temporal and spatial resolutions, and can be an effective form of the validation of data from other sources like remote sensing satellites. Some good examples of such initiatives include WeSenseIt, Ground Truth 2.0, and FloodCitiSense projects funded by European Union [77]. The accretion of crowdsourcing data for understanding flooding events is vital, even more so when the events occur in urban regions. The limitations in the temporal resolution from remote sensing sensors can be overcome as the crowdsourcing information can be made immediately accessible. Furthermore, flood depth information can also be collected at various places of interest at regular intervals to augment the data from other sources [78].

The recent improvements in remote sensing and geographical information systems (GIS) technologies have paved the way for further innovations in natural disaster risk management, particularly in flood risk assessment, damage assessment and planning [79]. There could be newer machine learning methods or improvisations to potentially enhance the SVR going forward. Further studies in this direction are important.

#### **5. Conclusions**

Flooding in Can Tho City can cause huge economic losses and impact many lives. In this study, a cost-effective solution for the study of flood risk via the usage of freely available satellite imagery (Landsat, MODIS and Sentinel-2) was proposed. Google Earth Engine, a powerful satellite imagery data archival and computing platform, was used to derive the areal extents of the inundated area for every month over a period between the years 2000 and 2018. Support Vector Regression (SVR), a supervised machine learning algorithm has been used to reconstruct the water-inundated areas. This reconstruction allowed for the filling of the data gaps while considering the data uncertainties from different satellite sensors with various temporal and spatial resolutions. This also helped establish a relationship for identifying the flooding pattern over the years. The inundated area pattern allows the users to examine the trends in flooding and assess the flood risk. This is a useful technique which could be extended to several other SE Asian cities and provinces which are prone to flooding.

Remote sensing can provide spatially continuous data, unlike point measurements, from gauging stations, and this is especially useful in the case of long duration floods [80]. Satellite revisit time allows for flood monitoring at regular intervals without a physical presence at the location of flooding, especially when access is difficult. In addition, the remote sensing data are easy to access, and are rapidly processed. However, these data carry some limitations; the quality and applicability is subject to the spatial resolution and satellite repeat visit time. There is also scope for errors in identifying the flooded area based on the image processing algorithm used. If the revisit time of the satellite is longer, then the monitoring is not possible at short intervals of time [14]. Nevertheless, the advantages of these data far outweigh the disadvantages and with improved techniques and sensors, these disadvantages can be overcome.

In this study, we provided a proof-of-concept, showing that using machine learning algorithms can help integrate data from three different satellite sensors to derive one single integrated value, providing economically more feasible methods in regions of the world undergoing rapid growth. A small number of data pairs can be sufficient to do so, especially when techniques such as bootstrapping are utilized along with support vectors, as was carried out here. Since a greater number of MODIS images are available due to its relatively higher temporal resolution, we propose that Landsat and/or Sentinel-2 data can be used to train MODIS to derive the composite inundation values. In this way, the advantages of data from greater temporal and spatial resolution may be integrated by the researchers and other practitioners. Here, all the data and analytical tools we used are freely available. We hope that others can use similar methods to access data in places where field data are not easily available, but where the assessment of flood risk is critical for planning and development purposes.

**Author Contributions:** Conceptualization, S.D.; Data curation, S.D.; Formal analysis, S.D. and T.D.; Investigation, S.D.; Methodology, S.D. and T.D.; Project administration, S.D.; Resources, S.D.; Software, S.D. and T.D.; Supervision, X.X.L.; Validation, S.D.; Visualization, S.D., T.D. and K.P.; Writing—original draft, S.D.; Writing—review & editing, S.D., T. D. and K.P. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Acknowledgments:** We acknowledge the support from Van Pham Dang Tri, Can Tho University and Pham Thi Mai Thy, Vietnam National Space Centre for their support in connecting with the relevant research personnel and sharing the local knowledge. Thanks to Dunja Krause, United Nations Research Institute for Social Development (UNRISD) for allowing to use her pictures. We appreciate the support from Nguyen Hanh Quyen and her team at Asian Disaster Preparedness Centre, Thailand for sharing useful inputs related to Google Earth Engine. We finally thank the Google Earth Engine development team and community for their support through their forum.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **An Integrated Approach for Assessing Flood Risk in Historic City Centres**

#### **Tiago M. Ferreira 1,\* and Pedro P. Santos <sup>2</sup>**


Received: 9 May 2020; Accepted: 7 June 2020; Published: 9 June 2020

**Abstract:** Historic city centres near watercourses are a specific type of urban area that are particularly vulnerable to flooding. In this study, we present a new methodology of flood risk assessment that crosses hazard and physical vulnerability information. We have selected the Historic City Centre of Guimarães (Portugal), a UNESCO Heritage Site, for developing and testing the defined methodology. The flood hazard scenario was obtained through the hydrologic–hydraulic modelling of peak flows with a 100-year return period, which provided flood extent, depths, and velocities. A decomposition of the momentum equation, using depth and velocity, allowed reaching a final hazard score. Flood vulnerability was assessed through combining an exposure component and a sensitivity component, from field-collected data regarding wall orientation, heritage status, age, number of storeys, condition, and material of buildings. By combining the results of the hazard and vulnerability modules in a risk-matrix, three qualitative levels of flood risk were defined. The individual and crossed analysis of results proved to be complementary. On one hand, it allows the identification of the more relevant risk factors—from the hazard or vulnerability modules. On the other hand, the risk-matrix identified other buildings with a high risk that otherwise would remain unnoticed to risk managers.

**Keywords:** urban flood; flood risk assessment; risk management; Historic City Centre of Guimarães

#### **1. Introduction**

The dizzying rate of urbanisation over the last few decades has led to an exponential increase in the magnitude of losses caused by natural and technological hazards worldwide. According to the risk-related literature, these losses can be understood as the consequence of a certain level of risk associated with a specific community or society over some specified time period [1]. In a broader sense, disaster risk is a compound concept determined by the combination of the hazard, exposure, and intrinsic vulnerability. Moreover, because vulnerability and hazard may change over time, disaster risk is highly dynamic [2,3]. For this reason, it is fundamental to have ways of monitoring this risk and, when necessary, supporting the definition and implementation of risk mitigation actions.

As a result of their heritage value and high physical vulnerability, historical centres are particularly critical areas and, therefore, deserve particular attention. According to the literature [4,5], in the last few decades, the impact of flooding events in historic centres has indeed increased steadily. As reported by Miranda and Ferreira [6], examples of recent high-impacting flooding events in historical centres include those that occurred in Central Europe, back in 2002 and 2010, in South Asia, in 2007 and 2008, and the New Orleans flood of 2005, caused by Hurricane Katrina. In a recent attempt to study this phenomenon, Marzeion and Levermann [7] investigated the number of cultural heritage sites worldwide that, due to global warming, are at risk of being flooded in the next two millennia. The results of this study are quite conclusive, pointing out that approximately 6% of current UNESCO sites (about 40 sites) will be flooded, particularly in China and India.

The present article addresses this challenge by discussing the application of an integrated flood risk assessment approach, which combines flood hazard and building vulnerability indicators to identify and classify risk and to narrow intervention priorities. The Historic City Centre of Guimarães (Portugal)—a World Heritage Site inscribed by UNESCO since 2001 due to its extraordinary authenticity and well-preserved condition—is explored here to illustrate the application of the methodology. After modelling the flood hazard using the hydrologic–hydraulic method and evaluating the flood vulnerability of the buildings by resorting to a simplified vulnerability assessment method, we provide a comprehensive analysis of the outputs, both in an individual and integrated manner. Finally, we used a risk-matrix approach to aggregate these hazard and vulnerability outputs and categorise the buildings into three qualitative levels of risk.

More than the results themselves—which are indeed very important for the Municipality of Guimarães, but eventually will be of little relevance to the reader—the interest and the innovation of this paper lies in how this integrated flood risk assessment approach (which results from the original combination of two already existing methods) can be used to individualise and guide intervention decisions. It is this analysis that, we believe, can be of great interest and direct relevance to the reader's own practice.

#### **2. The Historic City Centre of Guimarães**

Located in northern Portugal, the city of Guimarães is usually referred to as the cradle of the Portuguese nationality [8]. The Historic City Centre of Guimarães is a World Heritage Site, deemed as such in 2001. According to the classification committee, the Historic City Centre of Guimarães is an exceptionally well-preserved and authentic example of the evolution of a Medieval settlement into a modern town, see Figure 1a. The architectural characteristics of the Historic City Centre of Guimarães, marked by a rich diversity of construction typologies associated with different evolution periods of the place, as well as its unity and integration with the landscape setting, compounds outstanding universal values.

**Figure 1.** The Historic City Centre of Guimarães: (**a**) View to one of its squares; (**b**) Example of a flood event in the Couros' river basin (source: Guimarães City Council).

According to Miranda and Ferreira [6], in recent decades, the city of Guimarães has been subjected to intense anthropogenic pressure as a result of a steady increase of urban and industrial occupation. This has led to the depreciation of the Couros river basin, which was once a central part of the population's life and a vital element for the development of the leather industry [9]. As a result, the levels of pollution and contamination in the Couros creek, as well as the severity of flood events in the Historic City Centre of Guimarães, have increased substantially, Figure 1b.

The coexistence of the aspects mentioned above makes the assessment of the flood risk in the Historic City Centre of Guimarães a particularly challenging and relevant work, constituting, therefore, the justification for the selection of this case study.

#### *Past Flood Events in the Historic City Centre of Guimarães*

The Couros river basin (11.2 km2) is part of the Selho river basin (67.7 km2), which is a sub-basin of the Ave river basin (1390 km2) which drains to the Atlantic Ocean. The Couros river (5.6 km length) is thus a small left tributary of Selho river, located entirely in the Municipality of Guimarães.

Water availability soon became a relevant driver of human occupancy along the Couros river valley that crosses the Historic City Centre of Guimarães.

According to the DISASTER database [10], there are 10 flood occurrences in the municipality. This database covers the period from 1865 to 2015 and includes only the most severe cases of flooding (i.e., when at least one casualty, injured, missing, displaced, or evacuated person is registered in the data sources). In the study area, along the Couros river, there are two flood occurrences of the DISASTER type, both occurring during the flood event of 28 February 1978: in the elder residential structure "Lar dos Santos Passos", where 23 persons were evacuated; and a few hundred meters downstream, at the trucking station, where three families were evacuated.

Despite this apparently modest record of severe losses, the dense artificialisation and imperviousness of the basin—particularly in the Historic City Centre—has been causing frequent episodes of flooding with only material consequences and economic losses. The most recent occurred on 20 December 2019, which was associated with the Elsa atmospheric depression.

#### **3. Materials and Methods**

To better explain the methodological framework adopted in this work, this section presents and discusses the fundamentals of the applied flood risk assessment method. As schematically illustrated in Figure 2, the approach is composed of two main modules, the hazard and the vulnerability module. The bases of these two modules are detailed in Sections 3.1 and 3.2. Finally, Section 3.4 introduces the study area and the approach adopted in this research to collect, manage, and explore data.

**Figure 2.** The conceptual framework of the integrated flood risk assessment approach.

#### *3.1. Hazard Module*

The flood hazard was assessed using the hydrologic–hydraulic method. The assessment process involved the acquisition and preparation of geometric data, the estimation of the peak flow, the hydraulic modelling, and the GIS post-processing and mapping.

In the first step, we obtained the geometric data, representing the morphological features of the floodplain from a base map at a scale of 1:1000, contour lines 1 m equidistant, and a dense cover of mass points. Such data allowed us to create a detailed digital elevation model (DEM) with 1 m cell size, covering the entire valley of the Couros river, oriented E-W and crossing the Historic City Centre of Guimarães, illustrating how significantly modified by human intervention the natural morphology of the study area has been. The DEM features a topographic obstruction crossing the floodplain– and the Couros river—with an orientation N-S. At this section, flow occurs through a hydraulic passage of 30 m length, with a rectangular shape of 4.0 m × 3.5 m, which was geometric and hydraulically modelled as a bridge. Roughness was represented through the Manning's *n* value: 0.025 in the channel and 0.05 in overbank areas.

Flow data were estimated for the 100-year flood, based on the results obtained from the hydrologic study of Ramísio, Duarte, and Vieira [11,12], in which peak flows were obtained using empirical, kinematic, and statistical methods. The adopted 100-year peak flow results from a simple average of the estimates presented in the cited study, regarding four methods. Three of the methods are kinematic: the Giandotti, the rational method (using Kirpich concentration-time), and the rational method (using the Chow concentration-time). As for the rational method, we used the rainfall Intensity-Duration-Frequency (IDF) curve parameters from a rain gauge station located 45 km ENE of the study area, with a time series of 25 years. Although the gauge is located 45 km from the study area, it represents the same climatological and pluviogenic context found at the Couros river basin. Although the rain gauge is located at an elevation of 95 m, its 10 km radius has a mean elevation of 331.4 m, which is circa 70 m above the mean basin's elevation of 258.8 m. Considering the mountainous context where both areas are located, such a difference can be considered as not relevant. Of the several rainfall durations for which those parameters are valid, we have adopted the interval 30 min to 6 h, which frames the concentration-time found in the upstream sub-basin that drains to the modelled reach—30.8 min and 71.7 min according to the Kirpich and Chow formulas, respectively. The rainfall duration is assumed to equal the concentration-time obtained from the simple average of the applied methods. The runoff coefficient C was estimated from the slope angle, land use data, and the degree of imperviousness [11]: 0.56 in the upstream basin and 0.73 in the basin crossed by the modelled stream of the Couros river.

The fourth method is a statistical one, designated as the Loureiro's formula. This method uses the area of the basin and regional parameters, valid for the Portuguese context only, calculated from peak flow data fitted to the Gumbel distribution. The average of the four estimations was used from the Ramísio, Duarte, and Vieira [12] study in two sections: one, defined at the upstream inlet (35.95 m3/s, corresponding to a contributing basin of 3.75 km2); the other, estimated at the downstream outlet of the modelled reach (54.15 m3/s), the difference (18.2 m3/s) of which we have distributed along the 1786 m length of the modelled reach, proportionally to the drainage area of 11 pour points, Table 1.


**Table 1.** Estimated flow data for use in the hydraulic model of the Couros river reach, in the Historic City Centre of Guimarães.

By adopting this approach, we have intended to consider the effect of the complex sewer system that drains the small urban basins directly to the Couros river along the modelled reach.

Pre-processing of geometric data was performed at the HEC-RAS 5.0.7 environment [13], using the RAS Mapper tool. The modelled reach goes further downstream from the area to which the vulnerability of buildings was available. This is explained by the existence of the mentioned obstruction near the study area, leading us to model a flood pathway long enough to allow a proper elongation of floodwaters downstream of the obstruction. A normal depth slope of 0.015 m/m was used to define the downstream reach boundary conditions. A steady flow water surface computation was initiated at the upstream boundary using the 35.95 m3/s peak flow, to which the mentioned 11 flow change locations were sequentially added. In the modelling plan, we selected a mixed flow regime and requested the optional floodplain mapping.

Finally, we sequentially exported the HEC-RAS modelling results to SDF, XML, and GIS formats. Velocity and depth data were exported as GRID files (ESRI raster format) and the flood boundary as shapefile (vector polygon), both for the 100-year flood.

#### *3.2. Vulnerability Module*

Flood vulnerability is assessed through the application of the simplified flood vulnerability assessment methodology, which was initially proposed by Miranda and Ferreira [6]. This methodology consists of the evaluation of two vulnerability components, an exposure and a sensitivity component (Figure 2), which, through an index, quantify the vulnerability of the building to flood inundation.

As presented in Figure 3, the Exposure Component is composed of one single parameter (Wall Orientation), which evaluates the influence of the orientation of the main façade wall of the building to the water flow. This parameter intends to bring together the multiple aspects behind this phenomenon: the location of the building, the orientation of its main façade wall, and the existence of openings, knowing that buildings located in low-lying areas are theoretically more susceptible to inundation due to runoff. Regarding the Sensitivity Component, this focusses on the physical characteristics of the building by evaluating its Material characteristics, Condition (or conservation state), Number of Storeys, Age, and Heritage Status. It is worth noting that these indicators were defined from a comprehensive review of analogous indicators proposed for assessing similar building typologies and structural characteristics under similar conditions [6].

**Figure 3.** The schematisation of the vulnerability module.

We obtained the values of the exposure (EC) and the sensitivity (SC) components, which was done based on the vulnerability classes identified in Figure 3 (where A = 10, B = 40, C = 70, and D = 100), and individual flood vulnerability (FV) can be obtained using Equation (1).

$$\text{EV} = \text{EC} \times \text{SC} \,. \tag{1}$$

For simplicity's sake, the values of these three indices (EC, SC, and FV)—which range, respectively, from 10 to 100, 50 to 500, and 500 to 50,000—are normalised to fall within the range of 0 to 100; the lower

the value of the index, the lower the level of exposure, sensitivity, or vulnerability of the building. According to Miranda and Ferreira [6], despite the fact that this approach was primarily intended for assessing the flood vulnerability of single buildings, its simple formulation makes it particularly suitable to evaluate large urban areas, such as historic city centres. It is precisely in this context that we apply this methodology herein.

#### *3.3. Risk Matrix*

Finally, flood risk is computed from the combination of the hazard and vulnerability results obtained by using the above-presented approaches. This is done through a vulnerability–hazard matrix, which relates each building's vulnerability with the level of hazard to which it is exposed (see Table 2).


**Table 2.** Flood risk matrix.

Regarding the vulnerability level, this is measured directly by the flood vulnerability index. For such, 20 and 40 were conservatively defined here as plausible threshold values for "moderate" and "high" vulnerability, respectively. Although this criterion may be debatable, it is important to note that the value of 40 (the boundary value between "moderate" and "high") is frequently used in index-based vulnerability assessment approaches as a threshold for high vulnerability, see, for example, [14]. Concerning the level of hazard, this is obtained by combining flood velocity (*v*) and water depth (*y*) results according to the criterion given in Equation (2):

$$\begin{cases} \quad v < 2 \text{ m/s} \vee y \times v < 3 \text{ m}^2/\text{s} \triangleq \text{Low Hazard} \\ \quad v > 2 \text{ m/s} \wedge \not\equiv 3 \text{ m}^2/\text{s} < y \times v < 7 \text{ m}^2/\text{s} \triangleq \text{Moderate Hazard} \\ \quad v > 2 \text{ m/s} \wedge y \times v > 7 \text{ m}^2/\text{s} \triangleq \text{High Hazard} . \end{cases} \tag{2}$$

This criterion was proposed originally by Clausen [15] based on empirical data. According to Kelan and Spence [16], a physical meaning for this condition is related to momentum = mass × *v* = *pw* × volume × *v* = *pw* × horizontal flood area × *y* × *v*. If *pw* is constant in this hypothesis, the horizontal flood area can be considered constant, thus leaving *y* × *v* as the variable.

#### *3.4. Study Area, Building Assessment, and Implementation of a Geographic Information System (GIS) Tool*

The study area includes nine blocks with 116 buildings in total. Each one of these buildings was comprehensively assessed onsite to collect the data required for applying the flood vulnerability assessment approach, by using a detailed checklist. After being digitalised and systematised, the data that were contained in the checklists were manually inputted into a spreadsheet database to create a digital record and to automate some later steps of the work.

Once all the indices and flood indicators were computed, the results were plotted and analysed spatially using the open-source Geographic Information System software QGIS [17]. Geo-referenced graphical data (i.e., vectorised information and orthophoto maps) and specific information related to the hazard and the characteristics of the buildings were combined within the software to obtain firstand second-order outputs. In this case, each polygon (corresponding to a building) is associated with several features and attributes, allowing for their visualisation, selection, and search.

Because this GIS tool can efficiently combine highly-relevant hazard, vulnerability, and risk outputs in a very flexible and dynamic environment (easily updatable or modified at any time), it is undoubtedly a significant asset for risk management purposes, allowing the local authorities to define more consequent risk mitigation strategies.

#### **4. Results and Discussion**

Before getting into the integrated risk assessment outputs in Section 4.3, it is worth exploring the individual hazard and vulnerability results (in Sections 4.1 and 4.2, respectively), which we define as first-order results.

#### *4.1. Hazard Modelling*

The flood modelling process described in Section 3.1 allowed us to obtain a broad set of primary hazard indicators, which, alone, give us good insight into the potential magnitude of a flood event in the study area. As already noticed in Section 3, we focused on three primary hazard indicators: flood extent, velocity, and water depth.

As for the flood extent, we found that for the adopted 100-year peak flow scenario, the study area was significantly affected by the flood. As illustrated in Figure 4, 41 out of the 116 buildings considered in this analysis are potentially affected by the flood.

**Figure 4.** Flood inundation map for the adopted 100-year peak flow scenario.

Of these 41 buildings, 23 are affected to the fullest extent, whereas the other 18 are partially flooded, see Table 3. In absolute numbers, about 11,000 m2 of a total of about 32,000 m2 of built-up area are affected, which corresponds to about 34%.

**Table 3.** Number of affected building distributed by ranges of the affected extent.


Concerning the flood velocity, presented in Figure 5, it ranges between 0.01 m/s and 5.72 m/s. The average velocity value at the surface of the 41 buildings affected by the flood is about 2.15 m/s—with a standard deviation value (STD) of 1.68—being that 22 of these 41 building are exposed to surface velocities higher than 2 m/s. As can be observed in Figure 5, there are two blocks that are particularly

affected: one located in the northeast zone (15 buildings) and the other roughly at the central zone of the study area (7 buildings).

**Figure 5.** Flood velocities resulted from the adopted 100-year peak flow scenario.

As for the water depth, from our hazard analysis, it was found that for the considered 100-year peak flow scenario, the buildings will expectably be exposed to an average depth of about 1.20 m (STD = 0.97). As illustrated in Figure 6, 31 out of the 41 buildings affected—which is slightly more than 75%—present a water height of more than 0.5 m, which is a very significant value. The maximum water depth value was obtained at the triangular shape building located at the southwest zone of the study area (4.58 m), see Figure 6.

Table 4 summarises the above-discussed results, presenting the absolute and relative number of potentially affected buildings for different ranges of flood velocity and water depth.


**Table 4.** Number of affected buildings distributed by ranges of flood velocity and water depth.

**Figure 6.** Water depth resulted from the adopted 100-year peak flow scenario.

#### *4.2. Flood Vulnerability*

From the application of the flood vulnerability assessment approach detailed in Section 3.2, we found that 60% of the 116 buildings evaluated present a flood vulnerability value (FV) between 0 and 30. As presented in Figure 7, the remaining 40% present values ranging between 30 and 100. Statistically, the dataset has an average value of 25.70 (STD = 15.95).

Although flood damages cannot be estimated directly from these vulnerability values, the distribution shown in Figure 7 seems to demonstrate that a significant proportion of the buildings assessed are potentially very vulnerable to a flood event. We will provide more insight into this in Section 4.3.

When exploring vulnerability results, it is often relevant to dive into the analysis of the specific parameters of the methodology. Such analysis allows for a better understanding of the vulnerability sources, which is a fundamental prerequisite to defining more consequent and effective risk mitigation measures. Figure 6 presents a set of six maps associated with the spatial distribution of the parameters that compose the flood vulnerability index.

A comprehensive discussion of the maps gathered in Figure 7 would be of little interest. However, we find it relevant to highlight some main results.

Firstly, the general conservation state of the buildings: As it is apparent in Figure 7c, in general, the buildings are either in good condition (Vulnerability Class A, 34%) or have minor conservation issues (Class B, 46%). Most of these minor conservation issues are related to small cracks or coating decay. Further, 20% of the buildings are in poor condition, presenting a significant cracking and moisture phenomena.

Secondly, the distribution of the number of storeys, in Figure 7d: This aspect is especially relevant since, according to several authors [18,19], the number of storeys has a direct influence on the vulnerability of the building to flooding. We will go into further detail regarding this when discussing the second-order risk results. For now, let us emphasise that 113 out of the 116 buildings evaluated have between 1 and 3 floors, distributed as follows: 38% are single-storey buildings, 33% have two storeys, and 27% have three.

Finally, the Heritage Status, in Figure 7f: This is a relevant aspect in the sense that buildings with heritage value deserve particular attention when it comes to risk assessment. Thus, regarding this aspect, 59% of the buildings assessed are ordinary buildings (i.e., non-classified). Then, 38% are currently in the process of classification and only 3% correspond to classified buildings.

**Figure 7.** Mapping of the spatial distribution of the vulnerability classes: (**a**) Wall Orientation; (**b**) Building Material; (**c**) Condition (or conservation status); (**d**) Number of Storeys; (**e**) Building Age; (**f**) Heritage Status.

When getting into the discrete exposure, sensitivity, and vulnerability results, it is possible to gain a much better understanding of the overall vulnerability of the buildings. As illustrated in Figure 8a,b, respectively, spatial analysis reveals a scattered distribution of exposure and the sensitivity indicators over the study area. Comparatively, it is also clear that the exposure values are generally higher than sensitivity values. Despite this, and although there is no correlation between these two indicators, it is interesting to see that some of the most exposed buildings are also those that revealed to be more sensitive. This cross analysis is presented in Figure 9a, where we identify the buildings to which the exposure and the sensitivity values are cumulatively higher than 40. For vulnerability reduction purposes, these buildings are the most critical ones.

**Figure 8.** Spatial distribution of the exposure (**a**) and the sensitivity (**b**) indicators.

**Figure 9.** Cross analysis of the exposure and the sensitivity indicators (**a**) and spatial distribution of the vulnerability index results (**b**).

Figure 9b presents the distribution of the vulnerability index results. Although we have already commented on the most significant aspects when addressing the exposure and the sensitivity results, it is worth emphasising two further points. First, the fact that a significant part of the most vulnerable buildings is located in the central part of the study area. As we will have the opportunity to prove afterwards, this aspect may be essential, taking into account that these more vulnerable buildings are located coincidentally within the most hazardous area. Second, the fact that some of these buildings are abandoned. Keeping the social aspects out of the discussion, buildings' abandonment is one of the main factors of rapid degradation and, as a result, increased physical vulnerability. During the discussion of the risk results, we will provide further insight into these critical issues.

#### *4.3. Integrated Risk Assessment*

After having discussed the main outputs resulting from the hazard and the vulnerability analysis, we are in an excellent position to integrate these results in order to obtain more comprehensive flood risk indicators. As detailed in Section 3.3, this integration will ultimately result in a risk matrix that correlates the level of vulnerability with the level of hazard to which each building is exposed. However, before going into such an outcome, we would like to analyse a set of second-order results obtained by crossing some of the above-discussed indicators.

#### 4.3.1. Second-Order Analysis

The first second-order result that we think is of great interest is the joint analysis of the water depth, the exposure component of the vulnerability index, and the conservation state of the buildings. Before going into the result, it is worth justifying the selection of these particular indicators. Water depth

is recognised as the most relevant hazard indicator when analysing the impact of flood actions on buildings [15]: depth is usually used to produce vulnerability curves associated with flood events, which is also known as depth-damage curves [20–22]. In fact, although some studies note the importance of flood parameters other than depth [16], those are barely analysed so comprehensively. Equally important is the fact that water depth is directly related to flow velocity and, therefore, with the effects of the hydrodynamic actions. The impact of these hydrodynamic actions on the buildings are, of course, very dependent on their level of exposure—a building of which its façade wall is perpendicular to the direction of the water flow is potentially much more affected than another where its façade is parallel to the direction of the flow. As referred to in Section 3, this is exactly what the sensitivity component of the flood vulnerability index seeks to evaluate and that is why it is considered in this second-order analysis. It is also known that the weaker the state of conservation of the building, the higher the impact of the flood (due to hydrostatic and hydrodynamic actions). This fact rationalises the inclusion of this aspect in the present analysis.

Justifying the three indicators considered herein, Figure 10 presents the map resulting from their integrated analysis. We want to highlight a couple of interesting conclusions from the analysis of this map. The first one is the identification of highly-exposed buildings (i.e., with an EC value higher than 40) that, for the adopted 100-year peak flow scenario, are subject to a water depth over 2 m. Based on this first criterion, it is possible to identify six buildings—about 5% of the building stock within our study area—which are highlighted in red in Figure 10. However, when including the conservation state into the analysis, this number can be further reduced to 5 (about 4% of the building stock), making this result even more informative for decision-making purposes.

**Figure 10.** Joint analysis of water depth, exposure, and condition indicators.

Another second-order result that we find worthy of particular discussion herein is the joint analysis of the water depth and the sensitivity component of the flood vulnerability index methodology. Since this component assesses the intrinsic characteristics of the buildings that make them sensitive to the impact of flood actions, it is undoubtedly relevant to consider these two indicators together. We would like to note that the conservation sate of the buildings is already part of the sensitivity component, which is why we now choose not to consider this aspect explicitly. Figure 11 presents the maps resulting from this analysis.

**Figure 11.** Joint analysis of water depth and the sensitivity indicator.

As can be inferred from the analysis of Figure 11, the number of buildings identified from the joint analysis of the water depth and the buildings' sensitivity (for the very same 100-year peak flow scenario) is reduced to 2, which is a little less than 2%. It is also interesting to note that these two buildings are among the five already identified in Figure 10, which, from a decision-making standpoint, pushes them to the top of the intervention priorities.

#### 4.3.2. Matrix-Based Analysis

After the above preliminary second-order analysis—through which we have already gained a deeper understanding about the potential impact of the considered flood scenario in some particular buildings—we got to the point of combining the obtained hazard and vulnerability results into a single flood risk indicator. As a result, the level of hazard and vulnerability associated with each building are related through a risk matrix, wherein the vulnerability is inputted directly using the flood vulnerability index (FV) and the hazard is derived from the velocity and water depth results, using Equation (2).

Thus, before diving deeper into the results obtained from the flood risk matrix, it is useful to map and analyse the spatial distribution of the hazard and vulnerability results. These maps are presented in Figure 12a,b. According to the criterion adopted in this matrix-based analysis, the level of flood hazard throughout the study area is quite homogeneous, see Figure 12a. Further, 22 out of the 116 buildings assessed (about 19%) were identified as having a "Moderate" flood hazard, whereas all the remaining were labelled with "Low" hazard. In terms of spatial distribution, it is possible to identify two blocks that can be particularly affected. Let us notice that although the hazard is being evaluated differently in this section—here, velocity and water depth are combined into a single hazard indicator—this result is in essential agreement with the discussion provided in Section 4.1. As for the spatial distribution of the flood vulnerability results, given in Figure 12b, it is visibly more heterogeneous: 37.9% (44), 41.4% (48), and 20.7% (24) of the buildings were identified as having a "Low", "Moderate", and "High" flood vulnerability, respectively.

**Figure 12.** Spatial distribution of the flood hazard (**a**) and flood vulnerability results (**b**).

Contextualising the hazard and the vulnerability results, it remains for us to present and discuss the final flood risk results. They are illustrated in Figure 13 and quantified in Table 3.

**Figure 13.** Spatial distribution of the flood risk results.

As shown in the flood risk matrix provided in Table 5, 66.4% of the buildings (77) were identified with "Low" flood risk, 29.3% (34) with "Middle" risk, and 4.3% (5) with "High" risk for the same 100-year peak flow scenario considered in this work.


**Table 5.** Flood risk matrix.

Complementarily to the main conclusions drawn from the second-order analysis discussed in Section 4.3.1, this outcome allows us to recognise some additional buildings that, because the hydrodynamics effects of the flood have been disregarded in that analysis (only water depth was considered explicitly), are not identified there. In fact, none of the building previously highlighted in Figures 10 and 11 are identified in this final analysis as having a "High" flood risk. If in a less thoughtful consideration these two results may seem divergent, in fact, they prove to us the importance of considering different criteria and approaches to assess flood risk in urban areas. Still, in this regard, we would like to stress that these results are obviously conditioned by the criterion used to define the levels of hazard and vulnerability. If, for example, we had used a different criterion from that given by Equation (2)—which, as is always the case, was proposed by Clausen [15] from a set of specific conditions—the results could be very different. This said, the five buildings identified in this analysis, together with those five identified in Section 4.3.1, should be the priority targets of future flood risk mitigation programs.

#### **5. Conclusions**

This paper presents a new method for assessing the flood risk of the built environment, specifically adapted to areas classified by their heritage value. For the effect, the Historic City Centre of Guimarães (Portugal) was selected as a study area.

We have defined a hazard scenario for the 100-year flood by hydraulic modelling with HEC-RAS, of which its outputs are flood extent, depth, and velocity. Such outputs, derived from a steady flow analysis, must assume the limitation of not providing information that ultimately would allow for a better understanding of the flooding process, namely that from a flood hydrograph (time to peak and duration of inundation), an unsteady analysis would result. The vulnerability has been assessed by considering an exposure component (based on wall orientation) and a sensitivity component (based on heritage status, age, number of storeys, condition, and material of buildings). They define the hazard and vulnerability modules of a GIS tool, which later provided a cross-analysis of information, culminating in a flood risk matrix. A total of 116 buildings were evaluated with the developed methodology.

The first insight into each module's results provided a detailed understanding of flood risk roots or causes, which gives decision-makers and planners information on which risk factors are more relevant in each block or building. This highlighted wall orientation and condition as the most concerning aspects, with exposure being more relevant than sensitivity in explaining physical vulnerability. The second-order analysis evaluated evidence risk contexts that otherwise would go unnoticed, namely a) the analysis of flood depths, conservation state (from the sensitivity component of FV), and wall orientation (from the exposure component of FV) and b) the overlay of flood depths with the sensitivity component of FV. Complementarily, the risk-matrix analysis identified other buildings as high risk, some of them not coincident with the individualised analysis of risk components (hazard and vulnerability).

It is the ability to characterise the potential impacts of hazardous processes that allows us to prepare and promote the necessary formal and informal changes, which, ultimately, will contribute to risk reduction [23]. In that sense, we expect the generated knowledge to be applicable in different fields of flood risk management.

If the individual first-order results are used, resources' assignment will prove to be more efficient in addressing the particular risk factors identified as more relevant in each building. Combined with the risk-matrix classification, the results are capable of informing municipal decision-makers and planners regarding the intervention priorities of urban rehabilitation projects. Civil protection agents will be capable of planning efficient and safe evacuation routes, in case of flash flooding. The business sector will be able to prepare for recurrent flooding with minor consequences, defined by some as nuisance flooding. Finally, medium- to long-term strategies of spatial planning and design can be drawn from the risk-matrix results: the eventual relocation of buildings classified as high risk; the retrofitting of their physical characteristics in order to reduce their physical vulnerability while maintaining their functionality, if possible, even during flood events, in the areas of low risk. Intermediate contexts of flood risk require, in terms of spatial and design planning, pondering the

range of possible interventions, specifically those that better combine urbanity and safety in distinct degrees of flood adaptation [24].

When coupled with social vulnerability data, the provided risk assessment will significantly contribute to increasing the resilience of the built environment. The historical centres of cities represent places of high sensitivity of their exposed elements. In addition to their heritage and cultural value, the mandated authorities need to consider the functions that these buildings provide, both for the resident population and the transient population.

The analysis we have presented, although focused on flood risk, can be replicated with some adaptation, concerning other hazard processes, not exclusively of hydro-geomorphologic origin. Moreover, provided that the evaluated buildings can be grouped typologically, we believe that this framework can be easily applied in larger-scale risk assessments, keeping a very reasonable balance between accuracy and applicability. In that case, blocks of buildings or even entire neighbourhoods can be used as the basic assessment units. An interesting example of using neighbourhoods as the basic assessment unit to evaluate fire risk can be found in [25].

**Author Contributions:** Conceptualisation, T.M.F. and P.P.S.; methodology, T.M.F. and P.P.S.; hazard analysis, P.P.S.; vulnerability analysis, T.M.F.; investigation, T.M.F. and P.P.S.; writing—original draft preparation, T.M.F. and P.P.S.; writing—review and editing, T.M.F. and P.P.S.; funding acquisition, T.M.F. and P.P.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** Tiago M. Ferreira is funded by the Portuguese Foundation for Science and Technology (FCT) through the postdoctoral grant SFRH/BPD/122598/2016 and Pedro P. Santos is funded through the project with the reference CEEIND/00268/2017.

**Acknowledgments:** The authors would like to acknowledge the City Council of Guimarães, particularly the Architect Filipe Fontes, for his generous support and contribution to the development of this work.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article*
