*Article* **Regional Inundation Forecasting Using Machine Learning Techniques with the Internet of Things**

### **Shun-Nien Yang and Li-Chiu Chang \***

Department of Water Resources and Environmental Engineering, Tamkang University, New Taipei City 25137, Taiwan; aa22814946@yahoo.com.tw

**\*** Correspondence: changlc@mail.tku.edu.tw; Tel.: +886-2-26258523

Received: 23 April 2020; Accepted: 29 May 2020; Published: 31 May 2020

**Abstract:** Natural disasters have tended to increase and become more severe over the last decades. A preparation measure to cope with future floods is flood forecasting in each particular area for warning involved persons and resulting in the reduction of damage. Machine learning (ML) techniques have a great capability to model the nonlinear dynamic feature in hydrological processes, such as flood forecasts. Internet of Things (IoT) sensors are useful for carrying out the monitoring of natural environments. This study proposes a machine learning-based flood forecast model to predict average regional flood inundation depth in the Erren River basin in south Taiwan and to input the IoT sensor data into the ML model as input factors so that the model can be continuously revised and the forecasts can be closer to the current situation. The results show that adding IoT sensor data as input factors can reduce the model error, especially for those of high-flood-depth conditions, where their underestimations are significantly mitigated. Thus, the ML model can be on-line adjusted, and its forecasts can be visually assessed by using the IoT sensors' inundation levels, so that the model's accuracy and applicability in multi-step-ahead flood inundation forecasts are promoted.

**Keywords:** machine learning model; Internet of Things (IoT); regional flood inundation depth; recurrent nonlinear autoregressive with exogenous inputs (RNARX)

#### **1. Introduction**

Flood is one of the most disruptive natural hazards, which causes significant damage to life, agriculture, and economy, and has a great impact on city development. Nowadays, flood tends to increase and become more severe as climate changes together with the rapid urbanization and aging infrastructure in cities. Early notification of flood incidents could benefit the authorities and public for devising preventive measures, preparing evacuation missions, and alleviating flood victims. A precautionary measure to cope with the upcoming flood is flood forecasting and warning involved persons, which would result in the reduction of damage and life lost. Flood forecast models have been developed over the last decades. Among them, physically based models have been commonly used and showed great capabilities for flood estimation, while they often require hydro-geomorphological monitoring datasets and intensive computation, which prohibits short-term prediction [1–3]. Statistical models, such as the multiple linear regression (MLR) [4–6] and autoregressive integrated moving average (ARIMA) [7–10] are also frequently used for flood modeling. Nevertheless, their capability for short-term forecasting has been restricted because of the nonlinear dynamic feature of storm events resulting in a lack of accuracy and robustness of the statistical methods [11].

Machine learning (ML) models have a great capability to model the nonlinear dynamic feature and have widely been used in hydrological issues, such as predicting the level of sewage in sewers [12]; arsenic concentration in groundwater [13]; or flood level prediction [14–19]. Among the ML models, nonlinear autoregressive models with exogenous inputs (NARX) [20] can adaptively learn complex

flood systems and have been reported as valid for flood forecasting [21–26]. There is also relevant literature that applied it to regional flood forecasting. For instance, Shen and Chang [27] established a NARX model for flood forecasting in flood-prone areas in Yilan County, Taiwan and demonstrated that it has an error tolerance rate and can effectively suppress error accumulation in predicting the next 1–6 h; and Chang et al. [28,29] proposed the use of self-organizing map combing with recurrent nonlinear autoregressive with exogenous inputs (SOM-RNARX) for multi-time regional flood forecasting model and indicated the method could effectively model regional flood forecasting. It can provide the accuracy and reliability of the flood management system. The majority of these ML models have used rainfall as an input to make regional flood inundation forecasts. A drawback of these models was that they mainly relied on measurements from rain stations, which hinders the models from being sequentially adjusted due to lack of real observed inundation depth. Consequently, most existing models could not properly respond to a sudden flood and could not verify the resultant flood forecasts. Furthermore, the forecast was made based on present data that restrict it from determining flood inundation depths much further ahead. Consequently, there is a research gap from the perspective of ML modeling and the data monitoring system. In light of this, an analysis was conducted on the use of monitoring inundation depth data gathered from urban areas to forecast flooding with a view of on-line updating the model and mitigating the residuals between model outputs and real observed inundation depths.

Internet of Things (IoT) sensors are a useful means of carrying out the monitoring of rivers and other natural environments. They have attractive features: simple to install, low energy consumption, and high-precision sensors. The integration of a large number of IoT sensors can provide on-line comprehensive and broader information to effectively perform environmental monitoring and forecasting [30–34]. Recently, several studies have implemented IoT and big data [35–38] for flood forecasting. For instance, Chang et al. [39] proposed building an intelligent hydro-informatics integration platform to integrate ML models with sensors data for flood prediction. Sood et al. [40] proposed the Internet of Things (IoT) smart flood monitoring to predict floods and flood levels. Mishra et al. [41] combined IoT and deep learning to identify ditch or drainage channel blockage images. IoT has become one of the vital development projects in Taiwan lately. The authorities' IoT centers, (e.g., Water Resources Agency) can analyze the data and provide suitable countermeasures and rapidly transmit the data to regional control centers against flood. The integration and application of various sensors can offer a large amount of real-time monitoring data (e.g., inundation depths) for ML models' updating and testing.

This study intends to use the IoT sensor data to construct a machine learning-based embedded flood forecast model for predicting average flood depth in a river basin. The IoT sensor data will be sequentially fused into the ML model as extra input factors, so that the forecast model can be continuously updated and assessed, and the results can be closer to the current situation. This paper is organized as follows. The next section describes the proposed methodology. Subsequently, Sections 3 and 4 present the study area and practices relating to flood forecasting systems, respectively. Then, the results of visual assessments and numerical evaluations on studied areas are reported and discussed. Concluding remarks are given in the last section.

#### **2. Methodology**

We propose a methodology that couples machine learning models, i.e., the recurrent nonlinear autoregressive with exogenous inputs (RNARX) model [20], with the IoT sensor data for providing multi-step-ahead average regional inundated depth (ARID) during storm events. Various IoT sensor data were explored and implemented into the ML model as input factors to continuously update the model's parameters for better forecasting the ARID.

The architecture of the RNARX shown in Figure 1 is a three-layer dynamic neural network. The network was configured by using the model's outputs as parts of inputs, and its weights can be adjusted by using the conjugate gradient back-propagation learning algorithm. The major difference between the dynamic neural networks (e.g., RNARX) and the static neural networks (e.g., back propagation neural network, BPNN) is that dynamic neural networks use the network output of the current moment as one of the input factors for the next moment, so the model can effectively track the time-series features. The dynamic neural networks have been widely used in modeling time series, especially when there is no actual (observed) value of upcoming time, such as inundation depth. For instance, in this study, the average regional flood inundation depth in the urban area would be continuously predicted along the storm event, while there is no observed value in the coming hours. In model configuring stage, we used simulation results to train and validate the constructed model, while in actual applications, the average flooding depth of all grids, however, cannot be obtained. As known, the longer the forecast horizon, the greater the forecast error would be due to a lack of real monitoring of the inundation depth. In this study, a number of flood sensors implemented in the study area were explored for their effectiveness in modeling the multi-step flood inundation forecasts.

**Figure 1.** The architecture of the recurrent nonlinear autoregressive with exogenous inputs (RNARX) network.

The network contained P rainfall inputs (R), Q flood sensor inputs (S), and one recurrent input from the previous output. Let *X*(*t*) denote the (P + Q + 1) × 1 input vector, *y*ˆ(*t* + *n*) denote the *n*-step-ahead network output, and *yj*(*t* + *n*) denote the output vector of the *j*th layer. The network multiplied the input factors by weights and forwarded them to the next neuron. The neurons were summed up and outputted through the activation function f (·), which was continuously forwarded to the final output layer. The output values of each neuron were defined as Equations (1)–(4).

$$net\_{\dot{j}}(t+n) = \sum\_{i} w\_{\dot{j}i} x\_i(t) \tag{1}$$

$$y\_j(t+n) = f(net\_j(t+n))\tag{2}$$

$$net(t+n) = \sum\_{j} v\_{j} y\_{j}(t+n) \tag{3}$$

$$
\hat{y}(t+n) = f(net(t+n))\tag{4}
$$

Let *W* denote the N × (P + Q + 1) weight matrix of the hidden layer. Let *V* denote the N × 1 weight matrix of the output layer. RNARX uses the conjugate gradient back-propagation learning algorithm to adjust the average regional inundation depth (ARID) in a specific time during the training phase. The target output value at time t + n and its error vector were defined as *e*(*t* + *n*), as shown in Equation (5), and the instantaneous error function *E*(*t* + *n*) was as in Equation (6).

$$e(t+n) = y(t+n) - \hat{y}(t+n)\tag{5}$$

$$E(t+n) = \frac{1}{2}e^2(t+n) \tag{6}$$

The correction of *vj* was defined as the partial differential of the error function *E*(*t* + *n*) as Equation (7). After the chain-rule process, Equation (8) could be obtained, so *vj* was updated in Equations (9) and (10).

$$\frac{\partial E(t+n)}{\partial v\_j} = \frac{\partial \left(\frac{1}{2}e^2(t+n)\right)}{\partial v\_j} \tag{7}$$

$$\frac{\partial \mathbb{E}(t+n)}{\partial v\_{/}} = (y(t+n) - \hat{y}(t+n))(-f'(net(t+n)) \times \left(y\_{/}(t+n) + \sum\_{j} v\_{/} \left(f'(net\_{/}(t+n))w\_{j} \frac{\partial \hat{y}(t+n-1)}{\partial v\_{/}}\right)\right) \tag{8}$$

$$
\Delta \upsilon\_j = -\eta\_2 \frac{\partial E(t+n)}{\partial v\_j} \tag{9}
$$

$$
\omega\_j(p) = \upsilon\_j(p-1) + \Delta \upsilon\_j \, p = \text{epochs} \tag{10}
$$

where η<sup>2</sup> is the learning rate. Similarly, we could partially differentiate *wji* of the *E*(*t* + *n*), and the results are shown in Equations (11)–(14).

$$\frac{\partial E(t+n)}{\partial w\_{ji}} = \frac{\partial \left(\frac{1}{2}e^2(t+n)\right)}{\partial w\_{ji}}\tag{11}$$

$$\frac{\partial \mathbb{E}(t+n)}{\partial w\_{jl}} = (y(t+n) - \mathfrak{J}(t+n)) \left( -f \circ (\text{net}(t+n)) \right) \times \left( \sum\_{j} v\_{jl} f' \Big( \text{net}\_{j}(t+1) \big) \Big( w\_{jl} \frac{\partial \big( \hat{y}(t+n-1) \big)}{\partial w\_{jl}} + \delta\_{jl} \mathbf{x}\_{l}(t) \Big) \right) \tag{12}$$

$$
\Delta w\_{ji} = -\eta\_1 \frac{\partial E(t+n)}{\partial w\_{ji}} \tag{13}
$$

$$
\Delta w\_{ji}(p) = w\_{ji}(p-1) + \Delta w\_{ji} \, p = \text{epochs} \tag{14}
$$

where η<sup>1</sup> is the learning rate; δ*ji* is Kronecker delta with value 1 if and only if *i* = *j*. Through the above weight adjusting process, the error, because of the complicated interaction between the inputs (rainfall, sensor data, feedback flood depth) and the outputs (flood depth at time T + 1–T + 3), would be computed, and the weights were systematically adjusted through continuous iterations so that the error would be gradually reduced until the satisfactory accuracy and/or the number of iterations was reached for the set requirements. Therefore, the RNARX model was used to predict the multi-step-ahead flood depth, and at the same time, the differences by using only rainfall and the extra sensors of inundation were compared.
