*Article* **Spatiotemporal Modes of Short Time Rainstorms Based on High-Dimensional Data: A Case Study of the Urban Area of Beijing, China**

**Wei Liu 1, Sheng Chen <sup>2</sup> and Fuchang Tian 1,\***

	- **\*** Correspondence: tianfuchang@tju.edu.cn; Tel.: +86-22-27401156

**Abstract:** The identification of the characteristics of short time rainstorms in urban areas is a difficult problem. The traditional rainfall definition methods, using rainfall graph or a GIS map, respectively reflect the temporal or spatial variations of a rainfall process, but do not regard a rainfall as one complete process including its temporal and spatial dimension. In this paper, we present an approach to define typical modes of rainfall from the temporal and spatial dimensions. Firstly, independent rainfall processes are divided based on the continuous monitoring data of multiple rainfall stations. Subsequently, algorithms are applied to identify the typical spatiotemporal modes of rainfall and reconstruction of the process of modes, including dimensionality reduction, clustering, and reconstruction. This approach is used to analyze the monitoring data (5 min intervals) from 2004 to 2016 of 14 rainfall stations in Beijing. The results show that there are three modes of rainstorms in the Beijing urban area, which account for 31.8%, 13.7%, and 54.6% of the total processes. Rainstorm of mode 1 moves from the northwest to the center of Beijing, then spreads to the eastern part of the urban area; rainstorm of mode 2 occurs in the southwestern region of the urban area, and gradually northward, but there is no rainfall in the mountainous northwest; rainstorm of mode 3 is concentrated in the central, eastern, and southern regions. The approach and results of this study can be applied to rainstorm forecasting or flood prevention.

**Keywords:** rainstorm mode; high dimension; dimension reduction; cluster

#### **1. Introduction**

The security of water resources in this changing environment has become a research focus, due to the fact that climate change causes variations in rainfall at a large scale, and human activities influence the spatiotemporal characteristics at the regional scale [1,2]. Big cities are especially concerned as they are the hub of human activities. Human activities lead to the frequent occurrence of extreme rainstorms, through the urban heat island effect and air pollution [3,4]. Statistics show that 60% of the cities in China suffered from waterlogging from 2014 to 2016 [2]. Studies showed that urban waterlogging is directly related to the temporal and spatial distribution of rainstorms [5]. For example, the flood peak of triangle rainfall with a rain-peak in the central or rear is 30% larger than that of even rainfall, irrespective of whether the average rainfall is the same [6]. It is of great significance to study the spatial and temporal modes of rainfall to prevent waterlogging [7].

At present, there are two main approaches to the definition of urban rainfall process. First, different rainfall types are applied to describe various rainfall processes, such as single peak and double peaks, the method based on site monitoring data. Studies of this method focus on monitoring data of a single station or the average of multiple stations. Pilgrim and Cordery [8] put the time of a rain peak at the most likely position, and the proportion of the rain peak in the total rainfall is the average of the proportion of the rainfall peaks in each field. Keifer and Chu [9] designed the Chicago-mode according

**Citation:** Liu, W.; Chen, S.; Tian, F. Spatiotemporal Modes of Short Time Rainstorms Based on High-Dimensional Data: A Case Study of the Urban Area of Beijing, China. *Water* **2021**, *13*, 3597. https:// doi.org/10.3390/w13243597

Academic Editors: Yaohuan Huang, Yesen Liu, Runhe Shi and Hongyan Ren

Received: 9 November 2021 Accepted: 5 December 2021 Published: 14 December 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

to the strength, diachronic, and frequency of rainfall. Huff [10] designed four modes of rainfall, according to the location of the rain peak and duration of rainfall in Illinois, USA. These methods only consider the total rainfall and extreme rainfall of a single station [11]. However, the data of a single station cannot reflect the spatial characteristics of rainfall, especially in metropolitan areas with obvious spatiotemporal variations of the background environment, such as temperature and wind direction [12]. This method is widely used in urban planning or urban construction, but it is increasingly criticized for neglecting the spatial variations of rainfall, especially in large cities. The second method defines rainfall from the spatial perspective based on geographic theory. The general method is to use a spatial interpolation algorithm to interpolate the monitoring data from stations into spatial distribution data, such as the software ANUSPLIN, developed by Hutchinson, of the Australian National University [13]. In recent years, with the advancement of satellite and radar technology, rainfall spatial data can be obtained more directly, with higher accuracy, for example with Global Precipitation Climatology Project (GPCP) and Tropical Rainfall Measuring Mission (TRMM). These spatial distribution data are more suitable for analyzing the distribution of total rainfall or cumulative rainfall in a certain period, but it is difficult to express the correlation between rainfall distributions in different periods. However, in the real rainfall process, the rain belt usually moves rapidly. The spatial distribution of total rainfall or the time history distribution of rainfall intensity at a single point are not enough to accurately describe the dynamic temporal and spatial distribution characteristics of a rainfall, which is very important for the risk emergency management of rainstorm.

For the purpose of reducing the urban storm disaster effectively, it is necessary to express a complete "rainfall process", which includes not only the rainfall graph of stations at different locations, but also the spatial relationship of rainfall of these stations at different period. So, rainfall is a multidimensional data that includes time and spatial characteristics. Some scholars try to integrate the temporal and spatial characteristics of rainfall with multidimensional data. For example, based on the 3-dimensional mosaic reflectivity data from 10 S-band Doppler radars in Guangdong province, an artificial intelligence (AI) algorithm for automatic hail detection and nowcasting is developed in the light of the machine learning (ML) technology [14]. The temporal and spatial distribution characteristics of short duration rainfall in Shenzhen are analyzed and extracted by LLE algorithm [15]. It is necessary to conduct further study on the rainfall process with temporal and spatial dimensions.

In contrast to taking the rainfall data of a single station as the research object, this paper used the ML algorithm to extract the temporal and spatial distribution characteristics of rainfall from the rainfall data of all rainfall stations in the whole research area. The continuous monitoring data of rainfall stations from 2004 to 2016 were divided into different rainfall processes in Beijing. The rainfall processes were taken as the research objects, then, algorithms, such as dimensionality reduction, clustering, and reconstruction were applied to identify the typical spatiotemporal process of rainfall, and then simulate the rainfall process of different modes.

#### **2. Study Area and Data Process**

According to previous studies, Beijing has become one of the most urbanized cities in China in the past 30 years [16,17]. In 2013, urban population accounted for 86.3% of the total population in Beijing, far higher than the average of 53.73% for China, causing a rapid expansion of built-up areas [17]. In recent years, severe waterlogging has frequently occurred in Beijing [18]. For example, floods and waterlogging caused serious casualties on 21 July 2012 [19].

The Beijing urban area is 396 km2. Fourteen meteorological monitoring stations have continuously recorded the rainfall data of the Beijing urban area in recent years. The continuous monitoring data of 14 rainfall stations were selected from the database, which contains data from 2004 to 2016, at intervals of 5 min. The 14 rainfall stations were evenly distributed in urban areas, as shown in Figure 1.

**Figure 1.** Study area.

Statistics show that 89 rainstorms occurred during 2004 and 2016, which is defined as rainfall of 1 hour exceeding 30 mm [20]. For the comparability of rainfall processes, it was necessary to standardize the rainfalls with different duration. According to previous studies, the method of 1-h moving average was used to deal with the rainfall processes, as shown in Figure 2. The data in the red box were selected as the standardized rainfall.

**Figure 2.** Rainfall of 1-h moving average.

Figure 2 shows the data of a rainstorm process at 14 stations. In Figure 2, "cumulative rainfall" represents the sum of rainfall at 14 stations at a certain period. For example, "527" represents the sum of rainfall at all 14 stations from the beginning of rainfall to 1 h. The range surrounded by red dotted/dashed outlines indicates that the "cumulative rainfall" was the largest in this hour compared with any other hour.

According to the early warning standard issued by Beijing Flood Control Office, when the rainfall exceeds 50 mm in one-hour, yellow warning signals will be issued to the public, indicating that serious urban disasters may occur. In 2004–2016, 22 rainfall events had a maximum sliding rainfall of more than 50 mm in one hour, accounting for 24.7% of the total rainfalls. In this paper, the 22 rainstorms were selected as samples, which are called "short duration storm". The rainfall data of these 22 rainstorms are shown in Table 1.


**Table 1.** Data for 22 rainstorms.

Data source: Beijing Municipal Land and Water Protection Station.

#### **3. Methodology**

#### *3.1. Flowchart for Extraction of Rainfall Temporal and Spatial Modes*

Rainstorm is described according to the duration, intensity, total amount, and frequency by most researchers. In this paper, a new method is introduced to describe a rainstorm. The spatiotemporal process of a rainstorm was constructed as a high-dimensional array, and then principal component analysis (PCA), dynamic clustering (k-means), and reconstruction were applied to analyze the array [21]. The flowchart is shown in Figure 3.

In Figure 3, "m" is the number of samples, which represents the number of rainfalls in the manuscript. "n" is the dimension of the samples which represents the dimension of the rainfall matrix. "k" represents the dimension of rainfall samples after dimensionality reduction.


(4) With the inverse calculation of principal component analysis, the low dimensional array was reduced to a high-dimensional array to express the spatiotemporal process of each rainstorm modes.

**Figure 3.** Flowchart for extraction of rainfall temporal and spatial modes.

#### *3.2. Construction of High-Dimensional Array for Rainstorms*

Rainstorm was characterized as a spatiotemporal process. The continuous monitoring data was divided into independent rainfall periods, where the duration of no rainfall was longer than 120 min [20]. When the maximum rainfall in 1 h was greater than 30 mm, it was called a rainstorm.

We take the monitoring data from multiple stations in a rainstorm as a multi-dimensional array (shown in Figure 2). Based on the processing method introduced by "2. Study Area and Data process" in this paper, the multi-dimensional array was converted to matrix with same rows and columns, the number of which is 14 × 12 in this paper. Fourteen is the number of stations and twelve is the number of periods. The matrix forms one sample in the database of rainstorms (Ω). This is shown in Equation (1).

$$\Omega = \{Q\_1, Q\_2, \dots, Q\_m\}, \ Q\_i = \begin{bmatrix} r\_{1t1} & r\_{2t1} \dots & r\_{st1} \\ r\_{1t2} & r\_{2t2} \dots & r\_{st2} \\ \vdots & \vdots & \vdots \\ r\_{1tn} & r\_{2tn} \dots & r\_{stn} \end{bmatrix} \tag{1}$$

where *Qi* represents rainstorm i, m is the number of rainstorms, *rstn* is the rainfall at *tn* period of s rainfall station, *s* = 1,2,3 ... S, *tn* = 1,2,3 ... N, *S* is the number of stations, and N is the number of periods. The main objective of this paper is to study the "mode" of rainfall, or may be called "structure", so the monitoring rainfall data is standardized by Equation (2).

$$x\_{jtn} = \frac{r\_{stn}}{\sum\_{j=1}^{s} r\_{jtn}} \times 100\tag{2}$$

where *xjtn* represents the ratio of *j* station rainfall to all station at *tn* period, *S* is the number of stations.

Figure 4 is a sample of visual representation of the standardized rainstorm matrix.


**Figure 4.** Sample of rainstorm matrix.

#### *3.3. Dimensionality Reduction of High Dimensional Array*

As shown in Figure 4, the *Qi* is a *S* × N dimensional array. It was necessary to map the high-dimensional array to low dimensional space, in order to cluster and analyze arrays in Ω [22]. Principal component analysis (PCA) was used to reduce the dimensions of high-dimensional arrays [23].

Taking U represents the arrays in Ω, which is an n×m matrix, where n is the number of rainstorms, and m represents the number of characteristics, which equal S × N in this paper, including the monitoring data of each station at all periods. *Yn*×*<sup>k</sup>* is the matrix converted from *Xn*×*<sup>m</sup>* by the transformation matrix *Vm*×*k*, which means that the original data is converted from m dimensions to k dimensions (k m). The steps are as follows:


$$Y\_{n \times k} = X\_{n \times m} \times V\_{m \times k} \tag{3}$$

*Yn*×*<sup>k</sup>* is the low dimensional matrix after transformation, which *n* is the number of samples and *k* is the number of dimensions of new matrix.

#### *3.4. Clustering and Feature Selection*

After the dimensionality reduction of high-dimensional samples, the k-means clustering algorithm is used to classify low dimensional samples of *Yn*×*<sup>k</sup>* [24].

(1) *r* initial cluster centers are set up: *Z*1(*p*), *Z*2(*p*), ...... *Zr*(*p*), where *p* is the number of iterations.


$$Z\_{\vec{j}}(p+1) = \frac{1}{N} \sum\_{i=1}^{N} x\_i \, . \, j = 1, 2, \dots \cdot r \tag{4}$$

where *N* is the number of samples contained in the cluster *Sj*, and *xi* is the sample in *Sj*. Using *Zj*(*p* + 1) as the new cluster center, and the clustering criterion function can be minimized (Equation (5)).

$$J\_{\hat{\jmath}} = \left[ \sum\_{\mathbf{x} \in S\_{\hat{\jmath}}(k)} \mathbf{x} - z\_{\hat{\jmath}}(k+1)^2 \right]^{\frac{1}{2}} \tag{5}$$

where *j* = 1, 2, ··· *K*.

(4) If *Zj*(*p* + 1) = *Zj*(*p*), *j* = 1,2, ... *r*, then go to step (2); if *Zj*(*p* + 1) = *Zj*(*p*), *j* = 1,2, ... *r*, the calculation is over.

In this paper, different *r* values were calculated, and the initial values of different cluster centers were selected for the k-means cluster. Finally, the rainstorms were divided into three modes. The mean value of each rainstorm was taken as a typical mode of the rainstorm.

#### *3.5. Reconstruction of Rainstorms*

With the inverse calculation of principal component analysis, the low dimensional array was reduced to a high-dimensional array to express the spatiotemporal process of each rainstorm. The i clustering centers are reconstructed into i×m matrices (Equation (6)).

$$X\_{app} = Z\_{i \times k} \times V\_{m \times k} \,'\tag{6}$$

where *i* is the number of rainstorm modes and *m* is the dimension of the original data.

#### **4. Results**

The method of extracting rainfall temporal and spatial modes was applied to analyze the monitoring data of 22 rainstorms in the urban area of Beijing. It was found that the spatiotemporal distribution can be divided into three modes, and there are obvious differences in the center, spatial distribution, and occurrence time of three modes. The movement process of rainfall centers in different modes at different periods is shown in Figure 5. The centroid coordinates for a certain period was obtained by the method of calculating the geographical center with the weight of rainfall.

As shown in Figure 5, there are obvious differences in the three modes. The geographical centers of mode 1, mode 2, and mode 3 are located in northwest, southwest, and southeast, respectively. Mode 1 moves from the northwest to the urban center, mode 2 mainly spreads from the southwest and south to the north and the urban center, and mode 3 is basically concentrated in the urban center.

The matrixes of 3 modes are shown in Figure 6, which shows the characteristics of rainfall distribution. For example, the rainstorms belonging to mode 2 are more centralized than that belonging to modes and mode 3, which shows rainfall concentrated at several stations of 30748000, 30523900, 30504030, and 30523650. The rainfall spatial distribution at different periods of each mode is shown in Figures 7–9, which represents the percentage of rainfall at all rainfall stations in the period by the depth of the color in the location of the station.

As shown in Figure 6, the proportion of rainfall at each station is significantly different. In mode 1, rainfall is mainly concentrated in six stations, which are stations 3, 6, and 4. The spatial non-uniformity of rainfall in mode 2 is the most obvious, and only four stations

account for a large proportion of rainfall, which are stations 5, 9, 12, and 4. In mode 3, almost all stations have obvious rainfall, except station 1 and station 2.

In mode 1, as shown in Figures 5–7, rainstorm spreads from the northwest mountainous area to the central area of the city and the eastern part of the city. Rainfall began at the beginning of the northwest mountain area, and the rest of the city did not have any rainfall. Rainfall gradually dispersed and rainfall occurred at all stations. There are seven rainstorms of this mode, accounting for 31.8% of all sample. Among them, the single peak and homogeneity stations are 43%, and the double-peak type accounts for 14%.

**Figure 5.** Centroids of three modes at different periods.

**Figure 6.** Matrixes of 3 modes. (**a**) Mode 1; (**b**) mode 2; (**c**) mode 3.

**Figure 7.** Spatiotemporal process of mode 1.

In mode 2, as shown in Figures 5, 6 and 8, the main rainfall was concentrated in the southern and southwestern regions of the city, gradually spreading to the northern and urban central areas, and there was no rainfall in the northwest mountain areas. There are three rainstorms of this mode, accounting for 13.7% of the total sample. Two of them are unimodal and one is homogeneous. In addition, this type of rainfall occurs between 14:00 and 16:00.

**Figure 8.** Spatiotemporal process of mode 2.

In mode 3, as shown in Figures 5, 6, and 9, the main rainfall was concentrated in the central area of the city and the eastern and southern parts of the city, which basically did not move. There was no rain in the northwest mountain areas. There were 12 rainstorms of this mode, accounting for 54.6% of total sample, among them, 62% were single peak type and 38% were homogeneity type. This mode was the main rainstorm type in summer in Beijing, which mainly occurred from afternoon to evening.

**Figure 9.** Spatiotemporal process of mode 3.

#### **5. Discussion**

How to identify and extract valuable information from multidimensional and massive rainfall monitoring data is a problem faced by many researchers. In this paper, a new approach for rainfall mode recognition is introduced, and different rainfall modes are identified from the massive monitoring data through the algorithms of dimensionality reduction, clustering, and reconstruction of a high dimensional array. This approach can be applied to both multidimensional data analysis and spatiotemporal data mining.

The results show that there are three modes of rainstorms in the Beijing urban area. Rainstorms of mode 1 moved from the northwest to the center of Beijing, then spread to the eastern part of the urban area; rainstorms of mode 2 occurred in the southwestern region

of the urban area, and gradually northward, but there was no rainfall in the mountainous northwest; rainstorms of mode 3 were concentrated in the central and eastern regions, and basically did not move. The results are consistent with the actual rainstorm process. This approach provides a framework for analysis, and there is uncertainty in some respects. For example, the result of the restructured rainfall has certain randomness and uncertainty in spatiotemporal distribution, which should be paid attention to in future. Because of the abundance of available data, this paper only selected the rainfall data of 14 years in Beijing. It should be noted that urban rainfall is a part of a larger range of rainfall in most cases, the temporal and spatial distribution characteristics of rainfall extracted from the rainfall data of 14 stations may have a certain randomness and uncertainty. More extensive rainfall data should be collected, considering terrain, climate, etc., according to the availability of data. In addition, it is necessary to distinguish the historical evolution of rainfall modes in different periods due to the climate changes in the city.

There are several suggestions for practice, First, the spatial and temporal resolution for rainfall data needs to reflect the temporal and spatial differences of a rainfall. Secondly, the study area should include the urban area and surrounding areas as much as possible, which is mainly to maintain the integrity of rainfall process.

#### **6. Conclusions**

In this paper, a high dimensional array is introduced for the study of the spatiotemporal distribution of rainfall, which describes rainfall by storing continuous rainfall monitoring data of all rainfall stations.

Through the establishment of high dimensional arrays of each rainstorm and algorithms, such as dimensionality reduction, clustering, feature extraction, and reconstruction, the spatiotemporal distribution of rainstorms in the flood season of Beijing city was analyzed. It was found that there were three spatiotemporal modes of rainstorms in the urban area of Beijing from 2004 to 2016. Rainstorms of mode 1 moved from the northwest mountain area to the central district, and further spread to the eastern part of the area. Rainstorms of mode 2 concentrated in the southwest of the urban area, gradually spreading to the northern and urban central areas; the northwest mountainous area basically had no rain. Rainstorms of mode 3 concentrated in the central area of the urban area and the eastern and southern regions, and basically did not move. The variation of the centroids of different modes shows a significant difference between the modes. The approach and conclusions in this paper can be applied to the study of rainfall modes in other cities or regions at a different scale, so as to provide assistance for rainfall forecasting and flood prevention.

The limitation of current machine algorithms is that it is too dependent on the number and quality of learning samples. If the rainfall stations are dense and the rainfall data quality is accurate, this method can achieve good results. If it is in an area with insufficient data and sparse rainfall stations, results may not be satisfactory. With the increasing density of rainfall stations, the improvement of rainfall data quality, and improvement of machine learning algorithm, the machine learning algorithm will get more reasonable and objective results.

**Author Contributions:** The study was designed by W.L. and F.T. The data were collected and analyzed by W.L. The paper was written by S.C. and F.T. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **The Impact of Rainfall Movement Direction on Urban Runoff Cannot Be Ignored in Urban Hydrologic Management**

**Yesen Liu 1, Yaohuan Huang 2,\*, Yuanyuan Liu 1, Kuang Li <sup>1</sup> and Min Li <sup>1</sup>**


**Abstract:** Urban floods have been exacerbated globally, associated with increasing spatial-temporal variations in rainfall. However, compared with rainfall variabilities of intensity and duration, the effect of rainfall movement direction is always ignored. Based on 1313 rainfall scenarios with different combinations of rainfall intensity and rainfall movement direction in the typically rainy city of Shenzhen in China, we find that the effect of rainfall movement direction on the peak runoff may reach up to 20%, which will decrease to less than 5% under heavy rainfall intensity conditions. In addition, our results show that the impact of rainfall movement direction is almost symmetrical and is associated with the direction of the river. The closer rainfall movement direction is to the Linear Directional Mean of rivers, the larger is the peak runoff of section. Our results reveal that rainfall movement direction is significant to urban peak runoff in the downstream reaches, which should be considered in urban hydrological analysis.

**Keywords:** urban floods; rainfall movement direction (RMD); rainfall intensity (RI); peak runoff; Linear Directional Mean (LDM); Shenzhen

#### **1. Introduction**

Extreme rainstorms have been increasing in urban areas due to continued global warming and rapid urbanization [1–4].This increase is expected to lead to an increment in urban runoff generation and, consequently, to the intensification of urban flood risks [5,6]. During 2010–2018, more than 160 cities in China suffered from floods each year [7]. Hydrological processes in urban areas are sensitive to small-scale temporal and spatial variations in rainfall [8]. Furthermore, the probability of extreme rainfall in urban areas is very likely to increase, with high spatial and temporal variabilities [9,10], yielding further uncertainties in urban runoff estimations and flood-related damages [11,12]. Understanding the impacts of spatial and temporal rainfall variabilities on runoff is significant for urban hydrologic management (e.g., urban drainage design and construction, forecasting and prevention of flood risk) [13–15] and the development of an Early Warning System (EWS) [16,17].

However, the interactions among extreme rainfall variability, river features and runoff responses remain poorly understood, especially in urban areas [11,18]. Such attributions require sufficient information about the spatial distribution of short-term rainstorms and runoff responses, which is lacking in measurements for the sudden rainfall process and the complex inhomogeneity of urban areas introduced by the building envelope. With the development of new instruments, techniques and methods for capturing rainfall and hydrological processes at high resolution, urban hydrological models, such as the Storm Water Management Model [19], have been proposed and applied in urban hydrologic management [20,21]. Although various components of rainfall variability such as the RI, rainfall duration spatial and temporal resolution, and degree of imperviousness are

**Citation:** Liu, Y.; Huang, Y.; Liu, Y.; Li, K.; Li, M. The Impact of Rainfall Movement Direction on Urban Runoff Cannot Be Ignored in Urban Hydrologic Management. *Water* **2021**, *13*, 2923. https://doi.org/10.3390/ w13202923

Academic Editor: Renato Morbidelli

Received: 12 September 2021 Accepted: 15 October 2021 Published: 17 October 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

involved in previous studies, these components are far from complete. For example, similar extreme rainfall patterns of RI and rainfall duration in the same urban area may induce different flood damages. As a result, there remain critical errors and uncertainties in the impacts of rainfall variabilities on hydrological process such as runoff and floods [22].

Rainfall movement direction (RMD) is a significant component of rainfall variability [23–29] that is always ignored. While some studies have found no impact of RMD on hydrological responses [8,30], relatively few studies reported in the literature have remained inconclusive with respect to the impact of RMD on urban runoff. This contrast may be explained by (1) the limited RMDs observed from rainstorm events, which is almost fixed rather than 360 degree and lacks information in an urban area; (2) isolated analyses of the impacts of RMD were conducted in these studies by neglecting the interactions between RMD and the directions of urban river segments.

In this study, we focus on verifying whether RMDs play a significant role in generating peak runoff in urban areas. This paper is organized as follows: Section 2 introduces the study area and the experimental design. Results and discussion are given in Section 3, which also summarizes the main findings of the impact of RMD on runoff.

#### **2. Materials and Methods**

#### *2.1. Study Area*

The typically rainy city of Shenzhen in China, with four urban rivers, is selected as the study area [31], this city has complete hydrological infrastructure and observation information systems. Shenzhen, located in the southeast coast of China, has the most rapid urbanization of any city in China with high urban flood risk caused by extreme rainfall. Shenzhen city ranks fifth among the 136 coastal cities in the world in terms of future flood loss risk [32]. Shenzhen city is a typical case study area, in which the spatial and temporal variations in rainfall is wide with rapid movement and obvious "squall line" [33]. Additionally, urban hydrological datasets are sufficient in Shenzhen, which is the benefit from a 30 million RMB project of building an urban flood model, which began in 2018. These datasets include underground pipe network data (nearly 3000 km), 1:1000 terrain data, 13 years of rainfall monitoring data, historical water level data, reservoir operation data, and more.

#### *2.2. Methods*

#### 2.2.1. Construction of Rainfall Schemes

We designed the idealized experimental conditions of rainfall intensity (RI) and RMD to construct comprehensive rainfall schemes. According to the RI value of 132.7 mm/h recorded once every 1000 years in Shenzhen, we constructed 13 RIs with 10 mm intervals from 10 mm/h to 130 mm/h, and chose the commonly used Chicago rain pattern (Table 1) to represent short-term rainfall. In addition, 100 moving directions with equal intervals were designed to reflect the continuous RMD. We collected rainfall monitoring data from 63 meteorological stations in Shenzhen from 2008 to 2018, with a time resolution of 5 min. The rainfall events were extracted from the data and the rainfall center of every 5 min is calculated, then the moving speed of each rainfall events is calculated through the movement of the center. The average speed of all rainfall events is about 10 km/h.The asynchronous rainfall process of the whole basin is set to 3 h based on the average moving speed of the rainfall center (10 km/h) and the diameter of the circumscribed circle of the basin (30 km). By combining RI and RMD, 1300 rainfall processes (13 RIs × 100 RMDs = 1300) were constructed in Figure 1. The two rainfall distribution maps in Figure 1 show distributions when the RI is 80 mm and the RMDs are 15 and 80 respectively.


**Table 1.** Rainfall process of 13 rainfall intensity schemes.

**Figure 1.** Schematic diagram of rainfall schemes.

#### 2.2.2. Runoff Numerical Simulation

We applied the IFMS/urban (Integrated Urban Flood Modeling System) platform, an urban flood simulation platform developed by China Institute of Water Resources and Hydropower Research, to simulate the runoff processes of the studied rivers (Figure 2). Before the simulations, the model was calibrated by the rainfall–runoff relationship measured from 2018 to 2020. Based on 1313 rainfall processes, we constructed a set of input conditions for the model, including the initial water level of each section, the previous rainfall, and the infiltration process; these conditions were kept consistent for comparison. We calculated the runoff processes of 414 sections in each scheme by using a parallel computing program and output the result to structured files in CSV format.

**Figure 2.** Location of the four rivers.

IFMS/urban model is a flood analysis model developed in China, which has been successfully applied in Shenzhen, Beijing, Chengdu, Shanghai, Foshan and other cities [34,35]. Its main modules include: (1) Flood simulation and analysis: River, lake, nearshore flow, flood simulation analysis and calculation of flood protection area and flood storage area; urban rainstorm and waterlogging, overflow (break) flood and dam break flood simulation Analysis; storm surge simulation and analysis; (2) Engineering scheduling simulation: reservoir and lake gate dam, weir, culvert, box culvert and pumping station and other water conservancy engineering facilities scheduling simulation; pipeline one-way valve control; lake reservoir, underground storage, tank storage, and so on. (3) Data management and pre-processing functions: 2D and 3D data and result display platform; data pre-processing (grid generation, pipeline processing, data management).

#### 2.2.3. Index of Peak Runoff Deviation

We designed a concise index to describe the peak runoff deviation (Equation (1)):

$$I\_{fp} = \frac{Q\_{fp}}{Q\_p} \tag{1}$$

where *If p* is the peak runoff deviation of a section caused by the rainfall scheme with RI = *p* and RMD = *f*, *Qp* is the peak runoff (m3/s) of the section under the condition of a synchronous rainfall scheme with RI = *p*, and*Qf p* is the peak runoff (m3/s) with RI = *p* and RMD = *f*, where *f* ∈ 1, 2, . . . , 100.

In this paper, the *If p* values of 414 sections were extracted from all 1300 rainfall processes.

#### 2.2.4. Rainfall Movement Direction and Flow Concentration Direction

The trend in a set of line features is measured by calculating the average angle of the lines, which called the "Linear Directional Mean"(LDM) in GIS [36,37]. The LDM often represents the paths of objects that move, such as rainfall. We use the LDM to represent the confluence direction above a certain river section (Equation (2)):

$$\text{LDM} = \arctan \frac{\sum\_{i=1}^{n} \sin \theta\_i}{\sum\_{i=1}^{n} \cos \theta\_i} \tag{2}$$

where *θ<sup>i</sup>* is the angle between section *i*-1 and section *i* in the confluence area.

The *LDM* is adjusted according to the angle quadrant, and the adjusted *LDM* is between 0◦ and 360◦:

$$\text{LDM} = \begin{cases} \begin{array}{l} \text{LDM}\_{\prime} \sum\_{i=1}^{n} \sin \theta\_{i} \ge 0 \text{ and } \sum\_{i=1}^{n} \cos \theta\_{i} > 0\\ 180 - \text{LDM}\_{\prime} \sum\_{i=1}^{n} \sin \theta\_{i} \ge 0 \text{ and } \sum\_{i=1}^{n} \cos \theta\_{i} < 0\\ 360 - \text{LDM}\_{\prime} \sum\_{i=1}^{n} \sin \theta\_{i} < 0 \text{ and } \sum\_{i=1}^{n} \cos \theta\_{i} > 0\\ 180 + \text{LDM}\_{\prime} \sum\_{i=1}^{n} \sin \theta\_{i} < 0 \text{ and } \sum\_{i=1}^{n} \cos \theta\_{i} < 0 \end{array} \\ \end{cases}$$

The angle between the *LDM* and RMD is calculated as Equation (3):

$$\Delta\_{LDM} = \left\{ \begin{array}{c} \left| D\_{\text{rainfall}} - LDM\_{\text{river}} \right|, \left| D\_{\text{rainfall}} - LDM\_{\text{river}} \right| \le 180 \\ 360 - \left| D\_{\text{rainfall}} - LDM\_{\text{river}} \right|, \left| D\_{\text{rainfall}} - LDM\_{\text{river}} \right| > 180 \end{array} \tag{3}$$

where *Drainf all* represents RMD, *LDMriver* represents the LDM, and Δ*LDM* is between 0 and 180◦.

#### 2.2.5. Dynamic Clustering of Sections

We use the dynamic clustering machine learning method to classify the *If p* values of all sections. The *If p* value of each section in different RMD with same RI is used as the attribute to construct the sample set. The calculation is as follows:

$$S\_{\bar{1}} = \left\{ I\_{f(1)p^{\nu}} I\_{f(2)p^{\nu}} \dots I\_{f(100)p} \right\} \tag{4}$$

$$
\Omega = \{\mathcal{S}\_1, \mathcal{S}\_2, \dots, \mathcal{S}\_{429}\} \tag{5}
$$

where *Si* is section i and *If*(1)*<sup>p</sup>* is the simulated result with RMD = *f*(1) and RI = *p*.

The dynamic cluster analysis method is used to classify the sections in Ω; then, the features of each subset, that is, the typical features influencing the peak runoff of the section, are extracted.

The k-means clustering algorithm is used to classify low dimensional with 4 steps:

(1) A total of r initial cluster centers is set up: *Z*1(*p*), *Z*2(*p*), ...... *Zr*(*p*), where *p* is the number of iterations.

(2) The distance from samples x (*x* ∈ *X*) to each cluster center is calculated,

$$\text{if } D\_{\mathfrak{x}}(j) = \min\{D\_{\mathfrak{x}}(i)\} \text{ } i = 1, 2, \cdot \cdot \cdot \text{r}, \text{ then } \mathfrak{x} \in \mathcal{S}\_{j^\*}$$

where *Sj* represents cluster j with the center of *Zj*.

(3) The new center of each cluster is calculated. The new center of *Zj* is calculated as follows.

$$Z\_j(p+1) = \frac{1}{N} \sum\_{i=1}^{N} \mathbf{x}\_i \, , j = 1, 2, \dots, r$$

where *N* is the number of samples contained in cluster *Sj* and *xi* is sample *i*. using *Zj*(*p* + 1) as the new cluster center, the clustering criterion function can be minimized:

$$J\_{\vec{\jmath}} = \left[ \sum\_{\mathbf{x} \in S\_{\vec{\jmath}}(k)} \left\| \mathbf{x} - z\_{\vec{\jmath}}(k+1) \right\|^2 \right]^{\frac{1}{2}}$$

where *j* = 1, 2, ··· , *K*.

(4) If *Zj*(*p* + 1) = *Zj*(*p*), *j* = 1, 2, ... , r, then go to step (2); if *Zj*(*p* + 1) = *Zj*(*p*), *j* = 1, 2, . . . , r, then cease the iteration.

In this paper, different r values are calculated, and the initial values of different cluster centers are selected for the k-means cluster analysis. Then, the rainstorms are divided into 3 modes. The mean value of each rainstorm is taken as a typical rainstorm mode.

In addition, we reconstructed the typical features of the river sections. With the inverse calculation of the principal component analysis, the low-dimensional array is reduced to a high-dimensional array to express the spatiotemporal process of each rainstorm. The *i* clustering centers are reconstructed into *i* × *m* matrices (Equation (6)):

$$X\_{app} = Z\_{i \times p} \times V\_{m \times p}{}'\tag{6}$$

where *i* is the number of rainstorm modes and *m* is the dimension of the original data.

#### **3. Results and Discussion**

*3.1. Influence of Variation in RI and RMD Combinations on the Peak Runoff*

We found that the influence of variation in RI and RMD combinations on the peak runoff can reach 30% based on the *index of peak runoff deviation (Ifp)* in the study area. The statistics of 538,200 *Ifp*s show that the variations in the RI and RMD are more likely to cause negative effects on peak runoff, with 77.1% of *Ifp*s < 1.0 and 22.9% of *Ifp*s > 1.0. However, this influence is always ignored, possibly due to most *Ifp*s (83.3%) ranging from 0.9 to 1.1, indicating that the effect of these variations is smaller than 10% in most instances. Additionally, a heavy RI will reduce the effect of RMD on the peak runoff, with more concentrated of *Ifp*s value of approximately 1.0 correspondingly with the more enhanced RI. We found that 96% of *Ifp*s values ranged from 0.9 to 1.1 when RI > 70mm/h. Furthermore, the range of *Ifp*s decreased to 0.95–1.05 when the RI reach an extreme value larger than the once-in-one-hundred-years value (100 mm/h). We also found that the duration of the peak runoff was lengthened with extremely high RI values (Figure 3), which is associated with the processes of drainage networks operating under maximum waterlogging-elimination capacities. Figure 3A shows that the heavier the RI is, the more concentrated the *Ifp* is between 0.9 and 1.1, and the number of *Ifp* values less than 1.0 is obviously greater than that of *Ifp* values more than 1 for all RIs. Figure 3B shows that with an increase in the RI, the fluctuation range of *Ifp* decreases obviously and tends toward 1.0.

#### *3.2. Influence of Variation in RMD on the Peak Runoff*

To estimate the impact of RMD, we isolate the effect of RMD from the RI by analyzing the spatial pattern of the RMD impact under similar RI. Based on the dynamic clustering of 414 sections with 100 RMDs each, we do find three typical patterns with proportions of 42%, 21% and 37% of total number of sections, including (1) model 1 shows a nondirectional effect of RMD by reducing the peak runoff with most *Ifps* < 1.0; (2) model 2 shows an obvious directional effect of RMD with a symmetrical distribution of Ifps, in which the minimum effect of direction is opposite to direction of the maximum; (3) model 3 shows completely no obvious effect of RMD on peak runoff neither in direction nor in magnitude with *Ifps* approximately equal to 1.0 (Figure 4).

(**B**)

**Figure 3.** Influence of the rainfall intensity (RI) and rainfall movement direction (RMD) on the peak runoff. (**A**) The distribution of *Ifp* values under 13 different RIs. (**B**) The distribution of *Ifp* under different RIs.

**Figure 4.** Distribution of *Ifp*s for typical modes.

Spatially, these three models are primarily related to the location of sections in rivers (Figure 5).

**Figure 5.** Influence of the peak runoff on the sections of four rivers. (**A**) Dasha River. (**B**) Buji River. (**C**) Xinzhou River. (**D**) Futian River. Only certain parts of the 414 sections are drawn.

Figure 5 shows that the in lower reaches of the rivers, the larger the variation range of *Ifp* is, the more symmetrical is the influence. In model 1 and model 3, the influence of RMD on the peak runoff is negligible in the upper reaches of the river. However, model 2 shows symmetrical variations in the lower reaches of the river. We also find that the river length potentially influences the RMD effect. Taking four rivers in Shenzhen as an example, the Dasha River and Buji River show the effect of model 2 from middle reaches to lower reaches. While, this effect occurs at the ends of the lower reaches in the other two rivers, Xinzhou River and Futian River, the lengths of which are just half of the two aforementioned rivers. The impact of RMD on the peak runoff is primarily present in the lower reaches, possibly due to the longer duration and larger area of the flow concentration that gradually increased the asynchrony of rainfall with the peak runoff.

Furthermore, we find a slightly jagged shape in models of *Ifps* compared with the natural watershed, which may be explained by the uneven distribution of underground pipes used to drain flows into rivers in urban areas and their corresponding covering areas. In rapidly urbanizing area, the lack of pipe network datasets will increase the complexity of urban flood analyses.

#### *3.3. How RMD Affect the Peak Runoff across Rivers*

To further understand how RMD affect the peak runoff, we calculated three angles of sections: θmaxPR-FC (the angle between RMD with maximum peak runoff and flow concentration direction), θmin PR-FC (the angle between RMD with minimum peak runoff and flow concentration direction) and θmax PR-FD (the angle between RMD with maximum peak runoff and flow direction). In this paper, we used "Linear Directional Mean" (LDM) to represent the flow concentration direction, which is the geometric mean of all the reaches upstream. In contrast with the flow direction, we do find a significant relationship between RMD and the flow concentration direction during maximum peak runoff (Figure 6), with decreasing trends of θmaxPR-FC from upstream to downstream. For example, the angles of the second half of downstream reaches of the Buji River are almost smaller than 20◦, and some sections even reach 0◦. In contrast, θminPR-FC is gradually increased with the opposite RMD to θmaxPR-FC, which is consistent with the result of model 2 shown in Figure 4. Additionally, the θmax PR-FC values are larger in more meandering rivers (e.g., Dasha River) than in rivers with straight channels (e.g., Buji River). We can therefore conclude that RMD is a significant factor to peak runoff downstream in urban areas, and this influence is not isolated but needs to be combined with the spatial feature of rivers such as direction and bending.

The impact of RMD on the peak runoff indicates that river flood risks and discharge capacities should be evaluated from more rainfall variations including RI, rainfall duration, etc. Commonly used methods that do not consider the impact of RMD, such as constructing rainfall processes by independent zoning [34,38,39], may underestimate or overestimate the peak runoff magnitude in rivers. Figure 7 compares the difference of peak runoff between considering and not considering the impact of RMD, in four urban rivers of Shenzhen with similar RIs. We found that the uncertainties yielded from RMD may reach −40% and 50% (Figure 7). In most cases, these uncertainties range from −20% to 20%. These analyses confirm that the impact of RMD on runoff cannot be ignored in urban hydrologic management.

**Figure 6.** *Cont.*

**Figure 6.** Influences of rainfall movement direction (RMD) and the direction of river confluence on the peak runoff. (**A**) Dasha River. (**B**) Buji River. (**C**) Xinzhou River. (**D**) Futian River.

**Figure 7.** The peak runoff ranges with the same rainfall intensity (RI) value but different RMDs for 6 river sections. (**A**), section 50 of the Dasha River. (**B**), section 100 of the Dasha River. (**C**), section 150 of the Dasha River. (**D**), section 30 of the Buji River. (**E**), section 80 of the Buji River. (**F**), section 130 of the Buji River.

#### **4. Conclusions**

Our results show that rainfall movement direction (RMD) are very likely responsible for variations in flood risks in different sections of urban rivers, which should not be ignored in urban hydrologic management. Although the impact of RMD on peak runoff in rivers is always covered up by heavy rainfall intensity (RI), this impact is significant in the downstream reaches of urban river when combined with spatial features of rivers, such as the river direction and bending. We propose that RMD should be involved in urban hydrologic models and predication, as it may yield substantial uncertainties in peak runoff in rivers. We provide an empirical evaluation to quantify the contribution of RMD to peak runoff in urban rivers, indicating that the river flood risk and discharge capacity should be evaluated based on more variations in rainfall such as RMD. Given the importance of the prevention and treatment of urban waterlogging with the accelerating process of urbanization, this empirical evaluation of peak runoff variations due to changing RMDs represents a contribution for the development of an Early Warning System (EWS) for the study area and provides critical information to inform policy and decision making.

**Author Contributions:** Y.L. (Yesen Liu) and M.L. collected and processed the data, Y.L. (Yesen Liu), K.L. and Y.L. (Yuanyuan Liu) proposed the model and analyzed the results, Y.H. and Y.L. (Yesen Liu) wrote the manuscript. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was supported by National Natural Science Foundation of China (Grant NO. 41822104), the Strategic Priority Research Program of the Chinese Academy of Sciences (XDA19040402), the National Key Research and Development Program of China (Grant No. 2017YFB0503005), National Science and Technology Major Project of China's High Resolution Earth Observation system (21-Y20B01- 9001-19/22), the Fundamental Research Funds of IWHR (WH0145B032024), the Fundamental Research Funds of IWHR(WH0145B032024).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data presented in this study are available on request from the corresponding author.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**

