**1. Introduction**

The last decades have witnessed significant changes in climate and hydrological conditions. The increased frequency of extreme storm floods has led to major risks of damage due to weather-related hazards. Forecasting of such high-intensity floods on a shorter time scale has immense benefits such as saving lives, protecting economic assets, and improving quality of life [1–3]. For mesoscale mountain areas along the Daqing River of northern China, steep slopes, combined with high intensity and short duration convective rainfall, substantially shorten hydrological lead times. In addition, due to the lack of high-resolution and dense observations, the "throughfall" observed by rain gauges cannot reflect the realistic rainfall distribution in space and time, thus the accuracy of forecasting is limited by the layout of the rain gauge network. For processes of runoff and routing, different dependent processes are added and derived within models including

*Article*

**Citation:** Wang, W.; Liu, J.; Li, C.; Liu, Y.; Yu, F. Data Assimilation for Rainfall-Runoff Prediction Based on Coupled Atmospheric-Hydrologic Systems with Variable Complexity. *Remote Sens.* **2021**, *13*, 595. https:// doi.org/10.3390/rs13040595

Academic Editors: Yongqiang Zhang, Dongryeol Ryu and Donghai Zheng Received: 31 December 2020 Accepted: 3 February 2021 Published: 7 February 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

soil infiltration, overland transport, and channel routing, which result in the complexities and uncertainties in deducing the generation mechanisms of flash floods. In this study, we selected two upstream mountainous catchments along the Daqing River in which there is an urgen<sup>t</sup> need for accurate flood prediction to prevent and reduce risks facing the construction of the downstream including Xiong'an New Area.

Although recent advances have improved rainfall forecasting [4], several challenges remain. One such challenge is the reproduction of the magnitude and the disturbance patterns of rainfall that can assimilate suitable observations into numerical weather prediction (NWP) models [3]. Rainfall is among the variables generated with the greater errors in NWP models, while it plays an important role in forecasting the atmospherichydrological processes for its influence on the time and scale of floods [5,6]. There are three main sources of error in rainfall prediction: the initial conditions, the lateral boundary conditions, and the physical approximations in the model equations. Data assimilation allows atmospheric information to be extracted from multiple data sources, thereby improving the reliability of coarse resolution data and the complexity of atmospheric motion, reducing the initial and lateral boundary errors [7,8]. Routray et al. [9] found that weather research forecasting (WRF) can be used to assimilate observations from different sources and contribute to a better understanding of mesoscale rainfall convective activity within the Indian monsoon region. Kumar and Varma [10] further explored a short duration intense rainfall event in India, demonstrating the potential of WRF to adapt to rainfall forecast accuracy. Fierro et al. [11] conducted a data assimilation study in the eastern part of the USA that showed that WRF, in conjunction with data assimilation, could significantly improve models of local short-term rainfall processes. Although data assimilation can help NWP models to more accurately capture rainfall and enable rainfall-runoff conversion by constructing an atmospheric-hydrological model system, its potential to further improve flood forecasting has not been fully investigated.

A reliable atmospheric-hydrological model system is required to improve rainfall predictions and hydrological forecasts for early flood hazard mitigation [12]. A promising method is the coupling of hydrological models to a regional model such as NWP, in order to rapidly obtain high-resolution rainfall and flood forecasting data. In [13–15], Lin et al. and Lu et al. discussed the implementation and improvement of the Canadian regional mesoscale compressible community mode (MC2) rainfall forecasting in the Huaihe Basin of southern China, concentrating on the Huaihe sub-basin coupled to the Xinanjiang hydrological model. Wu et al. [16] further explored MC2 and the multiple linear regression integrated forecast and found that high-resolution rainfall distributions were problematic at finer temporal and spatial scales, requiring data assimilation or sub-grid-scale parameterization. Yucel et al. [4] and Moser et al. [17] tested WRF data assimilation as the input for flood forecasting in the Black Sea and Iowa, respectively, both finding an enhancement in the accuracy of flood warnings.

With regard to the selection of coupled models, it is subject to a diversity of laws and non-universality, which makes it difficult to accurately express physical movement processes. The hydrological model has a more comprehensive physical foundation including lumped, grid-based, and fully distributed setups [18,19]. Not only the above physically-based models are used, but machine learning models are also widely applied in rainfall-runoff forecasting (i.e., artificial neural networks (ANNs) [20], support vector machines (SVM) [21], and the recent emergence of theory-guided data science (TGDS) [22,23]. For flood forecasting, which is affected by the discretization construction method, different construction expressions determine variations between heterogeneity analysis and model calculation structure, and further influence the accuracy of physical expressions in the prediction processes of the hydrological model [24–26]. To analyze the scale of hydrological processes, large-scale studies are still the mainstay. During the last ten years, a focus has been made on downscaling and modeling through appropriate discrete methods [27]. The development of models with high prediction accuracy and computational efficiency is a key issue for basin-scale flood forecasting. Liu et al. [28] conducted coupled lumped

hydrological modeling and WRF flood forecasting on a 135.2 km<sup>2</sup> catchment with a 10 km resolution; Li et al. [29] used the rainfall of a 20 km WRF output to drive the distributed Luxihe model, extending the forecast period of flood forecast in the Liujiang catchment (58,270 km2). Rogelis [30] compared the flow results of different resolution data (minimum resolution 1.67 km) driven by WRF on a 380 km<sup>2</sup> catchment driving different lumped hydrological models. Previous studies have mostly focused on humid regions; consequently, runoff methods are mostly based on saturation excess, and limited discussion of the appropriate construction of atmospheric-hydrological model systems have been conducted for semi-humid and semi-arid areas.

The coupling system can also integrate land surface models (LSM) with hydrological models. Most LSM and hydrological models incorporate the same descriptions of water balance, albeit with different aims [31,32]. LSM evolves from land-atmosphere coupling models with the purpose of solving the surface energy balance equation and providing the necessary lower boundary conditions for the atmosphere [31,33]. Inversely, hydrological models focus less on radiation and more on hydrological changes (i.e., the lateral route of water along land surfaces). Such models are the most complicated among the current coupling systems due to their complex structure and the sensitive parameters to be determined in the relevant physical processes as well as hard parameters (fixed parameters written directly into the source code during the compilation of the model).

Limited research has been undertaken to model atmospheric-hydrological processes in semi-humid and semi-arid regions of northern China. Consequently, there is a lack of effective atmospheric-hydrological coupling forecasting systems for this region. Herein, we used WRF models, three-dimensional variational (3DVar) data assimilation modules coupled to three model sets of varying complexity to construct the required model systems. To test the influence of various levels of complexity, three types models were selected, namely the lumped Hebei model, the grid-based Hebei model, and the WRF-Hydro model. These models were both standalone and coupled with the WRF model and threedimensional variational (3DVar) data assimilation module. Four typical storm flood events with different spatial and temporal rainfall distributions, all of which occur in the upper catchment of the Daqing catchment, were explored before and after data assimilation. The purpose of this study was to investigate the impact of data assimilation on forecasting different types of rainfall-runoff events after coupling with variable hydrological structures. It should be noted that the atmospheric-hydrologic coupling in this study refers to "oneway" coupling of the three standalone hydrological model structures with the WRF and 3DVar data assimilation module, which means that the hydrological models are driven by the WRF and 3DVar outputs without feedback to the atmospheric modeling processes. The results obtained in this way can simply reflect the direct effects of data assimilation on rainfall as well as runoff forecasts.

There were four basic questions we aimed to explore:


#### **2. Study Area and Events**
