**4. Results**

The forecasting results for the three coupled systems before and after data assimilation for the four studied rainfall-runoff events are shown in Figure 6. In each single subfigure, the black bars and black solid curve indicate observed rainfall and runoff, respectively. Red and blue bars and curves indicate rainfall and runoff before and after data assimilation. For the studied coupling systems, assimilation had different degrees of enhancement on rainfall and runoff. The following analyses were conducted separately for data assimilation on the rainfall, runoff, and model systems.

**Figure 6.** Rainfall-runoff predictions for the three atmospheric-hydrologic coupled systems before and after data assimilation: (**a**) Lumped Hebei; (**b**) Grid-based Hebei; (**c**) WRF-Hydro.

#### *4.1. Effect of Data Assimilation on Rainfall Prediction*

Figure 7 shows the cumulative precipitation variation over the simulated period caused by the cycling assimilation processes. The black solid curve represents the observed precipitation, while the other colors represent different assimilation periods. As abovementioned, run1 was a non-assimilated precipitation process. Under conditions of data assimilation every 6 h, it was found that the cumulative rainfall gradually approached the observed values by the end of run6. For Event 2 and Event 3, the performance of run2 and

run3 relative to run1 was impressive. For Event 1, run4 and run5 changed and improved the temporal distribution of precipitation within a few hours after data assimilation, while run2 and run3 seemed to show slightly worse results than run1, if no further assimilation took place. Generally, forecasts of cycling data assimilation after five cycles were largely stable for all events and the final curve integrated by the data assimilation runs (run2 to run6) was enhanced relative to the original run (run1). In addition to showing good performance for selected typical precipitation events, the cycling data assimilation gradually improved the temporal variability.

**Figure 7.** Cumulative rainfall curves before and after data assimilation under different assimilation run conditions.

Table 4 presents further statistics regarding the performance of data assimilation in improving cumulative rainfall. This shows that precipitation significantly increased compared to the period of no data assimilation. Overall, the relative error (*RE*) before and after assimilation was reduced by 0.26; Event 3 errors resulted in the most significant change, with a reduction in deviation of 0.342, while Event 4 rainfall increased most after assimilation by 19.76 mm.

**Table 4.** Observed and forecasted rainfall accumulation before and after data assimilation.


The normalized Taylor diagrams of cumulative rainfall (in which the horizontal and vertical coordinates are normalized by dividing by the standard deviation (*SD*) of the observed series) before and after assimilation are shown in Figure 8. The variations of the assimilated results were closer to the actual observations. The correlation coefficients (*CC*) for both cumulative rainfall before and after assimilation were above 0.9 and the correlation coefficient and root mean square error (*RMSE*) after data assimilation showed a significant improvement compared to the corresponding values before assimilation, especially for Event 1, in which the *CC* increased from 0.93 to 0.99. In the case of Event 3, a decrease in *CC* from 0.98 to 0.96 occurred after assimilation; a similar trend was noted during Event 4. The previous calculations of the temporal *Cv* values for precipitation events showed that the two precipitation events were more heterogeneous in time and space than Event 1 and Event 2. In Event 3, for example, the temporal *Cv* value of 2.3925 was much higher than that of Event 1 (0.6011). This may explain the increased bias in assimilation, since the improved effectiveness of the rainfall forecast after assimilation is determined by the amount of effective information contained in the data. It is clearly easier for radar and GTS to capture data during periods of rainfall that are homogeneously distributed in space and time.

**Figure 8.** Taylor diagrams of forecasted hourly cumulative rainfall before and after data assimilation.

The above results demonstrate that WRF-3DVar effectively improved the consistency of simulated precipitation. Specifically, cycling assimilations of radar reflectivity and GTS data in the study area were able to improve both initial and lateral boundary conditions, providing a basis for future research into the accurate modeling of atmospherichydrological systems.

#### *4.2. Effect of Data Assimilation on Runoff Prediction*

In addition to the above analyses of the precipitation process, the performance of the data assimilation on the runoff process was also of interest. We found that runoff forecasts were relatively effective when data assimilation was used. For example, the coupling system from the lumped model simulated Event 2 had significant improvements in flood peak after assimilation (Figure 6).

The evaluation of the *NSE*, *Rv*, and *Rp* indices heat map among the flood processes for the four studied events are given in Figure 9. To evaluate the effects of data assimilation on runoff prediction, the indices of events were averaged from the results of the three coupled systems with different complexities. Figure 9 also shows the degree of improvement of the three types of indices after assimilation, demonstrating an overall improvement in *NSE*, *Rv,* and *Rp* of 0.386, 0.474, and 0.252. Event 1 presented a relatively homogeneous distribution in space and time and showed the most significant improvements in *Rp* and *NSE* after assimilation compared to the case of no assimilation, by 0.502 and 0.597, respectively. In contrast, although Event 3 showed the largest improvement in assimilating the *RE* in cumulative rainfall (Table 4), it resulted in the smallest improvement in both *Rv* and *NSE* of 0.293 and 0.322, respectively. This improved mitigation performance may stem from the poor spatial and temporal homogeneity of Event 3 of the studied storms, which poses difficulties for coupled systems prediction, even if the overall rainfall input does not differ significantly from the actual observations. The indices measured are thus a reflection of the complexity of rainfall-runoff processes.

**Figure 9.** Averaged runoff prediction indices for *NSE*, *Rv*, and *Rp* of the four storm events.

Figure 10 shows the normalized Taylor diagrams of runoff events for the atmospherichydrological coupling systems before and after assimilation. The assimilated runoff processes corresponded to smaller *RMSE* and *SD* results closer to the mean observation series, with the exception of Event 3. Assimilated flood discharge may have a higher *CC* than the case of no data assimilation, but this was not always the case; indeed, parts of the spatio-temporally heterogeneous rainfall-runoff events in all three coupling systems were slightly smaller. The *CC* values after rainfall data assimilation were mostly above 0.6, such that only WRF-Hydro simulations of Event 2 and Event 3 had smaller *CC*, as shown in Figure 8. This corresponds to the rainfall processes of Event 2 (where there were two rainfall peaks) and Event 3 (featuring a short and concentrated rainfall process).

**Figure 10.** Taylor diagrams of forecasted hourly runoff before and after data assimilation.

#### *4.3. Effect of Data Assimilation on Coupled Systems with Variable Complexity*

Smaller catchments are particularly vulnerable to uncertainties and spatial shifts in rainfall patterns that may result in poor streamflow performance [27]. Figure 10 illustrates the coupling systems' stability from the grid-based model in blue, WRF-Hydro in yellow, and the lumped model in red. It can be observed that the *CC* for the grid-based model both before and after assimilation reached above 0.9 and that the values of *RMSE* were smaller than those of the other coupling systems, with the exception of Event 3. The lumped model had the next highest *CC* values, between 0.6 and 0.9, whereas WRF-Hydro exhibited a more scattered *CC* distribution. Nonetheless, after assimilation, the latter captured a better flood peak for Event 4. Detailed indices are given below for storm events, followed by further distinctions in between the effects of varying coupled systems under different types of rainfall before and after assimilation, as shown in Figure 11.

**Figure 11.** Runoff prediction indices of *NSE*, *Rv*, and *Rp* for different atmospheric-hydrologic modeling systems.

#### 4.3.1. Results with the Lumped Hebei Model

Most lumped models lack the spatial information required to describe hydrological processes [56]. The lumped Hebei model consistently responds to spatial variability with probability functions (i.e., infiltration excess and saturation excess curves), thus ignoring the true spatial distribution. For the studied storms, the lumped model generally obtains an early flood peak for rainfall-runoff events, being 8 h and 4 h earlier for Event 3 and Event 4, respectively (Figure 6). Both Event 3 and Event 4 exhibited inhomogeneous spatial and temporal rainfall distributions. For the studied events, forecasts from the lumped model increased the inaccuracy of the peak present time. This may be due to the fact that when the catchment-averaged rainfall is used as an input, the effect of spatial variability in the underlying surface layer on runoff generation can only be considered when the spatial distribution of the rainfall is homogeneous (i.e., neither the combined effect of spatially heterogeneous rainfall distribution and underlying surface layer variability, nor the effect of net rainfall processes as multiple input sources when the spatial distribution of rainfall is heterogeneous).

#### 4.3.2. Results with the Grid-Based Hebei Model

Based on the grid-based Hebei model, an atmospheric-hydrological coupling system was constructed for different rainfall resolutions and applied to the study area. This approach further demonstrated that descriptions of rainfall-runoff generation are compatible with local rainfall and flood forecasting, and that the grid-based Hebei model is more stable for forecasts both before and after assimilation compared with the other coupling systems tested. Flood forecasts before assimilation were slightly less accurate than WRF-Hydro for Event 1 and Event 2, but were generally better than the lumped model. Particularly, for the flood processes of Event 4, the grid-based Hebei model obtained the best *NSE* results (0.643 before assimilation and 0.874 after assimilation), demonstrating that this model is well-adapted to modeling flash floods. Although there was no clearly defined division of soil in the grid-based hydrological model, the influence of spatial heterogeneity due to soil type was somewhat reduced due to the relatively homogeneous nature of the study area.

#### 4.3.3. Results with the WRF-Hydro Modeling System

WRF-Hydro exhibited the opposite prediction accuracy compared with the lumped model. A better forecast for the spatio-temporally heterogeneous Event 3 and Event 4 was noted than for Event 1. With the exception of Event 4, flooding processes were found to exhibit faster surface runoff recession, which may be related to rainfall-runoff generation and interactions between land surface that occurs at every short integration time step in the WRF-Hydro. This increased the volume of infiltrated precipitation, producing higher soil moisture and reducing runoff [57]. For Event 1, which had a small flood magnitude and long flood duration, the flood process was subject to a rapid recession, resulting in a poor NSE ( −0.5 before assimilation and −0.107 after assimilation). Nevertheless, WRF-Hydro provided a more favorable forecast for Event 4, demonstrating its potential ability to predict flash floods. Overall, however, the accuracy of the WRF-Hydro forecasts was poor, supporting the findings of previous studies by Wang et al. [58] and Sharma et al. [31].

#### *4.4. Improvement with Different Coupled Systems after Data Assimilation*

Figure 12 provides further statistics on the extent to which the three coupled systems contribute to an overall improvement in response to flood events. The indices of systems were averaged from the results of the four studied storm events with different spatial and temporal characteristics.

**Figure 12.** Improvement of the forecasted runoff indices of *NSE*, *Rv*, and *Rp* for the three coupled systems.

Data assimilation promoted flooding in all three coupled systems. The coupling system using the lumped model had a weak response with no data assimilation, denoted by low *Rp* (−0.640) and *Rv* (−0.504) values, and therefore the most significant enhancement in *Rp* and *Rv* after assimilation, especially *Rp*, with an enhancement of 0.604. Before assimilation, the grid-based system was found to be more stable than the other two systems, so that the improvement was moderate among the three systems. For the WRF-Hydro system, responses to the *Rp* of Event 3 and Event 4, which were spatially and temporally heterogeneous rainfall events, was better before assimilation, but problems such as rapid surface runoff recession remain a concern. The improvements in *Rp* and *Rv* were more obvious after assimilation, whereas *NSE* was less improved.
