**3. Methodology**

For our study period, the 782 eHYD gauges were used in this domain. The eHYD accumulates daily precipitation ending at 07:00 o'clock local time, which is different from the satellite daily precipitation accumulation convention. Therefore, the sub-daily precipitation estimates (e.g., half-hourly, three-hourly) were aggregated and derived a meaningful daily (at the local time of gauge measurement) and, consequently, monthly data. Then a comparison among the eHYD gauge-based data and the gridded precipitation products across Austria was conducted. It should be mentioned since the SM2RAIN data are available only in daily (and not sub-daily) scale, therefore, we could use only the original SM2RAIN daily data which estimate rainfall between the 00:00 and the 23:59 UTC of the indicated day.

In this research, to evaluate the capability of precipitation products to capture the precipitation patterns and for the aim of intercomparison with other gridded precipitation products, the data of a pixel of the gridded products are compared with that corresponding to the ground point observation (i.e., the station). Only the cells where there is at least one reporting station can be selected for computation. Since we have used a dense network, in numerous pixels, there are two or more gauges are located. In this case, an average of the two or more gauges is used as the basis for comparison. At the end, all stations fell within 601 pixels over the country.

The assessment and validation of precipitation products are carried out based on continuous and categorical statistical metrics. To quantitatively compare the performance of the gridded products against in situ observations at daily and monthly time-scales, the continuous statistical metrics including correlation coefficient (CC), bias, mean absolute error (MAE), and root mean square error (RMSE) are used. The CC is used to assess the agreement between SPEs and rain-gauge observations. The CC value vary from −1 to +1, where +1 indicates a perfect skill score and −1 indicates a perfect negative linear correlation. The bias is defined as the average difference between in situ observations and satellite/model precipitation estimates, and can be either positive or negative. A negative bias indicates underestimation by satellite precipitation while a positive bias indicates overestimation. The MAE is used to represent the average magnitude of the error and MAE = 0 indicates a perfect score. The RMSE is used to measure the average error magnitude and weighs the errors according to their squared value. This gives a greater weight to larger errors than the MAE. RMSE = 0 represents a perfect score. To examine the capability of the products in detection of precipitation, the two categorical statistical metrics, probability of detection (POD) and false alarm ratio (FAR), are used (see Appendix A). POD is an indicator of the SPE's ability to correctly detect precipitation events. Values vary from 0 to 1, with 1 as a perfect score. FAR denotes the fraction of cases in which the SPEs record precipitation when the rain gauges do not. Values vary from 0 to 1, with 0 as a perfect score. For extreme precipitation, we use R90th index to measure extreme wetter condition. R90 is precipitation in the 90th percentile of wet days in a year (i.e., after excluding precipitation less than 0.1 mm). We further break down and more deep analyses are conducted by classifying the stations' elevation (1000 m ≥ stations' elevation > 1000 m over the whole country).

The processing stages for error analysis of this study is shown in Figure 2.

**Figure 2.** Processing stages for error analysis of this study.

#### **4. Results**

In the first step we evaluated the daily time-scale of IMERG-V05, -V05-RT, -V06A, MSWEP, ERA5, and SM2RAIN over all days from June 2014 to December 2015 against the in situ observations as reference. Considering that the IMERG-FR V06A started in June 2014 while the eHYD was discontinued by December 2015, the study period was confined to the 17 months between June 2014 and December 2015. It is worth to mention that further work is needed to evaluate the seasonal and inter-annual comparison of these products relying on larger sample data. Figure 3 shows the spatial distribution and average statistical indices (bias, CC, RMSE, and MAE) at daily precipitation time-scale for all products over Austria.

Precipitation types vary across the area. This region is typified by stratiform and convective precipitation, while the west and middle of the area (along with alpine mountains), in addition, is dominated by complex precipitation system due to the orography of the area. In Figures 3 and 4 the spatial distribution and the statistical summary of the metrics for the aforementioned products at a daily and monthly resolution over Austria are shown. According to Figure 3, precipitation shows a weaker correlation to ERA5 and SM2RAIN with mean CC of 0.53 and 0.57, respectively, in compare to MSWEP (CC = 0.86) at daily time-scale. Particularly, over the Alpine mountains both SM2RAIN and IMERG-V05-RT indicated low CC skills, while they showed a better CC over the east and middle of the country.

The CC metric is used to describe the agreement between gridded precipitation products and in situ observations. As can be seen in Figure 3, with respect to CC, MSWEP, significantly yields better than other products in the whole domain in the range of 0.8 to 1 in most pixels. However, ERA5 indicated very low CC over the south and northern part of the country and rather high CC in the area with low altitude. The general performances of the CC for all three versions of IMERG and SM2RAIN are relatively similar and relatively low over the western part, in comparison to MSWEP that several factors could contribute to this lower CC over such areas: a) the topography and climate of the west domain is partly complex, might rise a big challenge for SPE accuracy [12]; b) the GPCC stations that are used for the calibration of IMERG are in monthly time-scale, while in this study the examination of the products are on a daily time-scale, leads the quality of IMERG products being potentially degraded. In general, the MAE and RMSE are significantly higher in the high altitudes and low in low altitudes. This is due to more sensitivity of RMSE, and there were high number of local and heavy convective precipitation events over the high altitudes of Austria.

The general analysis of the results shows that IMERG-V05B, -RT, -V06A, and SM2RAIN and ERA5 have similar scores with respect to MAE and RMSE, although the ERA5 surpass the other products according to the bias skill scores. However, MSWEP indicates a better result according to the error indices with bias, RMSE and MAE of –15 mm, 2.86 mm and 1.08 mm, respectively. This means that in Austria, MSWEP daily precipitation is very close to the in situ precipitation observations among the other recent and state-of-the-art precipitation products. These results are consistent with Beck et al. [22] which determine to underscore the importance of applying daily gauge corrections and accounting for gauge reporting times in compare to monthly gauge corrections. Meanwhile, IMERG and ERA5 indicate relatively higher variety of spatial bias and CC. However, one of the causes of the error of the gridded precipitation products might be their precipitation estimation for a whole pixel once there is a localized precipitation event, particularly in the west part of the area which characterized by complex systems, in some of the stations within. Thereby wrongly assigning the event to unaffected stations [23]. The tendency of reanalysis data to overestimate precipitation frequency might be the cause of ERA5 precipitation overestimation [18,24]. Therefore, after numerous occurrences, this process causes an average areal overestimation/underestimation.

Although all the products exhibited almost similar mean statistical skill scores in overall, regionally there were considerable differences. Compared to MSWEP and IMERG-FR, ERA5, SM2RAIN, and IMERG-V05-RT performed substantially worse over regions of complex terrain [22]. The results suggest that the topography and climate characteristics of the region should be considered when choosing between satellite and reanalysis datasets.

**Figure 3.** Statistical indices for bias, MAE, RMSE, and CC from left to right columns, respectively, at daily time scale for IMERG-V05, IMERG-V06A, IMERG-V05-RT, MSWEP, ERA5, and SM2RAIN. The center-line of each boxplot depicts the median value (50th percentile) and the box encompasses the 25th and 75th percentiles of the sample data, while the whiskers represent the extreme values, respectively.

**Figure 4.** Statistical indices and for bias, CC, RMSE, and MAE from top to down rows, respectively, at monthly time scale for IMERG-V05B, IMERG-V06A, MSWEP, ERA5, and SM2RAIN. The center-line of each boxplot depicts the median value (50th percentile) and the box encompasses the 25th and 75th percentiles of the sample data, while the whiskers represent the extreme values, respectively.

The monthly statistical indices from all precipitation products versus in situ observations are shown in Figures 4 and 5. According to the results of monthly precipitation, although all the examined products indicated a rather close performance to in situ measurements, it is evident that MSWEP, followed by IMERG-V05B and -V06A monthly precipitation compared well to the corresponding in situ measurements. With respect to monthly scale, the CC of MSWEP, IMERG-V05B, -V06A, and SM2RAIN exhibited strong agreement with observations over the whole area. Although IMERG-V05-RT and ERA5 indicated rather good CC for the eastern part of the country, they showed weak performance for the western and southern parts of the region, respectively, which might be due to the effect of relief and complex systems in that area. MSWEP with the skill scores of 6.14 mm, 22.37 mm, 28.29 mm, and 0.93 and ERA5 with −2.08 mm, 43.26 mm, 54 mm, and 0.68 for bias, MAE, RMSE, and CC, respectively determined as the best and worst products. According to bias, the ERA5 strongly overestimated precipitation in the north part of the area and underestimated precipitation in south and west part of the country, with a mean areal bias value of −2.08 mm, that might be due to its native low-resolution and/or parameterization limitation during the precipitation generation processes [17,25]. The box-plots can confirm that most of the IMERG-V05B, -V06A, and MSWEP's pixels are in a smaller range, close to zero, in comparison to ERA5, with a wider bias range. This suggested that their gauge-correction methodology requires re-evaluation. Overall, MSWEP, followed by IMERG-V05B and -V06A, showed improvements in monthly precipitation in comparison with other products.

**Figure 5.** Time-series of mean monthly precipitation and box-plots of daily and monthly time-scales across Austria from IMERG, MSWEP, ERA5, and SM2RAIN products as compared to eHYD stations for the period of June 2014–December 2016. The center-line of each boxplot depicts the median value (50th percentile) and the box encompasses the 25th and 75th percentiles of the sample data, while the whiskers represent the extreme values, respectively.

Figure 5 shows the daily and monthly time-series comparison and boxplots of regional average precipitation from stations and other precipitation estimate products over Austria from June 2014 to December 2015. All precipitation products generally captured the spatio-temporal precipitation of daily and monthly time-scales, with the highest amount occurred in July and August 2014 and the lowest amount observed in February and December 2015. However, in monthly comparison it is evident that MSWEP estimates are very close to in situ observations and tend to slightly overestimate precipitation during August 2014 and July 2015, which might be due to the small scale of the precipitation systems that are dominant during these months, while IMERG-V05-RT seems to have systematic overestimation from March to November 2014.

According to Figure 5, MSWEP outperformed other products with slight overestimation over only the November 2014 and January 2015. The mean monthly data indicated that SM2RAIN underestimated precipitation during December 2014–April 2015 over Austria, which reflects a possible limitation of SM2RAIN-ASCAT data during the cold months. The SM2RAIN underestimation in winter can be related to snowfall that SM2RAIN is unable to estimate. The behavior of IMERG V05B and -V06A were almost similar with slight overestimation, while a greater overestimation of precipitation is observed mainly in the months with less precipitation intensity. IMERG-V05-RT shows systematic underestimation for February–October 2015. Additionally, from the median and the 25th and 75th percentiles of the box-plots, one can obtain that the precipitation estimated by MSWEP followed by IMERG-V05B and -V06A are more accurate than other products, although MSWEP whiskers extend to the most extreme data points. With respect to the box-plots of daily comparison, SM2RAIN indicated fewer extremes and outliers.
