4.2.2. Categorical Evaluation Metrics

#### (1) Temporal Variation

Besides the continuous metrics, three categorical metrics are used to assess the daily precipitation detection capabilities of the SPPs. Table 4 compares the mean values of the categorical evaluation metrics among the SPPs at the daily scale. A daily rainfall threshold of 1 mm/d is used in calculating the metrics.


**Table 4.** Mean categorical evaluation metrics of the SPPs at daily scale.

<sup>a</sup> Spring extends from March to May; Summer extends from June to August; Fall extends from September to November; Winter extends from December to the following February.

In terms of POD, the IMERG products all tend to perform better than the TMPA products both annually and seasonally except in summer, during which IMERG\_L gives the poorest performance. Seasonally, the PODs of the two family products exhibit a somewhat different pattern of change. The PODs of the IMERG family products tend to peak (≥ 0.85) in spring, decline slightly to around 0.8 in summer, and further to around 0.67 in fall and winter. The PODs of the TMPA products peak in summer, drop to around 0.75 in spring, and further down to 0.6 in fall and < 0.45 in winter. IMERG\_F has the highest POD throughout the year, except it is slightly less than the other two IMERG products in spring (Table 4). The higher PODs in spring and summer indicate that the SPPs are poorer at detecting light precipitation that is more dominant in fall and winter. Meanwhile, the much lower PODs of the TMPA products in winter indicate that it is less capable of estimating solid precipitation than the IMERG products.

Unlike POD, the IMERG products all tend to perform worse than the TMPA products in terms of FAR. Meanwhile, the FARs of the five SPPs show a similar seasonal pattern of change, which peak in summer, decrease slightly in fall, and drop further in spring and winter (Table 4).

Incorporating both correct rainfall detection and false alarm, CSI indicates a mixed performance among the SPPs. Among the five SPPs, IMERG\_F performs the best annually, as well as in fall and winter. It performs slightly worse than both TMPA products in summer and IMERG\_L in spring. Seasonally, all five SPPs tend to perform the best in spring and then in summer. However, the IMERG products tend to perform slightly better in winter than in fall, while the TMPA products perform considerably worse in winter (Table 4).

Similar to our study, Xu et al. (2019) [48] concludes that IMERG\_F performs better than 3B42 in detecting precipitation events in the relatively flat Huang-Huai-Hai Plain of East Coastal China, with an annual POD of 0.83 and CSI of 0.52. The PODs and CSIs of IMERG\_F surpass those of 3B42 in all seasons, especially in winter. This indicates that IMERG\_F performs better in detecting precipitation events, especially in capturing light or solid precipitation.

#### (2) Spatial Variation

Figure A5 compares the spatial distribution of the three annual categorical evaluation metrics among the SPPs. Unlike the case of continuous metrics, topography does not seem to impose a consistent impact on categorical metrics at the daily scale. For example, while most SPPs have lower correct precipitation detection rates (PODs) at the two stations of high altitude (Station 12 and 13), they also have lower false alarm rates (FARs) at the stations. The leads to no obvious pattern in the spatial distribution of CSIs, with varied performance of stations at similar altitudes.

#### (3) Variation with Rainfall Thresholds

Figure 6 compares the performance of precipitation detection among the five SPPs by different daily rainfall magnitude. Each of the three categorical metrics has been sequentially calculated annually for the days when daily rainfall exceeds 1, 5, 10, 25, 50, 75, 100, and 150 mm/d. Similar daily rainfall thresholds have been used in previous studies, such as Wu et al. [29], Anjum et al. [43], and Tan et al. [49].

As seen from Figure 6a, the PODs of all five SPPs exhibit a largely decreasing trend with the increase of daily rainfall threshold until hitting the bottom at the threshold of 100 mm/d. Afterwards, the PODs of all SPPs bounce back substantially at the threshold of 150 mm/d. Interestingly, the PODs of the two TMPA products have mostly surpassed those of the IMERG products, indicating their better capabilities of correctly detecting daily rainfall occurrences.

However, as shown in Figure 6b, the FAR values of the TMPA products have also surpassed those of the IMERG products, especially IMERG\_F, at the majority of daily rainfall thresholds, indicating their higher risk of falsely detecting daily rainfall occurrences. By incorporating the factors of both false alarms and missed events, CSI provides a more comprehensive evaluation of precipitation rainfall detection performance of the SPPs. As shown in Figure 6c, the IMERG\_F has the highest CSI value at the daily rainfall thresholds of less than 100 mm/d, whereas it is caught up by the 3B42 at the threshold of 100 mm/d and above.

**Figure 6.** Comparison of the changes in annual categorical evaluation metrics with daily rainfall thresholds among five SPPs: (**a**) *POD*; (**b**) *FAR*; and (**c**) *CSI*.

#### 4.2.3. Comparison with Previous Studies

Table 5 summarizes the performance of the SPPs in estimating daily rainfall in previous studies worldwide. Previous studies have mostly assessed SPPs over approximately two years, compared to nine years in this study. It needs to be noted that the table does not serve to rigorously compare the relative performance of the SPPs in various regions, due to the differences in temporal frame, geographical regions, as well as climatic regimes of the studies.

The CCs of both IMERG products and TMPA products in this study have surpassed those in all previous studies except the one by Su et al. [46] conducted in the Upper Huai River Basin of China. Unlike the CC, the values of the other continuous as well as categorical metrics in this study all lie at the medium level among the previous studies. In addition, similar to our findings, many of previous studies have concluded a moderately better performance of the IMERG products in estimating daily rainfall than the TMPA products. However, the observed tendency of under-estimating daily rainfall by the IMERG products and over-estimating by the TMPA products in this study is not consistent with the findings of some previous studies.


