(2) Statistical Performance Comparison among the SPPs

Figure 4 compares the boxplots of the four annual continuous evaluation metrics among the five SPPs. For example, Figure 4a contains five boxplots, which respectively depict the distribution of annual CCs of the five SPPs across the 13 rainfall stations. At the daily scale, the annual CCs of the IMERG\_F range from 0.79 to 0.82 among the 13 rainfall stations, compared to 0.71–0.75 for IMERG\_E, 0.73–0.77 for IMERG\_L and 3B42RT, and 0.74–0.77 for 3B42.

One-way ANOVA could be used to assess whether the mean values of the four continuous metrics are significantly different among the five SPPs. One critical pre-condition of performing ANOVA is to ensure the homogeneity of variance among the compared groups. In this study, we use the Levene's Test to compare the variance of the metrics among the five SPPs, which confirm that all four metrics could meet the requirement of homogeneity of variance. The subsequent one-way ANOVA has concluded that the mean values of all four metrics are significantly different among the SPPs at the significance level (α) of 0.05 (Figure 4).

In view of the significant ANOVA results, multiple posterior comparison tests—including the Bonferroni, Sidak, Tukey, and Scheffe tests—are further conducted to identify the pairs of SPPs whose mean metrics are truly significantly different. In Figure 4, two SPPs are connected with a black dotted line if posterior comparison tests have concluded a non-significant difference between their means at the α level of 0.05. As shown in the figure, the mean values of CC are significantly different between all pairs of SPPs except between IMERG\_L and the two TMPA products as well as between the TMPA products themselves; the mean values of RMSE are all significantly different except between 3B42 and the two near real-time IMERG products as well as between the two near real-time IMERG products themselves; the mean values of RB are all significantly different except between the three pairs of IMERG products; the mean values of MAD are all significantly different except between IMERG\_L and the other two IMERG products as well as between IMERG\_E and 3B42. It is worth noting that the posterior comparison tests have shown that IMERG\_F is the single IMERG product that is significantly different from the TMPA products in terms of all four metrics.

**Figure 4.** Boxplots of the four annual continuous evaluation metrics of SPPs at daily scale: (**a**) Correlation coefficient (*CC*) (Levene's Test, *p* = 0.08; One-Way ANOVA, *p* = 0.0); (b) Root-mean-square error (*RMSE*) (Levene's Test, *p* = 0.16; One-Way ANOVA, *p* = 7.3 <sup>×</sup> 10<sup>−</sup>13); (**c**) Relative bias (*RB*) (Levene's Test, *<sup>p</sup>* <sup>=</sup> 0.78; One-Way ANOVA, *<sup>p</sup>* <sup>=</sup> 3.2 <sup>×</sup> <sup>10</sup><sup>−</sup>13); and (d) Mean absolute difference (*MAD*) (Levene's Test, *<sup>p</sup>* <sup>=</sup> 0.35; One-Way ANOVA, *p* = 3.0 <sup>×</sup> 10<sup>−</sup>15). Two SPPs are connected with a black dotted line if posterior comparison tests have concluded a non-significant difference between their means at α = 0.05. Each boxplot is used to depict the distribution, therefore the variation, of the continuous evaluation metrics among the 13 rainfall stations. In each boxplot, the top and bottom of the box represent the first and third quantiles. The whiskers extends to 1.5 times of the inter-quantile range. The horizontal line inside the box represents the median. The '×' inside the box represents the mean.
