*3.5. Geospatial Performance of IMERG Products*

Several studies have revealed the relationship between physical parameters, i.e., elevation, slope, latitude, and longitude, temperature etc., and satellite observation error [36]. Some of the errors are associated with the satellite sensor technology and applied algorithm while others are related to physical parameters on the ground. In the following section, we discuss the most prominent results obtained from the investigations of the relationships between rainfall index factors, introduced in Section 2.3.4, with multiple statistical criteria (i.e., rBIAS, CC, POD, etc.) that can introduce errors to the satellite data accuracy. The resulting boxplots for other geospatial factors can be found in the Supplementary Materials.

#### Relation to Location-Specific Rainfall

The charts presented in Figure 8 display the variation of location-specific criteria indices of IMERG daily products for different categories of stations based on the rainfall index factor, which, among all factors, showed a significant relationship with criteria indices. For these boxplots, limits of the bins (categories of rainfall index) are presented in the horizontal axis. For example, a hypothetical station with an average annual rainfall of 200 mm and a mean dry period of 20 days (i.e., rainfall index of 200/20 = 10) is located in the third bin. Accordingly, an equal number of stations fall inside each bin. The vertical axis shows the variation of a specific index, e.g., CC. Figure 9 represents the location map of the stations in each category using different colors. As shown in this map, rain gauges with the highest rainfall index are located in the northern regions, mostly adjacent to the Caspian Sea coastline, as well as in the western regions. However, by moving from north and west to the central, eastern and southern regions, the rainfall index generally decreases. This spatial pattern is mainly controlled by the effect of two major mountain ranges in Iran (Alborz along the northern and Zagros along the western borders).

In Figure 8a–h, there are three boxplots for each bin in blue, green, and red corresponding to the IMERG-Early, -Late, and -Final product, respectively. According to the CC chart (Figure 8a), for instance, the IMERG-Final showed a higher correlation with rain gauge measurements in comparison to both IMERG-Early and –Late. These two displayed rather identical variation in each bin. Also, for the first and the last bins, there is a tendency to a lower correlation in comparison to other categories although the variation of CC in each bin is rather high. In general, the CC between the rain gauge data and satellite products in the daily time scale varies between 0 and 0.9 for different stations in the country.

As seen in Figure 8b–d, generally, by increasing the rainfall index (i.e., for stations in wetter locations), a lower frequency of overestimations (Figure 8b) and higher frequency of underestimations (Figure 8d) appear for all products. However, no significant change is observed for the equal index (i.e., negligible difference) at different bins (Figure 8c).

By comparing all three charts (Figure 8b–d), it appears that the frequency of overestimations (over) for stations in the first bin (i.e., the driest locations) decreases accompanied by increase in the frequency of underestimations (under) and, to some degree, the frequency of negligible differences (equal) of the IMERG-Final product compared to the other daily products. For stations in the last bin, the condition for over and under was reversed while the frequency of negligible differences (equal) increased again for the IMERG-Final product. It can be concluded that the correction process by the IMERG-final product results in different changes in the frequency of over- and underestimations for different locations in Iran while it provides an overall decrease in the error for a majority of locations. The latter statement is confirmed by looking at Figure 8f,h as the alterations in the IMERG-Final boxplots in each bin relative to the boxplots of two other products are mostly close to a reduced MAE, as well as a reduced magnitude (absolute value) of rBIAS.

**Figure 8.** Box plots of the criteria indices for 10 rainfall index bins in blue, green, and red corresponding to IMERG-Early, IMERG-Late, and IMERG-Final daily products, respectively. The horizontal line in the boxes, and the upper and lower bounds of the boxes are the 50th, 75th, and 25th percentiles, respectively. The red plus symbols denote the outlier data and the whiskers (dashed black lines) extend to the most extreme data not considered as outliers.

**Figure 9.** Spatial distribution of rain gauges colored differently for different categories of the rainfall index on the map of digital elevation and average annual rainfall contour lines.

MAE is the average magnitude of individual errors, so smaller MAE is favorable. However, it can result in a misleading interpretation. For example, at a dry location with zero rainfall for more than 90% of the entire length of the dataset, MAE will not reflect if there are a few major individual errors related to the extreme events. On the other hand, rBIAS calculates the accumulated individual errors (overall bias) relative to the accumulated observed rainfall during the period of comparison. Thus, it represents both overall under and overestimations (according to the negative or positive sign) and a comparable bias for different locations. As a result, the use of MAE together with rBIAS is essential. While a small magnitude of both MAE and rBIAS indicates a high performance of the satellite products, a combination of a large rBIAS with a low MAE for a location can be interpreted as a typical low individual error. This situation is more likely to appear for dryer locations with a higher frequency of smaller rainfall amounts. Also, a low rBIAS needs to be considered in the case of a large MAE value. Figure 10 illustrates these statements using Q-Q plots for a few locations selected from the different categories of the rainfall index.

In theory, a low rBIAS means that the total amount of rainfall observed at a location is accurately estimated by the satellite during the period of comparison. In other words, the sum of the positive individual errors is almost equal to the sum of the absolute values of the negative individual errors, regardless of the magnitude of the individual errors. On the other hand, CC is an accuracy criterion showing the degree of linear correlation between two datasets, thus it is not aimed to be an error index. To be able to discuss how these indices may result in contradictory situations, Figure 11 shows Q-Q plots for five other locations, with different combinations of CC, MAE, and rBIAS. It can be discerned that at locations no. (1) and no. (2), deviations from the 45-degree line are smaller, compared to the other locations, so the satellite product showed a better performance relative to location no. (3) and an even higher performance relative to locations no. (4) and no. (5) in presenting the actual daily rainfall

distribution. The correlation values for locations no. (1) and no. (2) are substantially different. On the other hand, the performance of the IMERG-Early product for the locations no. (3) and no. (5) seems to be completely different from each other while they both showed a high correlation (0.79) and a low MAE (~0.7 mm day<sup>−</sup>1). Therefore, rBIAS could play a more discriminating role than a misleading CC or MAE, in the comparison of satellite-gauge datasets for their statistical distribution.

**Figure 10.** Q-Q plots for comparing different combinations of rBIAS and MAE at different locations with variable rainfall indices where lower values of MAE and lower absolute values of rBIAS are favorable (the large circles' center shows the position of the 95th percentile).

**Figure 11.** Q-Q plots comparing different combinations of CC, rBIAS, and MAE at five different locations where very low values of rBIAS show better performance for different combinations of CC and rBIAS (large circles' center shows the position of the 95th percentile).

To be able to evaluate the detection ability of the satellite products, the calculation of FAR and POD criteria is necessary (Figure 8e,g). As seen in Figure 8e, generally, FAR values were reduced with an increase of the rainfall index. The FAR values were higher for the majority of locations in the driest category (the first bin) as compared to wetter locations. For example, a median value of FAR at about 75% for the first category means that 75% of the rainy days detected by the satellite were not observed by the rain gauge. Also, the overall minimum value of FAR of around 35%, mostly, for the locations in the wetter locations indicated that, at least, 35% of the rainfall events detected by the satellite were not recorded by the rain gauges located within the corresponding satellite grids across the country. Regardless of errors due to the interruption of the rain gauge measurements or false detection by the satellite sensor, which are both possible, the increase of FAR for the dryer bins (Figure 8e) suggests that local rainfall events are more likely to appear at dryer locations. For such conditions, a rainfall event that partially affects a grid may not necessarily be observed by a rain gauge located in a dry part of the grid. Conversely, the chance for this condition is reduced for the wetter location, where uniform rainfall over a vast area is common.

According to Figure 8g, POD for different locations in Iran varied between 45% and 95% and more frequently between 60% and 80%. The higher PODs were more frequent at dryer locations and less frequent at wetter locations. POD indicates the chance for the satellite to detect a rainfall event, which is observed by a rain gauge within the satellite grid. According to this definition, the POD is not related to the spatial variability of rainfall in a grid. Instead, it indicates the sensor's inability to detect rainfall due to the temporal variability and the satellite visiting time. The variation of FAR and POD was almost the same for different daily products of IMERG, hence the applied correction in the IMERG-Final product did not account for the detection ability of the sensors. It appears that the IMERG corrections to the final product are mostly targeting the bias in the satellite observation. There is, however, some consistency in the results of different criteria. For example, the highest frequency of underestimations (Figure 8d) and negative rBIAS values (Figure 8h) for the wettest locations (locations in the last bin) can share common reasons related to the detection problems as the lowest values of POD were observed for a majority of the locations located in the last bin (Figure 8g).

The boxplots comparing the IMERG-Monthly products' performance for different categories of location (based on rainfall index) showed similar trends in the variation of the criteria indices for different categories of the rainfall index factor (Figure 12). The correlation for most of the locations was above 0.7 (Figure 12a). The over generally indicated a decreasing trend with the rainfall index (Figure 12b) similar to what was observed for the daily products while the frequencies of negligible differences (equal) between the IMERG-Monthly product and the rain gauge measurements showed a decreasing trend by the increase in the rainfall index (Figure 12c). The frequencies of the underestimations also showed an increase by the rainfall index (Figure 12d). The FAR decreased sharply compared to the daily FAR (Figure 12e). However, there are still considerable FAR values (i.e., above 40%) for the first three bins in Figure 12e (drier locations), which is related to local rainfall events in summer months (when there are only a few rainy days). This implies that the rainfall events are not uniformly distributed over a given satellite grid so the rain gauge located in the grid cannot, in some months, record any rainfall, but the satellite sensor does. MAE in Figure 12f shows an increase in the rainfall index, with a gentle slope compared to the increasing trend observed for the daily products (Figure 8f), which can be due to the smoother nature of the monthly data compared to the more erratic daily rainfall data. The POD for monthly data was close to 100% for almost all locations (Figure 12g), because there is a high chance that both the rain gauge and satellite recorded at least a rainy day in a given month. For the rBIAS, there is overestimation for almost all locations, with a rainfall index between 0 and 87, while the satellite underestimated monthly rainfall for a majority of the location in the 10th bin (the wettest category of locations) in Figure 12h.

**Figure 12.** Box plots of the criteria indices for the 10 rainfall index bins for the IMERG-Monthly product. The horizontal line in the boxes, and the upper and lower bounds of the boxes are the 50th, 75th, and 25th percentiles, respectively. The red plus symbols denote the outlier data and the whiskers (dashed black lines) extend to the most extreme rainfall data not considered outliers.
