**3. Results**

#### *3.1. Comparison of Surface Albedo between Satellite Products and In Situ Retrievals*

Tower measured shortwave radiation data from 19 sites were used to estimate both DHRs and BHRs and evaluate the accuracy of satellite derived values. There were no intercomparison results at the SPO site presented here, due to the lack of higher resolution satellite data covering this region. In Figures 7 and 8, DHRs and BHRs retrieved from tower albedometers were compared with the CGLS, MODIS and MISR products for the following sites: AU-TUM (evergreen broadleaf), US-FPK (grasslands) and US-BRW (snow and ice). The intercomparison of time-series tower and satellite-derived albedo for the other SURFRAD and BSRN sites is provided in the Supplementary Figure S1. Albedo products from satellite observations were produced using different time windows (i.e., 30 days for CGLS, 16 days for MODIS and near simultaneously for MISR (~7 min)). Here, three different time windows (30, 16 and three days) were used in tower albedo retrieval for the corresponding intercomparison with CGLS, MODIS and MISR products, respectively. A three-day rather than a one-day window was used in tower albedo retrievals when comparing with MISR products, because the effective number of measurements acquired from a one-day window was often insufficient to retrieve DHRs or BHRs after data screening.

**Figure 7.** CGLS (column 1), MODIS (column 2) and MISR (column 3) DHR products compared with tower derived DHRs at the Tumbarumba, Fort Peck and Barrow sites.

**Figure 8.** CGLS (column 1), MODIS (column 2) and MISR (column 3) BHR products compared with tower derived BHRs at the Tumbarumba, Fort Peck and Barrow sites.

At some sites (e.g., Tumbarumba and Fort Peck sites), the DHR intercomparison showed a good match in terms of absolute values and seasonal variations. The DHRs at the Barrow site showed a better match during the snow-free season than the snow-covered season. Among the three satellite DHR products, MODIS showed the best agreemen<sup>t</sup> with the in situ measurements. At the Tumbarumba site, both CGLS and MISR retrievals showed a systematic overestimation of the DHR, while the MODIS retrievals agreed fairly well with in situ measurements in all time periods. At the Ford Peck site, the MISR DHRs were comparable with MODIS DHRs, whereas the CGLS retrievals were still overestimated during the snow-free season. It is interesting to note that the MODIS retrievals had better performance in picking up the albedo of snow points. At the Barrow site, the MISR retrievals were closer to the in situ measurements during the snow-free season than the CGLS and MODIS retrievals.

The intercomparison of BHR measurements at the four sites discussed above are displayed in Figure 8, and results for the other sites can be found in Supplementary Figure S2. It should be noted that the BHR products provided in MISR were a very close approximation to the blue-sky albedo, rather than the white-sky albedo. Therefore, the tower albedos were directly retrieved from the ratio between the upwelling and downwelling radiation for this specific comparison with MISR retrievals. The tower data for the purpose of MISR BHR comparison were screened over a ±1-h window at local solar noon during one day. The variation of surface albedo was dependent on the solar zenith angles, please see [33] in more details for an explanation of why solar noon is employed.

Generally speaking, the DHR retrievals showed better agreemen<sup>t</sup> between satellite and in situ measurements than the BHR retrievals. In our method for BHR retrievals, the illumination was assumed to be uniform from all angles when the diffuse ratio was larger than *β<sup>B</sup>*. However, not all the tower data screened for BHR retrievals could meet this condition. This was the error source that may reduce the accuracy of BHR retrievals. Similarly, the MODIS DHR retrievals showed the best agreemen<sup>t</sup> with tower measurements, followed by the MISR retrievals, and then followed by the CGLS retrievals.

The albedo values derived from satellite products and tower retrieval are summarised in the 2D scatterplots shown in Figure 9. for DHRs for three selected sites. At the Tumbarumba site, all the satellite products were well-correlated with the in situ retrievals, with a bias value less than 0.025. The Fort Peck site also showed a good correlation, except for some points which were incorrectly identified as snow in CGLS and MODIS. The MISR products showed a better performance at the Barrow site during the snow-free season, while the CGLS and MODIS were better in picking up snow, although the snow-covered DHRs were often underestimated.

**Figure 9.** Scatterplots for DHRs from CGLS, MODIS and MISR. All the data are summarised from all the results from 2012-01-01 to 2016-12-31 for all four selected sites. The blue, green and red lines indicate CGLS, MODIS and MISR DHR products. The central solid lines are 1:1 lines (perfect correlation), and the outer dashed lines are 0.025 offset dashed lines.

The albedo values derived from satellite products against corresponding tower retrievals are summarised in Figure 10 for BHRs. Again, the satellite products and tower retrievals showed a better agreemen<sup>t</sup> of DHR values than BHR values. Large biases occured during the snow-covered season, which could be observed at the Fort Peck and Barrow sites.

**Figure 10.** Scatterplots for BHRs from CGLS, MODIS, and MISR. The data are summarised from all the results from 2012-01-01 to 2016-12-31. The blue, green and red lines indicate CGLS, MODIS and MISR BHR products. The central solid lines are 1:1 lines, and the outer dashed lines are 0.025 offset dashed lines.

#### *3.2. Comparison of Surface Albedo between Coarse-Resolution Satellite Products and Upscaled Tower Values*

DHRs and BHRs retrieved from tower data were upscaled to 1-km resolution and compared with CGLS products to assess the performance of this upscaling strategy. Coincident Landsat-8 30-m albedo data, which are used as a bridge to fill gaps between the small footprint tower measurements and the coarse-resolution measurements, were produced using the method introduced in Section 2.4.1. Scatter plots between the upscaled albedo and CGLS 1-km albedo are displayed in Figure 11 for DHR comparisons and Figure 12 for BHR comparisons, respectively. Comparisons for other sites are given in the Supplementary Figure S5. MODIS BRDF climatology data were used as input in the upscaling process; therefore, here the upscaled values were not directly compared with the MODIS albedo products. The sparsity of MISR albedo products severely increased the difficulty of finding cloudless Landsat-8 data.

**Figure 11.** Density plot of DHR upscaled from tower FoV to CGLS resolution over a 20 × 20 CGLS pixel region.

**Figure 12.** *Cont*.

**Figure 12.** Density plot of BHR upscaled from tower FoV to CGLS resolution over a 20 × 20 CGLS pixel region.

At the Fort Peck site, the upscaled DHRs were well correlated with the CGLS DHRs, with a R-squared (*R*2) of 0.685 and root-mean-square-error (RMSE) of 0.006. The Barrow site had fewer pixels upscaled to coarse resolution than the other sites, and the upscaled DHRs showed larger differences with CGLS for the pixels with smaller albedo values, due to melt-ponds and tundra in this region. At the Tumbarumba site, most pixels were clustered around the 1:1 line, because of the good agreemen<sup>t</sup> in "point-to-pixel" time series analysis. But the upscaled DHRs had a relatively small value of *R*<sup>2</sup> (0.233) when compared with CGLS DHRs, which suggests that the upscaling coefficient was not suitable for upscaling to a region covering 20 × 20 pixels around the tower. The BHR upscaling results were close to the DHR upscaling results in terms of *R*<sup>2</sup> and RMSE.
