**4. Discussion**

The DHR and BHR values retrieved from the tower albedometer data were first directly compared with satellite values derived from the pixel near the tower location. Generally speaking, the homogeneous sites had a better agreemen<sup>t</sup> between tower and satellite retrievals than the heterogenous sites. For the homogeneous sites, except for the US-BAO site that appears to have anomalous tower data since the year 2016, all the other sites (AU-TUM, US-TBL, US-FPK, US-DRA, US-BRW) showed good agreemen<sup>t</sup> with the satellite retrievals during the snow-free season. Among the heterogenous sites, the US-BON, US-GCM and US-PSU had large differences between the tower and satellite retrievals, while the US-SXF showed good agreement.

MODIS products showed the best agreemen<sup>t</sup> with tower retrievals, followed by MISR products, and then followed by CGLS products. The MODIS products appeared to have a good performance in picking up snow-covered points, which can be seen from the time-series analysis at the US-FPK, US-SXF and US-TBL sites. The MISR products were comparable with MODIS products at most of the sites studied in this work, and better than MODIS products at sites like US-BRW during the snow-free season. However, the MISR products were produced using a near-simultaneous retrieval, and compared with tower DHRs generated over a three-day window and tower BHRs generated in one day. In this case, the agreemen<sup>t</sup> between the MISR and tower retrieval suggests that the MISR products were closer to the actual surface albedo values.

The albedo values retrieved from both homogeneous and heterogenous sites were upscaled to coarse resolutions through the use of Landsat-8 spectral reflectance and MODIS BRDF climatology data. There was no obvious difference in the agreemen<sup>t</sup> between upscaled albedos and coarse-resolution albedos over homogeneous and heterogenous sites. But relatively better correlations could be still

be found at homogeneous sites, such as US-DRA with a *R*<sup>2</sup> of 0.81 for DHR comparisons. There are several sources that can affect the accuracy of upscaled albedo values. First of all, the accuracy of the generation of the high-resolution albedo values plays an important role in the upscaling process. Secondly, the upscaling coefficient calculated from the tower FoV is accurate for a local area because of the high coherence. If it is applied to a larger area, errors are more likely to be introduced at pixels further away from the tower. This can explain why at some homogeneous sites (e.g., AU-TUM with a *R*<sup>2</sup> of 0.233 on DHRs) the upscaled albedos appeared to have a poor correlation with coarse-resolution albedos. However, for heterogenous sites, such as US-GCM, a *R*<sup>2</sup> value larger than 0.5 could be found between the upscaled albedos and coarse-resolution albedos. This suggests that an optimal sample size for maximising this upscaling can be determined, and that this upscaling method can be applied to both homogeneous and heterogenous surfaces.
