*4.2. Evaluation against MOD11*

We have conducted an inter-comparison between MOD11\_L2 and our LST retrievals. Before comparison, the MOD11\_L2 data are rescaled to 0.05 × 0.05 degree latitude/longitude grids.

Figure 4 shows an example of comparison between the GOES\_RTTOV LST and MOD11\_L2 LST on 11 June 2004 UTC 17:15.

**Figure 4.** Example case of GOES\_RTTOV\_LST, MOD11\_L2 LST and their difference and its distribution at 11 June 2004 UTC 17:15.

We compared 25 match-up cases for which the two products have overlap and both start scan at 15 min after same hour in 2004. For most cases, the total number of points in the overlap area is more than 40,000. Figure 5 presents the correlation coefficients (*corr*), the mean bias (*bias*), the standard deviation (*std*) and the root mean square error (*rmse*) between the two products. As seen, in most cases, the two products yield close correlation. Only in one case the coefficient is less than 0.8. The averaged *corr* of all cases is 0.91. More than 50% cases have mean bias less than 2 K, and the averaged value is 1.7 K. The averaged *std* and *rmse* are 2.7 and 3.3 K respectively.

#### *4.3. Evaluation against ARM SGP Site at Instantaneous Time Scale*

The IRT data are available from two levels of a tower; one instrument was located at 25 m and one at 10 m above ground. The probability distribution of differences between GOES\_RTTOV\_LST and ARM IRT is shown in Figure 6 using all available retrievals. Obviously, most of the differences fall within the interval of 1 *std*. Less than 20% of the cases are beyond 1 *std*. The correlation between the two data sets is high for both levels, (>than 0.80 for all cases). The mean differences at daytime are smaller than at nighttime at both levels and the same applies to *std* and *rmse*. Numerical values for the cases of Figure 6 are shown in Table 3. Differences due to the height exposure of the instrument can be caused by differences in the field of view of the instrument, and as such, different shading effects.

**Figure 5.** Evaluation of 25 instantaneous match-ups of LST retrievals from GOES observations against MOD11\_L2. The *x*-axis provides the numbering of the cases while the *y*-axis indicates the correlation (in blue) and the other variables in W/m<sup>2</sup> .

**Figure 6.** Probability distribution of differences between the GOES\_RTTOV and ARM SGP C1 LST. Red dot line: 1 *std*; Blue dot line: 3 *std*.

**Table 3.** Statistical results for cases illustrated in Figure 6.


− − Figure 7 shows the results of evaluations for 2004–2009 from tower observations: (a) for daytime from 25 m level; (b) same as (a) using observations from 10 m level (year 2006 excluded since this year requires additional quality control); (c) same as (a) for nighttime observations only; (d) same as (b) for nighttime observations only. Only values within 1 *std* are used. The satellite product underestimates

the ARM IRT observations, yet, the difference between them is less than 1% and the *std* and *rmse* are also around 1%; the performance at daytime is better than at nighttime, most likely, due to better cloud detection during the daytime when observations from the visible channel are available.

**Figure 7.** Evaluation of RTTOV based estimates from GOES-E at the SGPC1 ARM test site using observations at hourly intervals during 2004–2009 for daytime and nighttime from 25 m and 10m tower level. Data that have differences of less than one *std* were used.

#### *4.4. Evaluation against SURFRAD*/*BSRN*

↑ ↓ The SURFRAD/BSRN network observes upwelling (*F*↑) and down-welling (*F* ↓) radiative fluxes which are converted to temperature as follows:

$$F \uparrow = \varepsilon\_{\rm IR} \sigma T\_S^4 + (1 - \varepsilon\_{\rm IR}) F \downarrow \tag{10}$$

ூோ − − − − ଵ/ସ where ε*IR* is the surface broadband emissivity assigned by surface type, σ is the Stefan-Boltzmann constant and is equal to 5.669 × 10−<sup>8</sup> J m−<sup>2</sup> s <sup>−</sup>1K−<sup>4</sup> . Then

$$T\_S = \left[\frac{\mathbb{E}\uparrow - (1 - \varepsilon)F \downarrow}{\varepsilon \sigma}\right]^{1/4} \tag{11}$$

The approach we use was also applied by others. The main issue in the conversion is the value of emissivity. Heidinger et al. [34] use a broadband longwave emissivity assumed to be 0.97. They indicate that a 0.1 error in emissivity equates to an error in the SURFRAD LST not exceeding 0.25 K. Yu et al. [59] also used the SURFRAD data to evaluate their LST retrievals following the same procedures. In their approach, the emissivity is estimated by mapping surface type classification of Snyder et al. [60] to emissivity (an approach that was popular for some time when direct information on emissivity was not available). They assume that the mean broadband emissivity of the satellite sensor is applicable. We use the CAMEL emissivity which is derived spectrally and integrated to the window spectral interval of the satellite used, and variable at monthly time scale; namely, for each month and for each location the spectral values are integrated to give a new broadband value. This is the most advanced use of surface emissivity in such retrievals.

The scatter plots of the instantaneous GOES\_RTTOV LST against SURFRAD sites for both daytime and nighttime are shown in Figure 8. As seen, the satellite estimates and the ground observations have very high correlation, mostly above 0.98. For daytime (left panel Figure 8) the differences ranged between 0.4 (0.2%) to 1.16 (0.4%) while the *std* ranged between 1.88 (0.6% to 2.53 (0.9%) respectively. For nighttime (right panel Figure 8) the results are comparable to daytime.

**Figure 8.** Evaluation of instantaneous GOES based LST estimates at hourly intervals against 4 SURFRAD/BSRN stations, independently for daytime (left panel) and nighttime (right panel) using observations from 2004–2009.
