*4.5. Evaluation against the Oklahoma MESONET Sites*

The distribution of sites used in current evaluation is illustrated in Figure 3a. The evaluations are carried out against all the stations for both daytime and nighttime in January and July during 2004–2007. Results are presented in Figure 9 where outliers outside 1 *std* were removed. Red color designates results that have bias smaller than 1 *std*, ranging from 1.74 to 2.47 K. The retrieved data have high correlation with the in-situ data (>than 0.9). Results of daytime and nighttime for January and July are comparable (unlike the results of Figure 6 where all outliers were used).

**Figure 9.** Evaluation of instantaneous GOES based LST estimates at hourly intervals against the MESONET stations, independently for daytime (left panel) and nighttime (right panel) using observations for years 2004–2007.

#### *4.6. Applications*

• Seasonal distribution of LST at monthly scale

Since many users of LST data are interested in monthly mean values, we have conducted a comprehensive comparison at such scale. This was possible due to the availability of both ground observations and satellite estimates for a period of six years.

The evaluation was expanded to include ground observations at the USA BSRN/SURFRAD sites and we also used information from MOD11C3, Version: 006 at 0.05◦ (https://lpdaac.usgs.gov/dataset\_ discovery/modis/modis\_products\_table/mod11c3\_v006). Derived statistics includes mean values, standard deviation, maximum/minimum, and medium values for each month are shown in Figure 10. It is clearly that the GOES\_RTTOV LST has very close distribution pattern as it of SURFRAD/BSRN. It has the ability to describe the annual variability of the LSTs at SURFRAN/BSRN sites. At DRA, MOD11C3v6 LST yields higher estimations against SURFRAD/BSRN for most seasons, the annual mean LST for all study years is 306.6 K, which is 3.1 K higher than the value of SURFRAD/BSRN. While

the GOES\_RTTOV estimations are much closer to the site value. The annul mean LST of GOES\_RTTOV is 301.6 K. For BON, both of the MOD11C3v6 and GOES\_RTTOV estimations are close to the site values, except April, May and June. The annul mean LSTs of SUFRAD/BON, GOES\_RTTOV and MOD11 are 287.9 K, 288.0 K, 289.8 K respectively. And for GWN, the GOES\_RTTOV has relatively lower estimation of annual mean LST which is 293.6 K. The site value is 295.1 K. The MOD11 is 295.7 K. The performance of the satellite estimations at FPK is similar as DRA. The MOD11 annual mean LST is 288.4 K, the GOES\_RTTOV is 283.8 K, and the site value is 285.2 K.

**Figure 10.** Daytime LST Distribution of GOES\_RTTOV, SURFRAD/BSRN sites over 2004–2009 for each month and their monthly mean values compared with MOD11C3v6. Top/Bottom of *dashed line*: maxi/min LST; *Solid "-"*: medium LST: *Solid box:* quartile of LST. Stars are monthly mean LST of SURFRAD/BSRN sites (orange), GOES\_RTTOV (blue) and MOD11C3v6 (red).

• A six-year climatology of LST over the US

A six year (2004-2009) monthly means of LST at 0.05◦ spatial resolution for January and July at UTC 06:15 and at UTC 18:15 are shown in Figure 11 for illustration of the product.

**Figure 11.** Monthly mean LST at 0.05◦ spatial resolution averaged over 6 years (2004–2009) for January and July; (**a**) UTC18:15, July; (**b**) UTC06:15:00, July; (**c**) UTC18:15, January; (**d**) UTC06:15, January.

As shown, for July, the differences in surface temperature during these two hours (close to representing daily max and min), are large. During the daytime, the western part of the US is dominated by clear sky conditions (the 100th W longitude is known to separate between the humid and dry parts of the US). During the nighttime, the clear conditions contribute to cooling by emitted LW radiation especially, over high elevations. Noticeable is also the pronounced latitudinal variability in the LST during January, dominated by solar zenith angle dependence of heating by SW radiation. The high spatial and temporal resolution of this product makes it useful for addressing hydrological issues such as modeling of evapotranspiration, snow-melt, or soil moisture estimation (utilizing morning heating rates) [61].

In Figure 12 we depict the diurnal variation of LST as observed at four SURFRAD/BSRN stations and from GOES-12. Notable is the large amplitude at the dry site of Desert Rock (DRA) (characterized as desert, gravel, flat, rural) as compared to the more vegetated regions at the other sites (BON is grass, flat, rural; FPK is grass, flat while GWN is grass, hilly, rural). The effect of latitude is also evident. The amplitude at GWN which is at ~34◦N is much smaller than the amplitudes at the higher latitude stations (BON at ~40◦ and FPK at ~48◦ ). Of interest are the differences between satellite estimates and ground observations which are more noticeable at DRA and FPK than at the other sites. A possible explanation for DRA is the lower homogeneity of the site compared to the others. The FPK is at higher elevation than BND and GCM and also at higher latitude so possibly, the cooling of the ground at the observational site may not represent the grid domain. Additional investigation is needed to better understand the behavior at these four sites during the earlier part of the day. The full agreement between the satellite and ground observations from about noon to late afternoon can possibly be due to more even heating of the ground than at the earlier hours of the day when the higher moisture content can differentially affect the emissivity. While ground observations are very sparse, the findings shown in Figure 12 indicate that satellites alone can be used to characterize the diurnal cycle over the domain of the GOES satellites (a comprehensive analysis over the entire US is needed). This information is of considerable interest since most satellite based estimates of LST use polar orbiters unable to depict the true diurnal cycle.

**Figure 12.** A six-year average (2004–2009) of the hourly LST at four SURFRAD/BSRN stations as observed (solid line) and as derived from the GOES observations (broken line).

#### **5. Discussion**

Available information on LST and DTR from remotely sensed data is deficient. Discrepancies and inconsistencies arise due to the quality of the satellite and the ground observations, differences in their spatial, spectral and temporal resolution, as well as differences in the inference methods and auxiliary data used. In principle, the well-established split window approach is known to perform better than the use of a single channel for deriving LST however, the 12 µm channel is not available any more during the operational period of GOES 12-15. To homogenize satellite observations to obtain a consistent long term record requires the utilization of observations from a single channel only.

With the advancement in archiving of satellite data, their maintenance in terms of calibration, geolocation, improved inference schemes and auxiliary information, it is timely to formulate an approach for deriving long-term, consistent, and calibrated data across multiple satellite sensors, as demanded by the user community. Progress has also been made in ground observations in terms of instrument characterization, guidelines for high quality maintenance and calibration. The issue of optimal coupling between satellite and ground observations is still widely debated.

LST is known to have large spatial variability at different temporal scale (diurnal, annual, inter-annual) and this variability has an informative value. For instance, the importance of the diurnal cycle of LST has been widely recognized [62–64] and numerous attempts have been made to estimate it. In an early attempt [65], used were the International Satellite Cloud Climatology Project (ISCCP) data (at 280 km resolution C-2 product) [66] in combination with ground observations to derive the monthly mean diurnal cycle in surface temperature over land (suitable for Global Climate studies). Duan et al. [64] tried to determine it using High Spatial Resolution Clear-Sky MODIS Data while Inamdar et al. [33] dis-aggregated the diurnal cycle of LST at the GOES pixel scale to that of the MODIS pixel scale. Yet, the daytime and nighttime products from polar orbiting satellites (e.g., MODIS) do not fully represent the daily amplitude as feasible from GEO satellites. Our effort represents a contribution to the development of a framework for obtaining long term records of consistent LST and DTR from the entire record of GOES satellites, using a physically based approach and utilizing the best currently available auxiliary information and the best available ground observations to evaluate the proposed approach.

In the evaluation process, factors that play a role include differences in ground instrumentation, their location above the surface, method of estimating LST from the measured outgoing LW radiation, calibration and maintenance of the instruments and scale issues between ground observations and satellite footprints. There is a need to ensure that the satellite observations used represent clear sky condition. Detailed information on each of these factors is needed for a full assessment of errors in the retrieved LST products. While under controlled short term experiments the uncertainty of many of these factors can be minimized, the results obtained are not representative for extended areas and all seasons.

In this paper, the quality of the new product is evaluated against extensive record of best available observations and products that are accepted by the scientific community. Specifically, evaluation of a six-year record of instantaneous LST as well as monthly averages was performed against the DOE Atmospheric Radiation Measurement (ARM) site at the Southern Great Plains central facility, the BSRN/SURFRAD stations, MOD11 products and the Oklahoma MESONET sites.

While the quality of the instrumentation used at each site can be traced to factory specifications, it is not possible to establish how much differences in daily maintenance at each site contribute to the quality of the observations. The hypothesis of our approach is that by using long term observations at numerous sites and seasons, the evaluation results do provide an indication on the robustness of the approach. One of the major factors affecting the evaluation results is related to cloud screening which vary among methodologies as recently discussed in Ermida et al. [67]. Spatial and temporal variability in emissivity are also a contributing factors. As reported in [6] a brightness temperature error due to emissivity error in the 11 µm band is about 3% for a 0.5% error in emissivity and up to 5% for an emissivity error of 2.0%; these estimates are based on global simulations over a wide temperature range. To fully understand discrepancies between products, there is a need in controlled experiments to evaluate independently factors that can cause differences. Till now, available retrievals are based on different satellite observation, different retrieval methodology, atmospheric inputs and time periods. An early attempt to compare the performance of several well-known algorithms was presented in {6]. To make such algorithm comparison consistent the individual methodologies need to be modified; it is necessary to rederive relevant coefficients of the algorithms used in a systematic manner using the same inputs. The need for controlled experiments to facilitate discussion on sources of discrepancies between methods has been recognized by the scientific community and is conducted frequently. Examples are numerical model evaluation as conducted at Lawrence Livermore National Laboratory (https://pcmdi.llnl.gov/?projects/amip/0), while controlled experiments to estimate errors due to aerosols is described in Randles et al. [68]. The objective of the current study is to present a credible methodology to generate long term time series of LST at best available spatial and temporal resolution (that currently are possible with a long term outlook), and evaluate it against best available satellite products and ground observations. Used were long term observations that represent different climatic regions and seasons that provided statistically robust indication on the soundness of the propose approach. The need for further work to investigate the sources of discrepancies is also recognized. Limitations and advantages of each methods and their trade-offs need also to be fully understood.

#### **6. Summary**

In principle, the split window approach is known to perform better than a single channel to derive LSTs. But the 12 µm channel is not available any more during the operation periods of GOES 12–15. To homogenize satellite observations to a consistent long term record requires the use of a single channel observation.

We have implemented the RTTOV radiative transfer approach adjusted for GEO channel 4 to derive LST at the high resolution of about 5-km. The model is driven with the MERRA-2 reanalysis profiles for water vapor and temperature and the CAMEL product. A homogeneous six year record of LST at 0.05◦ spatial resolution at hourly time scale was produced from GOES observations and evaluated for the period of 2004–2009. A six year climatology at monthly time scales was also derived and used to construct representative diurnal cycles for selected surface type.

The results shows that there is a close agreement between the GEO and MOD11 products. The averaged correlation coefficient between them is over 0.9. The averaged difference is less than 2 K and the averaged *rmse*is less than 3.5 K. It was also found that the derived LST has very close correlation with ground-based observations. In most cases, the correlation coefficients are greater than 0.9. The mean differences between the satellite LST and the station LST are less than 1% and over 80% of the differences fall within 1 *std*. The performance of retrieved LST for daytime and nighttime are comparable to each other after elimination of outliers caused by imperfect cloud detection. The estimated quality of the LST information can serve as a guideline for users in a wide range of applications, such as a realistic representation of the diurnal cycle.

Future improvement would be possible by satellite observations of higher spatial resolution, the incorporation of higher temporal resolution of surface emissivity and improved/innovative methodologies to remove cloud contamination [68] and by accounting for anisotropy in emissivity Pinheiro et al. [69], Ermida et al. [70].

**Author Contributions:** Conceptualization, R.P.; Data curation, G.H., E.B. and J.B.; Formal analysis, W.C.; Funding acquisition, S.H.; Investigation, S.H.; Methodology, R.P. and Y.M.; Project administration, K.C.-N.; Resources, C.H.; Software, Y.M. and T.I.; Validation, W.C.; Writing—original draft, R.P.; Writing—review & editing, K.C.-N.

**Funding:** This research was funded by the National Aeronautics and Space Administration, grant NNH12ZDA001N-MEASURES to Jet Propulsion Laboratory.

**Acknowledgments:** We acknowledge the ECMWF for providing the ERA-I data; the MERRA-2 data are provided by the Global Modeling and Assimilation Office (GMAO). Information from the MOD11\_L2: MODIS/Terra Land Surface Temperature and Emissivity 5-Minute L2 Swath 1 km V006 data base (Zhengming Wan, PI) (https://search.earthdata.nasa.gov/search/granules/collectiondetails?p=C194001236-LPDAAC\_ECS&m=-26. 4375!136.6875!0!1!0!0%2C2&tl=1515438649!4!!&q=MOD11\_L2%20V006), was provided under the courtesy of the NASA EOSDIS Land Processes Distributed Active Archive Center (LP DAAC), USGS/Earth Resources Observation and Science (EROS) Center, Sioux Falls, South Dakota. GOES data were obtained from the NOAA Comprehensive Large Array data Stewardship system (CLASS) (https://www.bou.class.noaa.gov/saa/ products/search?sub\_id=0&datatype\_family=GVAR\_IMG&submit.x=25&submit.y=8). The BSRN/SURFRAD data were provided by the NOAA Earth System Research Laboratory, Global Monitoring Division (https: //www.esrl.noaa.gov/gmd/grad/surfrad/). Data were obtained from the Atmospheric Radiation Measurement (ARM) user facility, a U.S. Department of Energy (DOE) office of science user facility managed by the office of Biological and Environmental Research. The team effort in generating and providing all the required data is greatly appreciated. We thank the anonymous Reviewers for constructive comments that helped to improve the manuscript.

**Conflicts of Interest:** The authors declare no conflict of interest.
