**1. Introduction**

Land surface temperature (LST) is the radiative temperature of the land surface, which plays a crucial role in understanding various environmental problems such as heatwaves, drought, wildfire, air quality, and urban heat islands [1–7]. Since LST reflects the energy flux stability at the boundary of the surface and atmosphere, it is also used as a major parameter in modeling global physical processes, including hydrological and biogeochemical cycles [8–10]. Therefore, it is important to obtain accurate LST over large areas on both high spatial and temporal domains.

With the continued development of remote sensing technology, LST has been retrieved from satellite data for large areas with high temporal and spatial resolution. Thermal infrared (TIR) sensors are the most widely used in producing satellite-based LST. Several algorithms, such as single-channel, split-window, and temperature and emissivity separation (TES) techniques, have been developed to provide TIR-based LST [11]. One of the most well-known TIR-based LST datasets is the moderate-resolution imaging spectroradiometer (MODIS) LST onboard the Terra and Aqua satellites. MODIS offers global LST products within a 1–2 K accuracy range, with a relatively high spatial resolution of 1 km, four times a day (two daytime and two nighttime LSTs). In addition, LST products are provided by several other TIR sensors with different specifications in both low earth orbit and geostationary orbit satellites: The visible infrared imaging radiometer suite (VIIRS), spinning enhanced visible and infrared imager (SEVIRI), and advanced spaceborne thermal emission and reflection radiometer (ASTER). Unfortunately, TIR-based LST is significantly affected by weather and atmospheric conditions; particularly, the surface temperature under clouds is not available. Some studies have been conducted to fill the gaps in LST data caused by clouds [12–26].

Previous studies aimed at overcoming the lack of TIR-based LST data under cloudy areas can be divided into four groups. The first group reconstructs LSTs in cloudy areas by combining spatially, temporally, or spatiotemporally neighboring clear sky LSTs [13,18,22,25]. In particular, recent studies have looked at modeling by combining multiple algorithms, such as regression and interpolation, using spatial and temporal information of multi-temporal LSTs [16,21,26]. The second group not only uses multiple LSTs, but also auxiliary variables that are highly correlated with LST, to estimate cloudy sky LSTs, using statistical methods such as regression kriging and spline interpolation. However, the critical limitation in the methods of these first two groups is that they assume that LST under cloudy weather conditions is not different from that under clear sky conditions. In general, clouds reduce incoming shortwave radiation during daytime by blocking the Sun, and increase downward longwave radiation during nighttime. Thus, nighttime LSTs in cloudy conditions are only slightly lower than those under clear skies, while the difference in daytime LST is more significant [27]. Consequently, it is essential to model LSTs under cloudy conditions.

The third group uses physical modeling approaches like surface–energy balance (SEB) theory, which is adopted to derive cloudy LSTs from spatially neighboring clear sky LSTs. The effect of the clouds is simulated using a correction term that takes into account surface insolation, air temperature, and wind speed [15,17,23,24]. The SEB techniques, however, require complex parameterization with air temperature and wind speed as input data. Although the variation of LST is assumed to be based on insolation during the daytime, the method is not able to be applied to nighttime [17].

The fourth group uses passive microwave (PMW)-based data to overcome the issue of cloudy areas in TIR-based LST data. PMW-based data are less affected by water vapor and clouds than TIR-based LST data. Brightness temperature (BT), measured by the advanced microwave Sscanning radiometer-earth observing system (AMSR-E) and advanced microwave scanning radiometer 2 (AMSR2) sensors, are frequently used as PMW-based data to estimate LST. Although PMW-based BT has limitations of coarse spatial resolution (10–25 km), it could be used as supporting data in estimating missing values of TIR-based LSTs under cloudy conditions [14]. For example, Shwetha and Kumar resampled 25-km AMSR-E/AMSR2-based BTs directly into 1 km, using them as input variables for artificial neural networks with auxiliary data of elevation, latitude, and longitude to model the all-weather 1 km LST [19]. Meanwhile, many studies have derived the PMW-based LST using the original resolution of BT (i.e., 10–25 km), rather than resampling it to 1 km and then downscaling it to a high resolution to merge with TIR-based LST [12,20]. PMW-based methods simulate cloudy sky LSTs based on the fact that PMW can penetrate clouds. However, the previous studies have limitations in terms of spatial accuracy in merging coarse PMW-based data with high-resolution TIR-based LST.

In South Korea, summers can often be scorching, causing a variety of disasters, including heatwaves and tropical nights. During these hot summers, Northeast Asia, especially South Korea, is usually covered by clouds transported by the East Asia monsoon [28]. Therefore, the reconstruction methods using temporally or spatially neighboring clear sky LSTs could not be successfully applied in this area in summer due to a very high cloud cover rate. Moreover, daytime LST on humid summer days (i.e., July and August) in South Korea is generally high under clear sky conditions, but it drops sharply in cloudy weather, such as during the rainy season or typhoon periods. Previous studies, however, have failed to consider the variability (i.e., rapid change) of LST under cloudy conditions. In addition, many studies have used air temperature rather than LST as in situ data to validate their cloudy sky LST predictions [16,29,30]. However, it should be noted that air temperature and LST often show different patterns in regions with heterogeneous land surfaces [31,32].

This study proposes two different schemes for estimating all-weather 1 km MODIS LSTs for humid summer days over South Korea, based on machine learning, using multiple datasets made up of AMSR2 BTs, and the annual cycle parameters (ACPs) of satellite TIR-derived LSTs. The first scheme (S1) is a two-step approach that first estimates 10 km LSTs and then downscales the LSTs from 10 km to 1 km. The second scheme (S2) is a one-step algorithm that directly estimates the 1 km all-weather LSTs. The primary objective of this study is to investigate how well the two schemes that we propose simulate dynamic humid summer LSTs under clear- and cloudy sky conditions through a series of validation processes. The remainder of this paper is organized as follows. Section 2 presents the study area and the data we used. Section 3 introduces the methods in detail, including the framework of our two proposed schemes. In Section 4, the distribution of clear and cloudy sky LSTs in the summer season are analyzed using in situ station data for daytime and nighttime. Then, the two different schemes are evaluated by a series of validations, especially using in situ LSTs for both clear and cloudy conditions. Finally, Section 5 presents the conclusion of this study.

#### **2. Study Area and Data**

#### *2.1. Study Area*

The study area is the mainland of South Korea with an area of approximately 99,728 km<sup>2</sup> (latitude 34◦ N–38.5◦ N and longitude 126◦ E–129.5◦ E) (Figure 1). South Korea generally has a humid, continental climate affected by the Asian monsoons, with a large amount of precipitation in summer during the rainy season (usually from the end of June to the end of July). The annual mean temperature is about 10–15 ◦C; August, the hottest month, has a mean temperature of 23–26 ◦C. Humidity ranges from 60%–75% on a national scale, with summers (July and August) rising to 70%–85%. The southern coast is subject to late-summer typhoons. As seen in Figure 1, the dominant land-cover categories of the study area are forest (67.7%), agricultural land (22.2%), urban areas (4.6%), and others, including grass, water, barren, and wetlands (5.5%).

**Figure 1.** (**a**) Study area and in situ reference data locations with 1 km elevation derived from Shuttle Radar Topography Mission (SRTM) digital elevation model (DEM), and (**b**) The landcover provided by Ministry of Environment of South Korea (http://egis.me.go.kr).

#### *2.2. Saetellite Data*

The AMSR2 onboard the global change observation mission 1st-water (GCOM-W1) satellite, launched in May 2012, provides global PMW-based BT data. It acquires a set of daytime and nighttime microwave data twice a day: The equator crossing time is 1:30 p.m. for the ascending pass, and 1:30 a.m. for the descending pass. The AMSR-2 has seven frequencies, with both vertical and horizontal polarizations, and approximately 62 × 35, 62 × 35, 42 × 24, 22 × 14, 19 × 11, 12 × 7, and 5 × 3 km spatial resolution at 6.9, 7.3, 10.7, 18.7, 23.8, 36.5, and 89.0 GHz, respectively. Among these, we used the four frequencies (36.5, 23.8, 18.7, and 10.7 GHz) mostly used for the estimation of LST in the previous studies [30]. Low frequency data resampled into a 10 km resolution were downloaded from the Japan Aerospace Exploration Agency (https://gcom-w1.jaxa.jp) for 2013–2018. We used daily MODIS daytime and nighttime Aqua LST data (MYD11A1) because the equatorial-crossing times of Aqua MODIS are nearly the same as those of AMSR2 (1:30 p.m.–daytime and 1:30 a.m.–nighttime). The MYD11A1 LST data, which have 1 km spatial resolution, were retrieved using a generalized split-window algorithm [33]. The MYD11A1 products from 2013–2018 were downloaded from Earthdata Search (https://search.earthdata.nasa.gov/search). South Korea's elevation was retrieved from the shuttle radar topography mission (SRTM) digital elevation model (DEM), with 30 m spatial resolution (https://earthexplorer.usgs.gov). Global man-made impervious surface (GMIS) data with 30 m spatial resolution derived from Landsat images for the year of 2010 [34] were obtained to get the fractional impervious surface in this study.

#### *2.3.* In Situ *LST Data*

In situ LSTs (1 a.m./p.m. and 2 a.m./p.m.) from 2013 to 2018, obtained from the automated surface observing systems (ASOSs) operated by the Korea Meteorological Administration, were used as reference data. As shown in Figure 1a, a total of 10 ASOSs were selected based on the following conditions: First, the stations close to the coastline were excluded, because satellite-based LST data could be contaminated from the influence of ocean water included in the grid; and second, the stations that have high bias were also excluded from the reference data, after applying the bias-correction method described in Section 3.1.
