**2. Data Description**

This section introduces four SSM products derived from SMAP, two SSM products derived from SMOS, the in situ SSM measured by REMEDHUS network, and other ancillary information that has been used in this work. The data products used are summarized in Table 1 and described in the following subsections.


**Table 1.** Summary of the data products used in this study.

#### *2.1. Soil Moisture Data*

#### 2.1.1. NASA SMAP Products

SMAP is a NASA mission within the Earth System Science Pathfinder (ESSP) program. The mission was launched in January 2015 with the main goal of measuring the SSM and the freeze/thaw state of the soil with high spatio-temporal resolution and global coverage [2]. The data products of this mission serve applications in many disciplines, including hydrology, weather and climate, meteorology, environmental sciences, agriculture, human health, and security [2,23]. Its scientific requirements are to provide estimates of soil moisture of the soil top 5 cm with a target accuracy of 0.04 m3/m<sup>3</sup> and a spatial resolution of 10 km every 3 days over continental land, excluding areas with standing water, high vegetation content (>5 kg/m2) or frozen ground as well as urban or mountainous areas.

Three SMAP SSM products were investigated in this study: the SMAP L2 Radiometer (SMAPL2) with a spatial resolution of 36 km [24], the SMAP Enhanced L2 Radiometer (SMAPL2\_E) with a gridding of 9 km [25] but still at the radiometer resolution (~40 km) and the SMAP/Sentinel-1 L2 Radiometer/Radar (SMAP\_AP) with a spatial resolution of 3 km (SMAP\_AP3) and also at 1 km (SMAP\_AP1) [26].

The SMAPL2 is a radiometer-only based SSM product derived directly from the SMAP Level-1C TB (L1CTB) product in a 36 km Equal-Area Scalable Earth Grid 2.0 (EASEv2) grid. To obtain the SMAPL2 from the L1CTB, the Single Channel Algorithm at vertical polarization (SCA-V) is used [27]. In addition to SSM and TB observations, the ancillary data required to apply the retrieval algorithm is included in the product, namely surface temperature, vegetation opacity, vegetation single scattering albedo, surface roughness, land cover information, soil texture, together with data flags for identification of land, water, precipitation, radio frequency interference, urban areas, mountainous terrain, permanent ice, snow, and dense vegetation [27–29].

The SMAPL2\_E is derived from the SMAP Level-1C TB Enhanced (L1CTB\_E) product and contains SSM and TB data, which are previously interpolated using Backus-Gilbert at TB level. This optimal interpolation technique takes advantage of the SMAP radiometer oversampling to generate an enhanced version of the TB that is posted on a 9 km grid. The SCA-V is applied to these TB data to obtain the SSM retrievals [30].

The SMAP\_AP is generated by merging the SMAP radiometer with Sentinel 1A/1B data through a recently developed active/passive downscaling Algorithm [12] (1). It allows to disaggregate the SMAP TB from a resolution of 36 km to 3 km or 1 km (depending on filtering speckle noise) [31].

$$T\_{B\_{\mathcal{P}}}\left(\mathcal{M}\_{\hat{\jmath}}\right) = \left[\frac{T\_{B\_{\mathcal{P}}}(\mathbb{C})}{T\_S} + \beta'(\mathbb{C}) \cdot \left| \left[\sigma\_{pp}(\mathcal{M}\_{\hat{\jmath}}) - \sigma\_{pp}(\mathbb{C})\right] + \Gamma \cdot \left[\sigma\_{pq}(\mathbb{C}) - \sigma\_{pq}(\mathcal{M}\_{\hat{\jmath}})\right] \right|\right] \cdot T\_s \tag{1}$$

where *M* (medium) and *C* (coarse) are the different spatial resolutions at which the variables are used, *Ts* is the land surface temperature, β is the active-passive microwave covariation parameter [12], σ*pp* and σ*pq* are the radar backscatter with co-pol and cross-pol, respectively, and Γ represents the vegetation heterogeneity within a pixel with C resolution. The SSM at 3 km (or 1 km) is retrieved after applying the SCA-V to the disaggregated TB.

Descending orbits (06:00 am) of all the SMAP products were used in this study, since they have the same local time of ascending orbits of the SMOS products.

#### 2.1.2. BEC SMOS Products

The SMOS satellite was launched in November 2009, and it is the second Earth observation mission of ESA's Living Planet program [32,33]. After 10 years in orbit, many studies have contributed to understand and improve the quality of SMOS soil moisture products. This mission was designed to observe both soil moisture and ocean salinity, as required by climatological, meteorological, hydrological, and oceanographic applications. The SMOS instrument, the Microwave Imaging Radiometer with Aperture Synthesis (MIRAS), is the first L-band (1.4 GHz) interferometric radiometer on space. It provides global views of the Earth at multiple incidence angles (from 0◦ to 65◦) with a spatial resolution of 35–40 km and a temporal resolution of 3 days [34].

The SMOS Level 3 (L3) and 4 (L4) SSM products used in this study are provided by the BEC [35], an ESA Expert Support Laboratory (ESL) of SMOS L1 and L2 ocean salinity. The BEC SMOS L3 SSM product (SMOSL3) is generated directly from the L2 SSM after discarding invalid retrievals by means of applying quality filters to each grid point. Later, a weighted average based on a data quality index is used to bin the data from the Icosahedral Snyder Equal Area (ISEA) to the 25 km EASEv2 grid [22].

The BEC SMOS L4 SSM (SMOSL4) product is derived from the SMOSL3 using a semi-empirical downscaling algorithm (2) which links the SSM with the TB, a vegetation index, and the LST [20,21]

$$SSM = b\_0 + b\_1 \cdot LST + b\_2 \cdot NDVI + \frac{b\_3}{3} \cdot \sum\_{i=1}^{3} T\_{BHO\_i} + \frac{b\_4}{3} \cdot \sum\_{i=1}^{3} T\_{BVO\_i} \tag{2}$$

where *TBH* and *TBV* are the TB at horizontal and vertical polarizations, respectively, at three different incidence angles (32.5◦, 42.5◦, and 52.5◦). The *b* parameters represent the downscaling factors associated to each variable. The downscaling is applied daily and the resulting L4 SSM maps are posted on the MODIS 1 km grid.

Ascending orbits (06:00 am) were selected for all the SMOS products used in this study.

#### 2.1.3. REMEDHUS Network

The Soil Moisture Measurements Station Network of the University of Salamanca (REMEDHUS) is an in situ network located in the central part of the Duero basin (41.1◦ to 41.5◦N; 5.1◦ to 5.7◦W). It contains 20 soil moisture monitoring stations that provide information at different depths (here we are using exclusively the topsoil data at 5 cm depth), and four automatic weather stations that measure precipitation, air temperature, relative humidity, wind speed, and solar radiation [36]. These stations are located within a nearly flat area of 1300 km2 in a semi-arid Continental-Mediterranean agricultural region. This area receives an average annual precipitation of 385 mm, and it has a mean temperature of 12 ◦C [37]. Most of the region is dedicated to grow rainfed cereals, as shown in Figure 1. Other land uses within this area: irrigated crops, fallow, vineyards, or forest-pasture. The stations record the SSM data every hour, aggregated to a daily average [38] for this study.

**Figure 1.** CCI land cover map (at 300 m) over the Iberian Peninsula (left) and a close-up of the REMEDHUS area (right). Black dots depict the 20 in situ SSM stations of the REMEDHUS network available for the study period (from April 2015 to December 2017). The distribution of the land cover within the REMEDHUS area is: agriculture, 95.45% (cropland, 75.44%; irrigated, 16.11%; other, 3.90%); forest, 2.70%; grassland, 0.63%; wetland, 0%; settlement, 0.26%; and other, 0.95%.

#### *2.2. Ancillary Data*

Climate Change Initiative: Land Cover

The ESA Climate Change Initiative (CCI) program includes a variety of biological, physical, and chemical variables known as ECV. Here the CCI land cover (LC) is used, which provides information of the geographical distribution of global land cover at a resolution of 300 m [39,40]. The CCI LC from year 2015 will be used in this study to characterize the dominant land cover within each SMOS/SMAP pixel. Minimal differences were observed on the CCI LC over the study region during the period 2013–2017.

#### **3. Methodology**

#### *3.1. Statistical Analysis of SSM Time Series at the Network Scale*

Ground-based SSM measurements from REMEDHUS have been selected as a benchmark for a cross-validation of the multi-scale remotely sensed SSM products. REMEDHUS stations were placed by the Water Resources Research group of the University of Salamanca (responsible for the maintenance of the network) in areas in which the land use, during the years from 2015 to 2017, were the following: fallow, rainfed, forest-pasture, vineyard and irrigated. A thorough analysis of the 20 operational in situ stations available during the study period and their comparison to satellite data was performed. For the sake of clarity and simplicity, in this work we will focus on 11 of them (see Table 2). They cover the five land uses -and therefore allow studying the impact of land use on the downscaling productsand also provide a good spatial representation when averaged at the network scale.

In this first step of the analysis, we used the data provided by six stations (H13, H9, J3, K13, N9, and O7) representative of the five different land uses over the REMEDHUS network. The SMAP and the SMOS time series of the pixels overlapping these stations have been statistically evaluated with the in situ SSM at two spatial levels, at low resolution (from 9 km up to ~40 km) and at high resolution (3 km and 1 km). Performance metrics, such as the Pearson's correlation (R), the root mean square error (RMSE), the unbiased root mean square error (uRMSE) and the bias, together with the number of available samples (N), have been computed for each station-pixel pair. These performance metrics have been calculated exactly as described in [41].


**Table 2.** Land use of the region where 11 in situ stations, of the REMEDHUS network, were located, Figure 2015. 2016, and 2017 (provided by the Water Resources Research group of the University of Salamanca). The land uses are: fallow (F), rainfed (R), forest-pasture (FP), vineyard (V), irrigated (I).

Since rainfed is the most common land cover type within the REMEDHUS area (see Figure 1), the second step of the analysis consisted in reproducing the same statistical evaluation, but using only the dataset of the stations located over rainfed/fallow land uses (F11, H13, J12, J14, K10, M9, and O7). The average value of all these rainfed/fallow stations were compared to the average of the respective SMAP and SMOS pixels covering these stations.

Additionally, statistical scores have been obtained for all seasons (DJF: December, January, February; MAM: March, April, May; JJA: June, July, August; SON: September, October, November). This analysis is needed to evaluate whether the precision (R), accuracy (bias) and quadratic errors (RMSE/uRMSE) of the studied products/methodologies have any seasonal dependence.
