2.3.2. ERA5

ERA5 is the most modern reanalysis produced by ECMWF, using a recent version of the ECMWF Integrated Forecasting System. The data cover the Earth on a 30 km grid and resolve the atmosphere using 137 levels from the surface up to a height of 80 km. It uses a vast number of observations, including several reprocessed datasets [32]. Although ERA5 is currently in production, the data for the period of this study is already available; top 7 cm soil moisture fields have been extracted for the selected target sites and the 6-year study period. After averaging 00 UTC and 12 UTC to obtain the daily values, 1-day time series were used to construct 18-day temporal average fields every 5 days.

#### *2.4. Ground-Based Soil Moisture*

In-situ soil moisture from REMEDHUS (target site E, see Figure 1, Table 1) was obtained through the International Soil Moisture Network [33]. The REMEDHUS soil moisture monitoring network is composed of 23 automated stations deployed within an area of 1300 km2 in a semi-arid sector of the Duero basin in Spain. Each station is equipped with capacitance probes providing hourly measurements over the top soil 5 cm with a reported accuracy of 0.003 m3·m<sup>−</sup>3. This network has been continuously operating and quality-controlled since 2005 and is therefore ideally suited for validation of multi-year satellite time series. Further details on the network can be found in [12]. For the period of study, data from 17 REMEDHUS stations were available. Data from these stations was first daily averaged and then used to construct 18-day temporal average fields every 5 days (see Section 2.5).

#### *2.5. Temporal Averaging and Filtering of SMOS Data*

SMOS L3 daily SM maps need to be pre-processed to ensure smooth spatio-temporal transitions and representative soil moisture states in the climatology. To this aim, outliers were first detected and screened out from the 1-day maps. This was done by comparing the SM retrieved at each pixel with the retrievals embedded within the one-degree box centered at the considered pixel. Any pixel value failing to pass the Tukey outlier test [34] were removed from the dataset. In the next step, the filtered 1-day SM maps were used to construct 18-day temporal average fields every five days. Although the daily SMOS SM maps are useful for the monitoring and evaluation of episodic events, we opted for an average period of 18 days to increase the spatial coverage of the resulting maps, reduce random retrieval errors and filter out high-frequency modes that can be considered as noise when calculating climatological averages. The temporal window of 18-day is the closest to SMOS repeat cycle and the one generally chosen to avoid orbital artifacts (e.g., [35]).

As shown in detail by Robock et al. [36], there are two distinct scales that determine the variations of SM in time and space. The small scale, referred to as hydrological or land surface related scale, is on the order of days and tens of meters. Soil moisture can vary on this scale due to variations of soil properties, vegetation, and topography or drainage patterns. This small-scale variability is intertwined with a much larger scale on the order of weeks-to-months and tens-to-hundreds of kilometers that is mainly due to atmospheric forcing. Microwave satellite measurements integrate over relative large-scale areas on the order of ∼25 km with a typical revisit of 3-days. At these spatio-temporal scales, the short-term (up to 3 days) and small-scale (tens of meters) land surface component of soil moisture variability appears as random (white) noise in comparison with the long-term (about 1–4 months) and large-scale (about 400–800 km) signal related to atmospheric forcing.

Examples of the time series of the original 1-day SM, together with the 18-day average, are shown in Figure 3 for the 8 target sites, which cover a variety of vegetation seasonality and climatic conditions (Figure 1, Table 1). Dots represent the daily values and the solid line corresponds to the 18-day average every five days. It can be seen that the variability of the daily signal is captured in the filtered time series, where the six SM annual cycles can be clearly identified. Difference in magnitude and extent of rainy seasons among years are more clearly distinguished in the filtered series (e.g., sites E and H). Also, notice the opposite timing of wet and dry seasons in Northern and Southern Africa (sites F and G), which reflect the displacement of the Inter-Tropical Convergence Zone (ITZC). Site D exhibits limited temporal variability in both the original and the filtered series.

**Figure 3.** Time series of the 1-day SMOS-based soil moisture retrievals (dots) and overlapped the 18-day average every five days (solid line) at the target locations.

Notice that not all the 25-km pixels in this 18-day averaged SM fields are continuously observed. SMOS's wide swath (1000 km) and polar orbit allow for a 3-day global revisit period. However, there is an important amount of missing data due to the presence of radio frequency interferences masking L-band measurements, particularly in South-East Asia [27]. In addition, no retrievals are attempted in areas of high topography (e.g., Himalaya) and in soils covered by snow. The latter strongly reduces the

data availability in high latitudes and especially during the fall and winter seasons. The global map of Figure 4 shows the SMOS temporal coverage for the study period, with 1 representing a 100% coverage of the filtered and temporally-averaged SMOS signal. In this study, only the pixels with a minimum of 80% temporal coverage were used in order to ensure representativeness and robustness of the signal decomposition and of the results presented. As we will show later in the study, this threshold does not exclude areas with limited data in winter due to snow. The impact of these "intermittent" data gaps in the STL decomposition is discussed in Sections 3 and 4.

**Figure 4.** Global map of SMOS temporal coverage during the study period.
