*3.6. Downscaling Process*

This study employed four ML techniques, including RF, XGBoost, LGBM, and GA-BP, to downscale CYNGSS SM retrieval to 3 km, respectively. Each of the proposed downscaling methods operates on a common principle: they create a statistical link between CYGNSS, geospatial variables (such as elevation and land cover type), land-surface variables (such as NDVI and precipitation), and SMAP SM at a coarse resolution of 36 km. In addition, SMAP SM was used as the reference value of SM, predicted by the downscaling model. Finally, the output covariates of the input variables were linked by using the following equation:

$$\text{SM} = f(\rho\_1, \rho\_2, \rho\_3, \dots, \rho\_n) + \varepsilon \tag{11}$$

where SM represents the downscaled SM data, which is determined by the regression function of the machine learning models (RF, XGBoost, LGBM, and GA-BP), *ε* is the model retrieval error, *ρ*1, *ρ*2, *ρ*3, ... , *ρ<sup>n</sup>* represent the input covariates (i.e., SNR, SR, LES, TES, NDVI, DEM, land cover type, and precipitation). The total number of predictors is represented by *n*. The steps of the downscaling method mentioned above can be briefly summarized as follows:


using in situ data. Spatial analysis of the downscaled CYGNSS SM is conducted using MODIS EVI and MODIS EV products.

**Figure 3.** Daily sampling counts of CYGNSS and their corresponding matched counts with SMAP.

The experimental process is based on the assumption that the spatial scale relationship among SMAP SM, CYGNSS observables, and auxiliary variables maintains consistency. In other words, the relationship models established at a coarse resolution are still applicable at a high resolution [39,53,54]. The above experimental process is shown in Figure 4.

**Figure 4.** Flow chart of downscaling procedure.
