*2.4. Soil Moisture Retrieval Algorithm*

In previous spaceborne GNSS-R soil moisture retrieval studies, the primary methodology is to establish the relationship between GNSS-R derived land surface reflectivity and reference truth SM values, which assumes that coherent component dominates GNSS-R land scattering field. In the theory of surface electromagnetic scattering, the surface reflectivity is the function of the incidence angle of the incoming signal and the Fresnel reflection coefficient; the latter one is mainly affected by the near-surface SM [1]. Figure 3 simulates the relationship between reflectivity and SM at different incidence angles with the solid line, where the semi-empirical Dobson model is used to mapping the relationship between soil moisture and complex permittivity [36]. The surface reflectivity increases monotonously with soil becoming wetter, and the response of reflectivity to the change of SM from 0.0 cm3/cm3 to 0.7 cm3/cm3 can reach 10 dB. The effect of the incidence angle on the mapping relationship between SM and reflectivity is negligible when the incidence angle is less than 60◦. In our CYGNSS SM inversion experiements, the DDM peak value of coherent scattered power is picked in the CYGNSS level-1 data as the left term of Equation (7). Since the small scale roughness and upwelling vegetation cover can attenuate the scattering signal, the roughness and vegetation correction in Equation (7) directly use the roughness coefficient and VOD parameter provided in SMAP product for individual observation. Although the influence of the signal incidence angle is small, the method proposed in [25] is still used in this work. The effect of the incidence angle correction is represented by the dashed lines in Figure 3 as well.

**Figure 3.** The relationship between reflectivity and soil moisture under different incidence angles of GPS signal and the performance of incidence angle correction.

Due to the pseudorandom distribution of CYGNSS measurement, the influence of observation noise, and the spatial difference of surface roughness and vegetation cover at the specular point, currently, it is difficult to directly establish a reliable SM retrieval model at the GNSS-R specular point modeling all these factors, the optimal approach is to improve the SNR of reflectivity using the space–time-averaging method to form the gridded retrieval model [18]. Since the SM reference data used in this work is from the SMAP level-3 version 6 product, the individual CYGNSS reflectivity calculated with Equation (8) within one day will be projected into a global cylindrical 36 km × 36 km EASE-Grid 2.0 grid to align with the reference SM values, the average reflectivity is picked as the grid value. Here, we set a data quality control criterion; if the count of projected reflectivity at the grid is less than three, the corresponding grid observation will be considered invalid on that day. Next, the time matching is used to combine the gridded reflectivity and SMAP SM to establish the training dataset and mask the pixels in the SMAP SM data flagged with inland water and urban areas. Finally, the retrieval model is fitted at each grid. Usually, the variation range of local surface soil moisture is limited in a year; the linear model can achieve high modeling accuracy. Therefore, the training samples are used to fit the linear model between mean reflectivity and reference SM values pixel-by-pixel:

$$\text{SM}\_{i,j}^{\text{CYGNSS}} = a\_{i,j} \Gamma\_{i,j} + b\_{i,j} \tag{15}$$

where the *a* and *b* are the pending parameters of the model. *i* and *j* are the grid location in the 36 km × 36 km EASE-Grid 2.0 grid. Γ is the grid mean reflectivity after space-time average processing.
