2.1.2. Data from Land Surface Model

As target data, we use the CLDAS soil volumetric water content analysis product, which is published by China Meteorological Data Service Centre. Comparing the qualitycontrolled SM observation data from automated monitoring stations in China with the CLDAS-V2.0 soil volumetric water content data shows that the CLDAS soil volumetric water content product fits the actual ground observation data [24], with a national regional average correlation coefficient of 0.89, a root mean square error (RMSE) of 0.02 m3/m<sup>3</sup> , and a deviation of 0.01 m3/m<sup>3</sup> . Therefore, CLDAS SM products are considered of higher quality than similar international products (such as the GLDAS and NLDAS products), and they also offer better spatial and temporal resolution. In addition, we use the SM data from the SM ground model GEOS-5 as the input product for triple collocation (TC). The GEOS-5

model provides hourly products, and this study uses the time of 05:30 for the surface SM dataset to represent the SM between 0 and 7 cm within the surface layer [25].

#### 2.1.3. MODIS Data and Terrain Data

The auxiliary input data included land surface temperature (LST) and vegetation index (VI), and the VI data further included the enhanced VI (EVI) and the NDVI [26]. The daily LST data were provided by the MODIS-Terra LST product (MOD11A1) with a spatial resolution of 1 km. The EVI and NDVI data were provided by the MODIS-Terra VI products (MOD13A2) and have a temporal resolution of 16 days and a spatial resolution of 1 km. This study also uses data from the MODIS-Terra surface reflectance product (MOD09A1), with a temporal resolution of 8 days and a spatial resolution of 500 m. The annual land cover data (MCD12Q1) were also used here. The specific spatial distribution of vegetation cover is shown in Figure 2. In addition, we use the results of the SRTM 90-m digital elevation model (DEM) data and the calculated slope data (SLOPE) based on the DEM data for the study area. Table 2 summarizes the datasets used. The above data were obtained from the land processes distributed active archive center (https://lpdaac.usgs.gov/, accessed on 29 March 2021).

**Figure 2.** Map of vegetation cover (MCD12Q1).



#### 2.1.4. In Situ Observations

The measured data used in this paper are mainly divided into SM data and precipitation data from stations in Qinghai area. SM data are obtained from a soil depth of 10 cm and in time intervals of hours from six automated SM stations (Delingha, Dulan,

Golmud, Nomuhong, Tianjun, and Wulan). The time interval for precipitation data is one day, and the areas contain the seven stations Yeniugou, Xiaozaohuo, Dachaidan, Chaka, Wudaoliang, Xinghai, Qumarai.

#### 2.1.5. Data Preprocessing

Due to snow and ice coverage in winter, active and passive microwave data were not available from December to March in the study area because snow cover and frozen soil typically cover the land surface, which might introduce large biases in satellite-retrieved products such as SM [27]. Therefore, this study uses the satellite data and ground station data for Qinghai Province (31◦–40◦ N, 89◦–103◦ E) from 1 April 2017 to 30 November 2017 and from 1 April 2018 to 30 November 2018. To apply satellite observation data and SM data in the NN model, all data were resampled to a grid of 0.25◦ × 0.3125◦ . SMAP and AMSR2 data were treated by using bilinear interpolation, ASCAT and FY3C data by the inverse distance weighted method, and CLDAS, MODIS, and terrain data by simple average aggregation. Since the passive microwave datasets (SMAP, AMSR2, and FY3C) include both ascending and descending orbits, we processed the data from these orbits separately. In addition, the brightness temperature data of the passive microwave data could be divided into vertical and horizontal polarization channels, based on which the microwave polarization difference index (MPDI) is calculated as [28]

$$\text{MPDI} = (\text{Tbv} - \text{Tbh}) / (\text{Tbv} + \text{Tbh}), \tag{1}$$

where Tbv and Tbh are the brightness temperature of the vertical and horizontal polarizations, respectively. Based on the assumption that the microwave channel is not subject to strong atmospheric attenuation, the MPDI is designed to eliminate the influence of surface temperature on microwave signals. In addition, it is a normalized polarization difference, which can serve as an indicator of SM status as a function of incident angle. In addition, the MPDI is sensitive to the dielectric properties of soil, and even more so to the surface roughness. Therefore, the MPDI is high for flat surfaces but relatively low for rough surfaces, such as areas with vegetation cover. In the preprocessing of the ASCAT data, the σ40 time series of each grid was renormalized to the range of [0 1], which means that the highest (lowest) backscatter value measured in this study is assigned the value 1 (0). The backscatter time index obtained from this preprocessed ASCAT data is abbreviated "BTI" [29]. This processing method emphasizes the time mode of the ASCAT signal and has been shown to reduce the retrieval time. However, since the processing is performed on the grid, it might reduce the spatial information provided by the radar.

#### *2.2. Method*

#### 2.2.1. Triple Collocation Method

The traditional error estimation method typically compares retrieved SM data with actual observation data obtained from ground stations. However, such comparisons are usually limited in number and location of instrument verification points, which makes it difficult to ensure robust datasets. In addition, the spatial mismatch between the ground data and the remote-sensing satellite data, as well as the heterogeneity of the ground surface, lead to representative errors and scale-conversion errors. Therefore, we used TC analysis [30–32] to estimate SM error. Compared with the traditional method to estimate SM error, it (1) does not require a high-quality reference dataset, which means that it can verify the three different SM data in the study area without ground measurement data. (2) Triple collocation simultaneously obtains the error variances of the three different SM data and (3) avoids the representative error caused by the spatial mismatch between the ground measurement data and the remote-sensing satellite data in the traditional estimation method. (4) Finally, the improved extended TC method [33] detects correlations between the retrieved SM data and the actual surface layer SM data. The error variance is expressed by [33]

$$\begin{aligned} \sigma\_{\varepsilon\_{\mathbf{Y}}}^2 &= \sigma\_{\mathbf{Y}}^2 - \frac{\sigma\_{\mathbf{Y}\mathbf{X}}\sigma\_{\mathbf{Y}\mathbf{Z}}}{\sigma\_{\mathbf{X}\mathbf{Z}}},\\ \sigma\_{\varepsilon\_{\mathbf{Y}}}^2 &= \sigma\_{\mathbf{Y}}^2 - \frac{\sigma\_{\mathbf{Y}\mathbf{X}}\sigma\_{\mathbf{Y}\mathbf{Z}}}{\sigma\_{\mathbf{X}\mathbf{Z}}},\\ \sigma\_{\varepsilon\_{\mathbf{Z}}}^2 &= \sigma\_{\mathbf{Z}}^2 - \frac{\sigma\_{\mathbf{Z}\mathbf{X}}\sigma\_{\mathbf{Z}\mathbf{Y}}}{\sigma\_{\mathbf{X}\mathbf{Y}}}.\end{aligned} \tag{2}$$

where σ 2 X is the variance of X, and σXY is the covariance of X and Y. The correlation coefficient is [33]

$$\mathbf{R}\_{\mathbf{X}} = \sqrt{\frac{\mathbf{σ\_{XYZ}}}{\mathbf{σ\_{XYZ}^2}}},\tag{3}$$

where R<sup>X</sup> is the correlation between X and the unknown true SM state.

In this study, the SMs retrieved by the NN method, collected from satellite data, and obtained from the ground model GEOS-5 are triple-matched in the form [NN data, satellite data, reanalysis data] to estimate the error variance and correlation coefficient of the TC. The SM obtained from the NN method is evaluated based on these results.

#### 2.2.2. Evaluation Index

This study uses spatial correlation, temporal correlation, root mean square error, and mean absolute error to quantitatively analyze the aspects that differentiate SM retrieved by the NN and SM obtained from a model, as well as aspects that differentiate SM retrieved by the NN and SM collected from ground stations.

#### (1) Spatial correlation: ρspatial

Spatial correlation serves to evaluate the accuracy with which the spatial model retrieves SM. It is obtained by calculating the Pearson correlation coefficient between the retrieved SM, which produces a daily correlation value between the whole area and the simulated SM map. For a better comparison, the average spatial correlation is calculated as the average of all daily spatial correlations with a significance greater than 95%.

#### (2) Temporal correlation: ρtemporal

Temporal correlation is used to evaluate how well the retrieved SM matches the temporal variations in the SM. It is a location-related metric calculated at the pixel level. The Pearson correlation between the retrieved time series and the modeled SM is calculated for each pixel, which gives a correlation map. The mean temporal correlation is the mean value of all the pixels in the temporal correlation map.

(3) Root mean square error: RMSE

The RMSE is calculated based on the unit error and the deviation from a reference of the unit error. Therefore, it provides a comprehensive assessment of recalculation, including the accuracy and precision of data retrieval. The RMSE is calculated at the pixel level by using the original SM time series, and a map of the RMSE is obtained for each retrieval. The mean RMSE is the mean value of all the pixels in the RMSE map.

(4) Mean absolute error: MAE

The MAE is the absolute error between the retrieved SM and the simulated SM. It is calculated at the pixel level, and each search generates a map. The MAE correlation is the mean value of all the pixels in the MAE map.2.2.3. Downscaling scheme based on neural network.

A NN [34–36] is essentially a system to do nonlinear mathematical calculations and can represent any complex nonlinear process. The multivariable nature and nonlinear ability of NN fully exploit the synergy between different data. The NN used in this study has three parts: (1) an input layer, which receives the satellite observation data and auxiliary variable inputs; (2) a hidden layer; and (3) an output layer, which provides the SM. This structure suffices to fit any continuous function.

The NN was trained with satellite observation data as input data and the corresponding ground model data as target data. The training dataset must represent the entire range of expected scenarios, which means that it must include all climate regimes and seasons. If the training data are well selected, the NN's performance when applied to the training data should differ little from its performance when applied to the entire dataset. Similarly, a NN should perform in the same way when applied to two sufficiently representative but completely different target datasets, meaning that any potential local or regional bias in the target data is corrected. These characteristics can be traced to the fact that the estimated spatial-temporal structure of the NN is determined by satellite observations instead of by target data [37]. In addition, the NN correlates the satellite observations with the most common SM among the input values in the target data, regardless of the location or acquisition time of the data [29].

The NN constructed in this study uses the Levenberg-Marquardt (LM) [38,39] training algorithm and applies error backpropagation [40] to update the weights. Since the LM algorithm stops when it finds a local minimum, the error surface is not fully explored. Therefore, in this study, the NN training was repeated four times, each time using random initial NN weights to ensure different starting points on the error surface; the optimal NN was selected for retrieving SM products.

The key step in downscaling SM in this study is to build a statistical relationship using low spatial-resolution data, and then input high-spatial-resolution data into the statistical relationships to obtain the downscaled SM. [26,41,42] The spatial scale of different data is unified and scaled by different resampling methods, as shown in Section 2.1.5. In particular, the low-spatial-resolution microwave data are resampled to 1 km spatial resolution by replication expansion, without changing the specific values, to make them consistent with the spatial resolution of MODIS and other auxiliary data. Figure 3 shows a flow chart of this process, which is described as follows:


**Figure 3.** Schematic of the proposed SM downscaling method.

#### **3. Results and Discussion**

#### *3.1. Selection Microwave Band*

By studying the quality of data retrieved from various satellites and in different bands, we identified the SM data from different sensors. This exercise was done using the CLDAS SM dataset as reference data. Although the NN model was trained on a small subset of the available dataset, the entire dataset was used for retrieval and evaluation. Table 3 summarizes the average quality index of the SM calculated by comparing a single microwave input dataset with the target SM dataset (CLDAS). Table 4 summarizes the average quality index of the SM calculated by comparing a combination microwave input dataset with the target SM dataset (CLDAS). Below, we discuss in detail the results of using different satellites and different bands.


**Table 3.** Average quality index for soil moisture calculated by comparing a single microwave input dataset with data from CLDAS.

**Table 4.** Average quality index for soil moisture calculated by comparing different microwave observation combinations with data from CLDAS.


SMAP\_TBV\_A: Ascending Tbv in the 1.41 GHz (SMAP) band; AMSR2\_TBH\_D: Descending Tbh in the 10.7 GHz (AMSR2) band; AMSR2\_MPDI\_D: Descending MPDI in the 10.7 GHz (AMSR2) band; σ40: σ40 from ASCAT; BTI: BTI from ASCAT.

> As shown in Table 3, in the 1.41 GHz (SMAP) band, Tbv has a higher spatial sensitivity to SM than Tbh, and the quality of Tbv in the ascending orbit exceeds that of Tbv in the descending orbit, with the average spatial correlation increased by 0.014 and the average temporal correlation increased by 0.087, and RMSE and MAE decreased by 0.001 and 0.002, respectively. In the 6.9, 7.3, and 10.7 GHz bands, Tbh is more sensitive to SM than is Tbv. In addition, the MPDI obtained from preprocessing in these bands also has greater spatial sensitivity to SM. Based on the AMSR2 microwave data, the 10.7 GHz band produces greater spatial and temporal correlation and lower RMSE and MAE in Qinghai Province compared with the 6.9 and 7.3 GHz bands and FY3C's 10.7 GHz band, which indicate a higher sensitivity to SM. In addition, the experiments show that Tbh and MPDI are highly similar in terms of spatial distribution in the 6.9, 7.3, and 10.7 GHz bands but differ significantly from Tbv in the 1.41 GHz band, which leads to the assumption that complementary relationships exist between them. The processed BTI data were also more sensitive to soil moisture than the original σ40 data, with an increase of 0.01 in the average spatial correlation and an increase of 0.002 in the average temporal correlation, whereas the RMSE and MAE decreased by 0.002 and 0.001, respectively. Finally, the best microwave band combination in Qinghai province was selected by joint retrieval of Tbv in the ascending orbit of 1.41Ghz (SMAP) band, Tbh and MPDI in the descending orbit of 10.7Ghz (AMSR2) band, and BTI, σ40 data of ASCAT. Figure 4 shows the raw images of

these five single bands, a map of daily average SM as obtained by the NN, and a map of the temporal correlation between the NN SM and the CLDAS SM, respectively.

**Figure 4.** From top to bottom: (**a**) Ascending Tbv in 1.41 GHz band (SMAP); (**b**) Descending Tbh in 10.7 GHz band (AMSR2); (**c**) Descending MPDI in 10.7 GHz band (AMSR2); (**d**) and BTI data from ASCAT; and (**e**) σ40 data from ASCAT. From left to right: original image, map of daily average soil moisture map, as obtained by the NN, and map of temporal correlation between NN soil moisture and CLDAS soil moisture.

Figure 4 shows the SM monitoring capability of different wavebands in Qinghai Province. The first row of Figure 4 shows that the original Tbv data at 1.41 GHz indicate higher temperatures in bare land and forest areas than in grassland areas, and both bare land and forest areas have lower daily average SM, which means that more information is needed to distinguish forest areas from bare lands when retrieving SM. Meanwhile, lake areas, such as the Qinghai Lake area in the northeast, are coolest. The temporal correlation map indicates a weak negative correlation in hotter areas (higher Tbv), such as the bare lands in the northwest (Qaidam Basin) and the bare lands in the northeast corner, and a strong positive correlation in cooler areas. The original Tbh maps in the second row reveal a high sensitivity to vegetation; combining these with the maps of average daily SM shows that higher vegetation coverage and higher temperature results in greater SM. However, the poor distinction between the bare lands in the northwest and the mixture of bare lands and grassland in the southwest indicates that more information is needed to distinguish between these two areas. The observation of the MPDI image in the third row shows a high similarity in spatial distribution with the TBH in the second row. The BTI maps in the fourth row show higher BTIs in the grassland in the southeast and in the hinterland of the Qaidam Basin in the northwest, which reflects greater SM in the SM map, and the mixture of bare lands and grassland around the Qaidam Basin has a lower BTI, which reflect a lower SM. Therefore, the correlation map shows a negative correlation of BTI in the northwest corner of Qaidam Basin and a strong positive correlation for the mixture of bare lands and grassland in the southwest and the mixture of bare lands and grassland in the northeast. The observation of the σ40 image in the fifth row shows that its spatial distribution is highly similar to that of the NN SM and CLDAS SM time correlation map of BTI in the fourth row. However, the differentiation between different areas of vegetation cover is worse in the SM map.

Figure 5 shows that the spatial distributions of SM obtained by NN retrieval of the four different combinations of data are highly similar to each other, and the overall SM increases from northwest to southeast, which clearly distinguishes bare soil areas, bare soil and grassland mixed areas, grassland areas, and forest areas, making up for the lack of detection capability of the single microwave band in Figure 4. Figure 5b,d with BTI data added at the same time do a better job distinguishing bare soil areas compared with Figure 5a,c with σ40 added [i.e., bare soil areas in Figure 5b,d have lower SM values and are better distinguished compared with grassland areas in the same image]. The NN SM and CLDAS SM time-correlation maps show that all four images are poorly correlated with CLDAS data in the northwest bare soil region and in the southeast corner of the mixed forest-steppe region but achieve better correlation in all other regions. Figure 5a,b show greater positive correlation in the northeast region than do Figure 5c,d, which confirms that the 10.7 Ghz (AMSR2) band descending-orbit TBH is more capable of detecting SM information in Qinghai province than is the 10.7Ghz (AMSR2) band descending-orbit MPDI.

The results in Table 4 show that the combined SMAP\_TBV\_A and AMSR2\_TBH\_D produce higher-quality SM data than the combination of SMAP\_TBV\_A and AMSR2\_MPDI\_D. Taking the CLDAS soil moisture data as a benchmark, for SMAP\_TBV\_A and AMSR2\_TBH\_D the combination spatial correlation and temporal correlation reach 0.621 and 0.393, respectively, for SMAP\_TBV\_A and AMSR2\_MPDI\_D the spatial correlation and temporal correlation reach 0.600 and 0.354, respectively. Meanwhile, given the high similarity between Tbh and MPDI (see Figure 4), AMSR2\_TBH\_D is used as a final input variable. Furthermore, the addition of σ40 and BTI to the above NN reduces the RMSE and MAE. Compared with the original σ40 data, BTI data obtained after preprocessing translate into a greater temporal correlation between the SM obtained by the NN model and CLDAS data. And based on experience, active and passive microwaves have different sensitivities to SM, vegetation, and surface roughness. The 5.3 GHz observation frequency of ASCAT also differs significantly from that of SMAP (1.41 GHz) and AMSR2 (10.7 GHz). Therefore, the ASCAT dataset is considered as a potentially useful dataset that could compensate for the combination of passive microwave data in the NN. BTI is used as a final input variable.

**Figure 5.** From left to right: Map of daily average SM retrieved by applying the NN, map of temporal correlation between NN soil moisture and CLDAS soil moisture. From top to bottom: (**a**) NN input data: SMAP\_TBV\_A, AMSR2\_TBH\_D, and σ40. (**b**) NN input data: SMAP\_TBV\_A, AMSR2\_TBH\_D, and BTI. (**c**) NN input data: SMAP\_TBV\_A, AMSR2\_MPDI\_D, and σ40. (**d**) NN input data: SMAP\_TBV\_A, AMSR2\_MPDI\_D, and BTI. SMAP\_TBV\_A: Ascending Tbv in the 1.41 GHz (SMAP) band; AMSR2\_TBH\_D: Descending Tbh in the 10.7 GHz (AMSR2) band; AMSR2\_MPDI\_D: Descending MPDI in

the 10.7 GHz (AMSR2) band; σ40: σ40 from ASCAT; BTI: BTI from ASCAT.

Finally, the input variables are ascending Tbv in the 1.41 GHz (SMAP) band, descending Tbh in the 10.7 GHz (AMSR2) band, and BTI data from ASCAT.

#### *3.2. Selection of Auxiliary Data*

This section discusses the results of a collaborative analysis of microwave data and auxiliary input data for SM retrieval. The purpose is to determine the content and type of information that can be extracted from microwave data and auxiliary observation data and determine how to combine these data to provide maximal information for SM retrieval. Experimental trials were conducted to add and combine various auxiliary input data based on microwave data and to retrieve SM from different combinations of datasets using the NN model. These results are compared with the CLDAS data to determine the optimal combination (see detailed results in Table 5). In addition, for completeness, the SM products retrieved from all available data are compared among themselves.

**Table 5.** Average quality index of soil moisture calculated from auxiliary input variables compared with data from CLDAS.


The brightness temperature and backscattering coefficient obtained by active and passive microwave data are all affected by the opacity of vegetation cover, which reduces the radiation from the soil surface. Therefore, information about the vegetation strongly affects SM retrieval. Table 5 also shows that adding the VI data to the microwave data improves spatial and temporal correlations and reduces MAE and RMSE. Compared with EVI, NDVI improves the spatial correlation to 0.623. Given the high correlation between NDVI and EVI, NDVI is used as a final input variable.

Terrain data such as DEM and SLOPE also play an important role in the retrieval of SM by physical models. The complex mountainous terrain reduces the quality of the microwave data retrieved. In addition, precipitation is mainly concentrated at higher altitudes in many areas of Qinghai Province, leading to relatively lush vegetation cover, which strongly affects the SM. As a result, DEM and SLOPE are also used as final input variables. Table 5 shows that adding DEM to the NN model improves the spatial correlation to 0.634 and the temporal correlation to 0.412. When NDVI, DEM, and SLOPE are all added to the NN model, the spatial correlation reaches 0.676, and the temporal correlation reaches 0.450. The surface temperature information strongly affects the soil surface emissivity, which directly affects the brightness temperature and the backscatter coefficient. Table 5 shows that when a single auxiliary input variable for the NN model, adding LST produces the greatest improvement of the temporal correlation, which attains 0.441.

Compared with CLDAS data, the SM retrieved (see Table 5) from the combination of microwave data and auxiliary inputs NSDSI, NDVI, DEM, and SLOPE produces the highest spatial correlation of 0.684, whereas the temporal correlation is only 0.453. The SM retrieved from the combination of microwave data and auxiliary inputs LST, NDVI, DEM, and SLOPE produces the highest temporal correlation of 0.477, whereas the spatial correlation is only 0.663. The SM retrieved from the combination of microwave data and auxiliary inputs NSDSI, LST, NDVI, DEM, and SLOPE produces a spatial correlation of 0.669 and a temporal correlation of 0.475, which is the most balanced combination. Therefore, we use herein the microwave data and auxiliary inputs NDVI, DEM, SLOPE, LST, NSDSI, etc. as final input variables to obtain the daily average SM map of Qinghai Province, which is shown on the left side of Figure 6. On the right side of Figure 4 is shown the correlation map between SM data retrieved from the NN and CLDAS SM data.

**Figure 6.** (**a**) Map of daily average soil moisture retrieved by applying the NN to the combination of microwave data and auxiliary inputs NDVI, LST, DEM, SLOPE, and NSDSI. (**b**) Map of temporal correlation between the NN SM data (left side) and CLDAS soil moisture data.

Figure 6 shows that the overall SM in the entire study area increases from northwest to southeast. Also, the high positive correlation in the grassland areas and poor correlations in the Qaidam Basin (northwest corner) and the mixture of forest and grassland (southeast corner) show that the above input variables do not allow us to retrieve SM from bare lands and forest areas but do allow us to retrieve SM from grassland areas.

#### *3.3. Triple Collocation Method to Verify Soil Moisture as Determined by Neural Network*

To estimate how accurately the NN model determines the SM on a large scale, we apply TC to analyze the SM from the NN. TC estimates the distribution of spatial error for each dataset by locally solving the linear relationships between the three SM datasets. One of the assumptions is that the errors in all three datasets are independent, so the FY3C SM data, which were not used to train the NN model, are combined with the GEOS-5 ground-model SM data and the SM data used to train the NN model in the form of [NN SM, FY3C SM, GEOS-5 SM] for TC. Furthermore, to ensure the accuracy of the TC results, the areas with a correlation coefficient between the three different datasets less than 0.2 are masked and are not involved in the final TC calculation. Finally, the error variance and correlation coefficient are estimated between the NN data and the actual SM data.

The spatial distribution shown in Figure 7 of the variance in TC error indicates that the variance in error between the SM retrieved from NN and FY3C and the actual SM is lowest in the Qaidam Basin in the northwest, whereas the variance in the error between the SM retrieved from GEOS-5 and the actual SM in the Qaidam Basin area is significantly greater. Combining these results with the maps of the spatial distribution of TC correlation coefficients shows that the spatial distribution of the correlation coefficient between the SM retrieved from NN and FY3C and the actual SM correlates to the error variance, meaning that the areas with greater error variance correlate more to the actual SM data.

**Figure 7.** (**a**) Top row from left to right, Map of error variance for NN SM, FY3C SM, and GEOS-5 SM compared with actual SM data. (**b**) Bottom row from left to right, Map of correlation coefficient for NN SM, FY3C SM, and GEOS-5 SM compared with actual SM data.

Figure 8 show that the error variance between NN SM and the actual SM is much less than the error variance between (i) FY3C SM and GEOS-5 SM and (ii) the actual SM, with a median error variance of 0.0003 (NN) < 0.00017 (FY3C) < 0.00030 (GEOS-5). The correlation coefficient between (i) NN SM and FY3C SM and (ii) the actual SM is much greater than that for GEOS-5 SM, with a median correlation coefficient of 0.811 (NN) > 0.792 (FY3C) > 0.516 (GEOS-5). Among these three datasets, NN and FY3C have similar median correlation coefficients, but NN has Q1 = 0.681 and a lower-limit outlier of 0.338, which are much greater than for FY3C (Q1 = 0.594 and lower-limit outlier of 0.115). Therefore, after comprehensive analysis and comparison, the SM data retrieved by the NN model is of better quality for Qinghai Province.

**Figure 8.** (**a**) Box-whisker plots of error variance for NN SM, FY3C SM, and GEOS-5 SM compared with actual SM data. (**b**) Box-whisker plots of correlation coefficients for NN SM, FY3C SM, and GEOS-5 SM compared with actual SM data.

#### *3.4. Verification of Downscaled Soil Moisture from Neural Network*

In this study, the downscaled SM dataset for Qinghai Province and the map of the daily average SM (Figure 9) were obtained by inputting MODIS data with high spatial resolution and resampled microwave data into the NN model verified by the TC. To verify the adaptability of the downscaled SM data for Qinghai Province, we apply a correlation analysis where we compare the downscaled 1 km SM data, original SMAP SM data, original AMSR2 SM data, and CLDAS SM data with the SM data collected from six ground stations in Qinghai Province. In terms of data selection, for each time series, we use data from all available at ground stations and from CLDAS, SMAP, AMSR2, and NN. In addition, each time series extends over at least 30 days to obtain good statistics. Furthermore, to determine whether the downscaled SM data capture the actual ground SM dynamics, we verify the variations over time of the downscaled SM by studying the time series of the seven ground precipitation stations (see Figure 10).

**Figure 9.** Map of downscaled daily average soil moisture.

**Figure 10.** *Cont*.

*Remote Sens.* **2021**, *13*, x FOR PEER REVIEW 19 of 23

**Figure 10.** Daily precipitation, downscaled SM, and CLDAS SM over time for seven ground precipitation stations (Yeniugou, Xiaozaohuo, Dachaidan, Chaka, Wudaoliang, Xinghai, Qumarai).

Table 6 shows that the correlation of the downscaled SM results of the NN model at the Dulan, Tianjun, and WuUlan sites exceeds 0.6, which is a larger average than CLDAS, SMAP, and AMSR2, thereby demonstrating that the NN model properly downscales the SM. Table 6 also reveals negative correlations with CLDAS SM at both the Golmud and Nuomuhong sites, whereas SMAP, AMSR2, and NN produce negative correlations at the Golmud site but positive correlations at the Nuomuhong site. This indicates that the temporal and spatial structures based on the NN model are driven by the satellite observations rather than by the target data. Figure 7 shows a map of the daily average SM in Qinghai Province after downscaling; these results provide much more SM information than do large-scale maps of SM.


DOWNSCALED 0.212 0.768 −0.524 0.251 0.620 0.616

**Table 6.** Correlation coefficients between downscaled NN soil moisture, CLDAS soil moisture, SMAP soil moisture, AMSR2 soil moisture, and soil moisture collected from ground stations.

Figure 10 shows that the downscaled SM strongly correlates with the precipitation data because the SM increases significantly after precipitation and decreases significantly during drought. Furthermore, the downscaled SM data from Xiaozaohuo, Chaka, Wudaoliang and other sites depart significantly from the absolute value of the CLDAS data, whereas both maintain good time consistency. The results demonstrate that downscaling the SM captures better the variations in precipitation over time, which indicates that the downscaled SM better reflects the actual variations in SM over time.

#### **4. Conclusions**

This paper presents a method to retrieve soil moisture (SM) by combining multiinstrument observation data. The method is based on a neural network (NN) to retrieve SM information from passive microwave sensors SMAP and AMSR2, active microwave sensors ASCAT, as well as MODIS data (LST, NSDSI, NDVI) and topographic data (DEM, SLOPE). The greatest advantage of this method is that it can give full play to the potential of the joint retrieval of SM by each microwave sensor and also make full use of the segmentation capability of high-spatial-resolution MODIS data and topographic data.

From the microwave band selection, the best retrieval effect was achieved by the combination of Tbv in the ascending orbit for the 1.41 GHz (SMAP) band, Tbh in the descending orbit for the 10.7 GHz (AMSR2) band, and BTI data of ASCAT through the neural network method. The final NN SM dataset is obtained by combining the auxiliary data LST, NDVI, NSDSI, DEM, and SLOPE with the above three bands of microwave data. The above two models were compared with the CLDAS model SM dataset, and the result shows that the spatial correlation increases from 0.597 to 0.669, the temporal correlation increases from 0.401 to 0.475, the root mean square error decreases from 0.051 to 0.046, and the mean absolute error decreases from 0.041 to 0.036. All indicators improve, which confirms that the use of the auxiliary data improves the performance of the NN model.

The low-resolution SM products obtained from the NN retrieval in the triple collocation are higher quality than the SM products from the FY3C satellite and the ground model GEOS5 in Qinghai Province (i.e., the NN low-resolution products have the highest median correlation of 0.811, the highest correlation Q1 value of 0.681, and the lowest error variance of 0.00003).

Based on the comparison with the ground stations data, the NN SM dataset obtained on the small scale is also of better quality than the CLDAS product, and the correlation with SM at three stations, namely, Dulan (0.768), Tianjun (0.620), and Wulan (0.616), exceeds 0.6, showing strong correlation. The correlation between CLDAS SM products is greater than

0.6 only in Dulan (0.759) and Wulan (0.670). In addition, comparing with the rainfall site data shows that downscaled NN SM data also better capture the dynamic changes of SM in the study area, producing higher SM values when there is more rainfall and a decrease in SM during the long dry season. Comparing the images before and after downscaling also shows that the SM after downscaling can provide more detailed SM information. We also discuss some shortcomings in the downscaling process. The downscaled SM is susceptible to interference from clouds and rain, leading to a significant quantity of missing data, so future work will focus on data completion.

The results of this study confirm that the NN method can be used to obtain SM with high spatial resolution and can be applied to the Qinghai Province area. The data used herein can be downloaded for free from the official websites of the National Aeronautics and Space Administration (NASA), the Japan Aerospace Exploration Agency (JAXA), the European Centre for Medium-Range Weather Forecasts (ECMWF), and the China Meteorological Information Sharing Platform (CIMISS) without regional restrictions and can be used to produce sTable 1 km SM data in the Qinghai Province area.

**Author Contributions:** A.L. designed and conduct the study. Z.Z. performed the data analysis and wrote the manuscript. H.Z. read and edited the manuscript. All authors reviewed and approved the manuscript. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded the National Natural Science Foundation of China (41671026), the Important Science & Technology Specific Projects of Qinghai Province (2019-SF-A4-1) and Scientific Research and Promotion Projects of the Second Phase Project of Ecological Protection and Construction of the Three Rivers Source in Qinghai Province (2018-S-3).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data used in this study were downloaded from the National Aeronautics and Space Administration (NASA), the Japan Aerospace Exploration Agency (JAXA), the European Centre for Medium-Range Weather Forecasts (ECMWF), and the China In-tegrated Meteorological Information Service System (CIMISS).

**Acknowledgments:** We are very grateful to the teams at NASA, JAXA, ECMWF, and CIMISS who have made their datasets available and ready to use.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**

