*2.5. Auxiliary Data*

According to existing research, SM is influenced by additional variables in addition to rainfall, including elevation, land cover type, annual accumulated days, Normalized Difference Vegetation Index (NDVI), and latitude and longitude information of satellite sampling points [31,39]. These factors are typically used as auxiliary variables in downscaling methods [40–43].

Topography, as a significant non-living factor, greatly influences the variability of soil hydrothermal resources. The differences in elevation directly impact the spatial redistribution of solar radiation and rainfall. Therefore, in our downscaling model, we incorporated altitude as the topographic variable. The source of altitude data is the Shuttle Radar Topography Mission (SRTM) [44]. The influence on SM varies with different types

of land cover, as they have different effects on the storage and release of moisture. The land cover type data used in this paper is based on the International Geosphere Biosphere Programme (IGBP) [45] land cover map derived from the MODIS. NDVI is widely used to assess vegetation growth, drought conditions, and ecological environments. Since NDVI exhibits a high sensitivity to factors such as vegetative cover and SM content, it is also used for retrieving SM and vegetation covering [46]. The NDVI product is calculated from the daily 250 m product provided by MODIS (MOD09GQ). The precipitation plays a significant role in vegetation growth and has a strong impact on SM. Precipitation affects SM as it comes into contact with the soil, and there is a positive correlation between SM and precipitation. Therefore, we include precipitation as an input variable in the model. The daily average precipitation in the study area is obtained through the Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS) project. Table 1 summarizes the fundamental characteristics of the auxiliary variables used in this paper.


**Table 1.** Overview of data utilized for the downscaling process in this paper.

#### **3. Soil Moisture Downscaling Framework**

### *3.1. Random Forest (RF)*

Ho et al. [47] first proposed the concept of Random Forest (RF) in 1998; then, Breiman et al. [48] systematically developed it in 2001. RF is a collective model constructed on the foundation of decision trees. It is implemented through the Bagging concept of ensemble learning, aiming to solve the problem of overfitting that is common in single decision tree algorithms. The decision tree is a fundamental component. Due to its significant advantages in handling high-dimensional feature data and large datasets, RF is widely used in multivariate regression problems. Compared with ordinary decision trees, RF makes some improvements in the process of building decision trees. During the generation of a regular decision tree, the optimal feature among all sample features on the node is used. However, the RF algorithm randomly selects a certain number of attribute features when generating a decision tree, and then picks the optimal feature from these randomly selected features to construct the decision tree. The decision trees built using RF have different structures. They will not lead to overfitting due to the addition of more trees, but instead produce a limited value of generalization error. This approach not only reduces fitting errors but also avoids repetitive learning, which helps to enhance the predictive performance of the final model and improve its generalization ability.

The process of the random forest algorithm is: (1) Performing *n* random samplings on the training dataset, each time taking *m* samples, to obtain a subset of data *Sn* = {(*x*1, *y*1),(*x*2, *y*2),(*x*3, *y*3), ··· ,(*xm*, *ym*)} containing *m* samples. (2) Using these subdatasets to train *n* weak prediction models *fn*(*x*) separately. (3) When training decision tree model nodes, a subset of feature samples is selected from all samples. Then, the optimal feature for splitting the decision tree is chosen from this randomly selected subset of feature samples. (4) The results of the various weak prediction models are consolidated according to the specific problem at hand. For regression functions, the final output is the arithmetic average of all the weak prediction models.
