Comparison of Machine Learning-Based Snow Depth Estimates and Development of a New Operational Retrieval Algorithm over China

Yang, Jianwei; Jiang, Lingmei; Pan, Jinmei; Shi, Jiancheng; Wu, Shengli; Wang, Jian; Pan, Fangbo

doi:10.3390/rs14122800

Open AccessArticle

Comparison of Machine Learning-Based Snow Depth Estimates and Development of a New Operational Retrieval Algorithm over China

by

Jianwei Yang

¹,

Lingmei Jiang

^1,*

,

Jinmei Pan

²

,

Jiancheng Shi

³

,

Shengli Wu

⁴

,

Jian Wang

¹

and

Fangbo Pan

¹

State Key Laboratory of Remote Sensing Science, Jointly Sponsored by Beijing Normal University and Aerospace Information Research Institute of Chinese Academy of Sciences, Faculty of Geographical Science, Beijing Normal University, Beijing 100875, China

²

State Key Laboratory of Remote Sensing Science, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100101, China

³

National Space Science Center, Chinese Academy of Sciences, Beijing 100190, China

⁴

National Satellite Meteorological Center, China Meteorological Administration, Beijing 100081, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2022, 14(12), 2800; https://doi.org/10.3390/rs14122800

Submission received: 21 April 2022 / Revised: 3 June 2022 / Accepted: 8 June 2022 / Published: 10 June 2022

(This article belongs to the Special Issue Microwave Remote Sensing for Quantitative Parameters Retrieval: Methods and Applications)

Download

Browse Figures

Versions Notes

Abstract

:

Snow depth estimation with passive microwave (PM) remote sensing is challenged by spatial variations in the Earth’s surface, e.g., snow metamorphism, land cover types, and topography. Thus, traditional static snow depth retrieval algorithms cannot capture snow thickness well. In this study, we present a new operational retrieval algorithm, hereafter referred to as the pixel-based method (0.25° × 0.25° grid-level), to provide more accurate and nearly real-time snow depth estimates. First, the reference snow depth was retrieved using a previously proposed model in which a microwave snow emission model was coupled with a machine learning (ML) approach. In this process, an effective grain size (effGS) value was optimized by utilizing the snow microwave emission model, and then the nonlinear relationship between snow depth and multiple predictive variables, e.g., effGS, longitude, elevation, and brightness temperature (Tb) gradients, was established with the ML technique to retrieve reference snow depth data. To select a robust and well-performing ML approach, we compared the performance of widely used support vector regression (SVR), artificial neural network (ANN) and random forest (RF) algorithms over China. The results show that the three ML models performed similarly in snow depth estimation, which was attributed to the inclusion of effGS in the training samples. In this study, the RF model was used to retrieve the snow depth reference dataset due to its slightly stronger robustness according to our comparison of results. Second, the pixel-based algorithm was built based on the retrieved reference snow depth dataset and satellite Tb observations (18.7 GHz and 36.5 GHz) from Advanced Microwave Scanning Radiometer 2 (AMSR2) during the 2012–2020 period. For the pixel-based algorithm, the fitting coefficients were achieved dynamically pixel by pixel, making it superior to the traditional static methods. Third, the built pixel-based algorithm was verified using ground-based observations and was compared to the AMSR2, GlobSnow-v3.0, and ERA5-land products during the 2012–2020 period. The pixel-based algorithm exhibited an overall unbiased root mean square error (unRMSE) and R² of 5.8 cm and 0.65, respectively, outperforming GlobSnow-v3.0, with unRMSE and R² values of 9.2 cm and 0.22, AMSR2, with unRMSE and R² values of 18.5 cm and 0.13, and ERA5-land, with unRMSE and R² values of 10.5 cm and 0.33, respectively. However, the pixel-based algorithm estimates were still challenged by the complex terrain, e.g., the unRMSE was up to 17.4 cm near the Tien Shan Mountains. The proposed pixel-based algorithm in this study is a simple and operational method that can retrieve accurate snow depths based solely on spaceborne PM data in comparatively flat areas.

Keywords:

passive microwave remote sensing; machine learning; snow depth; pixel-based algorithm

1. Introduction

As a key parameter indicating snow mass, snow water equivalent (SWE) plays a vital role in processes related to fresh drinking water for humans and animals, agricultural water uses and the prevention of natural hazards [1,2,3,4,5]. Thus, accurate knowledge of the spatiotemporal variations in SWE would improve hydrological forecasts and water resource management [6,7,8].

Satellite passive microwave (PM) remote sensing is an available and effective tool for monitoring global SWE due to its daily observation capability, good spatial coverage, and high sensitivity to the amount of snow [9,10]. The underlying physics of PM-based SWE retrieval is that the volume scattering of microwave radiation in snow varies with wavelength [11]. Typically, long waves (e.g., K-band) have strong penetrability to snow, reflecting the radiation characteristics of underground surfaces, whereas short waves (e.g., Ka-band) are easily scattered by snow particles. Thus, the contrast between the microwave brightness temperature (Tb) at the K- and Ka-bands can yield SWE estimates.

PM SWE retrieval algorithms are structured into empirical and semiempirical formulas [9,11,12,13,14,15,16], physical-based methods [17,18,19,20,21,22,23], machine learning (ML) approaches [24,25,26,27,28,29,30], and assimilation models [31,32,33,34,35,36]. The most widely used methods for operational product generation are still empirical and semiempirical algorithms. This is because these algorithms are independent of complex auxiliary data, computationally inexpensive, and easy to operate. In addition, empirical algorithms can provide snow depth estimates in real time, which plays an important role in disaster monitoring and relief efforts. Physical-based methods have low computational efficiency. In addition, rich prior knowledge is required to run physical emission models of snowpack. Moreover, the accuracy of the snow emission model is questionable at the satellite pixel scale (~25 km). Thus, the physical-based algorithm is less suitable for operational purposes. The assimilation methods depend on auxiliary data, e.g., ground-based observations and weather forcing datasets, to correct the estimated SWE. Moreover, land surface models are generally used to provide input parameters for snow emission models (observation operators), while atmospheric reanalysis data are required to implement the land surface model. In this process, some potential problems, e.g., model structure errors and model forcing errors, probably lead to large uncertainties in global-scale SWE retrieval [4,35,37]. Thus, the methodology of the assimilation model is very complex and not suitable for operation.

ML-based algorithms are currently popular and widely used to retrieve land surface parameters. ML techniques are a powerful set of tools for fitting the multivariate nonlinear relationships between predictor variables and a given dependent variable. Generally, the relationship between snow depth and satellite-observed Tb is nonlinear due to snow metamorphism, forest canopy, atmosphere, saturation effect and wet snow [38,39,40,41,42]. The greatest challenge is how to consider those factors in a defined algorithm. Unfortunately, this kind of retrieval algorithm has not yet been built due to an unclear physical basis. Although ML’s “black box” weakness has received substantial criticism from the scientific community, it initiated the era of data-driven models and will benefit traditional quantitative models. Previous studies have demonstrated that ML techniques present promising performance in retrieving stationary land surface parameters, e.g., soil types and soil properties. However, they are also challenged by the nonstationary Earth system [43]; for example, temporally and spatially dynamic snow microstructures still limit ML applications in snow depth retrieval. Our previous study explored the potential of the random forest (RF) approach in snow depth retrieval and demonstrated that a well-trained RF algorithm in a specific area is not applicable over different regions because snow characteristics change both spatially and over time, and there was no prior snowpack information (predictor variable) in the RF training samples [29]. To address this problem, we proposed a methodology that combines the snow microwave emission model with the RF approach [30]. In this methodology, an effective GS value (effGS), a prior snowpack descriptor, was optimized by utilizing the Helsinki University of Technology (HUT) model by minimizing the difference between the Advanced Microwave Scanning Radiometer 2 (AMSR2) observations (both 18.7 and 36.5 GHz) and HUT simulations. Then, we used the RF model to build the nonlinear relationship between snow depth and selected predictor variables, such as vertical polarized Tb differences between 18.7 and 36.5 GHz observations and 10.65 and 36.5 GHz observations, longitude and elevation, and the optimized effGS.

We previously reported the baseline retrieval algorithm and demonstrated its performance in snow depth estimation [30]. The proposed model was greatly superior to a single RF algorithm without effGS. Additionally, it significantly outperformed other snow depth products, e.g., AMSR2 and GlobSnow-v2.0. Our validation also showed that it can partially address some major issues presented in existing algorithms, e.g., underestimation for thick snow (>20 cm) and overestimation for thin snow cover (≤20 cm). However, the proposed model is seriously dependent on auxiliary data, e.g., snow and ground temperatures, snow density, and ground-based snow depth data. That is, this algorithm does not work if those auxiliary data are unavailable. In addition, the proposed model cannot monitor snow depth in near real time, which limits its applications in hydrological forecasting and water resource management. In our previous publication, we directly used the RF model in combination with the HUT model [30], but the performances of various other ML models in snow depth estimation were unknown.

Thus, the specific objectives of this follow-up study were to (1) compare the snow depth estimates from different ML methods, including support vector regression (SVR), artificial neural network (ANN), and RF, and determine which ML technique is more suitable for retrieving snow depth; (2) develop a simple operational snow retrieval algorithm (named the “pixel-based algorithm” in this study) based solely on space-borne PM data over China; and (3) compare the pixel-based algorithm with the GlobSnow-v3.0, AMSR2, and ERA5-land retrieval methods. The outline of this paper is as follows: Section 2 provides a description of the datasets and methodology of coupling the HUT model with the RF technique; Section 3 and Section 4 provide the results and the discussion, respectively; and conclusions are given in Section 5.

2. Data and Methodology

2.1. Ground-Based Measurements

The weather station data consisted of eighteen years of winter snow (2012–2020) data that were collected from the National Meteorological Information Centre, China Meteorology Administration (CMA, http://data.cma.cn/en (accessed on 15 May 2020)). The spatial distribution of the meteorological stations is shown in Figure 1. The attribute parameters of each station include the site name, geolocation, and elevation. The daily measured variables consist of ground surface temperature, air temperature, latitude, longitude, elevation, and snow depth. In this study, station observations of snow from eight winters (2012–2020, corresponding to the AMSR2 period) were used to train the ML model and verify its performance. To maintain the training samples independent of validation data, all the stations were randomly separated into two parts, A and B (see Figure 1). Here, part A samples from 341 stations were used to train the ML models, while part B measurements from 342 stations, as spatially independent reference data, were employed to evaluate the well-trained ML algorithms and the proposed pixel-based algorithm (see Section 3.2).

Additionally, field campaign observations were collected to support the assessment of snow depth estimates. Four snow courses were designed to measure snow characteristics (snow depth, snow density, snow mass, and temperature) in Northeast China and northern Xinjiang [30]. Figure 1 shows the four snow survey routes, hereafter referred to as snow courses 1, 2, 5, and 6. Snow courses 1 and 2 were distributed around the Junggar Basin and the Tien Shan Mountains, respectively, in northern Xinjiang. Snow courses 5 and 6 were located in Northeast China. Snow course 5 stretched from Xilingol to the greater Khingan range, and snow course 6 included parts of the Changbai Mountains and Lesser Khingan Mountains. For snow courses 5 and 6, the land cover types were abundant, including grassland, farmland, barren land, and forest. The snowpit measurements were performed for three successive winters from 2018–2020, every 15 to 30 km along each snow course. All snowpit measurements within a satellite pixel were averaged as the ground truth value. Basically, there was only one snowpit measurement in each satellite pixel. The part B station observations during the 2012–2020 period, together with the snow course observations from 2018 to 2020, were used to assess the pixel-based algorithm (see Section 2.4 and Section 3.4) proposed in this study and other snow depth products, e.g., assimilated GlobSnow-v3.0 data, remotely sensed AMSR2 estimates, and the reanalysis ERA5-land product.

2.2. Gridded Products

The AMSR2 instrument onboard the Global Change Observation Mission (GCOM) satellite of the Japanese Aerospace Exploration Agency (JAXA) has collected data since 2012 at frequencies of 6.9 GHz, 7.3 GHz, 10.7 GHz, 18.7 GHz, 23.8 GHz, 36.5 GHz, and 89.0 GHz with both V and H polarizations [44]. AMSR2 EASE-Grid Tb data (L3) from December 2012 to March 2020 were downloaded from http://gportal.jaxa.jp/gpr/ (accessed on 25 May 2020). Here, the AMSR2 Tb data were selected to train the ML models due to their advanced payloads, such as fine footprint size, which can reduce the uncertainties caused by the mixed pixels. Meanwhile, AMSR2 L3 Tb data were used to build the pixel-based algorithm and produce snow depth estimates during the 2012–2020 period.

To demonstrate the magnitude of improvement in snow depth estimation for the newly proposed algorithm, the retrieval results were compared with those of three globally published products (Table 1): the stand-alone satellite AMSR2 product, the assimilated GlobSnow-v3.0 retrieval results, and the reanalysis ERA5-land data. Note that the temporal resolution of the ERA5-land product is at the hour scale, and the data at two o’clock near the descending time of AMSR2 were selected. For the GlobSnow-v3.0 product, snow density was assumed to be a constant value of 240 kg/m³ [45]. Thus, 240 kg/m³ was applied to convert SWE to snow depth in this study. To maintain a uniform spatial resolution with AMSR2, the GlobSnow-v3.0 and ERA5-land products were also spatially resampled to 0.25° × 0.25° grid scale (Table 1).

2.3. ML Models

In this study, three widely used ML approaches (ensemble tree-based RF, kernel-based SVR, and neural network-based ANN) were selected to compare their performances in snow depth retrieval. Figure 2 illustrates the architectures of the three ML models. The RF algorithm is an ensemble ML algorithm proposed by Breiman in 2001 [46]. It combines several randomized decision trees and aggregates their predictions by averaging in regression. The default value of the number of trees in the ensemble (ntree) is 500. The number of random variables at each node (mtry) is typically set to the square root or 1/3 of the number of input variables for regression tasks.

SVR was initially developed to solve classification problems and then extended to regression tasks by introducing the ε-insensitive loss function [47,48]. ε is the termination criterion of the loss function; the default is 0.001. The key configurations of SVR include the punishment factor (c), kernel function selection, and gamma (g) parameter of the kernel function. Parameter c denotes the tolerance to error; typically, a high (low) c usually leads to an overfitting (underfitting) problem. The kernel is actually a mapping function that converts input predictor variables from low dimensions to high dimensions (hyperspace) so that samples can be separated easily. In general, the radial basis function (RBF) kernel is a reasonable first choice according to previous studies [49]. For the RBF kernel, the g parameter determines the distribution of high-dimensional data and typically defaults to the reciprocal of the number of input variables. In this study, we used the grid-search method to determine the optimal c and g parameters in the range of [2⁻¹⁰, 2¹⁰].

The inspiration for designing the ANN learning process comes from the working mode of the human brain [50,51]. Currently, a multilayer perceptron (MLP) is one of the most popular ANN types and is widely used in various studies and applications [49,52]. The architecture of the ANN consists of input, hidden, and output layers (Figure 2). The function of the input layer is to fetch the input elements and pass them to the first hidden layer of neurons. The output layer denotes the prediction results obtained by the final hidden layer that are then returned as the output of the ANN. The task of the hidden layer is to build the nonlinear relationship between the inputs and the outputs through a Levenberg–Marquardt backpropagation learning rule. The hidden layers are fully interconnected through a popular logistic sigmoid (logsig) function [25]. In this study, a relatively simple network structure (two hidden layers with 10 neurons) was used according to a previous study [25,53,54].

2.4. Workflow

The workflow of this study is shown in Figure 3. The first step is to optimize the effective grain size using HUT simulation and AMSR2 observation. Our previous study demonstrated that RF performance was greatly improved when effGS was involved in the training of predictor variables [30]. The effGS variable, as the most important predictor variable, was optimized by utilizing the HUT model by minimizing the difference between AMSR2 observations and HUT simulations at 18.7 and 36.5 GHz. The optimization procedure of effGS (ranging from 0 to 4 mm, with a step size of 0.01 mm) is described as follows:

\min_{d_{0}} {[{Tb}_{18.7 V, HUT} (d_{0}, SD, ρ, T_{snow}) - {Tb}_{36.5 V, HUT} (d_{0}, SD, ρ, T_{snow})] - [{Tb}_{18.7 V, AMSR 2} - {TB}_{36.5 V, AMSR 2}]}^{2}

where SD denotes the station-based snow depth (cm), Tb_18.7V and Tb_36.5V denote the vertically polarized observations (K) at 18.7 and 36.5 GHz from AMSR2, respectively, d₀ is effGS (mm), ρ is the snow density (kg/m³) from the ERA5-land product, and T_snow is the snow surface physical temperature (K) provided by daily weather station observations.

The second step is to determine predictor variables (Figure 3). According to our previous study [30], the training predictor variables consist of (Tb_10.65V − Tb_36.5V), (Tb_18.7V − Tb_36.5V), elevation, longitude, and effGS (Table 2 and Figure 3). According to our study in [30], latitude is significantly correlated with elevation (negatively) and longitude (positively). Moreover, longitude has a large variation in snow cover areas, spanning from 75°E to 134°E (Figure 1). Thus, longitude is more important to ML estimates than latitude. Although latitude was excluded from the predictor variables, elevation and longitude actually play a role in latitude.

The third step is to select an optimal ML model (Figure 3). To determine a more accurate reference snow depth dataset, three ML approaches, the RF, SVM, and ANN, were verified and compared (Figure 3). Then, the best ML model was selected to provide a reference snow depth dataset for the pixel-based retrieval algorithm, in which fitting coefficients vary spatially pixel by pixel.

To assess three trained ML algorithms, a well-known k-fold cross-validation (k-CV) technique was performed. In this process, all samples were first randomly and equally divided into k subsamples. Then, (k − 1) randomly selected subsamples were used to train ML models, and the remaining subsample was used to evaluate the trained ML algorithms. This procedure was repeated k times to avoid evaluation uncertainty. The above sample-based k-CV method cannot guarantee that the training samples are both temporally and spatially independent from the validation data. Thus, two extended k-CV approaches, temporal- and spatial-based k-CV, were applied to assess the performances of well-trained ML algorithms [30]. For temporal- and spatial-based k-CV, all samples were randomly divided into k groups according to dates (time) and station location (space), respectively. In this study, k was set to 10, namely, 10-CV. In addition, the spatially independent data from part B stations (Figure 1) were used to further validate snow depth estimates retrieved by the ML model trained with observations from part A stations (Figure 2).

The fourth step is to build the pixel-based algorithm (Figure 3). The proposed algorithm referenced the traditional empirical format, but the fitting coefficient varies in each satellite pixel. In this study, the 89 GHz channel was excluded because it is more subject to atmospheric contributions. Considering the merits of 10.65 GHz in monitoring comparably thick snowpack, a Tb gradient (Tb_10.65V − Tb_18.7V) was used. However, satellite observations at 10.65 GHz were not available from 2002 onward, so we cannot ensure the consistent and continuous monitoring of snowpack from 1978 to the present. Thus, two pixel-based algorithms were defined, PAG1 and PAG2:

PAG1: SD = slope1 × (Tb_10.65V − Tb_36.5V) + slope2 × (Tb_18.7V − Tb_36.5V) + intercept

(1)

PAG2: SD = slope1′ × (Tb_18.7V − Tb_36.5V) + intercept’

(2)

where SD denotes the estimated snow depth (cm). Tb_10.65V, Tb_18.7V, and Tb_36.5V denote the vertically polarized Tb observations at 10.65, 18.7, and 36.5 GHz. Vertical polarization was used in this study because the channels of V-pol are more sensitive to snow depth than the H-pol channels [29,55]. The slope and intercept are the fitting coefficients. In PAG1, a low frequency of 10.65 GHz was considered due to its strong penetrability to deep snow; that is, the Tb spectral difference at 10.65 GHz and 18.7 GHz can partially reflect thick snowpack. The PAG2 algorithm makes long-term snow depth production possible based solely on observations at two frequencies (18.7 and 36.5 GHz). The PAG1 and PAG2 algorithms were compared in snow depth retrieval using part B station observations, and then a suitable model was determined for monitoring snowpack.

The last step is to validate the pixel-based algorithm estimates and compare them with those of the other three products (Figure 3). To demonstrate the advantages of the proposed pixel-based retrieval algorithm, three global snow depth datasets, the remotely sensed AMSR2 product, the assimilated GlobSnow-v3.0 product, and the reanalysis ERA5-land product, were compared to the new proposed algorithm in three stable snow cover areas across China using part B station observations and field campaign measurements (Figure 3).

3. Results

3.1. Sensitivity of ML Models to Training Sample Size

One objective of this study was to compare the snow depth estimates based on three ML methods and determine which ML technique is more accurate in snow depth retrieval. To confirm the appropriate number of training samples, determination of the sensitivity of the three ML models to the training sample size is necessary. Thus, a test for three models trained with the same training samples (from 5000 to 50,000) was conducted during the 2015–2020 period. Then, the trained ML models were verified using station-observed snow depth data from 2012 to 2014. To demonstrate whether the sensitivity is affected by effGS, the trained ML models were compared based on predictor variables with and without effGS. Figure 4 shows the ML models’ performances with increasing training sample size. The results show that any ML model considering the effGS variable performs better than that without effGS. Additionally, inclusion of effGS in the ML model leads to a stable performance, that is, a poor sensitivity to sample size. If effGS values are excluded from the training samples, the correlation coefficients increase with increasing sample size, and the unRMSE values decrease, especially for the ANN technique.

3.2. ML Model Performances

Based on the analysis in Section 3.1, the number of training samples was set to 30,000, which is large enough to ensure that the ML model performs stably and to make the cross-comparison much more objective. Figure 5 shows the validation of the SVR, ANN and RF algorithms with sample-, temporal-, and spatial-based 10-CV approaches. This demonstrates that these three ML algorithms perform similarly in snow depth retrieval, which is attributed to the inclusion of effGS in training samples (Figure 3). The effGS variable records the spatial–temporal variation in snow metamorphism and makes the ML model robust and stable.

The ML-based estimates were further verified with spatially independent station measurements from part B stations, whereas the ML retrieval models were trained with measurements from part A stations (Figure 1). Figure 6 shows the validation and comparison results. Here, ANN1, SVR1, and RF1 represent the trained ML model without considering effGS, while ANN2, SVR2, and RF2 consider effGS. This result demonstrates that the ANN2-, SVR2-, and RF2-based estimates were in good agreement with the station observations, with a high R2 of 0.86 and a low unRMSE of approximately 3.4 cm. However, ANN1, SVR1, and RF1 present poor performances, with an unRMSE of up to 7.5 cm. Thus, the inclusion of effGS in predictor variables enhanced the predictive power of the ML models (Figure 4, Figure 5 and Figure 6). In this study, the RF model was selected to retrieve the snow depth reference dataset for building a pixel-based retrieval algorithm due to its high computational efficiency relative to the computational efficiencies of the SVR and ANN methods.

3.3. Development of the Pixel-Based Algorithm

Based on the RF estimates (see Section 3.2) during the 2012–2020 period, the slope and intercept coefficients were determined for each pixel based on the linear fitting between Tb gradients and ML snow depth estimates (Figure 7). Here, the snow cover detection method proposed by Li et al. [56] was used to identify dry snow pixels. Both the slope and intercept present high spatial heterogeneity (Figure 7). In Northeast China and northern Xinjiang, the slope varies smoothly, but the intercept changes sharply. The slope denotes the linear relationship between the Tb gradient and snow depth, while the intercept denotes the primary snow depth when the Tb gradient is zero. According to scattering theory, the snow depth should be 0 cm when the Tb gradient is equal to zero. In fact, saturation effects of microwave signals occur due to snow characteristics, such as snow depth and snow metamorphism, for example, under deep and mature snow conditions. As illustrated in Figure 7, the intercept coefficient is high in deep snow cover areas, such as the Tien Shan and Altai Mountains, Changbai Mountains, and Greater and Lesser Khingan Mountains. The slope correlation is very low and even negative in unstable snow cover areas, such as the Taklamakan Desert and the southern part of the Qinghai–Tibet Plateau (QTP).

To demonstrate the role of the 10.65 GHz channel in retrieving snow depth, two kinds of pixel-based algorithms (PAG1 and PAG2) are compared in Figure 8. Here, the part B station observations from 2012 to 2020 were treated as true data. Figure 8 shows that the PAG1 and PAG2 algorithms perform similarly in January, February, November, and December but present some differences in March, with correlation coefficients of 0.74 and 0.69, respectively. Figure 8 also shows that the PAG1 and PAG2 algorithms have similar performances in thin snow cover areas (less than 20 cm). However, PAG1 outperformed PAG2 under deep snow cover conditions (greater than 20 cm) because of the inclusion of the 10.65 GHz channel.

Generally, the 10.65 GHz channel plays a significant role in snow depth estimation globally, but it is inapparent in China. Figure 9 shows the relationship between AMSR2-observed Tb gradients and field-measured snow depth data along four snow courses (see Figure 1). Apparently, both (Tb_10.65V − Tb_18.7V) and (Tb_18.7V − Tb_36.5V) present an increase with increasing snow depth. However, (Tb_18.7V − Tb_36.5V) is more sensitive to snow depth than (Tb_10.65V − Tb_18.7V); that is, the small dynamic range of (Tb_10.65V − Tb_18.7V) for snow depths is from 0 to 70 cm. Snow cover is generally shallow over China’s flat areas; thus, both 10.65 GHz and 18.7 GHz channels can penetrate snow and achieve microwave emission of soil beneath snowpack. Although the snow cover over mountains is deep, the effects of the topography and mixed pixels disturb the relationship of the Tb gradient and snow depth. For example, snow course 2 is located in the Tien Shan Mountains. As illustrated in Figure 9, the relationship between the Tb gradient and snow depth is poor. Although the point-based measured snow depth is as high as 50–70 cm, the averaged snow depth in a pixel (0.25° × 0.25°) is unknown; that is, there is a poor representativeness of so-called true data.

3.4. Evaluation of the Pixel-Based Algorithm and Comparison with Other Satellite Products

The pixel-based snow depth algorithm was verified using observations from part B stations during the 2012–2020 period and snow course measurements from 2017 to 2020. Additionally, it was compared to the global remotely sensed AMSR2, assimilated GlobSnow-v3.0 and reanalysis ERA5-land products. Figure 10 illustrates the overall performances of the four algorithms’ estimates over China. The proposed pixel-based algorithm outperforms the other three methods, with a higher R² of 0.65 and lower unRMSE and MAE values of 5.82 cm and 4.21 cm, respectively. GlobSnow-v3.0 and ERA5-land have similar overall performances. The AMSR2 snow depth estimates present a serious overestimation over China, with the highest bias of 18.52 cm among these methods.

Figure 11 shows the comparison and validation results of the pixel-based algorithm, AMSR2, GlobSnow-v3.0, and ERA5-land products in three stable snow cover areas (Northeast China, northern Xinjiang and the QTP) across China. The pixel-based algorithm outperforms the other three products in any snow cover area. The AMSR2 estimates tend to be higher than the station measurements in northern Xinjiang, with a bias of 13.83 cm. Additionally, the AMSR2 product presents a large uncertainty, with a high unRMSE of 22.47 cm. The ERA5-land product also presents a high error (unRMSE: 14.48 cm) in northern Xinjiang, but the averaged estimates tend to be close to the station measurements. Although the unRMSE of GlobSnow-v3.0 is 11.78 cm, the relationship between the estimates and station measurements is poor. In Northeast China, the AMSR2 and ERA5-land products present an overestimation for snow depths less than 50 cm, especially for AMSR2, with the highest bias of 21.14 cm. The pixel-based algorithm is superior to AMSR2 and ERA5-land products on the QTP. However, it tends to underestimate snow depth for conditions greater than 20 cm.

The validation and comparison of the pixel-based algorithm and three products along with four snow courses are illustrated in Figure 12, Figure 13, Figure 14 and Figure 15. For snow courses 1, 5 and 6, the pixel-based algorithm’s estimates are in good agreement with the ground measurements (Figure 12, Figure 13 and Figure 14). For snow courses 5 and 6, ERA5-land has a similar performance to the pixel-based algorithm (Figure 13 and Figure 14). Figure 12, Figure 13 and Figure 14 also demonstrate that ERA5-land estimates are more accurate in Northeast China than in northern Xinjiang. AMSR2 estimates are significantly related to the ground-measured snow depth but present an overestimation, especially for deep snow conditions. The GlobSnow-v3.0 product also presents a large uncertainty in northern Xinjiang and Northeast China, although it performs outstandingly when applied globally, based on previous studies [4,55].

The proposed pixel-based algorithm outperforms the other three products in flat and stable snow cover areas (Figure 12, Figure 13 and Figure 14). However, deriving snow depth in mountains remains a challenge for space-borne PM remote sensing. Figure 15 shows the validation and comparison of snow depth estimates retrieved from the pixel-based algorithm and GlobSnow-v3.0, AMSR2, and ERA5-land products in the Tien Shan Mountains. Note that GlobSnow-v3.0′s sample size is smaller than that of the other three products because of its mountain mask. Here, the roughness of topography was calculated as the logarithm of the elevation’s standard deviation within a 0.25° × 0.25° pixel, that is, roughness = log_e(Stdev), with darker colors indicating more undulant terrain. Figure 15 shows that these four methods consistently present poor performance. The pixel-based algorithm tends to underestimate snow depth for deep snow conditions (>40 cm) where the roughness of terrain is high. AMSR2 and ERA5-land products can reflect deep snow, with large unRMSE values of 18.94 cm and 18.10 cm, respectively. GlobSnow-v3.0 estimates perform best among the four products, with an unRMSE value of 12.23 cm. However, some samples with unusually deep snow conditions are filtered out due to mountain masks.

Figure 16 shows the spatial patterns of snow depth based on four retrieval algorithms over China. They all show that snow cover is thick in northern Xinjiang and Northeast China. The estimates from AMSR2 are the highest among these four methods, even over 50 cm. The spatial pattern of ERA5-land estimates is consistent with the topography, that is, thick snow in mountains and comparatively shallow snow in flat areas.

4. Discussion

ML techniques present outstanding performance in snow depth estimation when effGS, which denotes snow metamorphism, is included in the training sample (Figure 4, Figure 5 and Figure 6). However, ML-based snow depth estimates are still disturbed by complex terrain. For example, our previous study demonstrated that unRMSE presents a significant increasing trend (from 4.7 cm to 19.6 cm) with increasing roughness, and the underestimation (from 2.6 cm to −9.8 cm) tends to be considerably serious [30]. Thus, the uncertainties from ML-based estimates are surely propagated to the proposed pixel-based algorithm in this paper.

According to the validation in Section 3, the pixel-based algorithm presented a better performance in northern Xinjiang and Northeast China than in the QTP (Figure 11, Figure 12, Figure 13 and Figure 14). However, in the complex terrain areas, e.g., the Tien Shan Mountains, the pixel-based algorithm was completely ineffective, especially for conditions with depths greater than 40 cm (Figure 15). Snow depth estimation in complex mountains is challenged by two main problems. One is that ground-based measurements are generally sparse in remote areas and high mountains (Figure 1). Moreover, the snow cover presents strong heterogeneity, especially for the snow depth, due to the complex topography. Thus, the point measurements are unrepresentative at a coarse pixel resolution (typically on the order of tens of kilometers). Another challenge is that microwave signals are affected not only by the snow mass but also by heterogeneous forest cover, terrain variability, and incomplete snow cover [57]. Synthetic aperture radar (SAR) remote sensing provides a promising prospect for mapping snow depth in mountains. For example, a recent work by Lievens et al. [58,59] attempted to retrieve snow depth in complex mountains using Sentinel-1 (SE1) SAR C-band observations. The results demonstrated that it provides much more accurate snow depth in Europe.

Figure 17 shows the snow depth spatial distribution of the SE1 product and pixel-based estimates. Figure 17a shows the initial SE1 product at a 0.01° × 0.01° scale. Figure 17b shows the linearly resampled SE1 product at a 0.25° × 0.25° scale. The SCF was determined by the number of initial SE1 pixels in a 0.25° × 0.25° grid. Figure 17c shows the pixel-based estimates at a 0.25° × 0.25° scale. We also compared the results of SE1 estimates and station measurements in three complex mountains (Altai Mountains, Tien Shan Mountains, and QTP) over China (Figure 17d–f). Figure 17d–f shows that the SE1 estimates are much higher than the station observations, even in areas where the SCF approximately reaches 100%, indicating that snow thickness presents strong heterogeneity in a 0.25° × 0.25° pixel due to the complex topography.

Thus, it is difficult to determine which dataset (station vs. SE1) represents the true snow depth in spatial pixels. For the station observation, its representativeness in a coarse pixel is unknown. For the SE1 estimates, the C-band is not feasible in comparatively shallow snow-covered areas due to its strong penetrability relative to the Ku-band (Yueh et al., 2009; King et al., 2015; Tsang et al., 2021). The snow grain sizes are typically 1–3 mm and are more than 20–55 times smaller than the C-band wavelength of 5.5 cm, which leads to slight volume scattering by snow grains in the C-band. At least, it is necessary to verify whether SE1 estimates can be treated as true data in mountains.

In the future, a combination of active sensors (C-band, X-band, and Ku-band) can be exploited to improve snow depth estimates in mountain areas [8], which can further support a quantitative assessment of the uncertainty in snow depth retrieval with PM observations. Additionally, automatic measurement networks, e.g., global navigation satellite system receivers (GNSS-R), satellite altimetry, light detection and ranging (LiDAR), can support studies of snow cover in remote areas and high mountains [60,61,62].

Figure 8 shows that the uncertainty of the proposed pixel-based retrieval algorithm increases from January to March, e.g., the unRMSE increases from 5.1 cm to 7.9 cm, and corr.coe decreases from 0.84 to 0.69. One reason is snow accumulation events, with the peak snow depth typically being reached in the middle of March. Another reason is snow metamorphism (grain size), which usually leads to stronger volume scattering than that of the snow depth [35]. In this study, the fitting coefficients of the pixel-based retrieval algorithm are dynamic at the spatial scale but fixed at the temporal scale, which neglects the scattering contribution caused by the snow grain evolution from the beginning to the end of the snow season. To reduce the errors caused by snow metamorphism, our ongoing work will attempt to build temporal (monthly) and spatial (pixel by pixel) dynamic retrieval algorithms based on ML estimates. However, this kind of method still cannot solve the problem thoroughly and theoretically because snow grains increase, even at the subhour scale. The optimum method is to develop an algorithm in which a specific index can decouple the scattering effects of snow grains and snow mass. In addition, the pixel-based method proposed in this paper is only suitable for retrieving snow depth in dry snow conditions. For dry snowpack, snow scattering typically dominates the signal at some frequencies, e.g., the K- and Ka-bands. Once snow contains liquid water, the penetration depth of electromagnetic waves decreases, and the snowpack absorbs radiation from the soil beneath the snowpack. Thus, wet snow also increases the uncertainty of the pixel-based algorithm.

5. Conclusions

Our previous study proposed a methodology that coupled HUT-optimized effGS parameters with the RF ML approach and demonstrated that it significantly improved snow depth estimates over China due to the inclusion of effGS in predictor variables (Yang et al., 2021). However, this method is a complex and time-consuming approach because input data is needed to drive the snow emission model to optimize effGS and then train the ML model.

This study further presents a new operational snow depth retrieval algorithm that implements ML-based estimates. The newly proposed snow depth estimation algorithm references the traditional empirical format, but the fitting coefficient varies in each satellite pixel. Thus, the new method is called the pixel-based retrieval algorithm in this study. To provide more accurate ML-based estimates for building pixel-based algorithms, we compared three widely used ML approaches (SVR, ANN and RF) over China. Meanwhile, we tested the sensitivity of these ML models to the training sample size. The results indicated that (1) the three ML approaches had significantly different performances if the predictor variables did not include effGS and presented sensitivity to the training sample size, especially for the ANN technique, and (2) the three ML models presented consistent performances in snow depth estimation if the predictor variables included effGS and simultaneously presented poor sensitivity to the training sample size.

The RF model was used to retrieve the snow depth reference dataset (2012–2020) in this study due to its slightly stronger robustness. The pixel-based algorithm was built based on the retrieved reference snow depth dataset and satellite Tb observations and then verified using ground-based observations from weather stations and field campaigns. Additionally, pixel-based algorithm estimates were compared to the global AMSR2, GlobSnow-v3.0 and ERA5-land products. The results indicated that the pixel-based algorithm exhibited overall unRMSE and R² values of 5.8 cm and 0.65, respectively, outperforming GlobSnow-v3.0, with unRMSE and R² values of 9.2 cm and 0.22, AMSR2, with unRMSE, and R² values of 18.5 cm and 0.13, and ERA5-land, with unRMSE and R² values of 10.5 cm and 0.33, respectively. Moreover, the pixel-based algorithm’s estimates were closest to the ground measurements along snow courses 1, 5, and 6, with unRMSE values of 4.8 cm, 6.6 cm, and 5.2 cm, respectively.

The proposed pixel-based algorithm in this paper is a simple and operational method and can retrieve accurate snow depth based solely on space-borne PM observations in comparatively flat areas. The pixel-based algorithm presents a high uncertainty in the complex terrain, e.g., the unRMSE was up to 17.4 cm around the Tien Shan Mountains. Additionally, it presents a serious underestimation for deep snow conditions (>40 cm). We are attempting to improve snow depth estimates in mountainous areas by combining SAR C-, X-, and Ku-band observations and will present this endeavor in future work.

Author Contributions

Conceptualization, J.Y. and L.J.; methodology, J.Y.; software, J.Y.; validation, J.Y., J.P. and S.W.; formal analysis, F.P.; investigation, J.Y., J.W. and F.P.; resources, L.J., J.S. and S.W.; data curation, J.P.; writing—original draft preparation, J.Y.; writing—review and editing, L.J. and J.S.; visualization, J.Y.; supervision, L.J.; project administration, L.J. and J.Y.; funding acquisition, L.J. and J.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This work was jointly funded by the National Natural Science Foundation of China (42171317 and 42090014), the Fundamental Research Funds for the Central Universities (2021NTST02), the National Key Research and Development Program of China (No. 2021YFB3900104), and the Second Tibetan Plateau Scientific Expedition and Research Program (2019QZKK0206).

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Barnett, T.P.; Adam, J.C.; Lettenmaier, D.P. Potential impacts of a warming climate on water availability in snow-dominated regions. Nature 2005, 438, 303–309. [Google Scholar] [CrossRef] [PubMed]
Qin, Y.; Abatzoglou, J.T.; Siebert, S.; Huning, L.S.; AghaKouchak, A.; Mankin, J.S.; Hong, C.; Tong, D.; Davis, S.J.; Mueller, N.D. Agricultural risks from changing snowmelt. Nat. Clim. Chang. 2020, 10, 459–465. [Google Scholar] [CrossRef]
Sturm, M.; Goldstein, M.; Parr, C. Water and life from snow: A trillion dollar science question. Water Resour. Res. 2017, 53, 3534–3544. [Google Scholar] [CrossRef]
Pulliainen, J.; Luojus, K.; Derksen, C.; Mudryk, L.; Lemmetyinen, J.; Salminen, M.; Ikonen, J.; Takala, M.; Cohen, J.; Smolander, T.; et al. Patterns and trends of Northern Hemisphere snow mass from 1980 to 2018. Nature 2020, 581, 294–298. [Google Scholar] [CrossRef]
Kraaijenbrink, P.; Stigter, E.; Yao, T.; Immerzeel, W.W. Climate change decisive for Asia’s snow meltwater supply. Nat. Clim. Chang. 2021, 11, 591–597. [Google Scholar] [CrossRef]
Derksen, C.; Toose, P.; Rees, A.; Wang, L.; English, M.; Walker, A.; Sturm, M. Development of a tundra-specific snow water equivalent retrieval algorithm for satellite passive microwave data. Remote Sens. Environ. 2010, 114, 1699–1709. [Google Scholar] [CrossRef]
Qu, X.; Hall, A. On the persistent spread in snow-albedo feedback. Clim. Dyn. 2014, 42, 69–81. [Google Scholar] [CrossRef]
Tsang, L.; Durand, M.; Derksen, C.; Barros, A.P.; Kang, D.H.; Lievens, H.; Marshall, H.P.; Zhu, J.; Johnson, J.; King, J.; et al. Review Article: Global Monitoring of Snow Water Equivalent Using High Frequency Radar Remote Sensing. Cryosphere Discuss. 2021. in review. [Google Scholar] [CrossRef]
Foster, J.L.; Sun, C.; Walker, J.P.; Kelly, R.; Chang, A.; Dong, J.; Powell, H. Quantifying the Uncertainty in Passive Microwave Snow Water Equivalent Observations. Remote Sens. Environ. 2005, 94, 187–203. [Google Scholar] [CrossRef]
Saberi, N.; Kelly, R.; Flemming, M.; Li, Q. Review of snow water equivalent retrieval methods using spaceborne passive microwave radiometry. Int. J. Remote Sens. 2020, 41, 996–1018. [Google Scholar] [CrossRef]
Chang, A.T.C.; Foster, J.L.; Hall, D.K. Nimbus-7 SMMR derived global snow cover parameters. Ann. Glaciol. 1987, 9, 39–44. [Google Scholar] [CrossRef]
Derksen, C.; Walker, A.; Goodison, B. Evaluation of passive microwave snow water equivalent retrievals across the boreal forest tundra transition of western Canada. Remote Sens. Environ. 2005, 96, 315–327. [Google Scholar] [CrossRef]
Che, T.; Li, X.; Jin, R.; Armstrong, R.; Zhang, T. Snow depth derived from passive microwave remote-sensing data in China. Ann. Glaciol. 2008, 49, 145–154. [Google Scholar] [CrossRef]
Kelly, R. The AMSR-E Snow Depth Algorithm: Description and Initial Results. J. Remote Sens. Soc. Jpn. 2009, 29, 307–317. [Google Scholar]
Jiang, L.; Wang, P.; Zhang, L.; Yang, H.; Yang, J. Improvement of snow depth retrieval for FY3B-MWRI in China. Sci. China Earth Sci. 2014, 44, 531–547. [Google Scholar] [CrossRef]
Yang, J.; Jiang, L.; Wu, S.; Wang, G.; Wang, J.; Liu, X. Development of a Snow Depth Estimation Algorithm over China for the FY-3D/MWRI. Remote Sens. 2019, 11, 977. [Google Scholar] [CrossRef]
Jiang, L.; Shi, J.; Tjuatja, S.; Dozier, J.; Chen, K.; Zhang, L. A parameterized multiple-scattering model for microwave emission from dry snow. Remote Sens. Environ. 2007, 111, 357–366. [Google Scholar] [CrossRef]
Langlois, A.; Royer, A.; Derksen, C.; Montpetit, B.; Dupont, F.; Goïta, K. Coupling the snow thermodynamic model SNOWPACK with the microwave emission model of layered snowpacks for subarctic and arctic snow water equivalent retrievals. Water Resour. Res. 2012, 48, W12524. [Google Scholar] [CrossRef]
Che, T.; Dai, L.; Zheng, X.; Li, X.; Zhao, K. Estimation of snow depth from passive microwave brightness temperature data in forest regions of northeast China. Remote Sens. Environ. 2016, 183, 334–349. [Google Scholar] [CrossRef]
Picard, G.; Brucker, L.; Roy, A.; Dupont, F.; Fily, M.; Royer, A.; Harlow, C. Simulation of the microwave emission of multi-layered snowpacks using the dense media radiative transfer theory: The DMRT-ML model. Geosci. Model Dev. 2013, 6, 1061–1078. [Google Scholar] [CrossRef]
Picard, G.; Sandells, M.; Löwe, H. SMRT: An active-passive microwave radiative transfer model for snow with multiple microstructure and scattering formulations (v1.0). Geosci. Model Dev. 2018, 11, 2763–2788. [Google Scholar] [CrossRef]
Dai, L.; Che, T.; Wang, J.; Zhang, P. Snow depth and snow water equivalent estimation from AMSR-E data based on a priori snow characteristics in Xinjiang, China. Remote Sens. Environ. 2012, 127, 14–29. [Google Scholar] [CrossRef]
Pan, J.; Durand, M.; Sandells, M.; Vander Jagt, B.; Liu, D. Application of a Markov Chain Monte Carlo algorithm for snow water equivalent retrieval from passive microwave measurements. Remote Sens. Environ. 2017, 192, 150–165. [Google Scholar] [CrossRef]
Tedesco, M.; Jeyaratnam, J. A New Operational Snow Retrieval Algorithm Applied to Historical AMSR-E Brightness Temperatures. Remote Sens. 2016, 8, 1037. [Google Scholar] [CrossRef]
Santi, E.; Brogioni, M.; Leduc-Leballeur, M.; Macelloni, G.; Montomoli, F.; Pampaloni, P.; Lemmetyinen, J.; Cohen, J.; Rott, H.; Nagler, T.; et al. Exploiting the ANN Potential in Estimating Snow Depth and Snow Water Equivalent from the Airborne SnowSAR Data at X- and Ku-Bands. IEEE Trans. Geosci. Remote Sens. 2021, 99, 1–16. [Google Scholar] [CrossRef]
Bair, E.H.; Abreu Calfa, A.; Rittger, K.; Dozier, J. Using machine learning for real-time estimates of snow water equivalent in the watersheds of Afghanistan. Cryosphere 2018, 12, 1579–1594. [Google Scholar] [CrossRef]
Xiao, X.; Zhang, T.; Zhong, X.; Shao, W.; Li, X. Support vector regression snow-depth retrieval algorithm using passive microwave remote sensing data. Remote Sens. Environ. 2018, 210, 48–64. [Google Scholar] [CrossRef]
Wang, J.; Forman, B.A.; Xue, Y. Exploration of synthetic terrestrial snow mass estimation via assimilation of amsr-e brightness temperature spectral differences using the catchment land surface model and support vector machine regression. Water Resour. Res. 2020, e2020WR027490. [Google Scholar] [CrossRef]
Yang, J.; Jiang, L.; Luojus, K.; Pan, J.; Lemmetyinen, J.; Takala, M.; Wu, S. Snow depth estimation and historical data reconstruction over China based on a random forest machine learning approach. Cryosphere 2020, 14, 1763–1778. [Google Scholar] [CrossRef]
Yang, J.; Jiang, L.; Lemmetyinen, J.; Pan, J.; Luojus, K.; Takala, M. Improving snow depth estimation by coupling HUT-optimized effective snow grain size parameters with the random forest approach. Remote Sens. Environ. 2021, 264, 112630. [Google Scholar] [CrossRef]
Che, T.; Li, X.; Jin, R.; Huang, C. Assimilating passive microwave remote sensing data into a land surface model to improve the estimation of snow depth. Remote Sens. Environ. 2014, 143, 54–63. [Google Scholar] [CrossRef]
Li, D.; Durand, M.; Margulis, S. Estimating snow water equivalent in a Sierra Nevada watershed via spaceborne radiance data assimilation. Water Resour. Res. 2017, 53, 647–741. [Google Scholar] [CrossRef]
Xue, Y.; Forman, B.A.; Reichle, R.H. Estimating snow mass in North America through assimilation of Advanced Microwave Scanning Radiometer brightness temperature observations using the Catchment land surface model and support vector machines. Water Resour. Res. 2018, 54, 6488–6509. [Google Scholar] [CrossRef] [PubMed]
Larue, F.; Royer, A.; De Sève, D.; Roy, A.; Picard, G.; Vionnet, V.; Cosme, E. Simulation and assimilation of passive microwave data using a snowpack model coupled to a well-calibrated radiative transfer model over North-Eastern Canada. Water Resour. Res. 2018, 54, 1–26. [Google Scholar] [CrossRef]
Merkouriadi, I.; Lemmetyinen, J.; Liston, G.E.; Pulliainen, J. Solving Challenges of Assimilating Microwave Remote Sensing Signatures with a Physical Model to Estimate Snow Water Equivalent. Water Resour. Res. 2021, 57, 1–24. [Google Scholar] [CrossRef]
Kim, R.S.; Durand, M.; Li, D.; Baldo, E.; Margulis, S.A.; Dumont, M.; Morin, S. Estimating alpine snow depth by combining multifrequency passive radiance observations with ensemble snowpack modeling. Remote Sens. Environ. 2019, 226, 1–15. [Google Scholar] [CrossRef]
Xiong, C.; Shi, J.; Pan, J.; Xu, H.; Che, T.; Zhao, T.; Ren, Y.; Geng, D.; Chen, T.; Jiang, K.; et al. Time Series X- and Ku-Band Ground-Based Synthetic Aperture Radar Observation of Snow-Covered Soil and Its Electromagnetic Modeling. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–13. [Google Scholar] [CrossRef]
Lemmetyinen, J.; Derksen, C.; Toose, P.; Proksch, M.; Pulliainen, J.; Kontu, A.; Hallikainen, M. Simulating seasonally and spatially varying snow cover brightness temperature using HUT snow emission model and retrieval of a microwave effective grain size. Remote Sens. Environ. 2015, 156, 71–95. [Google Scholar] [CrossRef]
Xue, Y.; Forman, B.A. Atmospheric and Forest Decoupling of Passive Microwave Brightness Temperature Observations Over Snow-Covered Terrain in North America. IEEE J. Select. Top. Appl. Earth Observ. Remote Sens. 2017, 10, 3172–3189. [Google Scholar] [CrossRef]
Li, Q.; Kelly, R.; Leppanen, L.; Juho, V.; Kontu, A.; Lemmetyinen, J.; Pulliainen, J. The Influence of Thermal Properties and Canopy-Intercepted Snow on Passive Microwave Transmissivity of a Scots Pine. IEEE Trans. Geosci. Remote Sens. 2019, 99, 1–10. [Google Scholar] [CrossRef]
Li, Q.; Kelly, R.; Lemmetyinen, J.; Roo, R.D.D.; Pan, J.; Qiu, Y. The influence of tree transmissivity variations in winter on satellite snow parameter observations. Int. J. Digit. Earth 2021, 14, 1337–1353. [Google Scholar] [CrossRef]
Venäläinen, P.; Luojus, K.; Lemmetyinen, J.; Pulliainen, J.; Moisander, M.; Takala, M. Impact of dynamic snow density on GlobSnow snow water equivalent retrieval accuracy. Cryosphere 2021, 15, 2969–2981. [Google Scholar] [CrossRef]
Reichstein, M.; Camps-Valls, G.; Stevens, B.; Jung, M.; Denzler, J.; Carvalhais, N.; Prabhat. Deep learning and process understanding for data-driven Earth system science. Nature 2019, 566, 195–204. [Google Scholar] [CrossRef]
Tedesco, M.; Jeyaratnam, J.; Kelly, R. NRT AMSR2 Daily L3 Global Snow Water Equivalent EASE-Grids; NASA LANCE AMSR2 at the Global Hydrology Resource Center Distributed Active Archive Center: Huntsville, AL, USA, 2015. [Google Scholar]
Luojus, K.; Pulliainen, J.; Takala, M.; Lemmetyinen, J.; Mortimer, C.; Derksen, C.; Mudryk, L.; Moisander, M.; Hiltunen, M.; Smolander, T.; et al. GlobSnow v3.0 Northern Hemisphere snow water equivalent dataset. Sci. Data 2021, 8, 163. [Google Scholar] [CrossRef] [PubMed]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Cortes, C.; Vapnik, V.; Saitta, L. Support-Vector Networks Editor. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
Vapnik, V.; Golowich, S.E.; Smola, A.J. Support vector method for function approximation, regression estimation and signal processing. Adv. Neural Inf. Processing Syst. 1997, 7, 281–287. [Google Scholar]
Liu, H.; Li, Q.; Bai, Y.; Yang, C.; Wu, G. Improving satellite retrieval of oceanic particulate organic carbon concentrations using machine learning methods. Remote Sens. Environ. 2021, 256, 112316. [Google Scholar] [CrossRef]
Specht, D.F. A general regression neural network. IEEE Trans. Neural Netw. 1991, 2, 568–576. [Google Scholar] [CrossRef]
Basheer, I.; Hajmeer, M. Artificial neural networks: Fundamentals, computing, design, and application. J. Microbiol. Methods 2000, 43, 3–31. [Google Scholar] [CrossRef]
Dobreva, I.; Klein, A. Fractional snow cover mapping through artificial neural network analysis of modis surface reflectance. Remote Sens. Environ. 2011, 115, 3355–3366. [Google Scholar] [CrossRef]
Broxton, P.; van Leeuwen, W.; Biederman, J. Improving snow water equivalent maps with machine learning of snow survey and lidar measurements. Water Resour. Res. 2019, 55, 3739–3757. [Google Scholar] [CrossRef]
Tarpanelli, A.; Santi, E.; Tourian, M.; Filippucci, P.; Amarnath, G.; Brocca, L. Daily river discharge estimates by merging satellite optical sensors and radar altimetry through artificial neural network. IEEE Trans. Geosci. Remote Sens. 2019, 57, 329–341. [Google Scholar] [CrossRef]
Takala, M.; Luojus, K.; Pulliainen, J.; Lemmetyinen, J.; Juha-Petri, K.; Koskinen, J.; Bojkov, B. Estimating northern hemisphere snow water equivalent for climate research through assimilation of space-borne radiometer data and ground-based measurements. Remote Sens. Environ. 2011, 115, 3517–3529. [Google Scholar] [CrossRef]
Li, X.J.; Liu, Y.J.; Zhu, X.X.; Zheng, Z.J.; Chen, A.J. Snow Cover Identification with SSM/I Data in China. J. Appl. Meteorol. Sci. 2007, 18, 12–20. [Google Scholar]
Dozier, J.; Bair, E.; Davis, R. Estimating the spatial distribution of snow water equivalent in the world’s mountains. WIREs Water 2016, 3, 461–474. [Google Scholar] [CrossRef]
Lievens, H.; Demuzere, M.; Marshall, H.P.; Reichle, R.H.; Brucker, L.; Brangers, I.; de Rosnay, P.; Dumont, M.; Girotto, M.; Immerzeel, W.W.; et al. Snow depth variability in the Northern Hemisphere mountains observed from space. Nat Commun. 2019, 10, 4629. [Google Scholar] [CrossRef]
Lievens, H.; Brangers, I.; Marshall, H.P.; Jonas, T.; Olefs, M.; De Lannoy, G. Sentinel-1 snow depth retrieval at sub-kilometer resolution over the European Alps. Cryosphere 2022, 16, 159–177. [Google Scholar] [CrossRef]
Painter, T.; Berisford, D.; Boardman, J.; Bormann, K.; Deems, J.; Gehrke, F. The Airborne Snow Observatory: Fusion of scanning lidar, imaging spectrometer, and physically-based modeling for mapping snow water equivalent and snow albedo. Remote Sens. Environ. 2016, 184, 139–152. [Google Scholar] [CrossRef]
Royer, A.; Roy, A.; Jutras, S.; Langlois, A. Performance assessment of radiation-based field sensors for monitoring the water equivalent of snow cover (SWE). Cryosphere 2021, 15, 5079–5098. [Google Scholar] [CrossRef]
Treichler, D.; Kääb, A. Snow depth from ICESat laser altimetry-a test study in southern norway. Remote Sens. Environ. 2017, 191, 389–401. [Google Scholar] [CrossRef]

Figure 1. Spatial distribution of weather stations with possible snow over mainland China. The base map shows the land cover types provided by the Data Center for Resources and Environmental Sciences, Chinese Academy of Sciences (http://www.resdc.cn/ (accessed on 12 May 2018)).

Figure 2. Schematic diagrams of the (a) RF, where n = samples, m = subsamples, m << n, k = variables, ntree = 500, and mtry = k/3; (b) SVR, where input data are rescaled to the range of [−1, 1], the RBF represents the radial basis function kernel, c and g are the parameters of the RBF, and 5-CV denotes 5-fold cross-validation; and (c) ANN, where i and j are the numbers of neurons and hidden layers, respectively.

Figure 3. Workflow of the proposed pixel-based snow depth retrieval algorithm over China.

Figure 4. Trends of (a) corr.coe and (b) unRMSE with increasing training sample size for the three ML models.

Figure 5. Performance comparison of the SVR (top row), ANN (middle row), and RF (bottom row) approaches for snow depth estimation with sample- (left column), temporal- (middle column), and spatial-based (right column) 10-CV approaches. The ground truth SD was randomly selected from all station measurements, including parts A and B. The black solid and dashed lines represent the 1:1 line and the linear regression line, respectively.

Figure 6. Performance validation of ANN- (left column), SVR- (middle column) and RF-based (right column) estimates over China using independent data spatially (see Figure 1, from part B stations). ANN1, SVR1, and RF1 denote the trained ML models without considering effGS, while ANN2, SVR2, and RF2 denote models considering effGS as a predictor variable.

Figure 7. Fitting coefficients of snow depth algorithms PAG1 (upper row) and PAG2 (bottom row).

Figure 8. Validation and comparison of PAG1 and PAG2 algorithms during the winter period (January, February, March, November, and December) and snowpack conditions (thin snow (<20 cm) and deep snow (>20 cm)).

Figure 9. Relationship between the satellite-observed Tb gradient and snow course-measured snow depth over China.

Figure 10. Evaluation and comparison of snow depth estimates from (a) the pixel-based algorithm, (b) GlobSnow-v3.0, (c) AMSR2, and (d) ERA5-land using part B station measurements.

Figure 11. Evaluation and comparison of snow depth estimates in three stable snow cover areas (northern Xinjiang (XJ), Northeast China (NE), and the QTP) over China using part B station measurements. GlobSnow-v3.0 has no estimates in the mountains, e.g., the QTP.

Figure 12. Evaluation of snow depth estimates using snow course 1 measurements in northern Xinjiang.

Figure 13. Evaluation of snow depth estimates using snow course 5 measurements in Northeast China.

Figure 14. Evaluation of snow depth estimates using snow course 6 measurements in Northeast China.

Figure 15. Evaluation of snow depth estimates using snow course 2 measurements around the Tien Shan Mountains. The roughness is equal to the natural logarithm of the standard deviation (Stdev) of the elevation within a grid cell (roughness = log_eStdev). The elevation at 90 m spatial resolution was downloaded from http://www.resdc.cn/ (accessed on 15 March 2022).

Figure 16. Spatial distribution of monthly averaged snow depths in 2016 over China.

Figure 17. Comparison of SE1 estimates and station observations in the Altai Mountains, Tien Shan Mountains, and QTP.

Table 1. Descriptions of the gridded snow depth products used in this study.

Data Product	Initial Resolution	Post Processing	Final Resolution	Reference/Availability
AMSR2	0.25° × 0.25° (daily)	\	0.25° × 0.25° (daily)	http://gportal.jaxa.jp/gpr/ (accessed on 25 May 2020)
GlobSnow-v3.0	25 km × 25 km (daily)	linear resampling		https://www.globsnow.info/ (accessed on 1 March 2022)
ERA5-land	0.1° × 0.1° (hourly)	linear resampling		https://cds.climate.copernicus.eu/ (accessed on 28 May 2020)

Table 2. Summary of predictor and target variables.

Predictor Variables	Interpretation	Source	Target Variable
Tb_10.65V − Tb_36.5V	vertical polarized spectral difference at 10.65 GHz and 36.5 GHz	AMSR2	Snow depth in centimeters (weather station, 2012–2020)
Tb_18.7V − Tb_36.5V	vertical polarized spectral difference at 18.7 GHz and 36.5 GHz	AMSR2
Elevation	altitude in meters	weather station
Longitude	longitude in degrees	weather station
EffGS	effective grain size in millimeters	optimized by HUT

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yang, J.; Jiang, L.; Pan, J.; Shi, J.; Wu, S.; Wang, J.; Pan, F. Comparison of Machine Learning-Based Snow Depth Estimates and Development of a New Operational Retrieval Algorithm over China. Remote Sens. 2022, 14, 2800. https://doi.org/10.3390/rs14122800

AMA Style

Yang J, Jiang L, Pan J, Shi J, Wu S, Wang J, Pan F. Comparison of Machine Learning-Based Snow Depth Estimates and Development of a New Operational Retrieval Algorithm over China. Remote Sensing. 2022; 14(12):2800. https://doi.org/10.3390/rs14122800

Chicago/Turabian Style

Yang, Jianwei, Lingmei Jiang, Jinmei Pan, Jiancheng Shi, Shengli Wu, Jian Wang, and Fangbo Pan. 2022. "Comparison of Machine Learning-Based Snow Depth Estimates and Development of a New Operational Retrieval Algorithm over China" Remote Sensing 14, no. 12: 2800. https://doi.org/10.3390/rs14122800

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Comparison of Machine Learning-Based Snow Depth Estimates and Development of a New Operational Retrieval Algorithm over China

Abstract

1. Introduction

2. Data and Methodology

2.1. Ground-Based Measurements

2.2. Gridded Products

2.3. ML Models

2.4. Workflow

3. Results

3.1. Sensitivity of ML Models to Training Sample Size

3.2. ML Model Performances

3.3. Development of the Pixel-Based Algorithm

3.4. Evaluation of the Pixel-Based Algorithm and Comparison with Other Satellite Products

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI