*Article* **Evaluation of Spaceborne GNSS-R Retrieved Ocean Surface Wind Speed with Multiple Datasets**

**Zhounan Dong 1,2 and Shuanggen Jin 1,3,\***


Received: 26 September 2019; Accepted: 20 November 2019; Published: 22 November 2019

**Abstract:** Spaceborne Global Navigation Satellite Systems-Reflectometry (GNSS-R) can estimate the geophysical parameters by receiving Earth's surface reflected signals. The CYclone Global Navigation Satellite System (CYGNSS) mission with eight microsatellites launched by NASA in December 2016, which provides an unprecedented opportunity to rapidly acquire ocean surface wind speed globally. In this paper, a refined spaceborne GNSS-R sea surface wind speed retrieval algorithm is presented and validated with the ground surface reference wind speed from numerical weather prediction (NWP) and cross-calibrated multi-platform ocean surface wind vector analysis product (CCMP), respectively. The results show that when the wind speed was less than 20 m/s, the RMS of the GNSS-R retrieved wind could achieve 1.84 m/s in the case where the NWP winds were used as the ground truth winds, while the result was better than the NWP-based retrieved wind speed with an RMS of 1.68 m/s when the CCMP winds were used. The two sets of inversion results were further evaluated by the buoy winds, and the uncertainties from the NWP-derived and CCMP-derived model prediction wind speed were 1.91 m/s and 1.87 m/s, respectively. The accuracy of inversed wind speeds for different GNSS pseudo-random noise (PRN) satellites and types was also analyzed and presented, which showed similar for different PRN satellites and different types of satellites.

**Keywords:** spaceborne GNSS-R; DDM; ocean surface wind speed; GMF; CYGNSS

#### **1. Introduction**

Rapidly acquiring global high temporal and spatial resolution ocean surface wind field has extremely significant in many fields. Spaceborne GNSS-R is a relatively new remote sensing technique with promising prospects, which receives reflected GNSS signals from the Earth's surface. With the GNSS-R receiver mounted on a low Earth orbit (LEO) microsatellite, it can form a spaceborne bistatic radar scatterometer to sense wind speed near the sea surface. The hardware instrument of the GNSS-R payload is lightweight and low in power, which can greatly reduce the deployment cost of this remote sensing technique. Through reasonable satellite constellation designing, continuous and rapid measurement of the global sea surface wind speed can be reached, which will effectively compensate for the shortcomings of the traditional monostatic scatterometer and radiometer.

The idea of using the signals of opportunity from GNSS for geophysical parameter detection has been through nearly 30 years of development. The passive reflectometry and interferometry system (PARIS) mesoscale ocean altimetry concept was first proposed by Martin-Neira in 1993 [1]. In 1996, Katzberg conceived of receiving GNSS signals from ocean surface reflection using receivers mounted on LEO satellites to remotely sense ocean states and ocean surface physical parameters [2]. In 2000, Zavorotny and Voronovich presented the GNSS-R bistatic radar scattering model [3], which is

based on the geometric optical approximation of the Kirchhoff (KA-GO) model that describes the average scattering power of the GNSS signal reflected by the sea surface to the receiver direction over different time delays and Doppler frequency shifts. Subsequently, simulation based on the theoretical model [4–6] and the feasibility test of the real satellite-based GNSS-R mission have been gradually carried out.

In 1998, Garrison and Katzberg conducted the airborne GNSS-R experiment that preliminarily demonstrated the potential of GNSS-R to sense the ocean state [7,8]. The first spaceborne GNSS-R experiment was implemented on Shuttle Imaging Radar with Payload C (SIR-C) carried on the space shuttle to verify the feasibility and determine the expected signal-to-noise (SNR) ratio on LEO [9]. The UK Disaster Monitoring Constellation (UK-DMC) was launched in 2003, and successfully verified that GPS signals could be received from the ice-ocean, snow, ocean water, and even in the case of land on the LEO [10]. On July 8, 2014, the UK Technology Demonstration Satellite-1 (TDS-1) satellite was successfully launched with a specific Spaceborne GNSS Receiver REmote Sensing Instrument (SGR-ReSI) payload specifically designed for generating Delay/Doppler map (DDM) in real-time [11]. On 15 December 2016, NASA launched the CYGNSS satellite constellation consisting of eight small satellites, which became the first microsatellite constellation dedicated to GNSS-R ocean wind remote sensing [12]. In addition, GNSS-R experiments on-board Soil Moisture Active and Passive (SMAP) [13] and GNSS Reflectometry, Radio Occultation, and Scatterometry Onboard the International Space Station (GEROS-ISS) [14] missions both aim to exploit the GNSS-R technique for geophysical parameters remote sensing.

The onboard Delay-Doppler mapping instrument (DDMI) outputs DDM after a series of digital signal processing and precise calibration [15]. The scattering power in different DDM pixels is the average cross-correlation power of the sea surface reflected signal received by the nadir left-hand circular polarization (LHCP) antenna and a locally clear replica GNSS navigation code in the receiver. In most missions, this mode is called traditional satellite-based GNSS-R, abbreviated as cGNSS-R. There is another GNSS-R operation mode that directly uses the reflected and direct signals for correlation processing that is called iGNSS-R [16]. In the early stage of airborne GNSS-R experiments, only a small number of observations was obtained, and the wind speed was inferred mainly through optimal correlation fitting between the observed time delay waveform and the simulation waveform from the Z-V model [3]. However, the current on-orbit UK Technology Demonstration Satellite-1 (TDS-1) satellite and CYGNSS constellation provide massive observation data. The sea surface wind speed retrieval method, similar to the backward microwave scatterometer [17], is commonly employed, which regresses so-called DDM observables against the collocated wind from other observing data sources, and quantifies the mapping relationship to form the empirical geophysical model function (GMF) for future wind predicting.

Previous studies have demonstrated that compared with the DDM model-fitting method, the formed empirical GMF can obtain better performance [18,19]. Before establishing a retrieval model through empirical regression, it is necessary to further calibrate DDM to remove the influence of non-geophysical effects. The UK TDS-1 mission provided DDM calibrated to SNR [20], while for the CYGNSS mission, the DDM can be calibrated to the bistatic radar cross-section (BRCS) with more comprehensive ancillary parameters [15,21]. However, the calibrated DDM still cannot be used directly for wind speed inversion but needs to further extract the feature quantity, it should sensitive to wind speed in a specific size of the DDM window determined in terms of the requirement of the inversion spatial resolution. Clarizia and Ruf studied a fitting method by using CYGNSS DDM observables to build empirical GMF, adaptively performed minimum variance (MV) estimates based on DDM range-corrected gain (RCG) [22], and the Bayesian estimate was also adopted to calculate the weighting coefficient for each observable [23]. The results of these studies show that when the real wind speed over sea surface is less than 20 m/s, the wind speed retrieval error is less than 2 m/s, but a larger error appears at a higher wind speed range [24]. The latest studies have carried out some new attempts: [25] proposed sequential processing based on the extended Kalman filter, which is evaluated

by the simulated wind field data; and [26] presented a method to train the neural network for wind speed prediction by using the entire DDM and different feature information. Both showed promising results, but need to further validate with a large amount of measured data.

Although scatterometers, radiometers, buoys, the NWP product, and even ship observations can be used to construct a robust empirical GMF model, the accuracy and spatial-temporal resolution of the reference wind speed provided by different data sources are different. Therefore, it is important to evaluate the effect of the reference wind and the reliability of the inversion model by the retrieval results obtained from different ground truth winds. In this paper, a refined wind speed retrieval algorithm was used to establish the GMF from three different wind data sources, and the inversion results were analyzed and evaluated by multi-dataset. The rest of this manuscript is organized as follows. Section 2 introduces the GNSS-R remote sensing theory and the improved wind speed inversion algorithm. Section 3 depicts the dataset and presents the performance of the wind speed retrieval algorithm compared to different ground truth wind speeds. The discussion is given in Section 4, and Section 5 summarizes the main conclusions.

#### **2. Theory and Methods**

#### *2.1. Bistatic Radar Equation*

The spaceborne GNSS-R remote sensing sea surface wind speed based on the onboard DDMI, which is capable of cross-correlation reflected signals with the local replica code in the receiver and mapping the scattering power over a range of time delay and Doppler frequency bins, is known as DDM. To generate the DDM, the coherent integration time commonly takes 1 ms during signal processing in the receiver to avoid the influence of strong speckle noise in short-time correlation, and 1 s non-coherent integration is performed to obtain higher SNR DDM. Both the current TDS-1 and CYGNSS projects belong to this traditional cGNSS-R. The bistatic radar equation (BRE) theoretically explains the physical meaning of DDM [3].

$$\langle \left| Y\_s(\mathbf{\hat{r}}, \hat{f}) \right|^2 \rangle = \frac{P\_T G\_T \lambda^2 T\_I^2}{(4\pi)^3} \iint \frac{G\_R \sigma^0}{R\_T^2 R\_R^2} \Lambda^2(\mathbf{\hat{r}} - \mathbf{\tau}) \text{sinc}^2(\mathbf{\hat{f}} - \mathbf{\hat{f}}) dA \tag{1}$$

where P*<sup>T</sup>* is the transmit power of GNSS satellite; G*<sup>T</sup>* is the transmit antenna gain; G*<sup>R</sup>* is the antenna gain of the receiver; λ is the wavelength of the signal carrier; T*<sup>I</sup>* is the relevant integration time; R*T*, R*<sup>R</sup>* represent the distance from the transmitter to the sea surface and surface to the receiver, respectively; Λ(τˆ − τ) is the GNSS signal spreading function in delay; and τˆ and τ are the replica signal and incoming signal time delays, respectively. *sinc*<sup>2</sup> ˆ *f* − *f* is the frequency response of the GNSS signal; and ˆ *f* and *f* are the replica signal and incoming signal frequencies, respectively. A is the effective scattering area of DDM and *dA* is a differential area within *A*. σ<sup>0</sup> is the normalized bistatic radar scattering cross-section, which can be expressed as:

$$
\sigma^0 = \frac{\pi \left| \Re \right|^2 \overrightarrow{q}^4}{q\_z^4} P \left( -\frac{\overrightarrow{q}\_\perp}{q\_z} \right) \tag{2}
$$

where is the Fresnel reflection coefficient associated with the signal polarization characteristics; <sup>→</sup> *q* is the scattering unit vector; <sup>→</sup> *q* <sup>⊥</sup> is the horizontal component of the scattering unit vector; *qz* is the vertical component (surface normal direction); *P* is the probability density function of the rough sea surface slope. The scattering coefficient is related to geophysical parameters in the bistatic radar equation. The scattering intensity from the ocean surface is mainly affected by the sea surface roughness. For the wind speed retrieval, the sea surface roughness is affected due to the influence of local winds, and the change in sea surface roughness will be reflected in the variation of the scattering power.

#### *2.2. Delay-Doppler Map Observables*

So far, two types of operational wind speed retrieval algorithm have been used in airborne or spaceborne GNSS-R: (1) through the optimal fitting between the simulated DDM by the Z-V theoretical model and actual observed DDM; and (2) using the DDM derived observables to matchup the collocated wind from another observing system to model an empirical GMF, then using GMF to map the new observable to predict wind speed. The method of using the DDM deconvolution to restore the scattering coefficient, then computing the probability density function (PDF) of the surface slope to relate the wind state, is still undergoing further theoretical research [27]. Usually, multiple DDM observables can be used to establish the separated GMF and predict corresponding wind speed, therefore, an optimal weighting estimator is needed to estimate the weighting coefficients to calculate the final weighted wind speed. Due to the influence of parameter uncertainty such as DDM observation noise, receiver antenna gain, and the effective isotropic radiated power (EIRP) of different GNSS satellites, it is difficult to generate very accurate reference DDM from the DDM simulator [6]. At the same time, the whole DDM from a spaceborne platform corresponds to a larger sea surface area, so the glistening area can be more than 400 km in diameter [28,29]. To meet the specific spatial resolution of inversed wind speed, the latter approach is commonly used in the current GNSS-R field. Before modeling the empirical GMF, the original DDM needs to be calibrated and wrapped according to the bistatic radar forward equations to remove non-geophysical effects, which are usually calibrated as bistatic radar cross-sections [15,21]. Based on the requirement of retrieved spatial resolution, the wind speed inversion model has been established to extract observables in a specific delay Doppler window from DDM [24], which is less affected by observed noise, and sensitive to wind speed around the specular point. Furthermore, the processing of time average is carried out to improve the SNR of observables [24].

Clarizia et al. proposed five DDM observables for the UK-DMC project to exploit the characteristics of DDM that sensitively respond to variations in wind speed, including the Delay-Doppler Map Average (DDMA), Leading Edge Slope (LES), Trailing Edge Slope, Delay-Doppler Map Variance (DDMV), and Allan Delay-Doppler Map Variance (ADDMV) [22]. However, the current UK TDS-1 and CYGNSS mission can only extract DDMA and LES, as both projects directly provide DDM after non-coherent processing in DDMI without raw intermediate frequency (IF) signals published. Since the limitation of spatial resolution, TES observables also cannot be adopted [24]. DDMA represents the average value of the BRCS near the specular point. The LES is the leading edge slope of the integral delay waveform (IDW), and the IDW is obtained from scattering power DDM by summing the columns along the Doppler axis in the specific Delay/Doppler window. Readers can refer to the method in [24] to calculate the inversion observations.

#### *2.3. Wind Speed Retrieval Algorithm*

The fundamental process of the spaceborne GNSS-R wind speed retrieval algorithm is shown in Figure 1 as follows. (a) calibrate the DDM and compute DDM observables, we directly used the DDM of BRCS in the CYGNSS Level 1 dataset, where the calibration can be found in [15,21]; (b) improving the SNR by time-averaging while satisfying the requirement of spatial resolution; (c) DDM observables matchup the ground surface truth wind speed from other observing techniques to establish training samples; (d) considering the geometry configuration of the bistatic radar system to form the mapping relation between DDM observables and referenced wind speed; (e) using the algebraic parametric model to smooth the 2D GMF to remove the influence of insufficient training dataset and observation noise; (f) estimating the weighting coefficients of different observables by the minimize variance (MV) estimator; and (g) using the model for wind speed predictions.

**Figure 1.** Flow chart of spaceborne GNSS-R wind speed retrieval

In this paper, a refined wind speed algorithm is presented based on [19], the NWP and CCMP wind products are used as ocean surface truth wind speed. Matching the observables with reference wind to form the inversion model, counterparts were obtained by bilinear interpolation in space and linear interpolation in time. In fact, the modeled GMF is a series of mapping relations between the DDM observables and referenced wind speed at different incidence angles but presented as a discrete point. The dynamic range of derived wind speed is limited below 35 m/s, and the incidence angle ranges from 0◦ to 70◦. This study does not distinguish the effects of different sea surface states on GNSS-R wind speed retrieval.

In the incident angle dimension, GMF is modeled under different incident angles starting from 0.5◦ in a step length of 1◦. At a certain incidence angle, wind speeds starting from 0.05 m/s increments to 35 m/s in the step length of 0.1 m/s, and the weighted average observables at different wind speed bins are calculated to form discrete empirical mapping relationships. In order to expand the training samples, the training data can be overlapped in both dimensions. At a certain incident angle, all the matched data pairs falling within the left and right two step length intervals (steps by 1◦) are taken as the model training samples. At a certain wind speed range, the weighted DDM observables are calculated by taking the data in the left and right two step length intervals as well, but it should be noted that the step length of the wind speed interval needs to be re-determined according to the wind speed probability density. The strategy in [19] was directly used in this paper. In both dimensions, the training data within different intervals use different weights, and samples within the first step length interval from certain incidence angle/wind speed take twice the weight. The weighting strategy can be explained in Figure 2. There are four different cases for the sample points to construct a specific discrete point for GMF to map between the observable and wind speed at a certain incidence angle: (1) the sample located in the first step length interval around the incident angle and the first step length around specific wind speed, as shown in the blue region in Figure 2, with a scale factor is 4; (2) the sample located within the first step length range of the incident angle and the second step length size around the wind speed, as shown in the purple area in Figure 2 with a scale factor is 2; (3) the sample point is located in the second step length interval around the current incident angle and in the interval of the first step length around the wind speed, as shown in the green area in Figure 2 with a scale factor is 2; and (4) the sample point is located in the second step length interval around the current incident angle and in the second step length interval around the wind speed, as shown in the gray area in Figure 2 with a scale factor is 1. The number of samples in the different intervals is multiplied by the corresponding scale factor as the numerator of the sample weight coefficients in a certain interval, and the sum of the four intervals is used as the denominator of the weight coefficient to compute the weighted value of the discrete points.

**Figure 2.** Weighting strategy of observables to form empirical geophysical model function.

To eliminate the fluctuation of the empirical model caused by insufficient training samples and system noise and ensure the accuracy of the empirical GMF, first, the maximum probability density bin of the wind speed in the training dataset is calculated at each incident angle because the probability density distribution of the wind speed slightly varies at different incident angle bins, as shown in Figure 3, it presents the PDF of the matched NWP wind for the CYGNSS DDM observables in each incidence angle bin. Then, the corresponding weighted DDM observables were calculated with the wind speed close to its maximum probability density, the observables above or below the wind speed were sequentially computed with the step size of 0.1 m/s, and the discrete GMF was also forced to be a monotone function with wind speed, which means that the GMF values could be same if monotonicity is violated during calculation. Since a single function form was not found to fit the discrete GMF well, we directly chose a piecewise function to obtain the final GMF model as the CYGNSS science operations center. The smoothing function smaller or larger than the piecewise point is shown in Equations (3) and (4), respectively:

$$\mathbf{a}\mathbf{b}\mathbf{s} = a\mathbf{o} + a\_1\mathbf{u}^{-1} + a\_2\mathbf{u}^{-2} \tag{3}$$

$$
\cos \mathbf{b} = b\_0 + b\_1 \boldsymbol{\mu} + b\_2 \boldsymbol{\mu}^2 \tag{4}
$$

where *obs* means the DDM observables and *u* represents the ground truth wind speed *U*10.

**Figure 3.** Distribution of *Pwind* as a function of incidence angle (unlabeled colored lines indicate the probability density distribution of wind speed at each incident angle bins).

The nonlinear least-squares fitting was performed for the GMF with wind speeds smaller than the segmented point, a new constraint was added, which requires all coefficients of formula (3) are limited to be non-negative. For discrete points of GMF where the wind speed is greater than the segmented point, the parabolic function fitting means that the values of two functions are equal at the piecewise point, and the first derivatives are equal. Furthermore, new constraints have been added such as limiting the opening of the parabola downward and its axis of symmetry on the left side of the piecewise point to force the established mapping relationship to be smoother and more consistent with the distribution of training data. The smoothing procedure of the discrete GMF is transformed into a nonlinear least square fitting and convex quadratic programming problem. The standard form of the convex quadratic programming is:

$$\begin{array}{ll}\min & \frac{1}{2} \mathbf{x}^T P \mathbf{x} + q^T\\ & \text{s.t. } G \mathbf{x} \le h\\ & A \mathbf{x} = b \end{array} \tag{5}$$

One of the important things to apply a piecewise function is to determine the segmentation point. In order to determine the optimal piecewise point, first, the smooth processing is performed at each GMF discrete point to find one with the smallest fitting residual. Then, do the smoothing again in the interval of a step length (0.1 m/s) around the discrete point with a smaller step. Finally, find the best location with the smallest fitting residual as the final transition point to re-smooth the final model. After obtaining the GMF function *u*(θ, *obsi*) (where θ denotes the incident angle and *i* denotes DDMA or LES) of the spaceborne GNSS-R, more precisely, it is a lookup table. When using the GMF model for the wind speed prediction. The model *u*(θ, *obsi*) is linearly interpolated to the incident angle θ corresponding to the observable getting *u*θ(*obsi*) at first, then *u*θ(*obsi*) linearly interpolates again to obtain the inversed wind speed *u*θ,*obsi* corresponding to the observable.

Establishing the separated empirical GMF for both DDMA and LES can benefit quality control. When the retrieved wind speed between the two types of observables is greater than 3 m/s, the inversion results are considered unreliable. Finally, the MV estimation is used to dynamically adjust the DDM SNR variation caused by GNSS-R geometry changes. The optimal wind speed estimator is obtained by weighting the wind speeds from DDMA and LES [22].

#### **3. Results and Validation**

The CYGNSS dataset became available in March 2017, and the experiments in this work used the V2.1 version of CYGNSS level 1 data downloaded from the Physical Oceanography Distributed Active Archive Center (PO.DAAC). In the process of forming GMF, the RCG of samples is required to be greater than 10 to ensure the quality of the GNSS-R observations. The training dataset used was from 1 July, 2017 to 30 November, 2017. Data collected in December 2017 were used as a test dataset. The inversed results were compared with the NWP-based and CCMP-based wind, also evaluated by the buoy data. Figure 4 shows the distribution of specular point tracks of the 8 CYGNSS satellite on 1st December 2017 and the distribution of the used moored buoys, which are indicated by the red dot.

**Figure 4.** The distribution of daily specular point tracks of 8 CYGNSS satellites and buoys.

#### *3.1. Wind Speed Retrieval Based on ERA5 and GDAS*

The atmospheric reanalysis numerical weather prediction (NWP) products from ECMWF ERA5 provided by the Climate Data Store (ADS) and the Global Data Assimilation System (GDAS) products provided by the Research Data Archive (RDA) were used as the ground truth wind speed in the GNSS-R wind speed retrieval. ERA5 is the fifth generation of NWP products provided by the European Center for Medium-range Weather Forecast (ECMWF), the spatial resolution of the grid wind field product is 0.25◦ × 0.25◦and the time resolution reaches per hour [30]. National Centers for Environmental Prediction (NCEP) operates a global data assimilation system and the surface flux grid for the NCEP GDAS/FNL global surface flux products uses the T574 Gaussian global grid with a time resolution of 6 hours for wind speed products [31]. Both NWP products provide the U10 wind field. The wind speed data are interpolated bilinearly in space and linearly in time to matchup the DDM observables. In order to improve the accuracy of the reference wind speed in the training samples, samples with a deviation larger than 3 m/s from the two reference winds were removed. When the wind speed was less than 20 m/s, only the matched reference wind from ERA5 was used. When the wind speed was greater than 20 m/s, but less than 25 m/s, the average of two matching wind speeds was used. While the wind speed was greater than 25 m/s, only the GDAS wind was adopted.

When DDMA and LES take the log scale, the fitting wind speed inversion model for DDMA and LES observables as shown in Figure 5, it is important to note that we have removed the parts of GMF that exceeded the range of the coordinate axis. Different levels of fold appeared where the wind speed was greater than 20 m/s, especially for the LES, which was obviously due to insufficient training samples. The correlation between the DDM observables and wind speed decreased as the wind speed increased, and when the wind speed was greater than 10 m/s, the first-order derivative change rate of the GMF function was small. This can be seen more clearly in Figure 6, which presents a group of DDMA and LES GMF at specific incidence angles. However, there is an apparent difference in the sensitivity between the two types of DDM wind speed observables, the LES could not respond to the variation of wind speed when it reached 25 m/s, while the DDMA was still sensitive to the change in the observables under the condition of strong wind speed. If the training sample is small or the DDM noise is large, the modeling error is easily shifted to GMF, which will definitely amplify the observation error in the final wind retrieval. However, the incident angle has little effect on the inversion model at low wind speed range, but as the wind speed increases, the influence becomes distinct.

**Figure 5.** Delay-Doppler Map Average (**a**) and Leading Edge Slope (**b**) geophysical model function using Numerical Weather Prediction winds as the ground truth wind speed.

**Figure 6.** Delay-Doppler Map Average (**a**) and Leading Edge Slope (**b**) geophysical model function at specific incidence angles using Numerical Weather Prediction winds as the ground truth wind speed.

When the NWP product was used as the surface truth wind, the density scatterplot of the retrieved wind speed against reference winds, residual versus reference winds, incident angle, and RCG is shown in Figure 7, where the blue dash line in Figure 7a represents 1:1, the wind speed deviation refers to the inverted wind speed subtracts the ground truth wind speed. The total number of matched test samples was 12,356,042 with bias and the Root Mean Square (RMS) of inferred winds are 0.14 m/s and 1.84 m/s, respectively. The inversed uncertainty of different wind intervals are also counted, and a wind speed greater than 10 m/s accounts for 6.71% with an uncertainty of 2.76 m/s; when the wind speed is larger than 15 m/s, the test samples account for 0.19% of the total training dataset with an uncertainty of 3.24 m/s, and a larger retrieval error appeared at higher wind speed range. The dependence of retrieval error on the NWP-derived ground truth wind is shown in Figure 7b, where positive biases appeared at reference wind speeds of 5–12 m/s, while negative biases can be seen at ground truth winds above 12 m/s. The dependence of retrieval error on the incidence angle is shown in Figure 7c, where the highest density of retrieval errors is generally distributed near zero error. There is a pyramid distribution between the retrieval error and RCG as shown in Figure 7d; since RCG represents the received signal strength, larger RCGs can mean better-received signal quality. Figure 8 shows the average deviation and RMS of the inversed wind speed at different wind speed bins. The maximum matching reference wind speed of the test samples was 23 m/s, and the bias and RMS had large

fluctuations around 20 m/s. The main reason is that there are fewer test samples under this wind speed range, which caused large errors.

**Figure 7.** Log(density) scatterplots of the retrieved wind speed with Numerical Weather Prediction ground truth wind speed (**a**), residual versus truth wind speed (**b**), incident angle (**c**), and range-corrected gain (**d**).

**Figure 8.** Root Mean Square and bias from Numerical Weather Prediction derived sea surface wind speed in different wind bins.

#### *3.2. Wind Speed Retrieval Based on CCMP*

The cross-calibrated multi-platform ocean surface wind vector analysis product V2 (CCMP) combines cross-calibrated satellite microwave winds and in-situ wind, using a variational analysis method (VAM) to produce high-resolution (0.25◦) gridded product with a time resolution of 6 h [32], it is also provided by RDA. Satellite-based passive and active microwave measurements mainly contribute to this wind product. The main reason for adopting the CCMP wind as ground truth wind speed is not only because it has high accuracy, but also because it is mainly based on a satellite-based microwave

scatterometer and radiometer. Using CCMP wind can further compare the accuracy of wind retrieval between GNSS-R and other satellite-based techniques.

The CCMP data are also interpolated bilinearly in space and linearly in time to match the DDM observables for the training dataset. The empirical wind speed retrieval algorithm mentioned in Section 2.3 was used to construct the 2D GMF, Figure 9 shows the GMF established by DDMA and LES at an incident angle of 30◦. Magenta dots in the figure represent the empirical GMF directly obtained by weighting the training samples, and the black line represents the GMF obtained after parametric smoothing. It should be noted that the *x*-axis magnitude of the two subgraphs is different. Compared with DDMA, the distribution of the LES samples is closer to the *y*-axis. The comparison of the empirical GMF and parametric smoothed GMF shows that the parametric model of DDMA GMF was better than LES, and the latter had a slightly bigger gap at a lower wind speed range between the smoothed parametric GMF and original GMF, where the phenomenon indeed appeared at the full incidence angle dimension. We infer that the reason for this is that the number of samples used to construct the GMF was still quite small.

**Figure 9.** Delay-Doppler Map Average (**a**) and Leading Edge Slope (**b**) geophysical model function at an incident angle of 30◦ with the CCMP wind.

Since the wind speed products provided by the CCMP dataset are only available until 30 December, 2017, the CCMP testing dataset only included 30 days of samples, and the matchup wind speed pairs reach a total of 10,272,765 groups. The density scatterplot of retrieved wind speed and reference winds, residual versus reference wind speed, incident angle, and RCG are shown in Figure 10 with the logarithmic in the number density of samples. Figure 10a shows that the highest density of samples occurred along the 1:1 blue dash line, and the statistic indicates that the bias and RMS of the CCMP-based inversed winds are 0.05 m/s and 1.68 m/s, respectively. In order to further clarify the inversion accuracy at the different wind speed ranges, the percent and uncertainty of the test samples were also presented, and the wind speed values in training dataset greater than 10 m/s account for 8.48% of the training set with an RMS 2.56 m/s, and wind speeds greater than 15 m/s account for 0.26% with 3.13 m/s. The behavior of the inversed residual with the CCMP wind, incidence angle, and RCG was similar to the NWP-based results, but the residual was closer to the zero error.

**Figure 10.** Log(density) scatterplots of retrieved wind speed with the CCMP ground truth wind speed (**a**), residual versus truth wind speed (**b**), incident angle (**c**), and RCG (**d**).

Figure 11 shows the mean bias and RMS of the CCMP-derived wind speed at different wind speed bins. The RMS uncertainty rose gradually with the wind speed below approximately 10 m/s, with bias around 0 m/s and RMS below 2 m/s. However, the inversion uncertainty increased sharply and is accompanied by fluctuations after the wind speed is above 15 m/s, the bias increasing is even surprising. However, the behavior of the RMS and bias is related to many factors. The primary contribution is from the insufficient testing samples at winds above 15 m/s, which also occurred on the training samples, where it appears as a large modeling error, and the observation is very sensitive to the variant of wind speed.

**Figure 11.** Root Mean Square and bias from CCMP-derived sea surface wind speed in different wind bins.

#### *3.3. GNSS-R Wind Speed Validation with Buoy Observation*

Continuous high temporal-resolution wind speed data were obtained from moored buoy measurements provided by the National Data Buoy Center (NDBC). The time resolution of the wind speed product is 10 minutes, the dynamic range of wind speed is 0.0–35 m/s, and the measurement uncertainty is 0.3 m/s or 3%. Figure 4 shows the distribution of the 93 moored buoys used in this study, buoy wind is employed as an additional data source to validate NWP-derived and CCMP-derived wind speed. It should be noted that the wind of the Tropical Atmosphere Ocean (TAO) program buoy array provided by the NDBC is not calibrated to the standard wind speed reference height U10, and is directly corrected by the following formula [33]:

$$\mathbf{U}\_{10} = 8.87403 \times \mathbf{U}\_{\mathbf{Z}} / \ln(z/0.0016) \tag{6}$$

where *UZ* is the measured wind speed at the anemometer height of *z* above the sea surface in meters. When matching the buoy wind speed with GNSS-R retrieved wind speed, the distance between the specular point of observables and the buoy is limited to less than 50 km, and the reference wind speed was obtained by linear interpolation in the time domain.

Buoy wind speed is obtained by in situ observations, so it has the highest accuracy. Since most of the buoys are located on the offshore coast, the number of training samples collected under the matching condition is very small, so it is difficult to model a 2D GMF. Through analyzing the training dataset, it is found that the buoy reference wind speed is almost below 15 m/s in the matched dataset, even if we tried to neglect the influence of incident angle, the 1D GMF still lacks the reference value for application, so we only used the buoy wind speed as external data source to further evaluate the wind speed inversion results of the other two datasets.

Figure 12 compares the NWP-derived and CCMP-derived winds using the buoy wind as a reference value, where RLM represents a robust regression line, and the histogram shows the distribution of winds. It can be seen from the wind distribution histogram that the wind speed range of the matched samples was basically in the range of 0–15 m/s. The test wind speed pairs from NWP-derived winds and buoy winds only had 23,812 pairs in December 2017, only the sample less than four times of the standard deviation was selected for accuracy statistics, which are contained in the patch of the figure. The bias and RMS are −0.18 m/s and 1.91 m/s, respectively, and the Pearson correlation coefficient between the two sets of wind is 0.78. The CCMP-derived inversed winds demonstrated a slightly better performance compared to NWP, where there are 20,604 groups of opportunity matched wind speed pairs obtained, with the bias and RMS of 0.11 m/s and 1.87 m/s, respectively, the Pearson correlation coefficient between the estimated winds and buoy winds is 0.78, at the same time, the linear regression line had a larger slope. Generally, the accuracy of the retrieved wind speed from two sets of reference wind data sources is considerable. Both the NWP and CCMP wind speed products can be employed as the truth reference wind speed for spaceborne GNSS-R ocean wind speed remote sensing.

**Figure 12.** Retrieved wind speed from NWP-based (**a**) and CCMP-based surface truth wind (**b**) versus buoy wind speed. The gray histogram indicates the distribution of wind speed in testing samples, and statistics only compute samples less than four times of the standard deviation within the blue patch.

#### **4. Discussion**

Comparing the GMF model of DDMA and LES in Section 3, it can be seen that the sensibility of both observables to the variety of wind speed decreases with the increase of wind speed, and the LES completely loses its response to wind speed changes when the real ocean surface wind speed is close to 25 m/s. On one hand, it is due to the influence of the characteristics of the two wind speed indicators [34], which also means that the accurate calibration of the DDM is very important in improving the inversion accuracy of the retrieval model under high wind speed. On the other hand, because the spaceborne GNSS-R bistatic radar wind speed retrieval algorithm strongly relies on the size of the training dataset and its reliability. Figure 13 shows the probability density distribution of wind speed in the training samples establishing GMF with CCMP wind products as the sea surface truth wind speed. The wind speed in the training dataset is mainly concentrated below 10 m/s, which can also be confirmed in Figure 3 with the NWP reference wind. The most probable wind speed in the dataset is 7 m/s, and the statistics show that wind speeds greater than 10 m/s only accounted for 11.71% in the training samples, and a wind speed greater than 15 m/s only accounted for 0.52%. Therefore, to further improve the inversion accuracy in the medium-strong wind speed range, it is necessary to expand the training data volume in this wind speed range. It should be noted that when CCMP was used as the surface truth wind to model GMF, the retrieved results are better than the NWP. However, when the wind speed was larger than 15 m/s, the bias of the CCMP-based retrieved wind speed was larger than the NWP-based retrievals.

Furthermore, the accuracy of the inversed wind for different GNSS types and PRN satellites is performed. Figure 14 depicts the histogram of the bias, RMS, and the testing dataset size of the inferred wind speed corresponding to different GPS PRN satellites with NWP-derived and CCMP-derived surface truth winds. Currently, the latest published V2.1 version of the CYGNSS data has removed the newly launched GPS block II-F satellite data because their transmit power monitoring is still inaccurate, so only GPS block II-R and GPS block IIR-M related DDMs are available. It can be seen that the accuracy of the inversed wind speed with different PRN satellites is approximately similar to the two different surface reference wind speed sources and there are also no significant differences between the different types of satellites.

**Figure 13.** CCMP reference wind speed probability density distribution in the training samples.

**Figure 14.** Wind speed inversion accuracy from the NWP-based reference wind (**a**) and CCMP-based reference wind (**b**) corresponding to different GPS PRN satellite types.

#### **5. Conclusions**

In this paper, the NWP and CCMP wind products are used as the ground truth wind to establish 2D GMF for spaceborne GNSS-R ocean surface wind speed remote sensing, and the buoy data are included to validate the results of the two groups' inversion results. In order to improve the accuracy of the retrieval algorithm, new constraints are added in the process of smoothing the discrete empirical GMF to eliminate fluctuations caused by the measurement noise and insufficient training samples to ensure that the model is consistent with the trend of the actual physical process. The established 2D GMF is sensitive to the variation of DDM observables under medium-strong wind speed conditions. The results show that the inversion accuracy could reach 1.84 m/s if the surface reference wind speed is given by the NWP when the wind speed was less than 20 m/s, while the inversion accuracy of the CCMP-based retrievals was 1.68 m/s. There are no large deviations between the derived wind based on different reference wind sources, and this result further proves the reliability of GNSS-R derived wind speed. From the distribution of the winds in the training dataset, it shows that the training samples are mainly concentrated below 15 m/s, and therefore, further expanding the size of the training dataset at a high wind range can improve the accuracy of the wind speed retrieval algorithm.

**Author Contributions:** S.J. and Z.D. conceived and designed the experiments and Z.D. performed the experiments and analyzed the data. Both authors contributed to the writing of the paper.

**Funding:** This work was supported by the National Key Research and Development Program of China Sub-Project (grant no. 2017YFB0502802), the Strategic Priority Research Program Project of the Chinese Academy of Sciences (grant no. XDA23040100), the Jiangsu Province Distinguished Professor Project (grant no. R2018T20), and the Startup Foundation for Introducing Talent of NUIST (grant no. 2243141801036).

**Acknowledgments:** The authors would like to thank ECMWF and GDAS for providing the reanalysis data, and the NASA CYGNSS team. The CCMP wind products were provided by the National Center for Atmospheric Research's Research Data Archive, and we also appreciate the work for this dataset.

**Conflicts of Interest:** The authors declare no conflicts of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
