1. Introduction
Wind speed and wind direction are important basic climate variables. As a passive and non-contact remote sensing method, the Global Navigation Satellite System Reflectometry (GNSS-R) technique uses the reflected signal of the navigation satellite L-band signal as the remote sensing source. The GNSS-R technique is based upon a bistatic configuration of the transmitter and receiver. The scattering problem involves the Global Positioning System (GPS) signals transmitted from satellites at altitudes of about 20,000 km [
1]. The system has outstanding application effects in retrieving various physical parameters, such as sea surface wind speed [
2,
3], sea ice [
4,
5,
6], ocean surface altimetry [
7,
8,
9], ocean oil slick detection [
10,
11]. Sea surface wind field retrieval is mainly divided into sea surface wind direction retrieval and sea surface wind speed retrieval. Sea surface wind direction retrieval is a difficult point in wind field retrieval. Few studies involve the sensitivity of GNSS-R signals to wind direction [
12].
Many studies have focused on finding feature parameters closely related to wind direction from the Delay Doppler Diagram (DDM). Early research was mainly based on the relationship between signal delay waveform, probability density function (PDF) and wind direction. In 2003, the influence of wind direction on the GPS signal delay waveform by analyzing the reflected waveform geometry, scattering area and receiver integration time [
13,
14] was simulated by Zuffada et al. Given the satellite elevation angle, wind speed and receiver height, Zuffada et al. found that there were significant differences in different wind directions at the trailing edge of the waveform. In 2004, similar results were shown in the carrier shape of the simulator at the height of 10 km. Further research found that the trailing edge of the waveform of left-hand circular polarization (LHCP) and right-hand circular polarization (RHCP) had differences [
15]. When the incident angle increased, this difference would be reduced. In 1994, Hildebrand et al. used the least-squares method to perform wind direction retrieval based on PDF when the aircraft’s receiver height is 3–5 km, the highest wind direction retrieval accuracy on the filtered data set and the unfiltered data set is 5° and 40° [
16]. On this basis, in 2004, the anisotropy in the PDF of the mean square slope (MSS) of the sea surface was found to correspond to the local near-surface wind direction. The PDF can be calculated from the code correlation waveform of the GPS reflected signal, which showed the wind direction can be obtained by using the scattered signals of two or more GPS satellites [
17]. In 2003, the MSS was estimated by fitting the delayed waveform obtained in the flight experiment to the geometric optical model [
18]. The part of the wind direction when the satellite azimuth was consistent with the aircraft heading was the same as that of the European Centre for Medium-Range Weather Forecasts (ECMWF) wind direction data. In 2004, Komjathy et al. used different satellite signals collected from airplanes combined with the nonlinear least-squares algorithm to retrieve the wind direction. The result showed that the retrieved wind direction and the QuikSCAT measurements were in 30° error at a wind speed of 5–10 m/s when the airplane had both a stable flight level and a stable flight direction [
19]. In 2014, by fitting the measured DDM with the simulated DDM, Chen et al. made the accuracy of the wind direction retrieval reach 30° [
20].
Another research work on wind direction retrieval was mainly focused on the surface reflection signal under the general mirror geometry observation configuration. In 2017, the influence of wind direction on the near-specular reflection area of DDM was used in calculating the Delay Doppler Map Average (DDMA) from the perspective of DDMA [
21]. The results showed that for pure mirror geometry, DDMA had an effect on wind direction. The relation between DDMA and wind direction is small and the influence of wind speed cannot be ruled out. When the incident angle was 20°–25° and the wind speed was about 9 m/s, the normalized signal-to-noise ratio (SNR) peak difference between the tailwind and the headwind was 1 dB. However, it was currently difficult to retrieve the wind direction through these small changes [
22]. However, refs. [
20,
21] both found differences in the dependence of different parts of the DDM on the wind direction in the process and predicted the possibility of using the part of the DDM far away from the specular reflection point to retrieve the wind direction.
At present, more studies tend to use the relationship between wind direction and the bistatic radar cross-section. In 2014, the sensitivity of the bistatic radar cross-section to the wind direction was evaluated using a small slope approximation model [
23]. The model was more accurate when using scattered signals away from the nominal specular reflection direction. In 2016, Park et al. used a Normalized Bistatic Radar Cross Section (NBRCS) to study the effect of wind direction on GNSS-R application sea surface specular scattering [
24]. For purely mirrored geometry, the change in NBRCS was too small to perceive the wind direction. For only slightly non-mirrored geometry, a large change in wind direction can be found at a single surface point. An airborne wind direction retrieval model based on NOAA G-IV jet aircraft using the Doppler angle of DDM as the retrieval observation was established in the study [
25]. The average accuracy of wind direction retrieval obtained under a fixed model was 20°. However, the model was a non-general model with a small amount of airborne data, and there was also a problem of 180° ambiguity. In 2018, Wang et al. explored the feasibility of using backscattered signals to retrieve wind direction through theoretical simulations, using multi-beam antennas to observe wind direction from at least three different directions to avoid ambiguity. The retrieval accuracy reached 24° at low wind speeds [
26]. In 2021, Pascual et al. drew the conclusion that wind speed and SNR had an important influence on the wind direction retrieval model. The sensitivity of the ocean surface bistatic scattering cross-section measured by the Cyclone Global Navigation Satellite System (CYGNSS) to wind direction using the kurtosis of the DDM samples within a given area was studied. The results show a coefficient of determination (
) between 0.6 and 0.9 for wind speeds between 4 and 10 m/s [
27].
According to the above research results, it is difficult to establish a sea surface wind direction retrieval model, especially in the case of a large space and time span. The solution of wind direction 180° ambiguity is also a key point in wind direction retrieval. Artificial intelligence algorithms, such as machine learning and deep learning, have made it possible to build complex models. Convolutional Neural Networks (CNN) is a typical algorithm in deep learning, and support vector machine (SVM) is a typical algorithm in machine learning [
28,
29,
30]. Due to the small number of training samples in this paper, the SVM algorithm is chosen. A small number of support vectors determine the final result during SVM calculation, which is insensitive to outliers and has excellent generalization capabilities [
31].
After our research about the sea surface wind speed inversion model of the CYGNSS sea surface data based on Machine Learning [
32], this paper studies the sea surface wind direction retrieval model of space-borne GNSS-R based on SVM. The data comes from CYGNSS Full DDM data, CYGNSS L1 data and ECMWF reanalysis datasets from 2019 to 2020. The geometric relationship parameters of DDM and CYGNSS satellite parameters are extracted as SVM feature parameters. The grid search method is adopted to optimize and establish a global satellite-borne sea surface wind direction retrieval model and verify the effectiveness of the model. Compared with the abovementioned research, the space-borne GNSS-R sea surface wind direction retrieval has a wider detection range and a larger amount of data. The CYGNSS satellite data selected in this paper has a wide coverage and a large time span. The results show that the SVM method proposed in this paper can effectively retrieve the sea surface wind direction.
3. Geometric Feature Parameter Extraction
According to the GNSS-R bistatic radar equation, wind direction causes a change in the direction of the sea surface slope resulting in DDM asymmetry, but other parameters can also contribute to the shape of DDM [
34]. Therefore, it is difficult to extract the wind direction from the DDM directly, and the geometric feature parameter of the DDM, which is more sensitive to wind direction and less sensitive to other factors, is needed. Most research and observations have shown that the DDM expands when the wind speed increases. When the DDM expands, the DDM peak, the center of mass of
and the center of mass of
will shift toward larger delay bins. The difference of their change is more determined by wind direction than by other factors, such as wind speed.
is to filter out DDM parts with a horseshoe shape whose power value is greater than the specified threshold after normalizing the DDM.
is the region whose power is within a certain ratio of the peak value of DDM.
Therefore, two measurement parameters to establish the relationship with the wind direction are proposed. The first parameter is the vector azimuth angle from the peak point of DDM to the center of mass of
, namely angle φ1 in
Figure 1,
Figure 2,
Figure 3 and
Figure 4. The second parameter is defined as the vector azimuth angle from the center of mass of
to the center of mass of
, namely angle φ2. The peak point corresponds to the specular reflection point.
When calculating these two indicators, the DDM power is normalized based on the DDM peak value to avoid any calibration problems. The specific steps are as follows:
- (a)
Normalize DDM based on the peak point.
- (b)
Filter out the part with normalized power value greater than from DDM.
- (c)
Calculate the coordinates of the DDM peak point (, ) and the centroid coordinates ( ) of , where X represents the time delay, Y represents the Doppler shift and T(X, Y) represents the corresponding power at the time delay X and the Doppler shift Y. The calculations of X and Y are shown in Equations (1)–(3).
- (d)
Filter out the part of whose power value is 30% to 70% of the peak power value from . This paper adjusts the threshold obtained after the attempt from the statistical analysis in a large number of DDM during the calculation of φ1 and φ2. Then calculate the center of mass of in the same way as above.
- (e)
Calculate φ1 and φ2 from the DDM peak point, the center of mass of (, ) and the center of mass of to obtain geometric feature parameters.
Figure 1,
Figure 2,
Figure 3 and
Figure 4 show that the shape of the DDM with different wind directions is significantly different between
Figure 1 (or
Figure 3) and
Figure 2 (or
Figure 4). The part outside the dark blue area is
. The five-pointed star represents the peak point, the diamond represents the center of mass of
and the triangle represents the center of mass of
.
The two angle parameters are related to the sea surface wind direction through the statistical analysis of a large amount of data of CYGNSS Full DDM.
Figure 1,
Figure 2,
Figure 3 and
Figure 4 are typical examples representing different wind directions. Therefore, φ1 and φ2 can be used to retrieve the wind direction. However, it also can be seen that wind direction being retrieved by φ1 and φ2 presents a double ambiguity of 180°. In order to avoid the ambiguity of wind direction and improve the accuracy of wind direction retrieval, other parameters related to wind direction are needed.
Figure 1,
Figure 2,
Figure 3 and
Figure 4 also indicate that the normal angle range of φ1 should be between 0° and 10°, and the normal angle range of φ2 should be between 170° and 180°. Based on a large amount of statistical data, this paper finds that most of the angles of φ1 that can reflect the asymmetry of DDM are between 0° and 10° and for φ2 are between 170° and 180°, which belong to normal angles. In order to improve the robustness of the model, this paper slightly expands the range of normal angles. Therefore, this paper defines φ1 that is greater than 15° and φ2 that is less than 165°, as thresholds of abnormal angles. The abnormal angle includes most angles that cannot correctly reflect the asymmetry of DDM. In most cases of abnormal angles, φ1 and φ2 cannot be used to establish a connection with wind direction. However, some data samples with abnormal angles can still be used, and this paper does not simply delete data samples with abnormal angles according to numerical values. Specific data preprocessing conditions will be discussed later.
5. Data Process and Results
5.1. Data Preprocessing
The sample data are from 27 February 2019, to 17 November 2020, in this paper. The data set information (
Table 1) is as follows:
The data collection quality of the dataset is basically controlled and preprocessed. The standards of data preprocessing are as follows:
(1) Through the quality control (QC) flag in the L1 data of CYGNSS, this paper selects data samples with good quality.
(2) The power of DDM with low wind speed is mainly concentrated near the specular reflection point, and it is difficult to get the normal angle. Data samples with a wind speed above 5 m/s are selected in this paper, and data are processed in different wind speed ranges above 5, 8, 10, 12 and 15 m/s.
(3) When the SNR is too low, the original shape of DDM is “submerged” in the noise, which affects the data quality of geometric relationship feature parameters (φ1 and φ2). The abnormal angles (φ1 and φ2) are defined in
Section 3.
Figure 7 shows that there is a much larger amount of data samples with an abnormal angle when the SNR is lower than 1.3, so the data sample with an SNR higher than 1.3 is selected here.
(4) Wind direction will be affected by the land airflow, resulting in the unusual shape of DDM data, and sometimes even the extreme situation that the peak point is located at the end of the horseshoe shape. Therefore, the sample data with an offshore distance of more than 25km is selected here.
The distribution of the final filtered data on different data sets is shown in
Table 2, the quantity distribution of the data in different months before and after the screening is shown in
Figure 8, and the distribution of the filtered data in different wind directions is shown in
Figure 9. It can be seen from
Figure 8 that the data samples are mainly concentrated in July to November, and
Figure 9 shows that wind direction is mainly concentrated in the interval of 1–36 (0°–180°).
5.2. Accuracy Assessment
According to the flow chart in
Figure 6, the SVM sea surface wind direction retrieval model is established. The data set imported here is from CYGNSS satellite and ECMWF reanalysis datasets. The dataset is randomly divided into 80% for the test set and 20% for the training set, and then 20% of the training set is divided for grid cross-validation. The cross-validation grid search method is carried out five times. After obtaining the model, the accuracy results are obtained in the test set, and the root mean square error (RMSE) is calculated as follows:
In Equation (9), n is the number of samples. is the true value of the wind direction from the ECMWF reanalysis data set, and is the middle value of the predicted wind direction interval (the middle value of label 1 interval is 2.5°, and that of label 2 interval is 7.5°).
5.3. Retrieval Results of Wind Direction
5.3.1. Grid Search Results
Based on the best parameters obtained by grid search cross-validation, the SVM model is established to evaluate the accuracy of 80% of test sets randomly selected from the dataset.
Figure 10 is the result of a grid search for datasets with a wind speed range of more than 8 m/s with nine dimensional feature parameters. It can be seen from
Figure 10a that the optimal parameter combination is near the range of
C = 10 and
Γ =
, and the optimal parameter can be obtained by further narrowing the range. For datasets with different wind speed ranges, the accuracy evaluation results are shown in
Table 3.
5.3.2. Results Comparison between Two Different Mapping Relationships
Table 3 shows the optimal parameters results (after Grid searching) and RMSEs of SVM retrieval results of different wind speed datasets (≥5, ≥8, ≥10, ≥12 and ≥15 m/s) with two mapping relationships,
and
.
6. Results Analysis
6.1. Analysis of Data Set Results for Different Wind Speeds
According to
Table 3, SVM models with different wind speed ranges have different optimal parameters, and the wind direction retrieval result with a wind speed greater than 10 m/s is best. In the case of a wind speed of 5–10 m/s of
, with the increase of wind speed, φ1 and φ2 can better reflect the geometric features of DDM asymmetry, which is related to the wind direction, so the RMSE gradually decreases. In the case of a wind speed greater than 10 m/s, the RMSE reaches its lowest. When the wind speed is greater than 12 m/s, the RMSE increases. The main reason is that the data quality decreases with the increase of wind speed, which is reflected in the decrease of SNR and the increase in the number of abnormal angles. When the wind speed range is more than 15 m/s, the RMSE decreases to 26.78°, which is mainly due to the reduction of the amount of data and the concentration of high wind speed data in time and space, resulting in the improvement of classification accuracy of the SVM sea surface wind direction retrieval model.
Compared with , the overall classification accuracy of decreased significantly. It shows that the introduction of LES, NBRCS, SNR and RCG is conducive to improving the accuracy of wind direction retrieval. These parameters more accurately reflect the variation of sea surface roughness under different wind directions and improve the classification performance of SVM.
According to
Table 3, the SVM wind direction retrieval result with a wind speed greater than 10 m/s using mapping-relationship
is the best.
Figure 11 shows that the SVM classification confusion matrix of the data set with wind direction retrieval result. Most of the data samples are classified into the correct wind direction interval or the range adjacent to the correct wind direction interval, and the overall classification accuracy is high. It indicates that the SVM sea surface wind direction retrieval model established in this paper has good classification results. However, the RMSE on all datasets is more than 20°, so the results need to be further analyzed.
Figure 12 shows the distribution of different angles on the test set. According to Equation (10), the difference can be calculated. The prediction error of most samples is within 5–10°, and only a small part of the data error is within 20–180°. It shows that the SVM model can effectively solve the problem of 180° wind direction ambiguity. However, even if the number of samples with an error of more than 90° accounts for a small proportion of the test set, they still have a considerable impact on RMSE. The biggest penalty for RMSE is to predict a data sample as the opposite wind direction. This is the main reason why the overall classification accuracy of this paper is higher, but the RMSE rises to more than 26°.
6.2. Further Analysis of RMSE Variation under Different Wind Speeds
In the previous section, this paper analyzes the changes of RMSE results under different wind speeds using nine dimensional parameters. Although the actual reason should be as mentioned above, the impact of changes in the amount of data cannot be ignored. With the increase of wind speed, the amount of data is decreasing. It must have a significant impact on the training of the SVM model. Therefore, it is necessary to further analyze the results under the condition of controlling the amount of data.
First, the increase in RMSE for wind speed ≥ 12 m/s compared to wind speed ≥ 10 m/s should be analyzed. To avoid the impact of data set changes, the under-sampling method was adopted to control the number of data samples. In this paper, 16,552 data samples are randomly selected from the data set in the wind speed between 9 and 12 m/s for the establishment of the SVM model.
From
Table 4, the RMSE of the SVM wind direction retrieval model established under the wind speed between 9 and 12 m/s is lower than that of the wind speed greater than 12 m/s.
Figure 13 and
Figure 14 show the number distribution of SNR in different wind speed ranges. The average SNR in
Figure 14 is 3.34. The average SNR in
Figure 13 is lower than that in
Figure 14, which is 2.78. It indicates that the increase of RMSE wind speed greater than 12 m/s is indeed due to the reduction of data quality compared with wind speed greater than 10 m/s. However, it is also found that reducing the amount of data does have a bad impact on the accuracy. Even in similar wind speed ranges, the reduction of data directly leads to the rise of RMSE.
Second, this paper analyzes the decrease in RMSE for wind speed greater than 15 m/s. There are a few data samples with wind speed greater than 15 m/s, and the data quality decreases. The high concentration of data samples in time and space should be the main reason for the decline of RMSE, but the impact of the reduction of data samples also needs to be studied. In this paper, 6498 data samples are randomly selected from the data set in the wind speed range of 12 to 15 m/s for the establishment of the SVM model.
From
Table 5, the RMSE at wind speeds of 12 to 15 m/s is higher than that at wind speeds greater than 15 m/s. It shows that under normal circumstances, the reduction of data and the reduction of data quality will lead to the increase of RMSE. However, the high wind speed data samples collected by the CYGNSS satellite are highly concentrated in time and space, which makes the RMSE lower. The SVM model established with this data set is not suitable for the retrieval of all wind directions. The wind direction information contained in the training data set is not complete.
This paper established two different mapping relationships between feature parameters and WD ( and ). The results using show that it can effectively solve the problem of wind direction ambiguity. In addition, in order to get φ1 and φ2, which can reflect the geometric features of DDM, it is necessary to preprocess the dataset. Firstly, the QC of the CYGNSS L1 data product is used for data preprocessing. Secondly, by analyzing a large number of abnormal angles’ data samples, it is found that wind speed and SNR have a greater impact on φ1 and φ2. Finally, the selected specular reflection point should be far away from the land. After data preprocessing, the wind direction retrieval based on SVM using nine dimensional feature parameters can accurately retrieve wind direction with a certain condition of wind speed and SNR.
In order to further reduce the RMSE of sea surface wind direction retrieved by the SVM model, more and higher quality DDM data is needed. More can make the model contain all wind direction information, and higher quality DDM can fully reflect the asymmetry of DDM through φ1 and φ2. They can improve the accuracy of wind direction retrieval of the model. In addition, this paper retrieves the wind direction based on the asymmetry of DDM. When the wind speed is lower than 5 m/s, the asymmetry of DDM is not obvious. In the case of low wind speeds and signal interference, the result of wind direction retrieval will become worse. However, based on a large amount of high-quality data, the model is also robust to wind direction retrieval under low wind speeds. Finally, there is one problem that cannot be solved. The sea surface wind direction close to the land cannot be retrieved accurately. Data samples with specular points close to land need to be deleted.
7. Conclusions
In this paper, a GNSS-R sea surface wind direction retrieval method based on the SVM model is proposed in the case of a large space and time span. The data are from CYGNSS Full DDM, CYGNSS L1 data and ECMWF reanalysis datasets. By extracting the geometric relationship features φ1 and φ2 of DDM, the wind direction can be reflected more accurately, which can be used as the important feature parameters of wind direction retrieval. Together with other feature parameters related to wind direction, the input feature parameters of the dataset are composed for the solution of wind direction ambiguity. The wind direction is divided into 72 retrievals in 5° steps. Wind speed and SNR have an important influence on the retrieval of sea surface wind direction, especially on the geometric feature parameters of DDM (φ1 and φ2). Therefore, in order to improve the retrieval accuracy of wind direction, the data of wind speed and SNR are screened and processed.
The retrieval results in different wind speed ranges are evaluated. In order to reflect the impact of sea surface roughness on DDM more accurately, this paper sets up two different mapping relationships, which contain five dimensional feature parameters and nine dimensional feature parameters. Finally, their experimental results are compared. In the case of using nine dimensional feature parameters, when the wind speed is higher than 5 m/s, the RMSE of wind direction retrieved by the SVM model is 30.03°. When the wind speed is higher than 8 m/s, the RMSE is 27.31°. In the dataset with wind speed higher than 10 m/s, the RMSE is 26.70°, which is the best of all. When using five dimensional feature parameters, the overall RMSE in different wind speed ranges increases by more than 10°, which shows that the introduction of LES, NBRCS, SNR and RCG can effectively improve the accuracy of SVM wind direction classification. RMSEs retrieved under different wind speeds are different. This paper discusses the reasons for the change of RMSE. In order to eliminate the influence of data volume, the down-sampling method is used to control the number of samples. The increase of RMSE from a wind speed greater than 10 m/s to a wind speed greater than 12 m/s is due to the decline of data quality. The decrease of RMSE from wind speed greater than 12 m/s to wind speed greater than 15 m/s is due to the high concentration of data samples in time and space. The results show that a sea surface wind direction retrieval model based on SVM can effectively retrieve the sea surface wind direction and solve the problem of wind direction ambiguity. The spatial-temporal discontinuity of full DDM data, the relatively small amount of filtered data and the error of wind speed products will affect the results of sea surface wind direction retrieval. With the increase of CYGNSS data products, the accuracy of the wind direction retrieval method based on SVM should be further improved.