1. Introduction
Microwave radar plays a crucial role in sea surface remote sensing and maritime military surveillance, and the received backscattered signals from the sea surface are referred to as sea clutter, which is a significant factor affecting the accuracy of detecting weak targets on the sea surface [
1,
2]. The sea clutter reflectivity is used to describe the radar cross-sectional area of the electromagnetic waves emitted by the radar and irradiated by the sea surface per unit area. Sea clutter reflectivity is one of the important characteristics of sea clutter, which can reflect the power level of sea clutter. Therefore, in the process of progressively refining radar signals, the high-precision prediction of sea clutter reflectivity is critically important.
Researchers from many countries have derived some semi-empirical sea clutter mean models based on real observed data of the sea surface. For instance, the Georgia Institute of Technology (GIT) [
3] proposed a model applicable to radar frequencies ranging from 1 GHz to 100 GHz and grazing angles from 0.1° to 10°. This model takes inputs such as the sea state, grazing angle, wind speed, wind direction, and radar wavelength. The model developed by the Technology Service Corporation (TSC) [
4] used the same inputs as GIT. However, TSC takes into account the effects of anomalous propagation. It is suitable for estimating average sea clutter backscattering coefficients under conditions with radar frequencies ranging from 0.5 GHz to 35 GHz, grazing angles from 0° to 90°, and full azimuthal wind directions. Reilly [
5] introduced a hybrid (HYB) model as a modification of GIT, quantifying it based on the sea state, grazing angle, wind direction, and radar wavelength and incorporating features from Nathanson’s data and the GIT model, with an increased incidence angle of up to 30°. The Naval Research Laboratory (NRL) [
6] proposed a model applicable to radar frequencies ranging from 0.1 GHz to 35 GHz and grazing angles from 0° to 60°, which is a function of the sea state, grazing angle, and radar wavelength. Rosenberg et al. [
7] extended the continuous model for the X-band based on GIT, expanding the usable range of grazing angles from below 10° to 0.1° to 45°. Empirical models are often based on experimental data collected under different environmental and condition settings, leading to inherent differences among them. Therefore, each empirical model has different parameters and effective ranges [
8].
The above-mentioned models proposed abroad are not suitable for predicting sea clutter reflectivity in the sea surrounding China. Chinese researchers have made modifications to these models specifically for the China Sea area. For example, Zhang et al. [
9] proposed a modified GIT model that improved wind speed prediction. Wu et al. [
10] used the FDTD method to correct the reflectivity of four semi-empirical sea clutter models, namely HYB, GIT, TSC, and NRL. Chen et al. [
11] combined the influences of the sea state and grazing angle to propose a sea clutter reflectivity prediction model based on NRL. Li et al. [
12] presented a modified TSC model specifically for the UHF band and horizontal polarization in the Yellow Sea with small grazing angles. Ma et al. [
13] proposed the NRL-MSINN model based on deep learning, which incorporates multiple marine environmental parameters, sea conditions, the grazing angle, and NRL model parameters as inputs. Shi [
14] constructed a prediction system for backscattering coefficients of sea clutter based on neural networks and various empirical models. Shui et al. [
15] proposed a prediction model based on GRNN by considering the grazing angle, significant wave height, wind speed, and wind direction. Wang [
16] developed a high-accuracy computational method for the average backscattering coefficient of high sea clutter in rough sea conditions. This method is based on electromagnetic calculations and combines the small slope approximation method with ray-tracing techniques to compute the backscattering coefficient. It achieves high precision and efficiency in the calculation of average backscattering coefficients in high sea clutter conditions. Most of these aforementioned prediction models only consider a limited number of marine environmental parameters and are only suitable for estimating sea clutter reflectivity under specific conditions. The NRL-MSINN model considers multiple marine environmental parameters, improving prediction accuracy. However, different marine environmental parameters can lead to different wave structures, for example, a higher likelihood of occurrence of swell waves when the wave period is larger. The electromagnetic scattering structure on the sea surface varies across different wave structures. Therefore, to further improve the prediction accuracy of sea clutter reflectivity, it is necessary to consider different wave structures.
Traditional sea state classifications, such as the grade of wave height issued by the Standardization Administration of China [
17], are based on a single marine environmental parameter, namely significant wave height. However, even under the same sea state, wave structures often vary due to differences in parameters such as the wave period, maximum wave height, and maximum wave period. Furthermore, wave structures can be even more complex and diverse under different sea conditions. Liu et al. [
18] verified through theoretical analysis and empirical data that the rate of reflectivity increases when the grazing angle varies significantly under different radar wavelengths, polarization modes, and sea surface roughness conditions. Das et al. [
19] found that sea clutter reflectivity increases with increasing sea surface roughness. Yang et al. [
20] discovered that when using only sea state as a descriptor of sea surface roughness, the accuracy of empirical models for sea surface reflectivity can be biased. In particular, the influence of various marine environmental parameters on the statistical characteristics of sea clutter, the uncertainty of marine environmental parameters (such as significant wave height, maximum wave height, wave period, maximum wave period, wind speed, etc.) when radar parameters are fixed, and with the stochastic nature of wave motion, all contribute to the variability in radar echoes caused by different wave structures. Therefore, to achieve a more refined classification of sea states, it is not sufficient to rely solely on significant wave height as the basis. The influence of multiple marine environmental parameters should be taken into account.
To achieve a finer classification of sea states, this study utilizes multiple marine environmental parameters and employs clustering to further divide the sea states. Clustering analysis and discriminant analysis play significant roles in the field of remote sensing [
21,
22,
23,
24]. Clustering analysis is an unsupervised learning method that aims to group samples into different clusters based on their similarities. By maximizing intra-cluster similarity and minimizing inter-cluster differences, clustering algorithms reveal the underlying structure and patterns in the data. On the other hand, discriminant analysis is a supervised learning method that aims to establish a function or model for predicting the class or label of input samples. Discriminant analysis learns a discriminant function or decision boundary that maximizes the compactness of samples within the same class and maximizes the separation between different classes. Clustering methods can help us understand the underlying structure and similarities in data of marine environmental parameters, providing valuable information and feature selection for subsequent discriminant modeling. Discriminant methods, on the other hand, learn the relationship between the features and labels of each clustered wave structure, offering an effective mechanism for classifying and predicting sea wave structures.
Based on the measured wave and sea clutter data from the Yellow Sea, this study first employed multiple marine environmental parameters (significant wave height, maximum wave height, wave period, maximum wave period, and wind speed) and a clustering algorithm to finely classify various wave structures. A discriminative model for finely classified wave structures was developed, which can classify new data into defined wave structure categories. Then, for each wave structure category, a deep neural network model called GIT-HYB-DNN, which combines the empirical models GIT and HYB, was used for prediction. The root mean square error (RMSE) was chosen as the loss function, and through optimization and parameter tuning, the predicted results achieved a range of 0.62 dB to 0.84 dB, which is superior to the direct prediction results of 1.08 dB obtained using the GIT-HYB-DNN model alone. The results of this research achieved the higher precision prediction of sea clutter reflectivity and an application value for the selection of radar parameters in the Yellow Sea area.
4. Discussion
This section includes the selection of the dataset division ratio to establish the discriminative model and a comparison of methods for improving the predictive performance of sea clutter reflectivity. This study mainly improves the predictive performance of sea clutter reflectivity through two aspects. Firstly, it finely divides the wave structures using a clustering algorithm, which is an unsupervised learning method. This algorithm divides the marine environmental parameter data into six different clusters based on similarity. Waves within the same cluster have similar structures, leading to similar scattering characteristics. Secondly, it utilizes a high-precision prediction model called GIT-HYB-DNN, whose inputs include various marine environmental parameters, considering more comprehensive influencing factors. It combines the rational elements from the empirical models GIT and HYB with the learning ability of deep neural networks. By combining these two methods, different prediction models are established for each type of wave structure. The input data for each model have high similarity and strong internal correlation, resulting in a reduction in sea clutter reflectivity prediction error in the Yellow Sea region range from 0.62 to 0.84 dB.
4.1. Division of Training Set and Test Set for Discriminant Model
When constructing the discriminant model, the 516 groups of labeled marine environmental data were divided into training and test sets using four different partition ratios as follows: 6:4, 7:3, 8:2, and 9:1. Training was performed separately for each ratio, and
Table 6 presents the accuracy of the discriminant model under different dataset divisions. Based on
Table 6, it is evident that the accuracy was highest when the training-to-testing dataset split ratio was 8:2.
4.2. Sea Clutter Reflectivity Prediction
Based on the test data in
Figure 9, we compared the predictive performance of four empirical models, the NRL-MSINN model and our GIT-HYB-DNN model.
Figure 16 displays the distribution of the differences between the predicted sea clutter reflectivity and the true values for each method. The bin size is 2 dB. It can be observed that the GIT model’s prediction errors were mainly distributed in the range of 12 to 20 dB and were relatively scattered, with fewer instances near the 0-dB line. The TSC and NRL models have similar error distributions, approximately following a normal distribution, and are mainly concentrated in the range of 10 to 18 dB. The HYB model’s prediction errors are primarily distributed in the range of −2 to 6 dB. Both the GIT-HYB-DNN model and the NRL-MSINN model have prediction errors concentrated around the 0-dB line, with the GIT-HYB-DNN model showing a more centralized distribution.
Figure 17 illustrates the absolute value distribution of the differences between the true values and the predicted values for each method. It can be observed that the median and mean of the prediction errors for the GIT, TSC, and NRL models are around 15 dB. Among the four empirical models, the HYB model demonstrates the best predictive performance. However, the NRL-MSINN model and the GIT-HYB-DNN model exhibit lower mean and median prediction errors, with the GIT-HYB-DNN model having the lowest values.
In
Figure 18, a comparison is made between the prediction error distributions of the NRL-MSINN model and the GIT-HYB-DNN model, with a bin size of 1 dB. It can be observed that the GIT-HYB-DNN model had a higher frequency of occurrences in the range of −1 to 1 dB and fewer occurrences in other bin ranges. This indicates that the GIT-HYB-DNN model has smaller prediction errors and higher accuracy. Therefore, the GIT-HYB-DNN model demonstrates superior predictive performance compared to the NRL-MSINN model.
4.3. Prediction of Sea Clutter Reflectivity for Different Wave Structures
By clustering five sea environmental parameters in the Yellow Sea region, namely significant wave height, maximum wave height, wave period, maximum wave period, and wind speed, the waves were divided into six different structures. For each of the six wave structures, sea clutter reflectivity was predicted using the NRL-MSINN model and the GIT-HYB-DNN model.
Table 7 presents the root mean square error (RMSE) results for the predictions of sea clutter reflectivity using the NRL-MSINN model and the GIT-HYB-DNN model for each of the six wave structures. It can be observed that the RMSE obtained from NRL-MSINN model predictions ranged from 0.73 to 0.98 dB, showing an improvement in predictive accuracy compared to the non-divided case with a prediction accuracy of 1.23 dB [
31]. The RMSE from the GIT-HYB-DNN model predictions ranged from 0.62 to 0.84 dB, also indicating an improvement in predictive accuracy compared to the non-divided case with a prediction accuracy of 1.08 dB. Both models showed an increase in sea clutter reflectivity prediction accuracy, suggesting that the fine division of wave structures led to the improved predictive accuracy of sea clutter reflectivity.
4.4. Limitations of This Study and Directions for Future Research
The data in this paper were collected in the Yellow Sea region over a certain period, where the marine environment was complex and constantly changing. This study only considers data for the grade of wave height sea state levels 1 to 4. In order to comprehensively model sea clutter reflectivity prediction, it is necessary to include data from more sea states. The application of the AP clustering algorithm in this study resulted in a refined classification of wave structures into six categories. With a larger dataset, it is possible to achieve an even more detailed classification of wave structures.
In future research, there is a need to expand the dataset, especially by collecting data from a wider range of marine areas and over a longer time span, to obtain a more comprehensive and diverse set of marine environmental samples, thus enhancing the model’s generalization ability and predictive accuracy. It is also worth considering the use of deep clustering algorithms to explore wave structures further. This paper focuses solely on the sea clutter reflectivity; however, adopting the concept of refined wave structure, it is possible to investigate other characteristics of sea clutter.