1. Introduction
Hainan Island, nestled in the northern part of the South China Sea, enjoys a warm tropical climate. This climate bestows upon the island a rich biodiversity and stunning natural landscapes. Economically speaking, the island stands out as a hotspot for both tourism and fishing. However, the tropical climate brings its own set of challenges, including seasonal storms and the long-term effects of climate change. Given Hainan’s unique status in China and the various challenges it faces, gaining insights regarding and monitoring its marine currents is crucial for ensuring the ecological and economic stability of Hainan and its surrounding maritime regions.
High-Frequency Radar (HFR) has emerged as a revolutionary tool in marine observation in recent years. By harnessing high-frequency electromagnetic waves, it not only captures subtle changes on the ocean’s surface but also, through cutting-edge inversion techniques, accurately deciphers activities of winds, waves, and crucial ocean currents [
1,
2,
3,
4]. The application of this technology offers researchers extensive, real-time marine data, and provides essential informational support and solutions in the realms of global climate change and marine safety.
HFR technology has demonstrated high accuracy in measuring winds, waves, and currents on the ocean surface. This achievement has been widely recognized by the academic community [
5,
6,
7,
8,
9,
10]. However, like other measuring tools, HFR faces inherent challenges in its measurements. These include inaccuracies due to equipment errors, electromagnetic interference, or data loss during transmission [
11,
12]. In previous studies, the quality control efforts for HFR were mainly centered on radial velocity data. These data reveals the speed component of the ocean’s surface from the radar to a specific point. To obtain a complete ocean current vector from this data, radial velocity information is needed from at least two different radar locations. These data are then combined to produce a two-dimensional vector, representing the flow direction and speed of the ocean’s surface at a specific point.
To form a continuous ocean current vector field, researchers typically employ mathematical methods to merge radial velocity data from multiple radar stations onto a grid. One commonly used method relies on inverse interpolation techniques, utilizing the distance and angular differences between radar stations to estimate velocity values for each grid point. In addition, there is a method based on the least squares principle that optimally fits data gathered from different radar stations by optimizing the model. Despite utilizing these advanced mathematical approaches, the merging process can still produce anomalous vector data. These anomalies can stem from mismatches between radar data, limitations of the merging algorithm, the incomplete or subpar quality control of radial velocities, and other factors. Issues with the quality of radial velocities, such as noise, discontinuity, or other unstable elements, might be partially addressed in the initial quality control phase but can be magnified or lead to new anomalies during the merging process.
In terms of radar data quality control, researchers have adopted various methods. Traditional approaches mainly focus on the quality control of radial velocity data obtained from high-frequency radars. The accuracy of these data largely hinges on the quality of electromagnetic signals and inherent characteristics of radial velocities. Past research has concentrated on the following areas:
Signal-to-noise ratio control: Cosoli [
13] and colleagues advocated for using the signal-to-noise ratio as a pivotal quality metric. They discovered that, when the ratio exceeds a certain threshold, the data accuracy notably improves. Based on this, they devised an algorithm to filter and correct radar data according to the signal-to-noise ratio.
Spatial analysis: Roarty [
14] and his team proposed a quality control method based on spatial characteristics. They examined the latitude, longitude, average radial direction, and speed of the radial data. Using this information, they assessed the data quality. Moreover, they compared the measured radial velocity with theoretical values, serving as a benchmark for quality control.
Parameter monitoring: Haines’s team [
15] recognized that radars output parameters beyond just radial velocities. They utilized these parameters for quality control, believing that these parameters provide added evidence for the quality of radial velocity data.
Real-time monitoring: Lorente’s group [
16] introduced a practical approach. They installed real-time monitoring equipment on buoys within the radar area, enabling instant diagnostics of non-speed parameters. This offered real-time quality feedback for radar data, assisting researchers in adjusting their quality control strategies.
Although these methods significantly improved the quality of radar data, anomalies might still emerge during the vector field synthesis. Thus, we present a new machine learning method specifically tailored for the quality control of these synthesized vector fields.
In this study, we employed machine learning techniques for the quality control of the synthesized radar ocean-current vector velocity data. We used the Bi-LSTM model to analyze the time-domain data and utilized its predictive residuals for anomaly detection. Compared to traditional methods, machine learning offers a more automated, efficient, and real-time solution. Conventional approaches often involve multiple steps, such as manual filtering, threshold setting, and geographical validation, which are not only time-consuming but may also introduce human errors. In contrast, deep learning models can automatically extract data features, substantially reducing the need for manual intervention and providing more consistent and reliable results. By applying machine learning for data anomaly detection, we demonstrated its potential in streamlining processes and enhancing data quality. The results of this research provide valuable guidance for improving radar data quality.
In this section, we have discussed the high-frequency radar technology and its applications and challenges in marine observations. In response to these challenges, we introduce a novel method utilizing machine learning for radar data quality control. In the sections that follow, we will describe in detail the data and methods used, then present our research outcomes, and wrap up with a summary of the entire study.
3. Results
As previously mentioned, the threshold is determined by comparing the anomaly detection performance metrics on the test set. The test set consists of full time series data from 15 different locations, totaling over 58,000 h. These data have been meticulously labeled to distinguish various anomaly types. Their respective quantities and categories are depicted in the accompanying
Figure 5.
We categorized anomalies into two main classes:
Single anomalies: These are isolated incidents where data at a specific time point deviate significantly from normal values, often due to momentary equipment faults or external interference.
Continuous anomalies: These anomalies manifest over multiple consecutive time points, possibly arising from equipment issues or external events. They can be further divided into:
Short-term: Spanning two consecutive time points, indicating brief yet noticeable disturbances.
Long-term: Persisting for more than two consecutive time points. These anomalies are less frequent, as they are more likely to have been removed in earlier quality control procedures.
As illustrated in
Figure 5, subplot (a) displays the different types of anomalies within a specific time frame. The marked points on this plot provide a clear distinction between the different anomaly classifications. Meanwhile, subplot (b) demonstrates the frequency of each anomaly type. Single anomalies occurred 947 times, making them the most commonly observed type. Short-term continuous anomalies were noted at 194 distinct time points, while long-term continuous anomalies were observed across 102 different time points.
Despite the chosen locations being in areas with high data coverage, evident anomalies were observed in these core regions. This underscores the central role of anomaly detection in data analysis. Particularly in the edge regions probed by the radar station, which are low data coverage areas, the data not only face significant continuity issues but are also more susceptible to various interferences and synthetic problems, exacerbating the anomaly conditions. Hence, the precise detection and handling of these anomalies become particularly vital to ensure data quality and accuracy, laying a solid foundation for subsequent data analysis.
In the ensuing discussions, we will delve into the anomaly detection methods and results for each of these three anomaly types.
3.1. Single Anomaly
As illustrated by
Figure 5b, single anomalies are the most prevalent type of anomaly. In the actual operation of radar systems, due to the combined effects of various random error factors, such as short-term signal interference, single anomalies tend to occur more frequently. On the other hand, long-duration anomalies, because of their more evident abnormal features, are typically filtered out before the synthesizing of radar ocean current vector data. Consequently, single anomalies are the primary focus of this study.
Figure 6a illustrates the input sequence structure of the prediction model. When the model attempts to predict the time point
, it considers data spanning six hours around that point. This design ensures that the model can fully capture the temporal context information adjacent to
, thereby enhancing the prediction accuracy. More detailed explanations regarding the choice of the input sequence length will be provided in the sensitivity analysis experiment at the end of the
Section 3.
As shown in
Figure 6b,c, the values in the residual time series at the timestamps corresponding to anomalies are significantly high, confirming our model’s anomaly detection capabilities. Nevertheless, it is worth mentioning that the normal data around these single-point anomalies also exhibit significant discrepancies in the difference series. One reason for this is that, when predicting these normal points, the model used data from the single-point anomalies in constructing its input sequence. This led to pronounced discrepancies even in what should have been normal predictions. With a lower threshold, this can mistakenly classify these points as anomalies, compromising the overall accuracy.
Figure 6d depicts the relationship between the precision and recall. The thresholds start from 0 and increase in increments of 0.01 m/s up to 0.5 m/s. These thresholds are applied to the residual time series generated by the model and the test set to obtain the variation curves of precision and recall, further aiming to determine the optimal threshold. As previously discussed in
Figure 6b,c,
Figure 6e provides a broader perspective from the entire dataset, showcasing the misclassifications at lower thresholds: despite the high recall in this range, the precision is low. When the threshold exceeds 0.2 m/s, there is a notable decline in recall with only a slight increase in precision. Constrained by these factors, the F-score reaches its maximum value of 0.627.
These prediction biases were expected given the nature of the training data. During previous model training, only normal data were used as input. However, when applying the model in real-world scenarios, the inclusion of a significant amount of anomalous data in the dataset becomes part of the prediction model’s input sequence, affecting its capability to accurately predict regular data.
Addressing the aforementioned issue and enhancing the anomaly detection performance necessitates improvements to our model. It should be capable of discerning the most relevant input information while minimizing the influence of anomalous data on its outputs.
To enhance the model’s ability to recognize anomalous data and its robustness, we decided to draw inspiration from data augmentation techniques in deep learning. We adopted an innovative strategy: injecting simulated anomalous information into the normal data. This strategy aims to optimize the model’s generalization capabilities. Experience tells us that solely relying on the original training data may not be sufficient to achieve optimal results in real-world anomaly detection tasks. However, if the model can learn and adapt to simulated anomalous data during the training phase, its ability to recognize anomalies and stability in real scenarios will be significantly improved.
For simulating anomalous information, we opted for Gaussian distribution, also termed normal distribution, a common probability distribution in statistics. This distribution simulates the overall effect of multiple small random error factors in the real world. According to the Central Limit Theorem, these effects exhibit a shape that is approximately normally distributed, making Gaussian distribution an ideal choice for our purposes.
During the anomaly injection process, we fine-tuned the Gaussian distribution’s relevant parameters to ensure that the produced anomalies clustered mainly within a range akin to observed real-world anomalies. Specifically, from 15 different locations, we randomly selected 10% of the data and introduced anomalies at the ‘t’ time point. To emulate various sudden changes potentially present in actual data, we then chose another distinct 10% data subset, introducing anomalies at either the ‘t − 1’ or ‘t + 1’ time points.
Figure 7a,b compares the predictions of the original and improved models at two different time points. As observed from the charts, the improved model can effectively identify anomalous data by producing large residuals. Simultaneously, it excels at making accurate predictions for normal data, with almost negligible residuals.
Figure 7c presents the PR curves of both models. As illustrated in the chart, the PR curve of the improved model consistently envelops that of the original model, indicating that, at equivalent recall levels, the improved model achieves greater precision.
Figure 7d displays the precision and recall of the new model at different thresholds. Viewing the entire dataset further validates the observations made from
Figure 7a,b, where the model significantly reduced residuals for normal data even in the presence of anomalous data. This is reflected in the graph where the model retains high precision even at lower thresholds.
All sub-figures in
Figure 7 validate the efficacy of our model’s improvements. The F-score of the new model has increased from 0.63 to 0.79, indicating, roughly, a 16% improvement over the old model. By injecting simulated anomalous data into the training set, we have not only enhanced the model’s generalization capabilities but also its robustness against anomalies.
3.2. Continuous Anomalies
Continuous anomalies represent another focal area of our study. Compared to single-point anomalies, these anomalies present a greater detection challenge due to their persistent nature in time series data. As shown in
Figure 8a,b, when the predictive model encounters continuous anomalies, it struggles to generate predictions that significantly deviate from the anomalous data. This observation can be explained in two ways. Firstly, while the Bi-LSTM model excels at capturing long-term dependencies in time series data, when faced with consecutive anomalous data points, the model may over-rely on previous “memories” to interpret the current data point, leading to reduced sensitivity to continuous anomalies. Secondly, although the bidirectional nature of Bi-LSTM allows it to capture both past and future context in an input sequence, this mechanism may lack sufficient discriminative power when faced with a large number of consecutive anomalies.
Figure 8c illustrates the comparison of the recall rates for continuous and single-point anomalies at different thresholds. As the threshold increases, the recall rate for single-point anomalies decreases slowly, while that for continuous anomalies drops rapidly. Combining the specific examples from
Figure 8a,b, it becomes clear that the model, when predicting certain continuous anomalies, cannot generate sufficiently large residuals to distinguish them from normal data.
Figure 8d shows that, even when the model’s F-score is at its peak, the recall rate for continuous anomalies remains significantly lower than that for single-point anomalies. The findings from all sub-figures in
Figure 8 emphasize the challenges that continuous anomalies pose to the predictive model.
In summary, while the model’s performance in detecting single-point anomalies is commendable, there is room for optimization in the detection of continuous anomalies. Thus, the next steps in research should focus on further enhancing the model’s performance in this area.
This iterative process presents a practical solution for addressing continuous anomalies, effectively transforming them into a sequence of single-point anomaly detections. This approach streamlines the detection process and has the potential to enhance the model’s performance in handling continuous anomalies.
In order to ensure rationality, during the iterative forecasting process, the data being replaced should ideally consist of as many abnormal data points as possible. As a result, the selected threshold should guarantee a high level of accuracy. In this study, a threshold corresponding to an accuracy of 90% or higher was used. As shown in
Figure 7d, the minimum threshold value is 0.3 m/s. The threshold values range from 0.3 to 0.5 m/s in increments of 0.01 m/s, and the number of iterations ranges from one to five times. The specific results are illustrated in
Figure 9.
Figure 9a clearly illustrates how the iterative prediction method significantly enhances the efficiency of the anomaly detection. The trend in F-score variation is influenced by several factors, including the type and quantity of anomalous data, the selected threshold, and the number of iterations.
As is shown in
Figure 5b, anomalies with a length of 2 predominate within the realm of continuous anomalies.
Figure 9b,c further emphasizes the exemplary performance of the iterative prediction model in addressing consecutive anomalies with lengths of 2 and 3. Combining these insights, even though the model encounters challenges when detecting longer consecutive anomalies due to constraints related to the input sequence length, its outstanding performance in detecting consecutive anomalies with lengths of 2 and 3 remains evident in
Figure 9d,e.
In
Figure 9d, the overall curve shifts upward and to the right, indicating an improvement in the model compared to the previous version. In
Figure 9e, it can be observed that the iterative prediction model effectively enhances the model’s anomaly detection performance, with improved recall rates for both single-point anomalies and consecutive anomalies.
3.3. Sensitivity Study
In the concluding part of this section, we will examine variations in the model’s anomaly detection performance under different input sequence lengths, considering two key aspects.
Firstly, as discussed in
Section 2 regarding missing data imputation, we only impute missing data when the gap between sequences has fewer than six standard time intervals. However, when processing radar sea-current data, we face a particularly severe problem: significant data are missing, especially in the peripheral regions covered by the radar (as depicted in
Figure 1). Even after applying missing value imputation, the data inevitably get fragmented into numerous time series segments. Due to the Bi-LSTM model requiring a certain amount of data both before and after the data point being predicted, the data at the beginning and end of each segment are undetectable. Consequently, as the input sequence length increases, the amount of data genuinely available for detection diminishes. This results in certain anomalous data being undetected, leading to the ‘escape’ phenomenon, as illustrated in
Figure 10a,b.
In the second aspect, we compared the performance metrics of anomaly detection for the model under different input sequence lengths. To account for shorter input sequence lengths, we introduced anomalies only at the points under examination, as opposed to the method described in
Section 3.1, where anomalies were added to both the points under examination and their neighboring points. The specific results are shown in
Figure 10c.
In an overall assessment, longer input sequences lead to the presence of more undetectable data. As the amount of detectable data diminishes, some anomalous data can slip through the detection process, thereby undermining the overall quality control effectiveness. Conversely, overly short input sequence lengths make it challenging for the model to address consecutive anomalies, resulting in suboptimal quality control. Considering these factors, we chose an input sequence length of 7 to strike a balance.
4. Conclusions
Compared to earlier research on the quality control of radar data, the deep learning approach employed in our study primarily focuses on generated vector sea-current charts rather than radial velocities. This allows for a more effective integration with previous quality control efforts on radial velocities. Not only can we address anomalies resulting from synthetic algorithms, but our study also helps to rectify anomalies that persisted due to incomplete or suboptimal previous quality control. Moreover, our method does not rely on radar echo-related signals for quality assessment. Instead, it capitalizes on the continuity of time series and the strengths of deep learning, significantly simplifying the quality control process and improving the data quality.
In
Section 2, we provided a comprehensive introduction to the Bi-LSTM neural network model, clarified the data preprocessing process, explained our reasons for choosing specific anomaly detection metrics, and detailed the architecture of the anomaly detection system. In
Section 3, we began by discussing the various types of anomalies commonly found in radar sea current data and their respective distribution percentages. Subsequently, we optimized the model’s input and detection process specifically for different anomaly types. In the concluding section of
Section 3, we conducted a sensitivity analysis experiment designed to elucidate the effects of input sequence length variations on model performance.
In our study, the quantitative results, as detailed in
Figure 9d,e, display the final PR curve of the model and the F-score, which stands as the core evaluation metric in our anomaly detection research. For datasets like high-frequency radar sea-current data, imbalances are prevalent where anomalies (or ‘positive instances’) are significantly outnumbered by normal observations. Such imbalanced datasets, characterized by large volumes of data and a high complexity, can pose challenges to conventional machine learning algorithms. In this context, our proposed method achieved an F-score of 0.814, signifying its strong performance in both precision and recall, and thus effectively tackling the anomalies.
Our method has provided substantial assistance for the quality control of radar and offered insights into potential areas of exploration. However, there remains potential for further optimization in certain areas. In subsequent research, we aim to enhance the model’s sensitivity to specific anomaly patterns and consider increasing the model’s complexity to capture more information.