On the Nearshore Significant Wave Height Inversion from Video Images Based on Deep Learning

Xu, Chao; Li, Rui; Hu, Wei; Ren, Peng; Song, Yanchen; Tian, Haoqiang; Wang, Zhiyong; Xu, Weizhen; Liu, Yuning

doi:10.3390/jmse12112003

Open AccessArticle

On the Nearshore Significant Wave Height Inversion from Video Images Based on Deep Learning

by

Chao Xu

^1,2,

Rui Li

¹

,

Wei Hu

^1,*,

Peng Ren

²

,

Yanchen Song

¹,

Haoqiang Tian

¹,

Zhiyong Wang

¹,

Weizhen Xu

^1,2 and

Yuning Liu

^1,2

¹

North China Sea Marine Forecast and Hazard Mitigation Center, Ministry of Natural Resources, Qingdao 266100, China

²

College of Oceanography and Space Informatics, China University of Petroleum, Qingdao 266580, China

^*

Author to whom correspondence should be addressed.

J. Mar. Sci. Eng. 2024, 12(11), 2003; https://doi.org/10.3390/jmse12112003

Submission received: 27 September 2024 / Revised: 26 October 2024 / Accepted: 27 October 2024 / Published: 7 November 2024

(This article belongs to the Section Physical Oceanography)

Download

Browse Figures

Versions Notes

Abstract

:

Accurate observation of nearshore waves is crucial for coastal safety. In this study, the feasibility of extracting wave information from wave video images captured by shore-based cameras using deep learning methods was explored, focusing on inverting nearshore significant wave height (SWH) from instantaneous wave video images. The accuracy of deep learning models in classifying wind wave and swell wave images was investigated, providing reliable classification results for SWH inversion research. A classification network named ResNet-SW for wave types with improved ResNet was proposed. On this basis, the impact of instantaneous wave images, meteorological factors, and oceanographic factors on SWH inversion was evaluated, and an inversion network named Inversion-Net for SWH that integrates multiple factors was proposed. The inversion performance was significantly enhanced by the specialized models for wind wave and swell. Additionally, the inversion accuracy and stability were further enhanced by improving the loss function of Inversion-Net. Ultimately, time series inversion results were synthesized from the outputs of multiple models; the final inversion results yielded a mean absolute error of 0.04 m and a mean absolute percentage error of 8.52%. Despite certain limitations, this method can still serve as a useful alternative for wave observation.

Keywords:

significant wave height; deep learning; wave video images; ResNet; multiple factors

1. Introduction

Wave observation is an important aspect of physical oceanography study, which is crucial for marine forecasting, disaster prevention and mitigation, ocean engineering, and maritime safety. Conducting observational research on waves has significant scientific and practical value. The wave parameters observed for nearshore waves primarily include the mean wave height, wave direction, wave period, and significant wave height (SWH), with SWH being particularly important. SWH represents the average height of the highest one-third of waves within a given period. Current wave observation methods include manual observation, instrument measurement, and remote sensing inversion [1]. At present, wave observation technology primarily relies on instrument measurement, with buoys and wave staffs being widely used for nearshore observations. Wave staff observation has advantages such as simple structure and quick response, but it is difficult to apply in open sea areas and requires frequent maintenance. Buoy observation is easier to maintain but is prone to issues such as dragging of anchor in harsh sea conditions. In recent years, the application of remote sensing radar observation in wave observation has been increasing. X-band radar, HF radar, lidar, and synthetic aperture radar (SAR) have significantly expanded their observation regions, and are achieving promising inversion accuracy. Zhu et al. [2] used wave mode data from the GF-3 SAR combined with existing model algorithms to inverted SWH; when compared to reanalysis data from the European Centre for Medium-Range Weather Forecasts (ECMWF), the inverted SWH achieved a root mean square error (RMSE) of 0.57 m. Klotz et al. [3] employed The Ice, Cloud, and land Elevation Satellite 2 (ICESat-2) land surface algorithm to determine wave and wind characteristics, with the inverted SWH yielding an RMSE of 0.3 m compared to ERA-5 reanalysis data. Liu et al. [4] utilized ensemble empirical mode decomposition (EEMD) with X-band radar sea surface images to estimate SWH, achieving an RMSE of 0.36 m against buoy observations. Similarly, Zhao et al. [5] employed high-frequency radar data to obtain wave information, where the RMSE between the inverted SWH and buoy data was found to be 0.29 m. However, satellite and radar observations often involve high costs and the observational accuracy is limited by spatial resolution. Additionally, research on using binocular vision technology and wave images captured by binocular cameras to invert wave surface information has made some progress. However, this technique requires a complex calibration process to ensure inversion accuracy, and thus most research remains in the experimental phase [6,7,8,9]. With the development of deep learning in the field of computer vision, research that integrates video imagery with various deep learning techniques for image recognition and object detection has emerged across multiple domains, demonstrating the advantages of deep learning methods in terms of accuracy and computational efficiency [10,11]. Currently, research on inverting sea conditions using various types of wave images has made some progress. Compared to previous measurement methods, acquiring nearshore wave video images is more cost-effective, and when combined with artificial intelligence technology, can produce more accurate and timely inversion results.

With the rapid development of computers and the continuous improvement in computational power, deep learning has gradually come into the public eye. Deep neural networks are a significant branch of deep learning, enhancing the ability to extract features from targets by constructing more complex network structures. Compared to traditional neural networks, deep neural networks utilize multiple hidden layers to perform nonlinear transformations, making them more capable of handling complex environments and problems. Convolutional neural networks (CNNs) [12,13] are a type of deep learning model commonly used for image and video processing. CNNs have a better capability for handling image and sequential data because they can automatically learn features from images and extract the most useful information. Krizhevsky et al. [14] enhanced the basic structure of a CNN by deepening the hierarchical structure and using the nonlinear activation function ReLU along with the Dropout method, resulting in the AlexNet model. The success of AlexNet greatly propelled the development of CNNs, encouraging other researchers to propose models like VGGNet [15] and GoogleNet [16] to tackle more complex image recognition problems. Traditional convolutional networks or fully connected networks often encounter issues such as information loss and degradation during transmission, as well as the vanishing gradient or exploding gradient problems, which make training very deep networks challenging. To address the issue, where training accuracy rapidly declines after reaching saturation as the network depth increases, He et al. [17] introduced the deep residual network (ResNet). ResNet addresses this problem to some extent by directly passing input information to the output, preserving information integrity. The network only needs to learn the differences between the input and output, thereby simplifying the learning target and reducing the difficulty. Depending on the number of network layers, ResNet can be divided into ResNet18, ResNet34, ResNet50, ResNet101, and ResNet152. The capability of various deep learning models in extracting features from wave video images is explored in this study, such as AlexNet, VGGNet, MobileNet [18], DenseNet [19], and ResNet. Additionally, the inclusion of attention mechanisms can enhance the ability of CNNs to capture and express important features. Attention mechanisms simulate the human brain’s focus on specific regions at particular moments, selectively acquiring more useful information while ignoring irrelevant data [20]. The principle of attention mechanisms can thus be understood as the model focusing on important information from the input data to achieve a specific task while ignoring less important information. Attention mechanisms improve model performance by weighting different input data to adjust the model’s focus accordingly. Currently, common attention mechanism modules include the SE module [21], ECA module [22], and CBAM module [23]. Among these, the SE module and ECA module are channel attention mechanisms, while the CBAM module combines both channel and spatial attention mechanisms. Channel attention mechanisms focus on which channel features are more meaningful, whereas spatial attention mechanisms focus on which spatial parts are more significant [24].

Against the backdrop of the rapid development and application of deep learning and machine learning, deep learning algorithms have already been applied across various domains within oceanography. In wave forecasting, numerous studies have emerged that utilize CNNs, long short-term memory (LSTM) networks, and various optimized models for long-term prediction of wave parameters [25,26,27,28]. Additionally, there has been considerable research on predicting and inverting factors such as sea ice concentration and sea surface temperature using GPR data [29], infrared remote sensing images [30], and imaging data based on MWRI and SSMI [31,32].

In recent years, video images combined with deep learning and computer vision methods have shown great potential in wave information recognition and monitoring. Andriolo et al. [33] developed two new methods for estimating breaking wave height based on Timex images. One method provides accurate breaking wave height estimates by integrating a series of video-derived parameters and the beach profile data, while the other does not require local water depth data, demonstrating cost-effectiveness and applicability across a wide range of field conditions. Scardino et al. [34] proposed a monitoring technique that combines CNN and Optical Flow technology to assess tidal and storm parameters from video recordings. The results indicate that the system achieves good accuracy in wave flow and height monitoring in a low-cost manner. Additionally, Valentini et al. [35] further demonstrated the application of a low-cost video surveillance system for wave monitoring. By combining a CNN with superpixels segmentation, they achieved precise classification of coastal images, showcasing its practicality in environmental monitoring. In another study, Andriolo [36] proposed a technique for automatically extracting nearshore wave transformation domains from images captured by coastal video monitoring stations, and demonstrating the method’s reliability and providing strong support for nearshore hydrodynamics and sediment transport research. In addition, research using video images as a data source has also been applied in visibility inversion and prediction. Hu et al. [37] proposed a cloud image retrieval method for sea fog recognition using a dual-branch residual neural network, effectively identifying sea fog and low-level clouds. Additionally, several studies have used image processing and deep learning techniques to detect sea fog from various image data sources, including video images captured by cameras and multispectral images [38,39,40].

Utilizing deep learning algorithms and wave images or videos for wave parameter inversion has emerged as a novel approach in recent years, and is employed by numerous researchers to enhance the inversion accuracy of various wave parameters. Xue et al. [41] proposed a method for SWH inversion based on CNNs and Sentinel-1 SAR data, constructing an inversion dataset with over 3000 images. When compared with buoy measurements, the method achieved an RMSE of 0.32 m for the inverted SWH. Liu et al. [42] utilized LeNet to classify waves using sea clutter data, demonstrating excellent wave height inversion capability on high-quality datasets. The method achieved an average accuracy of over 93% on the experimental dataset, and its generalization ability was validated using data from different periods and sea areas. However, this study lacked inversion results for actual values of wave heights. Choi et al. [43] employed deep learning techniques to estimate significant wave height from a single ocean image, achieving an accuracy of 84% with their proposed classification model. Additionally, they introduced a regression model based on a convolutional long short-term memory (ConvLSTM) network to estimate continuous significant wave height from a sequence of ocean images, achieving a mean squared error of 0.02 m on their proposed dataset. Song et al. [44] utilized images extracted from nearshore wave monitoring videos to capture both the static and dynamic features of waves. Two independent Network In Network (NIN) networks were constructed to learn the spatial and temporal characteristics of the waves, where the two types of features were fused using a central network to determine the SWH. The results achieved a relative error of 6.4% ± 4.9, and the author pointed out that this method can well meet the operational requirement for nearshore wave forecasting. It is necessary to evaluate the performance of the proposed method according to operational standards. In this study, the performance of model was assessed based on China’s nearshore observation standards [45]. The standard stipulates that the absolute error between instrument measurements and actual observed values should be maintained within 15% of the actual observed values. Gal et al. [46] proposed a method to estimate the height of breaking waves within the breaking zones from video recordings, which in turn estimates nearshore wave height, achieving results comparable to buoy observations. Sun et al. [47] conducted airborne interferometric radar altimeter experiments in the Yellow Sea; they used a mean filtering algorithm to invert sea surface height and its wave spectrum, achieving good results for swell but facing limitations in the application to wind waves. Jinah et al. [48] used various deep learning methods and coastal video images to track wave propagation. Learning the behavior of transformed and propagated waves in the surf zone, they successfully estimated the instantaneous wave speed of each crest and breaking wave in the video domain. Yun-Ho et al. [49] combined CNNs and LSTM to classify sea states and average wave height from monocular ocean videos. Learning methods based on wave videos have achieved classification accuracies exceeding 93%, but they require substantial training time. To more comprehensively learn the spatial and temporal information of waves, some research proposed using three-dimensional convolutional networks for wave parameter monitoring, achieving good accuracy [50]. Nonetheless, these approaches are still limited to utilizing wave images to represent wave information. Overall, research on using nearshore video images for wave inversion is still in its early stages and mostly focuses on wave classification. There is relatively little research on inverting SWH from a regression perspective. Analysis of the current research status reveals that there are still many methods in this field that can be explored and improved.

In summary, there is currently limited research on using deep learning algorithms and wave video images to achieve SWH inversion. In this study, a deep learning classification method for distinguishing wind waves and swell from nearshore instantaneous wave video images was proposed at first. Subsequently, a deep learning regression method based on CNNs and MLP was introduced to invert SWH from instantaneous wave video images, meteorological factors observation data, and oceanographic factor observation data. Additionally, the specialized models for wind wave and swell trained on the wind wave inversion dataset and the swell inversion dataset, and independent inversion was performed on the wind wave and swell samples in the test set. Finally, the impact of an improved loss function on the inversion accuracy was discussed. This paper is organized as follows: Section 2 describes the materials and methods; Section 3 presents the experimental results of wind wave and swell classification and SWH inversion, and discusses the impact of various factors on the inversion accuracy; and Section 4 provides the main conclusions.

2. Materials and Methods

2.1. Data Sources and Data Preprocessing

The data used in this study include wave video data, meteorological factors data, buoy data, and oceanographic factor data. The study area is the nearshore area of Xiaomai Island in Qingdao, Shandong Province. The video data collection point is located on the shore of Xiaomai Island near the “Xiaomai Island Buoy” shown in Figure 1, and the “Offshore Buoy” is a marine buoy located approximately 20 nautical miles from the video collection point. The wave video data were captured by an HP-RC2050YQ-1000S camera from the Xiaomai Island Ocean Station, facing the south side nearshore area. The experiment used part of the video data from September 2021, August 2023, and October to December 2023. High-quality videos were extracted from 7 a.m. to 7 p.m. daily, resulting in approximately 640 h of video. Wave video frames were extracted at a rate of one frame per minute to obtain wave video images with a resolution of 1920 × 1080. The buoy data were provided by the MKIII Wave Rider buoy, with an observational frequency of once per hour. The buoy data include various wave parameters such as SWH, wave direction, and wave spectrum. The oceanographic factor data (tide level) were obtained from the tide gauge well at the Xiaomai Island Ocean Station, also with an observational frequency of once per hour. The meteorological factors data were collected from the Xiaomai Island Marine Station, including parameters such as wind direction, wind speed, air temperature, relative humidity, and atmospheric pressure; these data were recorded at one-minute intervals. Due to differences in time resolution among the various data sources, hourly data from buoy and tide level observations were used to interpolate minute-by-minute values.

For the extracted images, certain processing steps were implemented. External factors like dense fog and reflections on the sea surface can degrade image quality, hindering the deep learning model’s ability to extract crucial features. Therefore, such unsuitable samples were excluded from the dataset. Specifically, this part of the work was indeed completed through manual inspection. Samples affected by severe dense fog and reflections on the sea surface have distinct characteristics and typically occur within a specific hour or a few hours, so this simple process did not take much time. Furthermore, this operation only removed samples that were severely contaminated by noise, while samples influenced by light fog and minor reflections were retained. Besides removing low-quality images, efforts were made to minimize image noise. During the data preprocessing phase, segments such as the sky and ships were cropped out from the wave images, as they could adversely affect the study. Subsequently, the images were cropped to dimensions of 512 × 512 pixels. To expedite training and reduce computational costs, the images were further resized to 224 × 224 pixels before model training. Unlike images used for other tasks, those employed for this study were not subjected to flipping or similar preprocessing steps to maintain the integrity of wave features. Following data preprocessing, a total of 14,092 images suitable for the study were obtained.

2.2. Building and Partitioning the Dataset

In this study, various data were integrated based on observation time to construct two datasets: the wind wave and swell classification dataset and the SWH inversion dataset, each containing 14,092 samples. The wind wave and swell classification dataset includes wave video images and wave type labels, comprising 6867 wind wave samples and 7225 swell samples. The SWH inversion dataset consists of wave video images, meteorological factors observation data (wind speed, wind direction, air temperature, atmospheric pressure, relative humidity), oceanographic factor observation data (tide level), and SWH observation data, with the SWH observation data used as training labels. Figure 2 illustrates the wave characteristics within the dataset. Figure 2a shows that wave directions are primarily concentrated in the east and southeast, with most SWH being below 1 m. Figure 2b provides a detailed distribution of SWH, indicating that samples with SWH between 0.2 m and 0.5 m are the most common, while large waves exceeding 1.5 m are relatively rare.

The dataset was partitioned into training, validation, and testing sets following a 6:2:2 ratio. The training set consists of 8456 samples, the validation set comprises 2818 samples, and the testing set contains 2818 samples. In addition to the two datasets mentioned earlier, this study also extracted the wind wave inversion dataset, swell inversion dataset, and onshore wind inversion dataset from the SWH inversion dataset. The partitioning details of these datasets are shown in Table 1.

2.3. Methods

2.3.1. Wind Wave and Swell Classification Algorithm (ResNet-SW)

In this paper, the ResNet architecture was improved to propose the ResNet-SW network model, which is suitable for classifying wind wave and swell using wave video images. Starting with ResNet50 as the foundation, the SE attention mechanism module was incorporated into the last layer of the residual structure to enhance the model’s capability to capture features in wave images. Additionally, the max pooling layers in ResNet50 can cause the loss of some information during down-sampling. Since extracting detailed information from wave video images is crucial for improving accuracy, ResNet-SW omits the use of max pooling layers for down-sampling. ResNet employs numerous ReLU activation functions and Batch Normalization (BN) [51]. In contrast, the Swin Transformer [52] reduces the usage of BN and ReLU to lower the model’s complexity, enhancing network performance. After feature extraction, the classification results are output through a fully connected layer. The ResNet-SW network architecture is shown in Figure 3.

2.3.2. Significant Wave Height Inversion Algorithm (Inversion-Net)

During the growth and dissipation of wave, various meteorological factors and oceanographic factors are closely related to wave conditions. Therefore, it is crucial to combine meteorological factors, oceanographic factors and video images to comprehensively represent wave information when inverting SWH. This paper proposes an inversion algorithm that integrates a convolutional neural network (CNN) with a multilayer perceptron (MLP) to invert nearshore significant wave height (SWH) using meteorological factors (wind speed, wind direction, air temperature, air pressure, relative humidity), an oceanographic factor (tide level), and wave video images.

In the CNN section, wave video images are used as input. After preprocessing, the ResNet-SW architecture is employed to extract wave features from the wave video images. These features are then passed through a fully connected layer to output SWH. The MLP takes multiple factors as input and consists of four hidden layers, each containing 256 neurons. To enhance model performance, the MLP utilizes the more stable SELU activation function, enabling it to learn complex nonlinear relationships between inputs and outputs. The outputs of the two components are combined into a new input, which is then processed through a two-layer fully connected network to produce the final inversion result. The structure of the Inversion-Net algorithm is shown in Figure 4.

2.3.3. Criteria for Wind Wave and Swell Classification Labels

In this study, observational data on wave types were not available. Therefore, it is necessary to establish criteria to assign wave type labels to the experimental samples. First, the JP [53] method was employed to separate wind wave and swell from the wave spectrum. The core principle of this method is to divide the wave spectrum into several wave systems and then determine whether each system represents wind waves or swell waves, thereby achieving the separation of the two [54]. This method introduces a parameter (α) that represents the ratio of the spectral peak energy of a wave system to the spectral peak energy of the Pierson–Moskowitz (PM) [55] spectrum at the same peak frequency, as shown in Equation (1) [53]:

α = \frac{S (f_{n})}{S_{P M} (f_{n})}

(1)

where

f_{n}

represents the spectral peak frequency of an individual wave system,

S (f_{n})

is the spectral peak energy corresponding to

f_{n}

, and

S_{P M} (f_{n})

is the spectral peak energy of the PM spectrum corresponding to

f_{n}

, The expression for the PM spectrum is given in Equation (2) [55]:

S_{P M} (f) = 0.0081 g^{2} {(2 π)}^{- 4} f^{- 5} e^{- 1.25 {(f / f_{n})}^{- 4}}

(2)

if α > 1, the wave system is classified as a wind wave; otherwise, it is classified as a swell. This method allows for the separation of wind wave spectra and swell spectra from the overall wave spectrum.

Subsequently, the wave types were determined based on the established criteria. Guo et al. [56] defined a mixed wave energy component factor using the zeroth-order moments of the wind wave spectrum and the zeroth-order moments of mixed wave spectrum, which serves as a criterion for distinguishing between wind waves and swell waves. The zeroth-order moments of the wind wave spectrum and the zeroth-order moments of swell spectrum can be easily obtained through integration. The relationship between the zeroth-order moment of the mixed wave and SWH is given by Equation (3). The expression for the mixed wave energy component factor is shown in Equation (4):

M_{o} = \frac{H^{2}}{16}

(3)

G = \frac{M_{o}}{M_{o w}} = 1 + \frac{M_{o s}}{M_{o w}}

(4)

where G represents the mixed wave energy component factor,

M_{o}

denotes the zeroth-order moment of the mixed wave,

M_{o w}

represents the zeroth-order moment of the wind wave, and

M_{o s}

denotes the zeroth-order moment of the swell. When G < 2, the wave is primarily composed of wind waves, whereas when G ≥ 2, the wave is primarily composed of swell. This method was employed in the present study to assign wave type labels to the samples in the dataset. The overall processing for this section involves first using the JP method to separate wind wave and swell components from the wave spectrum of each sample, dividing the original wave energy spectrum into energy spectra for wind wave and swell. Next, wave types are identified based on established classification criteria. Through the aforementioned method, the wave type labels for each sample can be obtained. As shown in Figure 5, three wind wave samples and three swell samples are presented. It can be observed that wind wave has more complex textures compared to swell. There are also certain differences between different wind waves; for example, the strength of the wind can cause varying sizes of wave crest breaking (as seen in Wind wave (a) and Wind wave (b)) or distinct surface undulations (as seen in Wind wave (c)). In contrast, the characteristics of swell are more consistent, typically exhibiting a calm surface with varying degrees of strips (as seen in Swell (a) and Swell (c)) or very slight surface fluctuations (as seen in Swell (b)). Overall, these criteria effectively label the wave types in the images.

2.3.4. The Training Process for Combining Constraint Condition and Deep Learning

The training process of a deep learning model includes the forward computation of the neural network and the error backpropagation process [57]. During training, the model continuously learns the mapping relationship between the given inputs and outputs, aiming to make the outputs as close to the true labels as possible. The loss function represents the degree of error between the output values and the labels, thereby guiding the learning process. In this study, images and various feature data were used as inputs, with SWH as the label for model training. However, the model obtained through this training method was not sufficiently stable. To address this, we attempted to improve the original deep learning loss function by introducing a constraint condition. The introduced constraint condition is derived from the statistical relationship between the SWH observed by the nearshore buoy (Xiaomai Island Buoy) and the SWH observed by the offshore buoy (Offshore Buoy). The study area is located in a nearshore region; the relationship between wind-driven wave growth and wind is unclear when the wind direction during the observation period is offshore wind. The offshore area consistently shows a clear relationship between wind and waves across all directions. When the wind direction during the observation period is toward the land (onshore wind), there is a strong linear relationship between the SWH observed by the Xiaomai Island Buoy and that observed by the Offshore Buoy. Consequently, wave samples driven by onshore wind were selected for our study. In this context, wind directions from 120° to 210° are considered onshore winds, while other directions are considered offshore winds. The Pearson correlation coefficients [58] between the SWH observed by the Xiaomai Island buoy and the offshore buoy were calculated under both onshore and offshore wind conditions. The results indicate that the correlation coefficient is 0.96 under onshore wind conditions, while it is only 0.44 under offshore wind conditions. This confirms a strong linear relationship between the SWH observed by the Xiaomai Island Buoy and that observed by the Offshore Buoy during onshore wind conditions. The data were fitted using the least squares method, with the relationship expressed in Equation (5):

y_{s w h} = 0.92 x_{s w h} + 0.1231

(5)

where

x_{s w h}

represents the SWH observed by the offshore buoy, and

y_{s w h}

represents the SWH label obtained from the fitted equation. A significance test was conducted using the t-test (t-value = 14.4516, p-value = 1.2 × 10⁻¹¹), confirming the validity of the fitted equation.

The formulas for the deep learning loss function and the loss function of introduced constraint condition are as follows:

H u b e r L o s s \{x, y\} = \{\begin{matrix} \frac{1}{2} {(x - y)}^{2}, & i f |x - y| \leq δ \\ δ |x - y| - \frac{1}{2} δ^{2}, & i f |x - y| > δ \end{matrix}

(6)

L o s s D L = H u b e r L o s s \{\hat{h}, h\}

(7)

L o s s C o n d i t i o n = H u b e r L o s s \{\hat{h}, y_{s w h}\}

(8)

L o s s T o t a l = λ_{D L} L o s s D L + λ_{C o n d i t i o n} L o s s C o n d i t i o n

(9)

Equation (6) is the Huber Loss function, which includes a hyperparameter δ. In this study, δ is set to 1.0. Equations (7) and (8) represent the deep learning loss function and the loss function with the constraint condition, where

\hat{h}

and

h

are the inversion results and the observed values. In Equation (9),

λ_{D L}

and

λ_{C o n d i t i o n}

are the weights corresponding to each loss function.

Figure 6 presents the deep learning training process with the constraint condition. During training, the model calculates the loss between the output and the label based on both the deep learning loss function and the constraint-embedded loss function. These two parts of the loss are then weighted and summed to obtain the total loss, which is ultimately used to guide the model training through backpropagation.

2.4. Evaluation Metrics

2.4.1. Evaluation Metrics for Classification Algorithms

The classification evaluation metrics, accuracy, precision, recall, and F1-score, are obtained from the confusion matrix in Table 2. Accuracy refers to the proportion of correctly classified samples out of the total samples. Precision represents the proportion of actual positive samples among the samples classified as positive. Recall represents the proportion of actual positive samples correctly classified as positive. F1-score is the harmonic mean of precision and recall, serving as an indicator of model robustness. F1-score ranges from 0 to 1, where 1 indicates the best model output and 0 the worst. The calculation methods are shown in Equations (10)–(13).

A c c u r a c y = \frac{T P + T N}{T P + F P + F N + T N}

(10)

P r e c i s i o n = \frac{T P}{T P + F P}

(11)

R e c a l l = \frac{T P}{T P + F N}

(12)

F 1 - s c o r e = \frac{2 \times P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l}

(13)

2.4.2. Evaluation Metrics for Significant Wave Height Inversion Algorithms

In this part of the experiment, three regression evaluation metrics were used to evaluate the performance of the algorithm: mean absolute error (MAE), root mean square error (RMSE), and mean absolute percentage error (MAPE). The calculation methods for these three metrics are shown in Equations (14)–(16).

M A E = \frac{1}{n} \sum_{i = 1}^{n} | y_{i} - {\hat{y}}_{i} |

(14)

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}

(15)

M A P E = \frac{100 %}{n} \sum_{i = 1}^{n} |\frac{y_{i} - {\hat{y}}_{i}}{y_{i}}|

(16)

where

y_{i}

represents the observed SWH from the buoy and

{\hat{y}}_{i}

represents the inversion results.

Song et al. [44,59] proposed an evaluation metric to evaluate the performance of models in determining nearshore wave height. This evaluation metric calculates the proportion of samples with good results relative to the total number of samples. This study will use this evaluation metric to evaluate the performance of the models. Based on the requirements for operational observations [45], the inversion results should meet the following criteria: the absolute error between the inversion results and the observed values should be within 15% of the observed values. This evaluation metric is called the compliance rate (CR) in this study. The CR refers to the proportion of samples in the test set where the inversion results meet the above criteria.

3. Results and Discussion

3.1. Experimental Environment

All experiments in this study were conducted on a Linux server equipped with an Intel(R) Xeon(R) Platinum 8352V CPU v16 @2.10 GHz, an NVIDIA RTX 4090 24G GPU, and 120 GB of memory. The models were implemented using the PyTorch 1.10.0 deep learning framework. Additionally, Python libraries such as Numpy and Pandas were utilized in the experiments.

3.2. Wind Wave and Swell Classification Results

First, various deep learning algorithms were employed to classify wind wave and swell; the performance of these models on the test set was compared to select the best-performing model by comparing metrics. All models employed the CrossEntropyLoss function and the Adam optimizer, with the number of epochs set to 100 and the learning rate set to 0.0001. Table 3 presents the classification results of the six models. ResNet50 outperformed all other models across all metrics, achieving an accuracy of 91.83%, precision of 92.05%, recall of 91.76%, and F1-score of 0.9181. From the comprehensive information in this table and Figure 7, these models demonstrated good performance in classifying wind wave and swell, indicating that deep learning algorithms can accurately classify wind wave and swell using nearshore wave video images. The ResNet architecture demonstrated stronger performance in extracting wave features, with ResNet50 showing the highest stability. Therefore, our next step will involve further enhancing the model’s performance by refining ResNet50.

Next, SE, CBAM, and ECA attention mechanism modules were incorporated into ResNet50, yielding the results shown in Table 4. From the information presented in the table, it is evident that all three models with attention mechanisms exhibited varying degrees of improvement across all metrics. Specifically, the addition of the SE module resulted in the most significant enhancements in accuracy, precision, recall, and F1-score, with increases of 1.33%, 1.06%, 1.35%, and 0.0123, respectively. These results indicate that the SE module is better suited for extracting features from wave images, thereby facilitating improvements in classification performance.

Finally, the classification performance of ResNet-SW on the test set was validated, as shown in the confusion matrix in Figure 8. From the classification results, 2666 samples were correctly classified, with 1376 correctly classified as swell and 1290 correctly classified as wind wave. The classification accuracy of ResNet-SW on the test set was 94.61%. There were 67 swell samples misclassified as wind wave and 85 wind wave samples misclassified as swell. Most of these misclassified samples were either mixed wave samples with indistinct characteristics or low-quality samples affected by sea fog, making classification more challenging. The accuracy, precision, recall, and F1-score of the ResNet-SW model for each category are shown in Table 5, with the “average” representing the overall classification performance of the model. From the table, in terms of precision, recall, and F1-score, wind wave (95.06%, 93.82%, and 0.9444) outperformed swell (94.18%, 95.36%, and 0.9477). This part of the experiment demonstrates that ResNet-SW can accurately distinguish between wind wave and swell, with a higher proficiency in identifying swell. The reason for this could be that the morphology of swell is more stable compared to wind waves, making it easier to capture consistent features. Additionally, since this study only classifies waves into two types, wind wave and swell, there was inevitably some bias when assigning labels to samples that contain both wind wave and swell components (mixed wave samples). This bias may have a more significant impact on the identification of wind waves. Overall, ResNet-SW demonstrates excellent capability in extracting wave information from nearshore video images and identifying wave types.

3.3. Significant Wave Height Inversion Results

3.3.1. Significant Wave Height Inversion Results Base on Wave Video Images

In the study of wind wave and swell classification, ResNet-SW demonstrated excellent capability in extracting features from instantaneous wave video images. Consequently, we decided to apply the ResNet-SW architecture to the research on SWH inversion base on instantaneous wave video images. Specifically, this task was approached as a regression problem. The structure of the final fully connected layer in ResNet-SW was modified to take wave video images as input and output the SWH values. Generally, there is a strong correlation between wind factors (e.g., wind speed, wind direction) and wave factors (e.g., wave height, wave direction). Therefore, an additional MLP network (the structure of the MLP in Inversion-Net) was introduced here, which uses wind speed as the sole input and output the SWH values. In this study, the inversion models employ the Huber Loss function and the Adam optimizer, with the epoch set to 100 and the learning rate set to 0.0002.

Firstly, the impact of varying image sizes and training sample quantities on the performance of ResNet-SW was analyzed through ablation experiments. As shown in Table 6, increasing both the image size and the number of training samples led to significant improvements in model performance. Specifically, larger image sizes facilitated the capture of finer wave details, while a greater number of training samples enhanced the model’s generalization ability and inversion accuracy. Based on the analysis of the computational resources and dataset available in this study, the model demonstrated optimal performance when the image size was set to 224 × 224 pixels and the training dataset included 8456 samples (the maximum). Furthermore, when the training dataset was reduced from approximately 8000 samples to 4000, there was no significant change in the inversion error of the test samples. However, when the dataset fell below 4000 samples, a noticeable increase in the inversion error was observed. Based on the experimental results of this study, the minimum training dataset should not be less than 4000 samples. In fact, under more favorable conditions, using larger image sizes and a greater number of training samples is expected to further enhance the performance of model.

Table 7 presents the inversion performance of ResNet-SW model and MLP model on the test set. Comparatively, the ResNet-SW model demonstrates superior inversion performance. Notably, the CR of the inversion results from the ResNet-SW model is 57.74%, which significantly outperforms that of the MLP model at 26.37%. Although this demonstrates that using instantaneous wave video images and the ResNet-SW structure can effectively invert SWH, the results are still not entirely satisfactory, as a considerable number of samples do not meet the requirements for operational observations. This issue may be related to the quality of the test data images, as the inversion method that relies solely on wave video images is highly dependent on image quality. Excessive noise can hinder the model’s ability to extract wave information, leading to significant errors during testing.

3.3.2. Significant Wave Height Inversion Results Base on Multi-Factor Fusion

Various meteorological and oceanographic factors are directly or indirectly related to SWH. In this study, the effects of wind speed (WS), wind direction (WD), relative humidity (RH), air temperature (Tem), air pressure (AP), and tide level (Lev) on inversion accuracy were analyzed. In Figure 9, the Pearson correlation coefficients between these factors are calculated within the SWH inversion dataset. A strong positive correlation (0.61) was found between wind speed and SWH, as wind speed is the primary driving force for wave generation, causing SWH to increase as wind speed rises. Wind direction showed a negative correlation (−0.32) with SWH. While wind direction does not directly influence SWH, the consistent alignment of wind direction with wave direction significantly affects the accumulation of wave energy. A positive correlation (0.40) was observed between relative humidity and SWH, due to the fact that higher humidity is often associated with weather systems that generate strong winds in alignment with wave direction, leading to an increase in SWH. Air temperature and air pressure showed weak correlations with SWH (0.09 and −0.19). Although these factors do not directly impact SWH, they are associated with changes in weather systems that can indirectly affect SWH. Similarly, tide level had a weak correlation with SWH (0.09). However, in coastal areas, changes in tide level significantly impact water depth, which in turn affects wave energy loss and consequently influences SWH. In the study area, the water depth is approximately 23 m (based on the gauge zero), with the tidal range reaching up to 4.5 m, which is about 20% of the water depth. Therefore, the impact of tide level on SWH in the study region cannot be ignored.

Next, it is necessary to further validate the impact of each factor on the inversion process. Here, ablation experiments were conducted on the input factors using the MLP network in Inversion-Net. Table 8 shows the experimental results trained and tested on the SWH inversion dataset, the wind wave inversion dataset and the inversion swell dataset. In the SWH inversion dataset and the wind wave inversion dataset, wind speed has a clear linear relationship with SWH. Therefore, wind speed was initially used as the sole input, and other factors were subsequently added one by one. Across the SWH inversion dataset, all metrics improved to varying degrees with the addition of factors. Hence, all six factors will be incorporated into the SWH inversion dataset. In the wave inversion dataset, the addition of tidal level data led to less favorable evaluation metrics for the inversion results. Thus, tidal level data will no longer be included in the wave inversion dataset. In the swell inversion dataset, since there is no clear relationship between wind speed and SWH, it is uncertain whether adding wind speed would improve the inversion accuracy. Therefore, relative humidity was used as the first input factor, and adding other factors also improved the inversion accuracy to some extent. Hence, all six factors will be incorporated into the swell inversion dataset.

Based on feature selection results, instantaneous wave video images, wind speed, wind direction, air pressure, air temperature, relative humidity, and tide level were used as inputs to train Inversion-Net on the SWH inversion dataset. The results on the test set are shown in Figure 10a. From the figures, it can be seen that adding the meteorological factors and oceanographic factor significantly improved the inversion accuracy. Compared to the inversion results of ResNet-SW, the MAE decreased by 45.45%, the RMSE decreased by 38.89%, and the MAPE decreased by 47.97%. It is noteworthy that the CR reached 80.23%, indicating that Inversion-Net significantly improved the proportion of samples meeting the requirements for operational observations. This indicates that the inclusion of multiple factors compensates for the deficiencies in image quality, validating the effectiveness of using the Inversion-Net model to invert SWH from wave images combined with various factors.

Given the significant differences between wind waves and swell in multiple aspects, it is essential to train models for each wave type separately based on their distinct characteristics. Accordingly, we trained the Inversion-Net model on the wind wave inversion dataset and the swell inversion dataset. Based on the analysis results in Table 5, the factors were set as inputs for both the wind wave model and the swell model. Figure 10b summarizes the inversion results of the specialized models for wind wave and swell on the test set. As shown in the figure, the scatter in Figure 10b is more concentrated compared to Figure 10a, and the evaluation metrics have improved to varying degrees. Table 9 presents the inversion performance of the Inversion-Net model using the specialized models for wind wave and swell under different sea states. It is evident from the table that the specialized models for wind wave and swell generally outperform the Inversion-Net model in most sea conditions. Both models exhibit a relatively low CR for samples with SWH below 0.5 m, as the allowable error margin of within 10 cm presents a significant challenge for the models proposed in this study. Additionally, the Inversion-Net model showed a higher number of samples outside the allowable error margin under rough sea states (1.25 m–2.5 m). This could be attributed to insufficiently distinct textures in the collected images, leading to feature confusion (where despite high SWH, the surface wave fluctuations in the image are relatively small). In contrast, the specialized models for wind wave and swell can focus on specific wave characteristics, allowing for more accurate extraction of relevant features and avoiding confusion with other types of waves. The Inversion-Net shows superior performance in very rough sea states (2.5 m–4.0 m), which may be attributed to differences in training data number. The samples in this sea state are all wind waves and are relatively few in number, leading to less stable performance from the wind wave model. However, under these sea states, both models achieve a CR of 100% in these conditions, meaning that both models fully meet the operational observation requirements. Therefore, the practical significance of this advantage appears less critical.

Table 10 presents the performance of the Inversion-Net model and the specialized models for wind wave and swell under different wave conditions. The results indicate that, under all conditions, the specialized models for wind wave and swell outperform the Inversion-Net model across various evaluation metrics. Additionally, both models show better performance in the inversion of wind waves. This may be due to the generally higher SWH in wind waves, which is consistent with the previously observed better performance in inverting higher SWH samples. It is worth noting that the CR reached 84.07%; this indicates that separately training Inversion-Net models for wind wave and swell improves the stability of the inversion results. Inversion problems generally refer to the process of recovering original parameters or variables from observational data, often accompanied by uncertainty. To assess the uncertainty of the inversion method, we employed the Bootstrap method to calculate the 95% confidence interval, thereby quantifying the reliability of the inversion results. Figure 9 shows the confidence intervals for both methods, where the red area surrounding the blue line represents the confidence interval. A narrower confidence interval indicates higher precision and lower error in the inversion estimates. As shown in the figure, both methods yield very narrow confidence intervals, indicating that the inversion results are stable and reliable. Overall, the specialized models for wind wave and swell achieved satisfactory results, which essentially meets the requirements for operational observations.

3.3.3. Significant Wave Height Inversion Results Base on Constraint Condition

Subsequently, Inversion-Net was trained on the onshore wind inversion dataset. Table 11 presents the performance of the Inversion-Net model with deep learning loss function and the Inversion-Net model with improved loss function, when applied to the onshore wind inversion test set. The results indicate that the model trained with the improved loss function demonstrates enhanced stability compared to the model trained solely with the deep learning loss function. The proportion of samples meeting the requirements for operational observations slightly increased, while the overall error significantly decreased. Specifically, the MAE was reduced by 25%, RMSE decreased by 18%, and MAPE dropped by 0.63 percentage points. However, the current dataset has a limited number of samples that meet the onshore wind. Overall, the introduction of constraint conditions to improve the deep learning loss function proved effective in enhancing inversion accuracy. As the dataset size increases, the advantages of this approach are expected to become more pronounced.

3.3.4. The Performance of the Model in Wave Processes

In previous sections, the inversion performance for SWH on individual samples was discussed. However, in oceanography and meteorology, SWH refers to the average height of the highest one-third of waves over a specific time period (typically 30 min to 1 h). Therefore, it is necessary to evaluate the performance of model over the entire wave process. In this study, SWH data were sampled hourly. However, multiple samples were included within each hour. To address this, the inversion results of multiple samples within each hour were averaged to represent the hourly inversion result. To assess the performance of model throughout the wave process, samples from the test set covering a continuous 30-day period from November to December 2023 were used for testing. Initially, the wave type was determined using the wind wave and swell classification model, and it was determined whether the test samples met the constraint conditions proposed in this study. Subsequently, the samples were input into the corresponding inversion models, producing a time series of results that synthesized multiple models.

Figure 11 illustrates the comparison between the actual observed SWH and the inversion results for different models. It can be seen from the figure that the SWH peak reached 2 m during the observation period, and is accompanied by significant fluctuations. The inversion results of each model generally follow the trend of the actual observed value. Specifically, the inversion results of the ResNet-SW model exhibit substantial discrepancies from actual observed value in several instances, with a notable underestimation of SWH at higher values. For instance, significant errors occurred between 28 November 2023 and 1 December 2023, as well as near the peak values. The quality of the sample images in this section was relatively poor. However, the ResNet-SW model’s overreliance on image quality resulted in these larger errors. In contrast, the inversion results of the Inversion-Net model are notably closer to the actual observed values compared to ResNet-SW. Although some deviations are present, the agreement with actual observed values is higher when SWH is less than 1.25 m. However, near the maximum peak, the Inversion-Net model significantly underestimates the SWH. This may be due to unclear image texture leading to feature confusion (where despite high SWH, the surface wave fluctuations in the image are relatively small). The inversion results synthesized from multiple models perform the best, closely aligning with actual observed value for most time periods, especially during peak periods. These results are relatively smooth, capturing both minor fluctuations and reflecting major trend changes effectively. Table 12 summarizes the inversion performance for different models; the inversion results synthesized from multiple models show an MAE of 0.04 m, RMSE of 0.07 m, MAPE of 8.52%, and CR of 84.05%. Overall, the inversion results of this study demonstrate higher accuracy compared to existing satellite, radar, and video inversion methods [2,3,4,5,41,43]. These results indicate that the method not only essentially meets operational observation requirements but also maintains a low error margin. It holds significant importance for nearshore wave observation.

3.4. Discussion

The results obtained from this study demonstrate that the proposed method is capable of extracting wave information from nearshore wave videos. The research on wind wave and swell classification, as well as SWH inversion from nearshore video images, shows that deep learning models have good practical applicability for both classification and regression tasks related to nearshore wave video images. In fact, external noise has consistently affected the model’s performance, especially in the classification of wind waves and swells, where noise and labeling biases have somewhat impacted the accuracy of the classification models. Improving image quality and reducing the influence of external noise on wave information extraction will be the focus of our next steps. In the SWH inversion study, this research combined CNN and MLP deep learning methods, using nearshore wave video images and various meteorological and oceanographic factors to jointly characterize wave information. This approach mitigated the overreliance on video image quality to some extent. This study also attempted to use either instantaneous wave images alone or meteorological and oceanographic factors alone to invert SWH, but the results were unsatisfactory (as shown in Table 7). Therefore, compared to current research in the field, which typically relies on a single form of wave data (such as binocular cameras, remote sensing images, RGB images, or oceanographic data from numerical simulations), this method offers certain advantages. Additionally, improvements to the loss function during the deep learning training process have enhanced the model’s stability to some extent, providing a potential direction for improving model performance in the future.

There are still many areas in the current study that require improvement or expansion. From a practical application perspective, the wave video image dataset should ideally include samples from multiple angles and scales, but it is evident that our current data are not sufficiently comprehensive. Additionally, this method does not yet support nighttime observations. In the future, we may use infrared cameras to collect nighttime video images to verify the model’s effectiveness for nighttime applications. Currently, the research has been conducted in a specific sea area, and the method’s performance needs further validation in more complex marine environments. There are also many limitations in the section on SWH inversion based on constrained conditions. The amount of usable data for this part is still too small, and the constraints are set from a data statistics perspective. In future applications, more equations or formulas with physical significance should be incorporated into the training process. Moreover, the wave parameters that can be inverted are still limited, and there is still much work to be performed before this method can replace buoy observations. Finally, as SWH is a statistical concept, using only instantaneous images as input to generate output results is not appropriate. Although this study statistically processed the results within each hour to obtain the SWH during the wave process, using multiple images within each hour simultaneously as input to output SWH seems more efficient. This approach would also maintain the continuity of the wave video, while avoiding the high computational demands of processing the entire video as input. These shortcomings highlight the direction for future work.

4. Conclusions

This study explored the feasibility of using deep learning techniques to extract wave features from instantaneous wave video images. First, various mainstream CNN models were compared, revealing that the ResNet architecture had a stronger ability to extract wave features from instantaneous wave images. Furthermore, based on ResNet50, the SE attention mechanism was incorporated, and inspired by the design of the Swin Transformer network structure, improvements were made to the ResNet architecture, leading to the development of the ResNet-SW model. This model is specifically designed to identify wave types using wave images. The results show that ResNet-SW can accurately classify wind waves and swells, even with a small amount of noise and label configuration bias, achieving a classification accuracy of 94.61%. Building on this, this study examined the effectiveness of using the ResNet-SW network structure to invert SWH from instantaneous wave video images. The results showed that the method is effective for SWH inversion, but due to its reliance on the image quality of instantaneous wave images to represent wave information, the model’s stability is poor. To address these limitations, various meteorological and oceanographic factors were introduced to jointly represent wave information, and the Inversion-Net algorithm, which combines CNNs and MLP, was developed. Feature selection was conducted before model training to analyze the impact of various factors on the inversion process and select suitable input features. The SWH inversion results obtained using this method showed an RMSE of 0.11 m compared to buoy observations. To further enhance the accuracy and stability of the inversion results, the specialized models for wind wave and swell were trained on the wind wave inversion dataset and the swell inversion dataset. The results showed that this method effectively increased the proportion of samples meeting operational observation standards, with a CR reaching 84.07%. Additionally, we attempted to enhance the stability of model by adding conditional constraints. The SWH observations from the nearshore and offshore buoys were fitted to derive a linear relationship, which was then embedded into the loss function. The results showed that this approach significantly improved the stability of model and inversion accuracy. Finally, the model’s performance was evaluated over the entire wave process. In the application phase, wind wave and swell classification models were first used to determine the wave type and assess whether the test samples met the constraints proposed in this study. Subsequently, the samples were input into the corresponding inversion models, and time series inversion results with a 1-hour resolution were synthesized from the outputs of multiple models. The results demonstrate that the method not only meets operational observation requirements but also maintains a low error margin.

In summary, deep learning techniques can effectively extract wave characteristics from instantaneous wave video images and use them for wave type classification and SWH inversion. Additionally, integrating multiple factors and improving the training process can significantly enhance the inversion accuracy and model stability. Although, there are certain limitations in the current study. These shortcomings have pointed out directions for future studies.

Author Contributions

Conceptualization, W.H. and R.L.; Data curation, Z.W., Y.S. and H.T.; methodology, C.X.; software, C.X.; validation, W.H., W.X., Y.L. and P.R.; writing—original draft preparation, C.X.; writing—review and editing, R.L.; funding acquisition, W.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to confidentiality policy.

Acknowledgments

The authors would like to thank Communication Network Dept of North China Sea Marine Forecast and Hazard Mitigation Center of the Ministry of Natural Resources for providing assistance during the experiment.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zhou, Q.; Zhang, S.; Wu, H.; Wang, X.; Du, M.; Bai, Y.; Meng, J. Reviews of Observation Technology for Ocean Waves. Hydrogr. Surv. Charting 2016, 36, 39–44. [Google Scholar]
Zhu, S.; Shao, W.; Armando, M.; Shi, J.; Sun, J.; Yuan, X.; Hu, J.; Yang, D.; Zuo, J. Evaluation of Chinese Quad-polarization Gaofen-3 SAR Wave Mode Data for Significant Wave Height Retrieval. Can. J. Remote Sens. 2018, 44, 588–600. [Google Scholar] [CrossRef]
Klotz, B.W.; Neuenschwander, A.; Magruder, L.A. High-Resolution Ocean Wave and Wind Characteristics Determined by the ICESat-2 Land Surface Algorithm. Geophys. Res. Lett. 2020, 47, e2019GL085907. [Google Scholar] [CrossRef]
Liu, X.; Huang, W.; Eric, G.W. Estimation of Significant Wave Height From X-Band Marine Radar Images Based on Ensemble Empirical Mode Decomposition. IEEE Geosci. Remote Sens. Lett. 2017, 14, 1740–1744. [Google Scholar] [CrossRef]
Zhao, C.; Chen, Z.; Jiang, Y.; Jiang, L.; Zeng, G. Exploration and Validation of Wave-Height Measurement Using Multifrequency HF Radar. J. Atmos. Ocean. Technol. 2013, 30, 2189–2202. [Google Scholar] [CrossRef]
Li, D.; Xiao, L.; Wei, H.; Li, J.; Liu, M. Spatial-temporal measurement of waves in laboratory based on binocular stereo vision and image processing. Coast. Eng. 2022, 177, 104200. [Google Scholar] [CrossRef]
Shi, L.; Hui, L.; Shi, J.; Zhao, B.; Cui, X.; Chu, S. Using binocular vision to measure wave characteristic. In Proceedings of the 2021 International Conference on Computer Engineering and Artificial Intelligence (ICCEAI), Shanghai, China, 27–29 August 2021; pp. 226–229. [Google Scholar]
Dong, X.; Duan, L.; Niu, L.; An, Y.; Gong, L. Binocular Vision based 3D Reconstruction of Ocean Waves and Position Coordinate Measurement. In Proceedings of the 2022 International Symposium on Electrical, Electronics and Information Engineering (ISEEIE), Chiang Mai, Thailand, 25–27 February 2022; pp. 177–182. [Google Scholar]
Cang, Y.; He, H.; Qiao, Y. Measuring the Wave Height Based on Binocular Cameras. Sensors 2019, 19, 1338. [Google Scholar] [CrossRef]
Cruz, G.; Bernardino, A. Aerial Detection in Maritime Scenarios Using Convolutional Neural Networks. In Proceedings of the Advanced Concepts for Intelligent Vision Systems, Cham, Switzerland, 24–27 October 2016; pp. 373–384. [Google Scholar]
Wang, W.; Shen, J.; Shao, L. Video Salient Object Detection via Fully Convolutional Networks. IEEE Trans. Image Process. 2018, 27, 38–49. [Google Scholar] [CrossRef]
Hasan, S.H.; Hasan, S.H.; Ahmed, M.S.; Hasan, S.H. A Novel Cryptocurrency Prediction Method Using Optimum CNN. Comput. Mater. Contin. 2022, 71, 1051–1063. [Google Scholar] [CrossRef]
Wang, L.; Deng, X.; Ge, P.; Dong, C.; Bethel, B.J.; Yang, L.; Xia, J. CNN-BiLSTM-Attention Model in Forecasting Wave Height over South-East China Seas. Comput. Mater. Contin. 2022, 73, 2151–2168. [Google Scholar] [CrossRef]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.-C. MobileNetV2: Inverted Residuals and Linear Bottlenecks. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4510–4520. [Google Scholar]
Huang, G.; Liu, Z.; Maaten, L.V.D.; Weinberger, K.Q. Densely Connected Convolutional Networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2261–2269. [Google Scholar]
Chaudhari, S.; Mithal, V.; Polatkan, G.; Ramanath, R. An Attentive Survey of Attention Models. ACM Trans. Intell. Syst. Technol. 2021, 12, 1–32. [Google Scholar] [CrossRef]
Hu, J.; Shen, L.; Sun, G. Squeeze-and-Excitation Networks. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 11531–11539. [Google Scholar]
Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. CBAM: Convolutional Block Attention Module. In Proceedings of the Computer Vision—ECCV 2018, Cham, Switzerland, 8–14 September 2018; pp. 3–19. [Google Scholar]
Gao, X.; Tu, L.; Li, J.; Li, X. Automatic Classification Algorithm of Astronomical Objects Based on Improved ResNet. Int. J. Innov. Comput. Inf. Control 2023, 19, 579. [Google Scholar] [CrossRef]
Bai, G.; Wang, Z.; Zhu, X.; Feng, Y. Development of a 2-D deep learning regional wave field forecast model based on convolutional neural network and the application in South China Sea. Appl. Ocean Res. 2022, 118, 103012. [Google Scholar] [CrossRef]
Zhou, S.; Xie, W.; Lu, Y.; Wang, Y.; Zhou, Y.; Hui, N.; Dong, C. ConvLSTM-Based Wave Forecasts in the South and East China Seas. Front. Mar. Sci. 2021, 8, 680079. [Google Scholar] [CrossRef]
Han, L.; Ji, Q.; Jia, X.; Liu, Y.; Han, G.; Lin, X. Significant Wave Height Prediction in the South China Sea Based on the ConvLSTM Algorithm. J. Mar. Sci. Eng. 2022, 10, 1683. [Google Scholar] [CrossRef]
Luo, Q.; Xu, H.; Bai, L. Prediction of significant wave height in hurricane area of the Atlantic Ocean using the Bi-LSTM with attention model. Ocean Eng. 2022, 266, 112747. [Google Scholar] [CrossRef]
Liu, Y.; Liu, M.; Xing, J.; Ye, Y. Dual-Parameter Simultaneous Full Waveform Inversion of Ground-Penetrating Radar for Arctic Sea Ice. Remote Sens. 2023, 15, 3614. [Google Scholar] [CrossRef]
Ai, B.; Wen, Z.; Jiang, Y.; Gao, S.; Lv, G. Sea surface temperature inversion model for infrared remote sensing images based on deep neural network. Infrared Phys. Technol. 2019, 99, 231–239. [Google Scholar] [CrossRef]
Zheng, Q.; Li, W.; Shao, Q.; Han, G.; Wang, X. A Mid- and Long-Term Arctic Sea Ice Concentration Prediction Model Based on Deep Learning Technology. Remote Sens. 2022, 14, 2889. [Google Scholar] [CrossRef]
Wang, X.; Yang, S.; Shangguan, D.; Wang, Y. Sea ice detection from SSMI data based on an improved U-Net convolution neural network. Remote Sens. Lett. 2022, 13, 115–125. [Google Scholar] [CrossRef]
Andriolo, U.; Mendes, D.; Taborda, R. Breaking Wave Height Estimation from Timex Images: Two Methods for Coastal Video Monitoring Systems. Remote Sens. 2020, 12, 204. [Google Scholar] [CrossRef]
Scardino, G.; Scicchitano, G.; Chirivì, M.; Costa, P.J.M.; Luparelli, A.; Mastronuzzi, G. Convolutional Neural Network and Optical Flow for the Assessment of Wave and Tide Parameters from Video Analysis (LEUCOTEA): An Innovative Tool for Coastal Monitoring. Remote Sens. 2022, 14, 2994. [Google Scholar] [CrossRef]
Valentini, N.; Balouin, Y. Assessment of a Smartphone-Based Camera System for Coastal Image Segmentation and Sargassum monitoring. J. Mar. Sci. Eng. 2020, 8, 23. [Google Scholar] [CrossRef]
Andriolo, U. Nearshore Wave Transformation Domains from Video Imagery. J. Mar. Sci. Eng. 2019, 7, 186. [Google Scholar] [CrossRef]
Hu, T.; Jin, Z.; Yao, W.; Lv, J.; Jin, W. Cloud Image Retrieval for Sea Fog Recognition (CIR-SFR) Using Double Branch Residual Neural Network. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 16, 3174–3186. [Google Scholar] [CrossRef]
Chincholkar, S.; Rajapandy, M. Fog Image Classification and Visibility Detection Using CNN. In Proceedings of the Intelligent Computing, Information and Control Systems, Cham, Switzerland, 13–15 May 2020; pp. 249–257. [Google Scholar]
Zhu, C.; Wang, J.; Liu, S.; Sheng, H.; Xiao, y. Sea Fog Detection Using U-Net Deep Learning Model Based On Modis Data. In Proceedings of the 2019 10th Workshop on Hyperspectral Imaging and Signal Processing: Evolution in Remote Sensing (WHISPERS), Amsterdam, The Netherlands, 24–26 September 2019; pp. 1–5. [Google Scholar]
Pavlić, M.; Belzner, H.; Rigoll, G.; Ilić, S. Image based fog detection in vehicles. In Proceedings of the 2012 IEEE Intelligent Vehicles Symposium, Alcala de Henares, Spain, 3–7 June 2012; pp. 1132–1137. [Google Scholar]
Xue, S.; Geng, X.; Yan, X.-H.; Xie, T.; Yu, Q. Significant wave height retrieval from Sentinel-1 SAR imagery by convolutional neural network. J. Oceanogr. 2020, 76, 465–477. [Google Scholar] [CrossRef]
Liu, N.; Jiang, X.; Ding, H.; Xu, Y.; Guan, J. Wave Height Inversion and Sea State Classification Based on Deep Learning of Radar Sea Clutter Data. In Proceedings of the 2021 International Conference on Control, Automation and Information Sciences (ICCAIS), Xi’an, China, 14–17 October 2021; pp. 34–39. [Google Scholar]
Choi, H.; Park, M.; Son, G.; Jeong, J.; Park, J.; Mo, K.; Kang, P. Real-time significant wave height estimation from raw ocean images based on 2D and 3D deep neural networks. Ocean Eng. 2020, 201, 107129. [Google Scholar] [CrossRef]
Song, W.; Li, Q.; He, Q.; Zhou, X.; Chen, Y. Determining Wave Height from Nearshore Videos Based on Multi-level Spatiotemporal Feature Fusion. In Proceedings of the 2021 International Joint Conference on Neural Networks (IJCNN), Virtual, 18–22 July 2021; pp. 1–8. [Google Scholar]
GB/T 14914.2-2019; The Specification for Marine Observation-Part 2: Offshore Observation. State Administration of Market Supervision and Administration, S.S.A.C: Beijing, China, 2019.
Gal, Y.; Browne, M.; Lane, C. Long-Term Automated Monitoring of Nearshore Wave Height From Digital Video. IEEE Trans. Geosci. Remote Sens. 2014, 52, 3412–3420. [Google Scholar] [CrossRef]
Sun, D.; Zhang, Y.; Wang, Y.; Chen, G.; Sun, H.; Yang, L.; Bai, Y.; Yu, F.; Zhao, C. Ocean Wave Inversion Based on Airborne IRA Images. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1001013. [Google Scholar] [CrossRef]
Kim, J.; Kim, J.; Kim, T.; Huh, D.; Caires, S. Wave-Tracking in the Surf Zone Using Coastal Video Imagery with Deep Neural Networks. Atmosphere 2020, 1, 304. [Google Scholar] [CrossRef]
Kim, Y.-H.; Cho, S.; Lee, P.-S. Wave height classification via deep learning using monoscopic ocean videos. Ocean Eng. 2023, 288, 116002. [Google Scholar] [CrossRef]
An, Z.; Yin, W.; Li, R.; Han, K.; Wang, H. Research on Wave Period Level Detection Based on 3D Convolutional Network. In Proceedings of the 2021 2nd International Conference on Artificial Intelligence and Information Systems, Chongqing, China, 28–30 May 2021; p. 155. [Google Scholar]
Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd International Conference on International Conference on Machine Learning—Volume 37, Lille, France, 6–11 June 2015; pp. 448–456. [Google Scholar]
Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 9992–10002. [Google Scholar]
Portilla, J.; Ocampo-Torres, F.J.; Monbaliu, J. Spectral Partitioning and Identification of Wind Sea and Swell. J. Atmos. Ocean. Technol. 2009, 26, 107–122. [Google Scholar] [CrossRef]
Li, S.; Zhao, D. Comparisons on partitioning techniques to identify wind-wave and swell. Haiyang Xuebao 2012, 34, 23–29. [Google Scholar]
Pierson Jr, W.J.; Moskowitz, L. A proposed spectral form for fully developed wind seas based on the similarity theory of SA Kitaigorodskii. J. Geophys. Res. 1964, 69, 5181–5190. [Google Scholar] [CrossRef]
Guo, P.; Shi, P.; Wang, H.; Wang, Z. A new criterion between wind wave and swell wave. J.-Ocean. Univ. Qingdao 1997, 27, 131–137. [Google Scholar]
Amari, S.-i. Backpropagation and stochastic gradient descent method. Neurocomputing 1993, 5, 185–196. [Google Scholar] [CrossRef]
Schober, P.; Boer, C.; Schwarte, L.A. Correlation Coefficients: Appropriate Use and Interpretation. Anesth. Analg. 2018, 126, 1763–1768. [Google Scholar] [CrossRef]
Song, W.; Zhou, X.; Bi, F.; Guo, D.; Gao, S.; He, Q.; Bai, Z. Automatic wave height detection from nearshore wave videos. J. Image Graph 2020, 25, 507–519. [Google Scholar] [CrossRef]

Figure 1. Data collection point (Xiaomai Island buoy and offshore buoy).

Figure 2. (a) The wave rose diagram for the samples in the dataset. (b) The distribution of significant wave height among samples in the dataset. The significant wave height data used in this study have accuracy levels of 0.1 m. To analyze the distribution of the data, the proportion of samples for each wave height value relative to the total number of samples was computed.

Figure 3. ReNet-SW network architecture.

Figure 4. Inversion-Net network architecture.

Figure 5. The wave video images with wave type label, among them, Wind wave (a–c) represent labeled wind wave samples, while Swell (a–c) represent labeled swell samples.

Figure 6. Deep learning training process with constraint condition.

Figure 7. The classification confusion matrices of various mainstream deep learning algorithms on the test set of wind wave and swell classification datasets.

Figure 8. Classification confusion matrix of ReNet-SW.

Figure 9. Pearson correlation heatmap of various meteorological and oceanographic factors and SWH.

Figure 10. The inversion results on the test set. (a) The inversion results of Inversion-Net model trained on the significant wave height inversion dataset. (b) The inversion results of the specialized models for wind wave and swell trained on the wind wave inversion dataset and the swell inversion dataset, and their inversion results were ultimately combined. The samples within the two red dashed lines represent those that meet the requirements for operational observations. The blue solid line denotes the fitted curve of the inversion results, and the red shaded area surrounding the blue line represents the 95% confidence interval.

Figure 11. Time series plot of the observed significant wave height and the inversion results for different models. The green dotted line represents the inversion results of the ResNet-SW model, the orange dotted line represents the inversion results of the Inversion-Net model, and the red dotted line represents the inversion results synthesized from multiple models.

Table 1. The partitioning results of multiple datasets.

Dataset Name	Training	Validation	Test
wind wave and swell classification dataset	8456	2818	2818
SWH inversion dataset	8456	2818	2818
wind wave inversion dataset	4129	1363	1375
swell inversion dataset	4327	1455	1443
onshore wind inversion dataset	1174	438	407

Table 2. Confusion matrix.

Confusion Matrix		Predict
Confusion Matrix		Positive	Negative
True	Positive	True Positive (TP)	False Negative (FN)
True	Negative	False Positive (FP)	True Negative (TN)

Table 3. The classification performance of various mainstream deep learning algorithms on the test set of wind wave and swell classification datasets.

Model	Accuracy	Precision	Recall	F1-Score
AlexNet	88.33%	88.38%	88.28%	0.8831
VGG16	91.13%	91.31%	91.05%	0.9110
MobileNetV2	91.16%	91.35%	91.08%	0.9113
DenseNet	90.03%	90.04%	90.06%	0.9003
ResNet34	91.02%	91.03%	91.05%	0.9102
ResNet50	91.83%	92.05%	91.76%	0.9181

Table 4. The classification performance of ResNet50 after adding attention mechanism on the test set of wind wave and swell classification datasets.

Model	Accuracy	Precision	Recall	F1-Score
ResNet50_SE	93.16%	93.11%	93.11%	0.9311
ResNet50_CBAM	93.01%	93.00%	93.01%	0.9301
ResNet50_ECA	92.69%	92.83%	92.63%	0.9267

Table 5. The classification performance of ResNet-SW on the test set of wind wave and swell classification datasets.

		Accuracy	Precision	Recall	F1-Score
Classes	swell	94.61%	94.18%	95.36%	0.9477
Classes	wind wave	94.61%	95.06%	93.82%	0.9444
average		94.61%	94.62%	94.59%	0.9460

Table 6. The ablation experiment results of ResNet-SW on the test set of SWH inversion dataset under different image sizes and training sample quantities.

Image Size	Training Sample Quantities	Training Time	RMSE (m)	CR (%)
64	8456	67 min	0.33	28.27
128	8456	125 min	0.21	54.01
224	8456	272 min	0.18	57.74
224	4228	133 min	0.19	52.44
224	2819	90 min	0.25	47.65

Table 7. The inversion performance of ResNet-SW model and MLP model on the test set of SWH inversion dataset.

Model	Inputs	MAE (m)	RMSE (m)	MAPE (%)	CR (%)
MLP	wind speed	0.27	0.39	52.93	26.37
ResNet-SW	image	0.11	0.18	18.95	57.74

Table 8. The result for feature ablation experiments of MLP in different datasets.

Dataset	Factors	MAE (m)	RMSE (m)	MAPE (%)	CR (%)
SWH inversion dataset	WS	0.27	0.39	52.93	26.37
	WS + WD	0.21	0.32	36.41	36.52
	WS + WD + RH	0.17	0.26	33.33	43.19
	WS + WD + RH + Tem	0.12	0.21	21.42	58.59
	WS + WD + RH + Tem + AP	0.11	0.21	19.66	61.67
	WS + WD + RH + Tem + AP + Lev	0.10	0.20	18.47	63.70
wind wave inversion dataset	WS	0.26	0.39	36.34	40.85
	WS + WD	0.21	0.32	28.45	44.44
	WS + WD + RH	0.16	0.25	22.62	56.57
	WS + WD + RH + Tem	0.13	0.22	16.43	68.94
	WS + WD + RH + Tem + AP	0.12	0.21	15.17	71.09
	WS + WD + RH + Tem + AP + Lev	0.13	0.22	15.45	69.87
swell inversion dataset	RH	0.15	0.21	36.65	26.01
	RH + WS	0.14	0.21	32.63	34.52
	RH + WS + WD	0.10	0.15	26.95	42.63
	RH + WS + WD + Tem	0.09	0.14	23.99	51.62
	RH + WS + WD + Tem + AP	0.08	0.14	22.52	52.84
	RH + WS + WD + Tem + AP + Lev	0.07	0.12	21.31	54.13

Table 9. Inversion performance of the Inversion-Net model and the specialized models for wind wave and swell under different sea states.

Model	Wave Height Range (m)	MAE (m)	RMSE (m)	MAPE (%)	CR (%)
Inversion-Net model	0.1–0.3 (Slight)	0.03	0.05	15.61	70.98
	0.3–0.5 (Small)	0.04	0.06	11.77	72.12
	0.5–1.25 (Moderate)	0.06	0.09	7.50	86.56
	1.25–2.5 (Rough)	0.19	0.33	11.03	76.84
	2.5–4.0 (Very rough)	0.07	0.09	2.70	100.00
wind wave and swell model	0.1–0.3 (Slight)	0.02	0.04	11.38	80.47
	0.3–0.5 (Small)	0.04	0.05	11.06	76.05
	0.5–1.25 (Moderate)	0.06	0.08	7.53	88.25
	1.25–2.5 (Rough)	0.13	0.19	8.14	86.32
	2.5–4.0 (Very rough)	0.15	0.18	5.70	100.00

Table 10. Inversion performance of the Inversion-Net model and the specialized models for wind wave and swell under different wave conditions.

Model	Wave Type	MAE (m)	RMSE (m)	MAPE (%)	CR (%)
Inversion-Net model	wind wave	0.07	0.14	7.54	86.04
	swell	0.05	0.08	12.07	74.71
	total	0.06	0.11	9.86	80.23
wind wave and swell model	wind wave	0.07	0.10	7.70	89.38
	swell	0.04	0.07	10.26	79.00
	total	0.05	0.09	9.01	84.07

Table 11. Inversion performance of the Inversion-Net model with deep learning loss function and the Inversion-Net model with improved loss function.

Loss Function	MAE (m)	RMSE (m)	MAPE (%)	CR (%)
deep learning loss function	0.08	0.11	9.94	80.34
improved loss function	0.06	0.09	9.31	83.78

Table 12. The inversion performance of different models on the test set during wave processes.

Model	MAE (m)	RMSE (m)	MAPE (%)	CR (%)
ResNet-SW	0.11	0.19	20.76	50.58
Inversion-Net	0.05	0.12	9.40	80.54
multiple models	0.04	0.07	8.52	84.05

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Xu, C.; Li, R.; Hu, W.; Ren, P.; Song, Y.; Tian, H.; Wang, Z.; Xu, W.; Liu, Y. On the Nearshore Significant Wave Height Inversion from Video Images Based on Deep Learning. J. Mar. Sci. Eng. 2024, 12, 2003. https://doi.org/10.3390/jmse12112003

AMA Style

Xu C, Li R, Hu W, Ren P, Song Y, Tian H, Wang Z, Xu W, Liu Y. On the Nearshore Significant Wave Height Inversion from Video Images Based on Deep Learning. Journal of Marine Science and Engineering. 2024; 12(11):2003. https://doi.org/10.3390/jmse12112003

Chicago/Turabian Style

Xu, Chao, Rui Li, Wei Hu, Peng Ren, Yanchen Song, Haoqiang Tian, Zhiyong Wang, Weizhen Xu, and Yuning Liu. 2024. "On the Nearshore Significant Wave Height Inversion from Video Images Based on Deep Learning" Journal of Marine Science and Engineering 12, no. 11: 2003. https://doi.org/10.3390/jmse12112003

APA Style

Xu, C., Li, R., Hu, W., Ren, P., Song, Y., Tian, H., Wang, Z., Xu, W., & Liu, Y. (2024). On the Nearshore Significant Wave Height Inversion from Video Images Based on Deep Learning. Journal of Marine Science and Engineering, 12(11), 2003. https://doi.org/10.3390/jmse12112003

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

On the Nearshore Significant Wave Height Inversion from Video Images Based on Deep Learning

Abstract

1. Introduction

2. Materials and Methods

2.1. Data Sources and Data Preprocessing

2.2. Building and Partitioning the Dataset

2.3. Methods

2.3.1. Wind Wave and Swell Classification Algorithm (ResNet-SW)

2.3.2. Significant Wave Height Inversion Algorithm (Inversion-Net)

2.3.3. Criteria for Wind Wave and Swell Classification Labels

2.3.4. The Training Process for Combining Constraint Condition and Deep Learning

2.4. Evaluation Metrics

2.4.1. Evaluation Metrics for Classification Algorithms

2.4.2. Evaluation Metrics for Significant Wave Height Inversion Algorithms

3. Results and Discussion

3.1. Experimental Environment

3.2. Wind Wave and Swell Classification Results

3.3. Significant Wave Height Inversion Results

3.3.1. Significant Wave Height Inversion Results Base on Wave Video Images

3.3.2. Significant Wave Height Inversion Results Base on Multi-Factor Fusion

3.3.3. Significant Wave Height Inversion Results Base on Constraint Condition

3.3.4. The Performance of the Model in Wave Processes

3.4. Discussion

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI