Prediction of the Occurrence Probability of Freak Waves in Unidirectional Sea State Using Deep Learning

Zhou, Binzhen; Wang, Jiahao; Ding, Kanglixi; Wang, Lei; Liu, Yingyi

doi:10.3390/jmse11122296

Open AccessArticle

Prediction of the Occurrence Probability of Freak Waves in Unidirectional Sea State Using Deep Learning

by

Binzhen Zhou

¹,

Jiahao Wang

¹,

Kanglixi Ding

¹,

Lei Wang

^1,* and

Yingyi Liu

²

¹

School of Civil Engineering and Transportation, South China University of Technology, Guangzhou 510641, China

²

Research Institute for Applied Mechanics, Kyushu University, Fukuoka 812-8581, Japan

^*

Author to whom correspondence should be addressed.

J. Mar. Sci. Eng. 2023, 11(12), 2296; https://doi.org/10.3390/jmse11122296

Submission received: 9 November 2023 / Revised: 28 November 2023 / Accepted: 1 December 2023 / Published: 3 December 2023

(This article belongs to the Section Ocean Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

Predicting extreme waves can foresee the hydrodynamic environment of marine engineering, critical for avoiding disaster risks. Till now, there are barely any available models that can rapidly and accurately predict the occurrence probability of freak waves in a given state. This paper develops a trained model based on the Back Propagation (BP) neural network, with wave parameters of unidirectional sea state fed into the model, such as significant wave height, wave period, spectral type, and the intermodal distance of the peak frequencies. A rapid and accurate model optimized for predicting the occurrence probability of freak waves in a unidirectional sea state, from unimodal to bimodal configuration, is achieved by iterating to reduce accumulation errors. Compared to the regression and least-squares boosting trees, the optimized model performs much better in accurately predicting the occurrence probability of freak waves. Irrespective of whether in unimodal or bimodal sea state, this optimized model is competitive in calculation accuracy compared to theoretical models such as Rayleigh prediction and MER prediction, improved by at least 41%. The established model based on the BP neural network can quickly predict the threshold of freak waves in a given sea state, guiding practical engineering applications.

Keywords:

occurrence probability of freak waves; unidirectional sea state; high-order spectral method; BP neural network; wave height distribution

1. Introduction

With vigorous renewable energy developments, the demand for marine environmental safety is increasingly urgent. Freak waves [1], whose wave height is twice the significant wave height [2], pose a severe threat to the safety of offshore structures, ships, and personnel of their concentrated energy [3]. Recently, freak waves have been regarded as one of the most catastrophic waves in the real sea state, which are widespread concerns as per academia and industry [4,5,6,7]. Therefore, predicting the probability of freak waves is of great significance to avoiding the risk of extreme sea conditions and ensuring the safety of marine engineering.

Scholars have conducted extensive research on the statistical characteristics of freak waves in unidirectional waves to estimate the wave load threshold in the traditional approach. Assuming a random wave train is idealized as a stationary, Gaussian, and highly narrow-banded process, the wave height distribution obeys the Rayleigh distribution [8]. The behavior of real sea waves in nature is usually nonlinear and broad-banded, making the linear distribution inapplicable, especially for large waves [9]. Based on the analysis of the field data, nonlinearity was discussed to play a dominant role in large wave formation [10,11]. Consequently, kurtosis, reflecting the third-order nonlinear interactions, was pointed out to be an important measure at the tail of the exceedance probability of wave height distribution [12]. Introducing kurtosis into the distribution function as a correction to the Rayleigh distribution, Mori and Janssen [13] developed a distribution called the modified Edgeworth–Rayleigh (MER) distribution with the assumption of weak nonlinearity, narrow spectrum, and crest-to-trough height twice of wave amplitude. However, although the MER distribution could predict the wave height distribution well compared to the Rayleigh distribution, MER’s distribution underestimated the statistical wave height distribution when H/Hs > 2.0, which was recognized as the criteria of freak waves [14]. In addition to theoretical predictions, traditional numerical methods were also available, such as the commonly used high-order spectral method [15]. Taking full nonlinearity into account, the numerically simulated results were closer to real values, but they took a long time to simulate and process [16,17]. The above methods could neither quickly nor accurately predict the occurrence probability of freak waves under a given application.

In the past few years, machine learning has witnessed a resurgence with groundbreaking successes in applying it to a wide range of ocean engineering assessment methods [18,19,20,21], such as support vector machines (SVM) [22], support vector regression (SVR) [23], recurrent neural network (RNN) [24], wavelet neural network (WNN) [25], long short-term memory (LSTM) [26,27,28], regression tree [29,30], least-squares boosting tree (LSBoost) [31], so on. Among these, the last two algorithms were proved to be competitive for applying to wave prediction. Mahjoobi et al. [22] used the regression tree for wave height prediction and found that the error statistics of the predicted results were within the allowance range. Tang et al. [32] explored the maximum wave crest distribution using random forest and pointed out that its accuracy was higher than traditional numerical models. Although the regression tree could deal with discrete and continuous variables, fewer requirements for data, and minor sensitivity to outliers, it was prone to overfitting especially for high-dimensional data [29]. Later, Zhang et al. [33] predicted the elastic modulus of compressive strength of standard high-strength concrete by the LSBoost method, with better prediction performance compared to traditional empirical equations and statistical regression methods. Furthermore, the Back Propagation (BP) neural network was successfully adopted to predict short-term effective wave height [34], demonstrating its nonlinear mapping ability, self-learning ability, and specific generalization and generalization abilities [35]. These advantages would help the BP neural network better predict wave height distribution, which has not been conducted in previous studies.

Based on the above, the existing theoretical model, including the Rayleigh linear distribution and the MER weakly nonlinear distribution, and the traditional statistics based on fully nonlinear numerical simulation, need to achieve both high efficiency and accurate precision. The model’s performance using the BP neural network better at nonlinear mapping is yet unclear. Therefore, the present work aims to establish an empirical model with a BP neural network algorithm to rapidly and accurately predict the occurrence probability of freak waves. The novelties are two-fold. First, the BP neural network is applied to predict freak wave occurrence under various sea conditions, from unimodal to bimodal configurations. Second, a further optimization of the internal parameters is conducted to improve their accuracy in different sea conditions, thus requiring a more universal empirical prediction model.

The rest of the paper is arranged as follows. Section 2 briefly introduces the existing methods, such as the Rayleigh distribution, MER distribution, regression forest, least-squares boosting tree, and BP neural network. Section 3 describes the routine of dataset construction based on a mathematical model. The optimization of the trained model based on machine learning and the comparisons of predicted results between different deep learning models and theoretical models are provided in Section 4.

2. Existing Prediction Model

Forecasting the occurrence probability of freak waves in a given sea state is critical to engineering practices, such as oceanic equipment design and marine energy development. Nowadays, some available models are applied to predict this probability, classified into theoretical models, such as the Rayleigh linear distribution and the MER weakly nonlinear distribution, and machine learning algorithms. These potential candidates are introduced in this section. Besides, the Regression tree is shown in Appendix A and LSBoost is shown in Appendix B.

2.1. Rayleigh Distribution

Under the assumption of a stationary, Gaussian, and highly narrow-banded process, the wave height regarded as twice the envelope amplitude obeys the Rayleigh distribution [8], and the exceedance probability of wave height is described as follows:

E (H) = \exp [- \frac{H^{2}}{8 m_{0}}]

(1)

where m₀ = H_s²/4 is the zero-order spectral momentum.

2.2. MER Distribution

Introducing weak nonlinearity, Mori and Janssen proposed a Modified Edgeworth–Rayleigh (MER) distribution [13], in which the wave height is also supposed to be twice the wave amplitude in narrow-banded wave trains, expressed as follows:

E (H) = \exp (- \frac{H^{2}}{8}) [1 + κ_{4}_{0} \frac{H^{2}}{384} (H^{2} - 16)]

(2)

κ_{4}_{0} = k u r t o s i s - 3 = \frac{π}{\sqrt{3}} {BFI}^{2}

(3)

where BFI denotes the deep-water Benjamin–Feir Index defined by Onorato et al. [12]:

BFI = \frac{\sqrt{2} ε}{2 Δ f / f_{p}}

(4)

in which ε = k_pH_s/2 is the wave steepness. The relative spectral bandwidth Δf/f_p (Δf is the half-width at half-spectrum height, and f_p is the peak frequency of irregular wave trains) refers to the definition introduced in Ref. [14].

Concerning the bimodal spectrum, there are highly complex interactions between high-frequency and low-frequency components that impede the define the B-F index clearly. In this study, the BFI in the bimodal sea state is represented by that of the corresponding unimodal sea state under the assumption of equivalent energy.

2.3. BP Neural Network

Back Propagation (BP) neural networks can learn and store a large number of mapping relations between input and output patterns without revealing the mathematical equations that describe these mapping relations beforehand [35]. Its learning rule is to continuously adjust the weights and thresholds of the network through backpropagation adopted by the steepest descent method, to minimize the sum of squared errors of the network.

BP neural network is a typical multi-layer forward network, shown in Figure 1. x, y, and z represent the parameters in the input, hidden, and output layers. Y and Z are the normalized values. Regarding the occurrence probability of freak waves in unidirectional sea states focused on in this study, there are two parameters (i.e., relative water depth and deep-water BFI) in unimodal wave configuration but three parameters (i.e., relative water depth, deep-water BFI, and intermodal distance) in bimodal wave configuration in the input layer, and one parameter (i.e., the occurrence probability of freak waves) in the output layer.

The parameter y_s at the s-th node in the hidden layer can be expressed as follows:

y_{s} = \sum_{v = 1}^{i} w_{r s} x_{r} + a_{s} r = 1, 2, \dots i; s = 1, 2, \dots j

(5)

where x_r is the r-th parameter in the input layer, and w_rs is the weight value between the r-th node in the input layer and the s-th node in the hidden layer, and a_s is the bias of the input layer to the s-th neuron of the hidden layer.

Y_s corresponds to the normalization of y_s,

Y_{s} = s i g m o i d (y_{s}) s = 1, 2, \dots j

(6)

in which the activation function sigmoid is

s i g m o i d (y_{s}) = \frac{1}{1 + e^{- y_{s}}}

(7)

The parameter z_t at the t-th node in the output layer can be expressed as

z_{t} = \sum_{s = 1}^{j} q_{s t} Y_{s} + b_{t} t = 1, 2, \dots k

(8)

where q_st is the weight value between the s-th node in the hidden layer and the t-th node in the output layer. b_t is the bias from the hidden layer to the t-th neuron of the output layer.

Z_t corresponds to the normalization of z_t:

Z_{t} = s i g m o i d^{- 1} (z_{t})

(9)

Subsequently, the output results will be compared with the expected values from test samples. Suppose the error is not within the allowable range. In that case, the weight coefficients w and q will be adjusted according to the error feedback, and the output will be initialized toward the input for the following training process. The training process will end with the current weight values retained until the training error between the output results and the expected value is within the allowable range.

3. Construction and Prediction Routine of Datasets

3.1. Mathematical Background

Unidirectional sea states are usually used to determine the threshold for extreme configuration. To predict the occurrence probability of freak waves in various sea states, random unidirectional waves tend to be simulated taking full nonlinearity into account. A numerical wave tank based on the high-order spectral (HOS) method [36] is preferred to collect a significant amount of time history data, possessing great advantages in computational efficiency and accuracy when simulating two-dimensional random waves for a long time. Assuming the wave fluid is incompressible, inviscid, irrotational, and non-breaking, the velocity potential in the wave field Φ(x, z, t) satisfies the Laplace equation:

\nabla^{2} Φ (x, z, t) = 0

(10)

where ∇ = (∂/∂x, ∂/∂z), z = 0, denotes the still water surface, taking vertically upward direction as positive, and the x-axis is located on the still water surface, with wavemaker at x = 0.

To solve the fully nonlinear numerical tank with wavemaker boundary introduced, the velocity potential Φ(x, z, t) is split into the sum of a free surface velocity potential Φ_f and an additional velocity potential Φ_add, with reference to Bonnefoy et al. [36].

Φ (x, z, t) = Φ_{f}^{s} (x, t) + Φ_{a d d} (x, z, t)

(11)

in which free surface velocity potential Φ_f satisfies the Laplace equation, the free surface boundary conditions, and the bottom condition, solved by the traditional HOS method proposed by Dommermuth and Yue [37], and prescribed non-periodic component Φ_add satisfies the Laplace equation, the wave-maker boundary condition, and the bottom condition, composed by a set of specific basis functions concerning a given wavemaker boundary. More details are provided in Refs. [16,38].

3.2. Numerical Set-Up

Besides the often-studied unimodal spectral waves (pure swell or pure wind–sea configuration), 25% of the real sea state morphology appears as bimodal structures. Consequently, unimodal sea states and bimodal sea states are all covered to study the occurrence probability of freak waves in this study. The numerical wave tank based on the HOS method established to simulate unidirectional random waves is illustrated in Figure 2. The wavemaker is assigned to the left side of the wave tank, with the absorbing zone at the right end to eliminate the influence of the reflected waves. The numerical wave tank’s effective length is 220 m to ensure that the random waves evolve over a sufficiently long extent in space, and the other 20 m is arranged as an absorbing zone. The wave gauges are set up at every 5 m point along the wave tank to record the free surface elevations per 0.02 s. The working water depth is set as 4.0 m.

The frequency spectrum applied to determine the amplitude of each wave component is adopted with the six-parameter model proposed by Ochi and Hubble [39] in the following study:

S (ω) = \frac{1}{4} \sum_{j} \frac{{(\frac{4 λ_{j} + 1}{4})}^{λ_{j}}}{Γ (λ_{j})} {(\frac{ω_{p j}}{ω})}^{4 λ_{j}} \frac{H_{s j}^{2}}{ω} \exp [- \frac{4 λ_{j} + 1}{4} {(\frac{ω_{p j}}{ω})}^{4}] j = 1, 2

(12)

where j = 1, 2 correspond to the swell-dominated system and wind-sea-dominated system, respectively. Γ is the gamma function. The significant wave height H_s, the angular frequency ω_p (=2πf_p), and the peak enhancement factor λ are available to define these two different dominated systems.

The significant wave height H_s of the mixed system can be calculated by the energy superposition according to the Rice theory:

H_{s} = \sqrt{H_{s 1}^{2} + H_{s 2}^{2}}

(13)

In these cases, the total number of wave components describing the temporal wavemaker motion is 200, and the frequency range is (0.2 Hz, 3.0 Hz) to maintain the total energy involved. The number of points per peak wavelength equals 30, the time step is 1/100 of the wave period, and the order of the HOS method is five, regarding the convergence analysis and numerical validation conducted in Ref. [14]. More details on the present implementation and set-up parameters can be found in Ref. [14].

A series of waves should be recorded to obtain sufficient time history data. Each configuration is simulated several times considering different initial random phases, and for a duration of 10,000 s in total, containing 7000 waves at least.

3.3. Construction of Datasets

A total of 130 cases of unimodal sea state and 1300 cases of bimodal sea state are to be simulated. Unimodal configurations in unidirectional sea states are listed in Table 1, with 130 cases defined by various relative water depths and deep-water BFI values. Meanwhile, 10 cases of the bimodal configuration in correspondence to each unimodal configuration (1300 cases in total) assuming energy equivalent are considered. Table 2 shows one set of examples, described by equivalent energy but different intermodal distances (ID), with their corresponding spectra provided in Figure 3.

In this study, the occurrence probability of freak waves is defined as the probability at H_i/H_s ≥ 2 in the exceedance probability, also applied in their work by Wang et al. [17]. After the long-time numerical simulation considering various wave configurations, we processed data such as statistics to assess the wave height using a zero-crossing method, calculate the exceedance probability distribution of wave height, and so on. If the wave height distribution is not smooth near H_i/H_s = 2 (criteria for identifying freak waves), tedious manual filtering work case-by-case will be required before dataset construction. Figure 4 shows the statistical results of the unimodal wave configuration in the final dataset.

The unimodal dataset is classified into 100 training samples and 30 test samples at random (demonstrated in Figure 5), and the bimodal dataset corresponds to 1000 training samples and 300 test samples, as listed in Table 3.

3.4. Prediction Routine

To achieve an empirical model predicting the freak wave occurrence in unidirectional sea states from unimodal to bimodal configurations, the constructed datasets were trained using machine learning. The flow chart is shown in Figure 6.

Step 1: Train the empirical model

A prediction model based on the BP neural network is established preliminarily using MATLAB code. Convergence analysis can be performed by optimizing the internal parameters of various sea states.

Step 2: Validate the trained model

Assigning input conditions of test samples into the trained model, the occurrence probability of freak waves can be rapidly predicted. The trained model can be validated by comparing the forecast and the expected values.

Step 3: Compare with the other predictions

Relative errors are listed by comparing the trained model to the other deep learning algorithms and the existing theoretical model, such as the Rayleigh linear model and the MER weakly nonlinear model.

Step 4: Establish a rapid and accurate prediction model

4. Establishment of the Empirical Model Based on the BP Neural Network

A rapid and accurate model for predicting freak waves is bound to be highly anticipated in engineering applications. To achieve this expectation, optimizing the parameters of the neural network (such as sample numbers or hidden layer nodes) for more generalization is a priority. Hence, parameter optimization and model performance analysis (i.e., error analysis and efficiency analysis) of the BP neural network model applied in various sea states are to be carried out in this section.

The relative error is used as a defined indicator to estimate the model’s accuracy:

E_{r} = |\frac{a_{predict} - a_{expect}}{a_{expect}}| \times 100 %

(14)

where a_predict represents the results obtained from the prediction model, and a_expect is the actual value in datasets obtained from extremely tedious manual work.

4.1. Unimodal Sea State

4.1.1. Convergence of the Trained Model

(1): Number of hidden layer nodes

In a highly complex mapping relation, the trained model sometimes performs well in the training dataset but poorly in the test dataset, leading to an “overfitting” phenomenon. It is usually associated with the number of hidden layer nodes. Thus, the influence of the number of hidden layer nodes on the behaviors of the trained model is explored, shown in Figure 7, in which the number of the training sample is fixed at 100. Since changing the number of hidden layer nodes hardly affects the calculational duration, the time spent is not provided here.

When the number of hidden layer nodes is taken to be one, the relative error is up to 4.99%. Then, the relative error decreases rapidly as the number of hidden layer nodes increases till the number of hidden layer nodes is four. When the number of hidden layer nodes exceeds four, the relative error is stable at around 2.83%. Thus, the minimum number of relative errors occurs at the 4-four hidden layer nodes, indicating that there is little impact on the accuracy of the prediction model when the number of hidden layer nodes is greater than three. This can be explained by the fact that there are relatively few parameters in the input layer and output layer in our prediction model. For the number of hidden layer nodes greater than three, accurate prediction can be obtained. Therefore, the model with the number of hidden layer nodes adopted as four can predict freak waves in the unimodal sea state.

(2): Number of training samples

Taking the hidden layer node as four, various sample numbers are selected to train the empirical model. Figure 8 demonstrates variations in the relative error and the computational time versus the number of training models adopted as 20, 30, 40, 60, 80, and 100. The unit time is defined as the time spent when the number of training models is 20. As expected, the accuracy of the prediction model is closely related to the number of training samples. For a smaller number of training samples (20–40), the relative error decreases rapidly as the number of training samples increases. When the number of training samples exceeds 40, the relative error changes slowly and tends to be stable at around 3.0%. Meanwhile, the calculational duration is linearly proportional to the number of training samples. Taking the accuracy and efficiency of the calculation into account, the preferred number of the training sample is 40 for the following operation.

4.1.2. Validation of the Trained Model

The preferred parameters such as the number of samples and the number of hidden layer nodes are determined through the above optimization. Based on the dataset of 40 training samples, a trained model with four hidden layer nodes is applied to predict the occurrence probability of freak waves in 30 different sea states. The anticipated results are compared with the expected values, shown in Figure 9, in which the scatters are mainly distributed on or pretty close to the line y = x, reflecting that the prediction accuracy of this model is relatively high. Further, relative errors corresponding to each test sample are provided in Figure 10. The prediction error is small, with a maximum value of 16.4%. These validations indicate that the model trained above is competitive in rapidly predicting the freak wave occurrence in a unimodal sea state.

4.1.3. Comparison to Other Predictions

Similarly, the other two models based on deep learning algorithms are optimized successively. The prediction errors of the three trained models are compared in Figure 11, and the error analyses are listed in Table 4. It is obvious that the relative error predicted by the trained model based on the BP neural network, in which the error distribution is much more concentrated with a maximum error of only 16.4%, is far less than that of the regression tree and LSBoost. The mean error predicted by the BP neural network is 4.01%, only one-fifth of that is predicted by the regression tree. The comparison proves that compared to the other deep learning algorithms, the BP-trained model is much better in accurately predicting the occurrence probability of freak waves in a unimodal sea state.

The models available now that are applied to rapidly predict the occurrence probability of freak waves include the Rayleigh linear model and the MER’s weakly nonlinear model. To estimate the advantage of the BP-trained model optimized in this paper, the relative errors of these three models are compared in Figure 12 and Figure 13. As seen in Figure 12, the prediction error of the trained model is far less than that of the Rayleigh model and less than that of the MER model. Figure 13 demonstrates the relative errors corresponding to each test sample for a more intuitive comparison. The maximum value of the prediction error obtained from the Rayleigh, MER, and trained models using the BP neural network is 91.7%, 28.0%, and 16.4%, respectively. Therefore, compared to the existing theoretical models, the prediction accuracy of the trained model based on the BP neural network is higher and thus suitable for predicting freak waves.

4.2. Bimodal Sea State

4.2.1. Convergence of the Trained Model

In the following study, we focus on the sea-swell energy equivalent configuration (i.e., SSER = 1.0) regarding the bimodal sea state. Similar to the optimization in a unimodal sea state, the model based on a deep learning algorithm must also be trained.

Figure 14 provides the variation in relative error versus the number of hidden layer nodes with the number of training samples fixed as 40. When the number of hidden layer nodes is taken one, the relative error is up to 5.54%. The variation trend is similar to that in the unimodal sea state, except that the minimum relative error occurs at the five hidden layer nodes, a little different from the unimodal sea state. It is because, for the bimodal sea state, there are more initial parameters of the wave field, such as significant wave height H_s, peak wave period T_p, sea-swell energy ratio SSER, and intermodal distance ID, in which the last two are not involved in the unimodal sea state. An increase in input parameters results in a few hidden layer nodes that cannot satisfy the accuracy requirements. But if more hidden layer nodes are taken, the overfitting phenomenon might occur, leading to an increased error. Hence, the number of hidden layer nodes is adopted as five when predicting freak waves in the bimodal sea state.

As prescribed, one unimodal wave configuration corresponds to various bimodal wave configurations considering different spectral distributions. With five hidden layer nodes fixed, the corresponding number of training samples 200, 300, 400, 600, 800, and 1000 are chosen to predict the other 300 test samples, shown in Figure 15. Here, unit time is redefined as the time required when the number of training models is 200 in a bimodal sea state. Like the phenomenon in the unimodal sea state, the relative error decreases as the number of training samples increases. When the number of training samples is taken as 200, the maximum relative error compared to the expected value is 4.74%. Meanwhile, for the other number of training samples, it is stable at 3.0% or so. Because of the significant increase in time cost caused by the rise in the number of training samples, the configuration within the allowance range of error with less time cost as far as possible should be selected. Hence, the appropriate number of the training sample is adopted as 300 when predicting the bimodal sea state.

4.2.2. Validation of the Trained Model

Based on the dataset of 300 training samples, a trained model with five hidden layer nodes is used to predict the occurrence probability of freak waves in 300 different sea states. Relative errors corresponding to each test sample considering different intermodal distances are provided in Figure 16. When the intermodal distance is smaller than 0.10, the relative errors are generally large, but the maximum errors are almost below 15%, except when ID = 0.02, only one error is up to 16.0%. Figure 17 summarizes the maximum error and average error in different wave configurations. When the ID value, maximum error, and average error are generally immense. For a more significant ID value, the value of maximum error and average and average errors are relatively minor. It can be explained by the fact that for a smaller ID value, the wave configuration is quite similar to that in the unimodal configuration, in which fewer parameters are in the input layer. As the value of ID increases, the structure of two spectral peaks is increasingly prominent, along with more parameters in the input layer and larger errors made by the prediction model.

4.2.3. Comparison to the Other Predictions

Figure 18 compares the prediction errors of the three trained models. Although the difference between these three models’ mean errors is insignificant, the distribution of the relative error obtained from the trained model based on the BP neural network is much more concentrated, with a maximum error of only 16.0%. This comparison indicates that the BP-trained model still has a slight advantage in predicting the freak waves in a bimodal sea state.

Table 5 compares the maximum prediction errors of the Rayleigh, MER, and trained models based on the BP neural network. Similar to the results in the unimodal sea state, the trained model based on the BP neural network optimized in this paper, with much smaller error, can better predict the occurrence probability of freak waves in a bimodal sea state than the Rayleigh linear model and the MER weakly nonlinear model. This is because the training samples collect data taking full nonlinearity into account. Meanwhile, the maximum error of BP neural network model in predicting the probability of freak waves is reduced by 42.8%, 46.4%, 75.7%, 52.9%, 67.5%, 77.5%, 88.6%, 86.1%, 90.4%, and 90.0% corresponding to ID = 0.02, 0.04, 0.06, 0.08, 0.10, 0.15, 0.20, 0.25, 0.30, and 0.35, respectively, compared to that of the MER prediction. Therefore, this empirical model based on the BP neural network optimized in this paper used to predict the occurrence probability of freak waves in bimodal sea states is available, with features of higher accuracy and less time-consuming.

5. Conclusions

In this paper, a trained model based on the BP neural network is developed and available in a unidirectional sea state, from unimodal to bimodal configuration, to predict the occurrence probability of freak waves. Deep optimization is conducted for more generalization. The main conclusions are as follows:

(1): The BP model performs well in accurately predicting the occurrence probability of freak waves in unimodal sea states. Compared with the regression tree and LSBoost, the optimized model based on the BP neural network has a high precision of prediction and a reasonable value of application, with a maximum error of 16.4% and mean error of 4.01%, which is only one-fifth of that of the regression tree.
(2): The trained model based on the BP neural network is still optimal for predicting the bimodal sea state. Although the error results expected by the three methods are not significantly different, the BP model manifests a more concentrated error distribution, with a maximum error of only 16.0%, reflecting great competence in better predictive stability under the circumstances characterized by the bimodal structure.
(3): The more comprehensive the spectral bandwidth, the greater the advantage of the BP model. Concerning the unimodal sea state, the maximum error of the BP neural network model can be reduced by 41.4% compared to that of the MER prediction. Further, in bimodal sea state, the maximum error can be reduced by 42.8%, 46.4%, 75.7%, 52.9%, 67.5%, 77.5%, 88.6%, 86.1%, 90.4%, and 90.0% corresponding to ID = 0.02, 0.04, 0.06, 0.08, 0.10, 0.15, 0.20, 0.25, 0.30, and 0.35, respectively.

The proposed model based on the BP neural network is universal in unidirectional sea states considering various spectral distributions and guiding practical engineering applications.

Author Contributions

Conceptualization, B.Z. and L.W.; methodology, J.W.; software, J.W.; validation, J.W. and L.W.; formal analysis, J.W. and L.W.; investigation, J.W. and K.D.; resources, B.Z.; data curation, J.W.; writing—original draft preparation, J.W.; writing—review and editing, B.Z., L.W. and Y.L.; visualization, J.W. and K.D.; supervision, B.Z. and L.W.; project administration, B.Z. and L.W.; funding acquisition, B.Z. and L.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (52071096, 52301319), the National Natural Science Foundation of China National Outstanding Youth Science Fund Project (52222109), Guangdong Basic and Applied Basic Research Foundation (2022B1515020036), Guangzhou Basic and Applied Basic Research Foundation (2023A04J1596), the Fundamental Research Funds for the Central Universities (2022ZYGXZR014), and the Funds of Guangxi Key Laboratory of Beibu Gulf Marine Resources, Environment and Sustainable Development (NRESD-2023-804).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

For complex and nonlinear time series data of random waves, the tree regression algorithm is more suitable for regression modeling than linear regression. The core idea of the algorithm is to continuously branch until the preset termination conditions are met [29]. In the branching process, the threshold of each feature is exhausted to find the optimal segmentation feature and optimal segmentation point, where the square error is minimized. A regression tree typically consists of several decision trees, each containing several nodes. The leaf nodes finally represent the predicted outcome.

Appendix B

Least-squares boosting tree (LSBoost) is a gradient tree boosting with least squares loss. In this algorithm, every regression tree is constructed by selecting the optimal segmentation point based on minimizing the mean square error (MSE) of the present trained model to create a new tree node [31]. To fit the training dataset better, the new node is boosted. When an iteration is conducted once, a recent regression tree is added to promote the performance of the trained model. The actual prediction will be obtained by weighing the predicted results of all trees.

References

Draper, L. Freak wave. Mar. Obs. 1965, 35, 193–195. [Google Scholar]
Kharif, C.; Pelinovsky, E. Physical mechanisms of the rogue wave phenomenon. Eur. J. Mech.-B/Fluids 2003, 22, 603–634. [Google Scholar] [CrossRef]
Nikolkina, I.; Didenkulova, I. Rogue waves in 2006–2010. Nat. Hazards Earth Syst. Sci. 2011, 11, 2913–2924. [Google Scholar] [CrossRef]
Ma, Y.; Zhang, J.; Chen, Q.; Tai, B.; Dong, G.; Xie, B.; Niu, X. Progresses in the research of oceanic freak waves: Mechanism, modeling, and forecasting. Int. J. Ocean Coast. Eng. 2022, 4, 2250002. [Google Scholar] [CrossRef]
Fu, R.; Ma, Y.; Dong, G.; Perlin, M. A wavelet-based wave group detector and predictor of extreme events over unidirectional sloping bathymetry. Ocean Eng. 2021, 229, 108936. [Google Scholar] [CrossRef]
Zhou, B.; Ding, K.; Wang, J.; Wang, L.; Jin, P.; Tang, T. Experimental study on the interactions between wave groups in double-wave-group focusing. Phys. Fluids 2023, 35, 037118. [Google Scholar] [CrossRef]
He, Y.; Wu, G.; Mao, H.; Chen, H.; Lin, J.; Dong, G. An experimental study on nonlinear wave dynamics for freak waves over an uneven bottom. Front. Mar. Sci. 2023, 10, 1150896. [Google Scholar] [CrossRef]
Longuet-Higgins, M. On the statistical distribution of the heights of sea waves. J. Mar. Res. 1952, 11, 245–266. [Google Scholar]
Janssen, P.A.E.M. Nonlinear four-wave interactions and freak waves. J. Phys. Oceanogr. 2003, 33, 863–884. [Google Scholar] [CrossRef]
Fedele, F.; Brennan, J.; de León, S.P.; Dudley, J.; Dias, F. Real world ocean rogue waves explained without the modulational instability. Sci. Rep. 2016, 6, 27715. [Google Scholar] [CrossRef]
Ponce de León, S.; Osborne, A.R. Role of Nonlinear Four-Wave Interactions Source Term on the Spectral Shape. J. Mar. Sci. Eng. 2020, 8, 251. [Google Scholar] [CrossRef]
Onorato, M.; Osborne, A.; Serio, M.; Cavaleri, L.; Brandini, C.; Stansberg, C. Observation of strongly non-Gaussian statistics for random sea surface gravity waves in wave flume experiments. Phys. Rev. E 2004, 70, 067302. [Google Scholar] [CrossRef] [PubMed]
Mori, N.; Janssen, P.A.E.M. On kurtosis and occurrence probability of freak waves. J. Phys. Oceanogr. 2006, 36, 1471–1483. [Google Scholar] [CrossRef]
Wang, L.; Li, J.; Liu, S.; Ducrozet, G. Statistics of long-crested extreme waves in single and mixed sea states. Ocean Dyn. 2021, 71, 21–42. [Google Scholar] [CrossRef]
Ducrozet, G.; Bonnefoy, F.; Le Touzé, D.; Ferrant, P. A modified high-order spectral method for wavemaker modeling in a numerical wave tank. Eur. J. Mech.-B/Fluids 2012, 34, 19–34. [Google Scholar] [CrossRef]
Wang, L.; Ding, K.; Zhou, B.; Li, J.; Liu, S.; Tang, T. Quantitative prediction of the freak wave occurrence probability in co-propagating mixed waves. Ocean Eng. 2023, 271, 113810. [Google Scholar] [CrossRef]
Wang, L.; Zhou, B.; Jin, P.; Li, J.; Liu, S.; Ducrozet, G. Relation between occurrence probability of freak waves and kurtosis/skewness in unidirectional wave trains under single-peak spectra. Ocean Eng. 2022, 248, 110813. [Google Scholar] [CrossRef]
Deo, M.C.; Naidu, C.S. Real time wave forecasting using neural networks. Ocean Eng. 1999, 26, 191–203. [Google Scholar] [CrossRef]
Makarynskyy, O. Improving wave predictions with artificial neural networks. Ocean Eng. 2004, 31, 709–724. [Google Scholar] [CrossRef]
Makarynskyy, O.; Pires-Silva, A.A.; Makarynska, D.; Ventura-Soares, C. Artificial neural networks in wave predictions at the west coast of Portugal. Comput. Geosci. 2005, 31, 415–424. [Google Scholar] [CrossRef]
Xie, Y.; Zhao, X.; Liu, Z. A simple approach for wave absorbing control of plunger wavemakers using machine learning: Numerical study. Coast. Eng. 2023, 179, 104253. [Google Scholar] [CrossRef]
Mahjoobi, J.; Adeli Mosabbeb, E. Prediction of significant wave height using regressive support vector machines. Ocean Eng. 2009, 36, 339–347. [Google Scholar] [CrossRef]
Chen, S.; Wang, Y. Improving coastal ocean wave height forecasting during typhoons by using local meteorological and neighboring wave data in support vector regression models. J. Mar. Sci. Eng. 2020, 8, 149. [Google Scholar] [CrossRef]
Mandal, S.; Prabaharan, N. Ocean wave forecasting using recurrent neural networks. Ocean Eng. 2006, 33, 1401–1410. [Google Scholar] [CrossRef]
Deka, P.; Prahlada, R. Discrete wavelet neural network approach in significant wave height forecasting for multistep lead time. Ocean Eng. 2015, 43, 2–42. [Google Scholar] [CrossRef]
Fan, S.; Xiao, N.; Dong, S. A novel model to predict significant wave height based on long short-term memory network. Ocean Eng. 2020, 205, 107298. [Google Scholar] [CrossRef]
Liu, Y.; Duan, W.; Huang, L.; Duan, S.; Ma, X. The input vector space optimization for LSTM deep learning model in real-time prediction of ship motions. Ocean Eng. 2020, 213, 107681. [Google Scholar] [CrossRef]
Hao, W.; Sun, X.; Wang, C.; Chen, H.; Huang, L. A hybrid EMD-LSTM model for non-stationary wave prediction in offshore China. Ocean Eng. 2022, 246, 110566. [Google Scholar] [CrossRef]
Mahjoobi, J.; Etemad-Shahidi, A. An alternative approach for the prediction of significant wave heights based on classification and regression trees. Appl. Ocean Res. 2008, 30, 172–177. [Google Scholar] [CrossRef]
Elbisy, M.; Elbisy, A. Prediction of significant wave height by artificial neural networks and multiple additive regression trees. Ocean Eng. 2021, 230, 109077. [Google Scholar] [CrossRef]
Su, M.; Zhang, Z.; Zhu, Y.; Zha, D. Data-driven natural gas spot price forecasting with Least Squares Regression Boosting algorithm. Energies 2019, 12, 1094. [Google Scholar] [CrossRef]
Tang, T.; Adcock, T.A.A. Data driven analysis on the extreme wave statistics over an area. Appl. Ocean Res. 2021, 115, 102809. [Google Scholar] [CrossRef]
Zhang, Y.; Xu, X. Modulus of elasticity predictions through LSBoost for concrete of normal and high strength. Mater. Chem. Phys. 2022, 283, 126007. [Google Scholar] [CrossRef]
Ban, W.; Shen, L.; Chen, J.; Yang, B. Short-term prediction of wave height based on a deep learning autoregressive integrated moving average mode. Earth Sci. Inform. 2023, 16, 2251–2259. [Google Scholar] [CrossRef]
Wang, W.; Tang, R.; Li, C.; Liu, P.; Luo, L. A BP neural network model optimized by Mind Evolutionary algorithm for predicting the ocean wave heights. Ocean Eng. 2018, 162, 98–107. [Google Scholar] [CrossRef]
Bonnefoy, F.; Ducrozet, G.; Le Touzé, D.; Ferrant, P. Time domain simulation of nonlinear water waves using spectral methods. In Advances in Numerical Simulation of Nonlinear Water Waves; World Scientific: Singapore, 2010; pp. 129–164. [Google Scholar] [CrossRef]
Dommermuth, D.; Yue, D. A high-order spectral method for the study of nonlinear gravity waves. J. Fluid Mech. 1987, 184, 267–288. [Google Scholar] [CrossRef]
Li, J.; Liu, S. Focused wave properties based on a high order spectral method with a non-periodic boundary. China Ocean Eng. 2015, 29, 1–16. [Google Scholar] [CrossRef]
Ochi, M.; Hubble, E. Six parameter wave spectra. In Proceedings of the 15th International Conference on Coastal Engineering, Honolulu, HI, USA, 11–17 July 1976. [Google Scholar]

Figure 1. Diagram of the structure of the BP neural network.

Figure 2. Diagram of the numerical wave tank.

Figure 3. Several bimodal spectra correspond to one unimodal spectrum.

Figure 4. Statistical results of the unimodal configuration.

Figure 5. Unimodal datasets are classified into training samples and test samples.

Figure 6. Flow chart of constructing a rapid and accurate model predicting freak wave occurrence.

Figure 7. Variation in prediction error versus the number of hidden layer nodes in unimodal configuration.

Figure 8. Variation in prediction error versus the number of training samples in unimodal configuration.

Figure 9. Comparison between the predicted value and expected value in the unimodal configuration.

Figure 10. Prediction error corresponding to each test sample in unimodal configuration.

Figure 11. Prediction errors of deep learning algorithms corresponding to each test sample in unimodal configuration.

Figure 12. Prediction errors of the three trained models using different deep learning algorithms in unimodal configuration.

Figure 13. Comparison of prediction errors of different models corresponding to each test sample in unimodal configuration.

Figure 14. Variation in prediction error versus the number of hidden layer nodes in bimodal configuration.

Figure 15. Variation in prediction error versus the number of training samples in bimodal configuration.

Figure 16. Prediction error corresponding to each test sample considering different intermodal distances in bimodal configuration. (a) ID = 0.02–0.10. (b) ID = 0.15–0.35.

Figure 17. Maximum and average errors versus different intermodal distances in bimodal configuration.

Figure 18. Prediction errors of deep learning algorithms corresponding to each test sample in bimodal configuration.

Table 1. Wave configurations in unimodal sea states.

Case	k_ph	BFI
single	1.7–10	0.6–1.0

Table 2. A set of detailed parameters in bimodal sea states corresponding to one unimodal sea state.

Case	f_p (Hz)	H_s (m)	e = k_pH_s/2	ID
single	0.58	0.108	0.0594	-
Case A	0.56 0.71	0.0764 0.0764	0.0487 0.0786	0.02
Case A	0.71	0.0764	0.0786	0.02
Case B	0.54 0.67	0.0764	0.0453 0.0700	0.04
Case B	0.67	0.0764	0.0700	0.04
Case C	0.52 0.64	0.0764	0.0420 0.0626	0.06
Case C	0.64	0.0764	0.0626	0.06
Case D	0.50 0.60	0.0764	0.0389 0.0562	0.08
Case D	0.60	0.0764	0.0562	0.08
Case E	0.48 0.57	0.0764	0.0358 0.0505	0.10
Case E	0.57	0.0764	0.0505	0.10
Case F	0.46 0.54	0.0764	0.0329 0.0456	0.15
Case F	0.54	0.0764	0.0456	0.15
Case G	0.44 0.52	0.0764	0.0301 0.0412	0.20
Case G	0.52	0.0764	0.0412	0.20
Case H	0.42 0.49	0.0764	0.0275 0.0374	0.25
Case H	0.49	0.0764	0.0374	0.25
Case I	0.40 0.47	0.0764	0.0249 0.0339	0.30
Case I	0.47	0.0764	0.0339	0.30
Case J	0.38 0.45	0.0764 0.0764	0.0225 0.0309	0.35
Case J	0.45	0.0764	0.0309	0.35

Table 3. Classification of the unimodal datasets and bimodal datasets.

Dataset	No.	Classification	No.	Usage
Unimodal dataset	130	Training samples	100	To fit the empirical model
Unimodal dataset	130	Test samples	30	To check accuracy
Bimodal dataset	1300	Training samples	1000	To fit the empirical model
Bimodal dataset	1300	Test samples	300	To check accuracy

Table 4. Error analysis of the three trained models using different deep learning algorithms (%).

Model	Mean	Median Line	25%	75%
BP neural network	4.01	2.64	1.74	4.70
Regression tree	19.11	12.22	7.75	26.75
LSBoost	12.24	7.72	3.09	19.80

Table 5. Maximum prediction errors of different predicting models corresponding to each test sample in bimodal configuration.

ID	MRE of Rayleigh Prediction (%)	MRE of MER Prediction (%)	MRE of BP Prediction (%)
0.02	91.7	28.0	16.0
0.04	91.7	28.0	15.0
0.06	91.7	28.0	6.8
0.08	91.7	28.0	13.2
0.10	91.7	28.0	9.1
0.15	91.7	28.0	6.3
0.20	91.7	28.0	3.2
0.25	91.7	28.0	3.9
0.30	91.7	28.0	2.7
0.35	91.7	28.0	2.8

Note: MRE means maximum relative error.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhou, B.; Wang, J.; Ding, K.; Wang, L.; Liu, Y. Prediction of the Occurrence Probability of Freak Waves in Unidirectional Sea State Using Deep Learning. J. Mar. Sci. Eng. 2023, 11, 2296. https://doi.org/10.3390/jmse11122296

AMA Style

Zhou B, Wang J, Ding K, Wang L, Liu Y. Prediction of the Occurrence Probability of Freak Waves in Unidirectional Sea State Using Deep Learning. Journal of Marine Science and Engineering. 2023; 11(12):2296. https://doi.org/10.3390/jmse11122296

Chicago/Turabian Style

Zhou, Binzhen, Jiahao Wang, Kanglixi Ding, Lei Wang, and Yingyi Liu. 2023. "Prediction of the Occurrence Probability of Freak Waves in Unidirectional Sea State Using Deep Learning" Journal of Marine Science and Engineering 11, no. 12: 2296. https://doi.org/10.3390/jmse11122296

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Prediction of the Occurrence Probability of Freak Waves in Unidirectional Sea State Using Deep Learning

Abstract

1. Introduction

2. Existing Prediction Model

2.1. Rayleigh Distribution

2.2. MER Distribution

2.3. BP Neural Network

3. Construction and Prediction Routine of Datasets

3.1. Mathematical Background

3.2. Numerical Set-Up

3.3. Construction of Datasets

3.4. Prediction Routine

4. Establishment of the Empirical Model Based on the BP Neural Network

4.1. Unimodal Sea State

4.1.1. Convergence of the Trained Model

4.1.2. Validation of the Trained Model

4.1.3. Comparison to Other Predictions

4.2. Bimodal Sea State

4.2.1. Convergence of the Trained Model

4.2.2. Validation of the Trained Model

4.2.3. Comparison to the Other Predictions

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A

Appendix B

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI