Next Article in Journal
The Stochastic Frontier Model for Technical Efficiency Estimation of Interconnected Container Terminals
Previous Article in Journal
Design of Energy Saving Controllers for Central Cooling Water Systems
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Ocean Wave Height Series Prediction with Numerical Long Short-Term Memory

1
College of Oceanography and Space Informatics, China University of Petroleum, Qingdao 266580, China
2
College of Control Science and Engineering, China University of Petroleum, Qingdao 266580, China
3
North Sea Marine Forecast Center of State Oceanic Administration, Qingdao 266061, China
*
Author to whom correspondence should be addressed.
J. Mar. Sci. Eng. 2021, 9(5), 514; https://doi.org/10.3390/jmse9050514
Submission received: 22 April 2021 / Revised: 4 May 2021 / Accepted: 7 May 2021 / Published: 10 May 2021
(This article belongs to the Section Ocean Engineering)

Abstract

:
This paper investigates the possibility of using machine learning technology to correct wave height series numerical predictions. This is done by incorporating numerical predictions into long short-term memory (LSTM). Specifically, a novel ocean wave height series prediction framework, referred to as numerical long short-term memory (N-LSTM), is introduced. The N-LSTM takes a combined wave height representation, which is formed of a current wave height measurement and a subsequent Simulating Waves Nearshore (SWAN) numerical prediction, as the input and generates the corrected numerical prediction as the output. The correction is achieved by two modules in cascade, i.e., the LSTM module and the Gaussian approximation module. The LSTM module characterizes the correlation between measurement and numerical prediction. The Gaussian approximation module models the conditional probabilistic distribution of the wave height given the learned LSTM. The corrected numerical prediction is obtained by sampling the conditional probabilistic distribution and the corrected numerical prediction series is obtained by iterating the N-LSTM. Experimental results validate that our N-LSTM effectively lifts the accuracy of wave height numerical prediction from SWAN for the Bohai Sea and Xiaomaidao. Furthermore, compared with the state-of-the-art machine learning based prediction methods (e.g., residual learning), the N-LSTM achieves better prediction accuracy by 10% to 20% for the prediction time varying from 3 to 72 h.

1. Introduction

Ocean waves are irregular combinations of multiple waves with multiple wave heights, periods and travel directions [1]. Ocean waves cause huge losses to the lives and properties of people when big waves reach the coast. The predictions of significant wave height play an important role in marine engineering such as fisheries, exploration, power generation and marine transportation [2].
In the research literature, many efforts have been made in predicting significant wave heights [3,4]. Numerical wave models are widely applied to global sea-state predictions [5,6]. The principle of a numerical wave model is to obtain the wave height, period and other information by solving the wave spectrum equation of ocean physical processes. Dentale et al. [7] compare the wave height buoy observations and model predictions and find that the numerical prediction is a reliable method of wave height prediction. The third generation models such as Wave Model (WAM) [8,9], WAVEWATCH-III (WWIII) [10,11] and Simulating Waves Nearshore (SWAN) [12,13] are among the most advanced numerical models. The WAM model and WMIII model are similar in structure. WMIII uses more complicated dissipation source terms and wind input terms than the WAM model [14]. Liu et al. [15] use the data of the South Indian Ocean to compare the performance of WAM and WMIII. The experiments show that the two methods perform well in the prediction of significant wave height. The SWAN model is developed for addressing the complex wave conditions in coastal regions [16]. Various operations validate the effectiveness of the SWAN model. Specifically, Liang et al. [17] validate the SWAN performance with respect to the buoy measurements in the Northwest Pacific, Northeast Pacific and Northwest Atlantic. The experimental results show that the SWAN model accurately simulates waves in coastal regions when boundary conditions are accurate. However, the fixed energy spectrum equations with fixed expressions hardly characterize the complex and changeable ocean environment in a comprehensively fine way. Specifically, the numerical predictions of waves under extreme ocean conditions are not satisfactory.
Machine learning is a data-driven methodology and has been recently applied to wave height prediction [18,19,20]. Based on long-term accurate wave height measurement data obtained through buoys [21], satellites [22] and scatterometers [23], machine learning methods [24,25] predict wave heights in the future by learning the inherent data variability. Deo et al. [26] explore a three-layered feed forward network to obtain the output of significant wave heights. Berbic et al. [27] use the artificial neural network (ANN) and support vector machine (SVM) to predict the significant wave height between 0.5 and 5.5 h. Experiments validate that the ANN and SVM predictions in such intervals outperform numerical models. Dixit and Londhe [28] use the neuro wavelet technique to predict extreme wave heights. Experiments validate that multi-level decomposition of wave data helps to improve the prediction accuracy. Recurrent neural networks (RNN) [29] and its variant long and short-term memory network (LSTM) [30] have unique advantages in addressing forecasting problems. Zhang et al. [31] use the LSTM model to learn the relationship between the meteorological factors and the rainfall, allowing the correction of the model forecast rainfall. They also have merits in wave height forecasting. Mandal and Prabaharan [32] use a recurrent neural network with rprop update algorithm to predict wave height. Felix and Marina [33] apply RNN-LSTM to predicting significant wave height and the method performs well in forecasting within 24 h. Kaloop et al. [34] integrate wavelet, particle swarm optimization (PSO) and extreme learning machine (ELM) methods into a wavelet PSO-ELM model for coastal and deep-sea wave height estimation. The evaluations show that the method has high prediction accuracy. The disadvantage of the machine learning methods is that the long-term forecasting capability is significantly reduced and the generalization ability of the model is limited.
Both the numerical model and machine learning have advantages and disadvantages for the wave height forecast. Recently, some research is devoted to combining the numerical model prediction methods and machine learning methods to improve the prediction accuracy. One typical manner for combining numerical prediction and machine learning is to use a machine learning model to learn the difference between the numerical predictions and real measurements. Deshmukh et al. [35] use a wavelet neural network for learning the error between SWAN predictions and real measurements. Wang et al. [36] develop a multi-factor extreme learning machine which characterizes the difference between SWAN wave height numerical predictions and real measurements and refers to the strategy as residual learning. Corrections for the numerical predictions are obtained through the learned residuals. Campos et al. [37,38] use neural networks to perform a nonlinear ensemble averaging of the global ocean wave integrated forecasting system data and predict significant wave heights through residual forecasting. Though effectively improving the prediction accuracy of numerical models, the residual learning method tends to exhibit poor performance when the wave height changes abruptly. One reason for this limitation is that residual learning considers a machine learning model and a numerical model as two separate procedures. Learning the difference between their predictions cannot inherently exploit their joint prediction capability.
In order to explore the potential of machine learning for lifting wave height numerical prediction accuracy, we introduce a numerical long short-term memory (N-LSTM) framework for correcting the wave height predictions from the numerical model. The model is composed of an LSTM module and a Gaussian approximation module in cascade. The LSTM module characterizes the measurement-prediction correlation for wave height sequences, resulting in the condition on which the Gaussian approximation module models the wave height distribution. In contrast to error and residual learning [35,36] which just models the difference between measurement and prediction, our N-LSTM comprehensively characterizes the correlation between measurement sequences and numerical prediction sequences via long short-term memory. It explores temporal characteristics and thus renders accurate corrections for numerical prediction sequences. Furthermore, the measurements in our method are used to provide wave height feature information and we can use very few measurements to predict future wave heights for a longer time. As long as there is no missing a lot of measured data, our method can be used. Our N-LSTM is motivated by the autoregressive recurrent networks [39] in terms of model structure but exhibits major differences from two operational aspects. First, our N-LSTM incorporates numerical predictions which are not considered in the autoregressive recurrent network. Second, our N-LSTM makes predictions by correcting the wave height numerical prediction, contrasting the straightforward prediction in the autoregressive recurrent networks. Experimental evaluations validate the effectiveness of our proposed model in predicting significant wave heights by correcting significant wave height predictions from SWAN for the Bohai Sea and Xiaomaidao.
The organization of the paper is listed as follows: Section 2 presents the structure of the numerical long short-term memory (N-LSTM) network and describes the training method for the N-LSTM network. Section 3 describes the significant wave height correction based on an N-LSTM network. Section 4 evaluates the effectiveness of our method qualitatively and quantitatively. Section 5 discusses the effectiveness of our method under large wave conditions and the probability correction. Section 6 concludes the paper.

2. Numerical Long Short-Term Memory (N-LSTM) Network

2.1. Overall N-LSTM Model

In this section, we present a numerical long short-term memory (N-LSTM) network for correcting the numerical wave height prediction series. The N-LSTM processes time-sequential data. It consists of an LSTM module and a Gaussian approximation module. Figure 1 shows the structure of the N-LSTM.
At the time t , the input of the overall N-LSTM and also the LSTM module is a combined significant wave height representation x t , which is formed of the current wave height measurement m t at the time t and a numerical prediction n t + 1 for the significant wave height at the subsequent time t + 1. The LSTM input x t is denoted as follows:
x t = [ m t , n t + 1 ] ,
where denotes the transpose operation. The LSTM updates the memory cell state c t and the output of the LSTM module h t . c t and h t are fed back for updating the LSTM at the subsequent time t + 1. In addition, h t is used as the input of the Gaussian approximation module. The output of the overall N-LSTM and also the Gaussian approximation module is n ^ t + 1 , which is the corrected numerical prediction with respect to the numerical prediction n t + 1 .

2.2. LSTM Module

The subsection describes the internal structure of the LSTM module. The LSTM module consists of an input gate i t , a forget gate f t and an output gate o t . Figure 2 shows the architecture of an LSTM module.
At the time t , the inputs of the LSTM module contain the combined significant wave height representation x t , the memory cell state c t 1 and the output of the LSTM module h t 1 . c ˜ t denotes memory information. The LSTM computing process is as follows:
i t = sig ( W i x t + U i h t 1 + b i ) ,
f t = sig ( W f x t + U f h t 1 + b f ) ,
o t = sig ( W o x t + U o h t 1 + b o ) ,
c ˜ t = tanh ( W c x t + U c h t 1 + b c ) ,
c t = i t c t 1 + f t c ˜ t ,
h t = o t tanh ( c t ) ,
where W , U denote weight matrices and b denote bias vectors. Sig and tanh are smooth step functions and hyperbolic tangent functions, respectively. denotes the Hadamard product. The output of the LSTM module c t and h t will be the input of the LSTM module at a subsequent time and h t is also the input of the Gaussian approximation module. Θ LSTM summarizes the parameters of the LSTM module as follows:
Θ LSTM = { W i , U i , b i , W f , U f , b f , W o , U o , b o , W c , U c , b c }
The LSTM structure extracts long-term useful features in sequences and has been validated effectively in a variety of prediction problems. Different from the original LSTM which tends to memorize historical measurements to predict subsequent measurements, the LSTM module within our N-LSTM explores a combined significant wave height representation in terms of both real measurements and the numerical predictions. Being capable of capturing time-varying characteristics, the LSTM module characterizes the temporal correlation between the real measurements and the numerical predictions. Therefore, the LSTM module within the N-LSTM takes advantage of both real measurements and the numerical predictions along with their temporal correlation characteristics and is bound to exhibit greater prediction capability than the original LSTM which just relies on memorizing historical measurements.

2.3. Gaussian Approximation Module

The LSTM module learns the correlation between historical measurements and numerical prediction. The learning uncertainty arises subject to noise from historical measurements and numerical prediction. To alleviate such deficiency, probabilistic models are used for addressing uncertainty. The widely used Gaussian function is exploited for modeling the conditional distribution of the significant wave height, and accordingly a Gaussian approximation module is introduced. Figure 3 shows the architecture of a Gaussian approximation module.
At the time t , the input of the Gaussian approximation module is the output of the LSTM module h t . Specifically, the Gaussian approximation module commences by processing h t via two layers in parallel as shown in Figure 3. The top layer is a linear mapping which operates as follows:
μ ( h t ) = w μ * h t + b μ ,
where w μ * and b μ are the weight and bias of the linear mapping, respectively. The bottom layer is a linear mapping, followed by a softplus activation. The softplus activation is used to ensure the variance σ > 0. The linear mapping and softplus activation with respect to the bottom layer operate as follows:
σ ( h t ) = log ( 1 + exp ( μ ( w σ * h t + b σ ) ) ) ,
where w σ * and b σ are the weight and bias of the linear mapping, respectively. The softplus activation layer ensures that σ ( h t ) is positive. We denote the parameter set Θ P of the Gaussian approximation module as follows:
Θ P = { w μ , b μ , w σ , b σ }
The Gaussian approximation module terminates by modeling a conditional probabilistic distribution for significant wave height in terms of a Gaussian distribution. The outputs μ ( h t ) and σ ( h t ) of the two layers (9) and (10) are modeled as the mean and variance of the Gaussian distribution, respectively. The conditional probabilistic distribution for significant wave height is given as follows:
P ( r t + 1 | h t ;   Θ ) = 1 2 π σ ( h t ) exp ( ( r t + 1 μ ( h t ) ) 2 2 ( σ ( h t ) ) 2 ) ,
where r t + 1 denotes the real wave height observations at the time t + 1 and Θ represents the overall N-LSTM parameter set including Θ LSTM and Θ P as follows:
Θ = { Θ LSTM ,   Θ P }
The Gaussian approximation module models the conditional probabilistic distribution of the subsequent significant wave height given the learned LSTM (i.e., the learned h t ), rather than straightforwardly predicting significant wave height. It should be noted that the unconditional significant wave height distribution might not follow Gaussian distribution but an arbitrary one. It is observed in the literature of probability that the distribution of an arbitrary (non-Gaussian) process can be approximated by a weighted combination of Gaussian distribution functions. In the light of this observation, the unconditional distribution P ( r t + 1 ) of the significant wave height process can be expressed as a weighted combination of Gaussian distribution P ( r t + 1 | h t ) as follows:
P ( r t + 1 ) = h t P ( r t + 1 | h t ) P ( h t ) d h t ,
where P ( h t ) behaves as the weight for combination. This justifies the effectiveness to use the Gaussian function to model the conditional distribution of significant wave height.

2.4. Sampling for Correcting Numerical Predictions

The prediction for significant wave height at the time t + 1 is obtained by sampling from the conditional probability distribution of prediction (12) and it is presented as follows:
n ^ t + 1 ~ P ( · | h t ;   Θ )
The benefits of the sampling procedure are two-fold. First, the sampling procedure produces a future result by not only learning historical measurements but also correcting numerical predictions with respect to historical measurements. This is enabled by the conditional probability distribution which encodes both measurements and numerical predictions within h t . Second, the sampling strategy alleviates the accumulation of prediction errors which cause considerable prediction uncertainty. At each time step, an LSTM without sampling the conditional probabilistic distribution predicts a fixed value. The predicted value might be contaminated by noise. If the noisy predicted values keep recycled for updating LSTM, error accumulation inevitably arises. The random sampling of the conditional distribution neutralizes such error accumulation and renders predictions more robust than those straightforwardly regressed from an LSTM.

2.5. N-LSTM Training

One sample for training the N-LSTM consists of a real significant wave height observation sequence r 1 : T = { r 1 , r 2 , r τ , r τ + 1 , , r T } and a significant wave height numerical prediction sequence n 1 : T = { n 1 , n 2 , n τ , n τ + 1 , , n T } , where T indicates the sequence length. The training procedure with respect to the sample is under the assumption that the input of real observations is available for the measurement time t { 1 ,   2 ,     , τ } , but the unavailable time t { τ + 1 ,   , T } . In this scenario, the significant wave height predictions obtained as described in Section 2.4 are used for representing the measurement time t { τ + 1 ,   , T } .
In addition, we construct the significant wave height measurement sequences m 1 : T = { m 1 , m 2 , , m τ , m τ + 1 , , m T } , with the significant wave height measurement m t defined as follows:
m t = { r t , for   t { 1 ,   2 ,     , τ } n ^ t ,     for   t { τ + 1 ,     , T } ,
where r t is the real observation of the significant wave height at time t and n ^ t is computed according to (14) and obtained as follows:
n ^ t ~ P ( · | h t 1 ;   Θ )
Given a training dataset including I significant wave height numerical prediction sequences { m ( i ) } i = 1 , , I and I wave height real observation sequences { n ( i ) } i = 1 , , I , we train the N-LSTM by maximizing the log-likelihood as follows:
Θ ^ = arg max Θ   L ( Θ ) = argmax i = 1 I t = 1 T log P ( r t + 1 ( i ) | h t ( i ) ;   Θ )
We use a stochastic gradient descent algorithm to optimize Equation (18) by calculating the gradient of Θ ^ . The Adam adaptive stochastic gradient descent optimizer is used to minimize the loss in our work [40]. The overall procedure of training the N-LSTM network is presented in Algorithm 1.
Algorithm 1 The training procedure of the N-LSTM
1: Input: The significant wave height numerical predictions { n ( i ) } i = 1 , , I and the real observations { r ( i ) } i = 1 , , I   .
2: Output: The parameters Θ ^ of the autoregressive recurrent network.
3: for i = 1 , 2 , , I do
4: for t = 1 , 2 , , T do
5: Compute h t according to (2)–(7).
6: Compute μ ( h t ) and σ ( h t ) according to (9) and (10).
7: Construct the conditional probabilistic distribution P ( r t | h t ; Θ ) according to (12).
8: end for
9: Maximize the log-likelihood according to (18).
10: Train the log-likelihood by stochastic gradient descent algorithm.
11: end for

3. Numerical Wave Height Series Correction Based on N-LSTM

Given the significant wave height real observations r 1 : τ = { r 1 , r 2 , , r τ } and numerical predictions n 1 : T = { n 1 , n 2 , n τ , n τ + 1 , , n T } , the learned N-LSTM corrects the significant wave height numerical predictions { n τ + 1 , n τ + 2 , , n T } and produces corrected numerical predictions { n ^ τ + 1 , n ^ τ + 2 , , n ^ T } . Figure 4 shows the significant wave height prediction procedure of N-LSTM.
The significant wave heights at the time t { 1 ,   2 ,     , τ } does not require prediction because real observations are available in the time interval. The significant wave height real observations r 1 : τ and numerical predictions n 1 : τ at the time t { 1 ,   2 ,     , τ } are used for computing h τ according to (7). The real observation is no longer available starting from the time τ + 1 . At the time τ + 1 , the corrected prediction n ^ τ + 1 is produced Gaussian approximation module in terms of sampling the conditional distribution as follows:
n ^ τ + 1 ~ P ( · | h τ ;   Θ )
At the time t { τ + 1 ,   , T } , the corrected prediction n ^ t is used as the measurement m t , i.e.,:
m t = n ^ t
Specifically, the input of the N-LSTM is given as follows:
x t = [ n ^ t , n t + 1 ] * , for   t { τ + 1 ,   , T }
The corrected numerical prediction series { n ^ τ + 1 , n ^ τ + 2 , , n ^ T } are obtained by iterating the N-LSTM procedure as illustrated in Figure 1. In order to render more accurate predictions, the sampling at each time t { τ + 1 ,   , T } is performed multiple times and the average is given as the resulted prediction n ^ t .

4. Results

4.1. Data Collection and Evaluation Criterions

The study area is the Bohai Sea and the northern part of the Yellow Sea (35°–41° N, 117°–123° E), as shown in Figure 5. The submarine topography in the study area is relatively flat, with an average water depth of about 18 m. The maximum tidal range is about 2.7 m. The study area is dominated by wind waves and is greatly affected by the monsoon.
In our work, the wave height numerical predictions are obtained from the third-generation numerical wave model SWAN. The grids in this study are with a spatial resolution of 0.1° × 0.1°. The source term in the SWAN model uses the default formula package. The model considers the dissipation caused by whitecapping, bottom friction and depth induced wave breaking [41]. Other configurations of the SWAN model are carried out according to [42]. The real wave height observations for validating our method are collected from buoys located in the Bohai Sea, China, and Xiaomaidao Nearshore Sea, Shandong. The buoy in the Bohai Sea is located in the open waters inside the Bohai Sea with a water depth of 20 m. The position of the buoy on Xiaomaidao is shown in Figure 5 and the water depth is about 24 m. All the obtained data are recorded hourly with a period for two years from January 2017 to December 2018. We use a total of 17,510 sample pairs to train and test our model. We randomly select 10% of the time-continuous data pairs as the test set and the remaining 90% of the data pairs are used as the training set. All training data and test data are divided into real significant wave height observation sequences and numerical model prediction sequences according to Section 2.5. The number of sequences is related to the predicted time T.
We adopt the root mean square error (RMSE), mean absolute percentage error (MAPE) and skill score (SS) to evaluate the effectiveness of the proposed method. The RMSE and MAPE metric are defined as follows:
RMSE = 1 T t = 1 T ( r t n ^ t ) 2 ,
MAPE = 1 T t = 1 T | r t n ^ t r t |
The skill score shows the relative improvement of the proposed models over the numerical model. It is defined as follows:
SS = t = 1 T | r t n t | t = 1 T | n ^ t n t | t = 1 T | r t n t |
The performance of the prediction methods is better when the RMSE and MAPE are closer to 0 and the SS is closer to 1.
In our experiment, the epoch and learning rate are set to 10,000 and 0.001, respectively. The number of hidden units of the LSTM module in the N-LSTM is set to 50. The hyper-parameter setting of LSTM is the same as that of N-LSTM. The major hyper-parameter of the ELM methods is the number of hidden layers, which is set to 36 for both ELM and MFELM. We carry out hyper-parameter optimization through manual search. We conduct our experiments on a computing platform with three Intel Xeon CPUs E5-2690 at 2.60 GHz. N-LSTM and LSTM are implemented by PyTorch under the environment of Python. ELM residual learning method and MFELM residual learning method are implemented by using MATLAB R2019a.

4.2. Empirical Evaluations

We compare the performance of our N-LSTM method with the LSTM method, the ELM residual learning method [36] and the MFELM residual learning method [43] through the significant wave height prediction results of the numerical model in the future time. The numerical model is used as a baseline method for evaluating the improvement of the predictions from the numerical model. The LSTM behaves as one module in our N-LSTM. The other important module in the N-LSTM is the Gaussian approximation. The LSTM is used for validating the effectiveness of the overall N-LSTM (especially with the involvement of the Gaussian approximation) over the sole LSTM. The ELM and MFELM share some similar ideas to our N-LSTM in terms of correcting numerical predictions. They straightforwardly model the difference between the numerical predictions and the real observations via residual learning. In particular, the MFELM residual learning method exploits multiple factors, such as wind speed and wind direction, and achieves state-of-the-art performance for wave height predictions. In contrast to the ELM and MFELM residual correction methods, the N-LSTM does not model the residuals in a straightforward way but comprehensively incorporates the correction procedure into an LSTM followed by a Gaussian approximation. The empirical comparison between the N-LSTM and the (MF)ELM validates the advantage of the comprehensive correction procedure over the straightforward correction procedure.
We correct the significant wave height numerical model predictions for 3 h, 6 h, 12 h, 24 h, 48 h, 72 h and the time step is one hour. Each of these forecast leads comes from independent N-LSTM models. Table 1 and Table 2 show wave height prediction results at the buoy positions of the open water in Bohai Sea and the nearshore water by Xiaomaidao, respectively. The bold entries denote the best results. From Table 1, we observe that our proposed N-LSTM is significantly better than the LSTM method, the ELM residual learning method and the MFELM residual learning method in each evaluation criterion. This reflects that correcting the significant wave height predictions from the numerical model in the future time steps according to the real observations and the numerical model in the previous time steps is useful, and our model learns the relationship between the numerical model predictions and real observations effectively. In addition, we observe that the accuracy of all prediction methods decreases with the increase of the prediction time. In particular, the accuracies of the ELM related methods are considerably reduced when the prediction time increases more than 24 h. However, our method maintains good predictions in 24 h and 48 h and the skill scores are both higher by 50%. It drops to 36% in the 72-h prediction and is still significantly better than the other methods. This validates that our N-LSTM maintains good results in long-term predictions. This is because the LSTM structure can save long-term information that is not available in the ELM methods. From Table 2, the N-LSTM achieves the best performance compared with the other methods except the six-hour prediction. The comparison methods have different wave height prediction accuracies at different locations. The original LSTM outperforms two ELM methods for the Bohai Sea. The MFELM outperforms the original LSTM for Xiaomaidao. It confirms that our method effectively improves the accuracy of nearshore wave height prediction and validates that our method has robustness in predicting wave heights in different sea areas.
Figure 6 shows the significant wave height prediction results at various time steps. We evaluate the prediction results of several methods qualitatively through Figure 6. We see that the performance of our method is better than the other three machine learning methods. In particular, Figure 6d,e show that as the significant wave height increases, the error between the predictions of the numerical model and the real observations increases too. We can see that our N-LSTM method is more accurate than the other three machine learning methods in these cases. More detailed validations will be introduced in Section 4.1. In addition, the principle of the ELM residual learning method and the MFELM residual learning method is to predict the residuals between the real observations and the predictions from the numerical model. It is difficult to accurately predict the residual when the residuals change suddenly (the significant wave height has a peak). Our N-LSTM method characterizes the temporal correlation between real observations and numerical predictions rather than straightforwardly use their difference such that it achieves better performance.

5. Discussion

5.1. Analysis of Significant Wave Height Predictions between 1.25 m and 4 m

The World Meteorological Organization (WMO) sea state code describes the wave height and corresponding characteristics [44]. When the significant wave height is below 1.25 m, the sea state is calm or slight. The numerical model has a good performance in this case. The probability of the significant wave height above 4 m appearing on the coast is extremely low. The significant wave height sometimes occurs between 1.25 m and 4 m. The sea state at this time is rough and will affect marine production. This section mainly discusses the model prediction of significant wave height between 1.25 m and 4 m.
Figure 7 shows the scatter plot of the significant wave height prediction results in which the real observations are between 1.25 m and 4 m. In the plot, the diagonal indicate 100% accuracy. The scattered points close to the diagonal reflect accurate predictions and those far from the diagonal inaccurate ones. In Figure 7, the numerical model prediction results are concentrated on the upper left of the diagonal. It means that the correct results for the numerical model are always smaller than the real observations. The prediction results obtained by our method are closer to the diagonal than the results obtained by the ELM residual learning method and the MFELM residual learning method. In particular, the predictions of the numerical model have a large deviation and the prediction effectiveness of the other two ELM methods is poor when the significant wave heights are higher than 2 m. In contrast, our method maintains a good prediction of this situation.

5.2. Analysis of Probabilistic Wave Height Predictions

In addition to calculating the prediction results of the wave height numerical model, our method obtains the confidence interval of the prediction wave heights in the form of Monte Carlo sampling [45] and obtain probabilistic forecasts of wave heights in the future. Figure 8 shows probabilistic predictions for the forecasting time of 1–3 h for the nowcast and 48 h for the long forecast and made 600 h of consecutive predictions. The black scatters shows the real observations. The green line shows the numerical model predictions. We plot prediction results from our method as red line along with eighty percent confidence intervals (shaded area). The confidence interval represents the probability interval that real wave height observations fall between the predictions.
Figure 8 indicates that our N-LSTM predictions accurately approach most of the real observations for the one-h probability forecast, no matter the significant wave heights have a peak or stay stable. In addition, the corrected confidence intervals are smooth and their amplitude is small. This reflects that the probabilistic forecast is useful in the one-h prediction. The confidence interval increases as the forecast time step increases and the amplitude fluctuate greatly at the peak. In general, the confidence interval reasonably presents the characteristics of uncertainty, because the confidence interval contains most of the real observations. Accurate probability prediction of significant wave height plays an important role in early warning of Marine disasters.

6. Conclusions

This paper presents a numerical long short-term memory (N-LSTM) network for correcting the significant wave height predictions from the numerical model. The N-LSTM takes a combined significant wave height representation, which is formed of real observations from a buoy in the Bohai Sea and the predictions of the SWAN at the same location as the input and produces the corrected numerical prediction. The N-LSTM characterizes the temporal correlation between the real measurement and the numerical prediction and models the conditional probabilistic distribution of the significant wave height real observations and numerical predictions. Compared with traditional machine learning prediction methods, our method reduces the accumulation of error in the prediction process. The experimental results show that the N-LSTM network has better prediction accuracy. Besides, our N-LSTM method performs better when the significant wave height is over 1.25 m and the numerical model predicts poor performance in this case. Our method N-LSTM make probabilistic predictions for significant wave heights and the confidence interval of the prediction covers most of the real observations. Therefore, our N-LSTM method is capable of coping with not only difficult wave situations but also prediction uncertainties, and renders a robust prediction strategy.
Furthermore, we find that the wave height prediction of the numerical model sometimes has a lag. Since our model only uses the significant wave height as the input data, the prediction may fail. In future work, we will add sea surface wind, atmospheric pressure and other factors as input and go deep in to investigate the impact of the wave type [46,47] and meteorological elements.

Author Contributions

Conceptualization, X.Z. and P.R.; methodology, X.Z.; software, X.Z.; validation, X.Z., Y.L. and P.R.; formal analysis, X.Z. and P.R.; data curation, S.G.; writing—original draft preparation, X.Z. and P.R.; writing—review and editing, X.Z., Y.L. and P.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Key R&D Program of China, grant number No. 2019YFC1408400; the National Natural Science Foundation of China, grant number No. 61971444 and Shandong Provincial Natural Science Foundation, grant number No. ZR2019MF019.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Yang, Z.; Shao, W.; Ding, Y.; Shi, J.; Ji, Q. Wave Simulation by the SWAN Model and FVCOM Considering the Sea-Water Level around the Zhoushan Islands. J. Mar. Sci. Eng. 2020, 8, 783. [Google Scholar] [CrossRef]
  2. Young, I.R.; Ribal, A. Multiplatform evaluation of global trends in wind speed and wave height. Science 2019, 364, 548–552. [Google Scholar] [CrossRef]
  3. Carter, D. Prediction of wave height and period for a constant wind velocity using the JONSWAP results. Ocean Eng. 1982, 9, 17–33. [Google Scholar] [CrossRef]
  4. Muzathik, A.M.; Nik, W.B.W.; Samo, K.B.; Ibrahim, M.Z. Ocean wave measurement and wave climate prediction of Penin-sular Malaysia. J. Phys. Sci. 2011, 22, 77–92. [Google Scholar]
  5. Simmons, H.L.; Jayne, S.R.; Laurent, L.C.; Weaver, A.J. Tidally driven mixing in a numerical model of the ocean general circulation. Ocean Model. 2004, 6, 245–263. [Google Scholar] [CrossRef]
  6. Mandal, S.; Prabaharan, N. Ocean Wave Prediction Using Numerical and Neural Network Models. Open Ocean Eng. J. 2010, 3, 12–17. [Google Scholar] [CrossRef] [Green Version]
  7. Dentale, F.; Furcolo, P.; Carratelli, E.P.; Reale, F.; Contestabile, P.; Tomasicchio, G.R. Extremewave analysis by integrating model and wave buoy data. Water 2018, 10, 373. [Google Scholar] [CrossRef] [Green Version]
  8. Group, T.W. The WAM model—A third generation ocean wave prediction model. J. Phys. Oceanogr. 1988, 18, 1775–1810. [Google Scholar] [CrossRef] [Green Version]
  9. Bottcher, A.B.; Whiteley, B.J.; James, A.I.; Hiscock, J.G. Watershed Assessment Model (WAM): Model Use, Calibration, and Validation. Trans. ASABE 2012, 55, 1367–1383. [Google Scholar] [CrossRef]
  10. Tolman, H.L. User manual and system documentation of WAVEWATCH III TM version 3.14. Technical note. MMAB Contrib. 2009, 276, 220. [Google Scholar]
  11. Mentaschi, L.; Besio, G.; Cassola, F.; Mazzino, A. Performance evaluation of Wavewatch III in the Mediterranean Sea. Ocean Model. 2015, 90, 82–94. [Google Scholar] [CrossRef]
  12. Booij, N.; Holthuijsen, L.; Ris, R. The “Swan” Wave Model for Shallow Water. Coast. Eng. 1996, 1997, 668–676. [Google Scholar] [CrossRef]
  13. Rogers, W.E.; Hwang, P.A.; Wang, D.W. Investigation of Wave Growth and Decay in the SWAN Model: Three Regional-Scale Applications. J. Phys. Oceanogr. 2003, 33, 366–389. [Google Scholar] [CrossRef]
  14. Swain, J.; Umesh, P.; Balchand, A. WAM and WAVEWATCH-III intercomparison studies in the North Indian Ocean using Oceansat-2 Scatterometer winds. J. Ocean. Clim. Sci. Technol. Impacts 2019, 9, 1–24. [Google Scholar] [CrossRef] [Green Version]
  15. Liu, Q.; Rogers, W.E.; Babanin, A.V.; Young, I.R.; Romero, L.; Zieger, S.; Qiao, F.; Guan, C. Observation-Based Source Terms in the Third-Generation Wave Model WAVEWATCH III: Updates and Verification. J. Phys. Oceanogr. 2019, 49, 489–517. [Google Scholar] [CrossRef]
  16. Akpınar, A.; van Vledder, G.P.; Kömürcü, M.İ.; Özger, M. Evaluation of the numerical wave model (SWAN) for wave sim-ulation in the Black Sea. Cont. Shelf Res. 2012, 50, 80–99. [Google Scholar] [CrossRef]
  17. Liang, B.; Gao, H.; Shao, Z. Characteristics of global waves based on the third-generation wave model SWAN. Mar. Struct. 2019, 64, 35–53. [Google Scholar] [CrossRef]
  18. Asma, S.; Sezer, A.; Ozdemir, O. MLR and ANN models of significant wave height on the west coast of India. Comput. Geosci. 2012, 49, 231–237. [Google Scholar] [CrossRef]
  19. Kumar, N.K.; Savitha, R.; Al Mamun, A. Ocean wave height prediction using ensemble of Extreme Learning Machine. Neurocomputing 2018, 277, 12–20. [Google Scholar] [CrossRef]
  20. Yu, T.; Wang, J. A Spatiotemporal Convolutional Gated Recurrent Unit Network for Mean Wave Period Field Forecasting. J. Mar. Sci. Eng. 2021, 9, 383. [Google Scholar] [CrossRef]
  21. Bidlot, J.-R.; Holmes, D.J.; Wittmann, P.A.; Lalbeharry, R.; Chen, H.S. Intercomparison of the Performance of Operational Ocean Wave Forecasting Systems with Buoy Data. Weather Forecast. 2002, 17, 287–310. [Google Scholar] [CrossRef]
  22. Fan, C.; Wang, X.; Zhang, X.; Gao, D. A newly developed ocean significant wave height retrieval method from Envisat AS-AR wave mode imagery. Acta Oceanol. Sin. 2019, 38, 120–127. [Google Scholar] [CrossRef]
  23. Wang, J.; Zhang, J.; Yang, J.; Bao, W.; Wu, G.; Ren, Q. An evaluation of input/dissipation terms in WAVEWATCH III using in situ and satellite significant wave height data in the South China Sea. Acta Oceanol. Sin. 2017, 36, 20–25. [Google Scholar] [CrossRef]
  24. Deo, M.; Naidu, C.S. Real time wave forecasting using neural networks. Ocean Eng. 1998, 26, 191–203. [Google Scholar] [CrossRef]
  25. Tsai, C.-P.; Lin, C.; Shen, J.-N. Neural network for wave forecasting among multi-stations. Ocean Eng. 2002, 29, 1683–1695. [Google Scholar] [CrossRef]
  26. Deo, M.; Jha, A.; Chaphekar, A.; Ravikant, K. Neural networks for wave forecasting. Ocean Eng. 2001, 28, 889–898. [Google Scholar] [CrossRef]
  27. Berbić, J.; Ocvirk, E.; Carević, D.; Lončar, G. Application of neural networks and support vector machine for significant wave height prediction. Oceanologia 2017, 59, 331–349. [Google Scholar] [CrossRef]
  28. Dixit, P.; Londhe, S. Prediction of extreme wave heights using neuro wavelet technique. Appl. Ocean. Res. 2016, 58, 241–252. [Google Scholar] [CrossRef]
  29. Mikolov, T.; Kombrink, S.; Burget, L.; Cernocky, J.; Khudanpur, S. Extensions of recurrent neural network language model. In Proceedings of the 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague, Czech Republic, 22–27 May 2011; pp. 5528–5531. [Google Scholar]
  30. Gers, F.A.; Schraudolph, N.N.; Schmidhuber, J. Learning precise timing with LSTM recurrent networks. J. Mach. Learn. Res. 2002, 3, 115–143. [Google Scholar]
  31. Zhang, C.; Zeng, J.; Wang, H.; Ma, L.; Chu, H. Correction model for rainfall forecasts using the LSTM with multiple meteorological factors. Meteorol. Appl. 2020, 27, 1852. [Google Scholar] [CrossRef] [Green Version]
  32. Mandal, S.; Prabaharan, N. Ocean wave forecasting using recurrent neural networks. Ocean Eng. 2006, 33, 1401–1410. [Google Scholar] [CrossRef]
  33. Pushpam P., M.M.; Enigo V.S., F. Forecasting Significant Wave Height using RNN-LSTM Models. In Proceedings of the 2020 4th International Conference on Intelligent Computing and Control Systems (ICICCS), Madurai, India, 13–15 May 2020; pp. 1141–1146. [Google Scholar]
  34. Kaloop, M.R.; Kumar, D.; Zarzoura, F.; Roy, B.; Hu, J.W. A wavelet—Particle swarm optimization—Extreme learning machine hybrid modeling for significant wave height prediction. Ocean Eng. 2020, 213, 107777. [Google Scholar] [CrossRef]
  35. Deshmukh, A.N.; Deo, M.C.; Bhaskaran, P.K.; Nair, T.B.; Sandhya, K.G. Neural-network-based data assimilation to im-prove numerical ocean wave forecast. IEEE J. Ocean. Eng. 2016, 41, 944–953. [Google Scholar] [CrossRef]
  36. Wang, T.; Gao, S.; Xu, J.; Li, Y.; Li, P.; Ren, P. Correcting Predictions from Oceanic Maritime Numerical Models via Residual Learning. In Proceedings of the 2018 OCEANS—MTS/IEEE Kobe Techno-Oceans (OTO), Kobe, Japan, 28–31 May 2018; pp. 1–4. [Google Scholar]
  37. Campos, R.M.; Krasnopolsky, V.; Alves, J.-H.G.M.; Penny, S.G. Nonlinear Wave Ensemble Averaging in the Gulf of Mexico Using Neural Networks. J. Atmos. Ocean. Technol. 2019, 36, 113–127. [Google Scholar] [CrossRef]
  38. Campos, R.M.; Krasnopolsky, V.; Alves, J.-H.; Penny, S.G. Improving NCEP’s global-scale wave ensemble averages using neural networks. Ocean Model. 2020, 149, 101617. [Google Scholar] [CrossRef]
  39. Salinas, D.; Flunkert, V.; Gasthaus, J.; Januschowski, T. DeepAR: Probabilistic forecasting with autoregressive recurrent net-works. Int. J. Forecast. 2020, 36, 1181–1191. [Google Scholar] [CrossRef]
  40. Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. In Proceedings of the International Conference Learning Representations (ICLR), San Diego, CA, USA, 5–8 May 2015. [Google Scholar]
  41. Lv, X.; Yuan, D.; Ma, X.; Tao, J. Wave characteristics analysis in Bohai Sea based on ECMWF wind field. Ocean Eng. 2014, 91, 159–171. [Google Scholar] [CrossRef]
  42. Wu, W.; Li, P.; Zhai, F.; Gu, Y.; Liu, Z. Evaluation of different wind resources in simulating wave height for the Bohai, Yellow, and East China Seas (BYES) with SWAN model. Cont. Shelf Res. 2020, 207, 104217. [Google Scholar] [CrossRef]
  43. Wang, T.; Gao, S.; Bi, F.; Li, Y.; Guo, D.; Ren, P. Residual Learning with Multifactor Extreme Learning Machines for Wave height Prediction. IEEE J. Ocean. Eng. 2020, 46, 611–623. [Google Scholar] [CrossRef]
  44. Taboada, J.V.; Hirpa, G.L. Analysis of Wave Energy Sources in the North Atlantic Waters in View of Design Challenges. In Proceedings of the 2016 35th International Conference on Ocean, Offshore and Arctic Engineering, Busan, Korea, 19–24 June 2016; pp. 1–9. [Google Scholar]
  45. Hastings, W.K. Monte Carlo sampling methods using Markov chains and their applications. Biometrika 1970, 57, 97–109. [Google Scholar] [CrossRef]
  46. Stopa, J.E.; Cheung, K.F. Intercomparison of wind and wave data from the ECMWF Reanalysis Interim and the NCEP Climate Forecast System Reanalysis. Ocean Model. 2014, 75, 65–83. [Google Scholar] [CrossRef]
  47. Bruno, M.F.; Molfetta, M.G.; Totaro, V.; Mossa, M. Performance Assessment of ERA5 Wave Data in a Swell Dominated Region. J. Mar. Sci. Eng. 2020, 8, 214. [Google Scholar] [CrossRef] [Green Version]
Figure 1. The structure of the N-LSTM.
Figure 1. The structure of the N-LSTM.
Jmse 09 00514 g001
Figure 2. The architecture of the LSTM module.
Figure 2. The architecture of the LSTM module.
Jmse 09 00514 g002
Figure 3. The architecture of a Gaussian approximation module.
Figure 3. The architecture of a Gaussian approximation module.
Jmse 09 00514 g003
Figure 4. The prediction procedure of N-LSTM.
Figure 4. The prediction procedure of N-LSTM.
Jmse 09 00514 g004
Figure 5. Model domain and the buoy position.
Figure 5. Model domain and the buoy position.
Jmse 09 00514 g005
Figure 6. Wave numerical predictions and significant wave height predicting results from the LSTM method the ELM residual method, the MFELM residual learning method and our method: (a) Three hours; (b) six hours; (c) 12 h; (d) 24 h; (e) 48 h; (f) 72 h.
Figure 6. Wave numerical predictions and significant wave height predicting results from the LSTM method the ELM residual method, the MFELM residual learning method and our method: (a) Three hours; (b) six hours; (c) 12 h; (d) 24 h; (e) 48 h; (f) 72 h.
Jmse 09 00514 g006
Figure 7. Scatter plot for significant wave height results between 1.25 m and 4 m.
Figure 7. Scatter plot for significant wave height results between 1.25 m and 4 m.
Jmse 09 00514 g007
Figure 8. Wave height probabilistic predictions with 80% confidence interval for 1–3 and 48 h (shaded area): (a) One-hour forecasting; (b) two-hour forecasting; (c) three-h forecasting; (d) 48-h forecasting.
Figure 8. Wave height probabilistic predictions with 80% confidence interval for 1–3 and 48 h (shaded area): (a) One-hour forecasting; (b) two-hour forecasting; (c) three-h forecasting; (d) 48-h forecasting.
Jmse 09 00514 g008
Table 1. Significant wave height prediction results for the buoy position in the Bohai Sea.
Table 1. Significant wave height prediction results for the buoy position in the Bohai Sea.
Validation MethodMethod3 h6 h12 h24 h48 h72 h
RMSENumerical model0.47300.43250.37160.27280.28110.2807
LSTM0.13720.18350.21630.15540.17060.2051
ELM residual learning0.21600.25250.27960.24090.23400.2427
MFELM residual learning0.18880.21330.21940.21260.21040.2103
N-LSTM (Our method)0.05810.12480.14790.12370.15000.1869
MAPENumerical model0.26880.24660.22670.27170.15440.1420
LSTM0.07280.10250.12810.15220.09170.1092
ELM residual learning0.11720.13930.15690.27890.12190.1209
MFELM residual learning0.10390.11670.11920.26110.10260.1039
N-LSTM (Our method)0.03190.06910.08300.12330.08330.0946
SSLSTM0.74120.58540.43430.51420.43600.2848
ELM residual learning0.55980.43830.31280.11350.21890.1497
MFELM residual learning0.63620.52800.46260.22150.32670.2776
N-LSTM (Our method)0.88900.72210.65300.56700.52660.3655
Table 2. Significant wave height prediction results for Xiaomaidao.
Table 2. Significant wave height prediction results for Xiaomaidao.
Validation MethodMethod3 h6 h12 h24 h48 h72 h
RMSENumerical model0.15290.11810.12480.15470.17770.3243
LSTM0.06100.09270.08380.13530.15930.2326
ELM residual learning0.08390.04020.09150.11190.15460.2713
MFELM residual learning0.07660.04400.07750.08490.11760.2944
N-LSTM (Our method)0.01740.05740.07650.08310.11660.1858
MAPENumerical model0.76390.49380.64430.71740.96971.2006
LSTM0.29550.32780.41700.63320.89540.8170
ELM residual learning0.41710.14460.45740.42200.59100.8979
MFELM residual learning0.38150.18080.29680.36290.60951.1040
N-LSTM (Our method)0.07740.21910.37650.35160.62600.6229
SSLSTM0.61310.22130.40690.13980.12060.3317
ELM residual learning0.45400.73090.29680.35270.25180.1946
MFELM residual learning0.50050.60120.43390.52470.40920.0908
N-LSTM (Our method)0.89870.58250.44530.53210.42180.4891
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Zhang, X.; Li, Y.; Gao, S.; Ren, P. Ocean Wave Height Series Prediction with Numerical Long Short-Term Memory. J. Mar. Sci. Eng. 2021, 9, 514. https://doi.org/10.3390/jmse9050514

AMA Style

Zhang X, Li Y, Gao S, Ren P. Ocean Wave Height Series Prediction with Numerical Long Short-Term Memory. Journal of Marine Science and Engineering. 2021; 9(5):514. https://doi.org/10.3390/jmse9050514

Chicago/Turabian Style

Zhang, Xiaoyu, Yongqing Li, Song Gao, and Peng Ren. 2021. "Ocean Wave Height Series Prediction with Numerical Long Short-Term Memory" Journal of Marine Science and Engineering 9, no. 5: 514. https://doi.org/10.3390/jmse9050514

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop