Extreme Short-Term Prediction of Unmanned Surface Vessel Nonlinear Motion Under Waves

Wang, Yiwen; Li, Jian; Wang, Shan; Zhang, Hantao; Yang, Long; Wu, Weiguo

doi:10.3390/jmse13030610

Open AccessArticle

Extreme Short-Term Prediction of Unmanned Surface Vessel Nonlinear Motion Under Waves

by

Yiwen Wang

¹,

Jian Li

^1,2,

Shan Wang

^3,*

,

Hantao Zhang

^1,2,

Long Yang

^1,2 and

Weiguo Wu

¹

Green & Smart River-Sea-Going Ship Cruise and Yacht Research Center, Wuhan University of Technology, Wuhan 430063, China

²

School of Naval Architecture, Ocean and Energy Power Engineering, Wuhan University of Technology, Wuhan 430063, China

³

Centre for Marine Technology and Ocean Engineering, Instituto Superior Técnico, Universidade de Lisboa, Av. Rovisco Pais, 1049-001 Lisbon, Portugal

^*

Author to whom correspondence should be addressed.

J. Mar. Sci. Eng. 2025, 13(3), 610; https://doi.org/10.3390/jmse13030610

Submission received: 27 February 2025 / Revised: 14 March 2025 / Accepted: 18 March 2025 / Published: 19 March 2025

(This article belongs to the Section Ocean Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

Under complex hydrodynamic conditions, Unmanned Surface Vessel (USV) exhibits non-stationary and nonlinear dynamic behaviors. Extreme short-term prediction of such nonlinear motion is therefore critical for ensuring navigational safety. To improve the prediction accuracy, a VMD-CNN-LSTM combined prediction model was applied based on Variational Mode Decomposition (VMD), Convolutional Neural Network (CNN), and Long Short-term Memory (LSTM) neural network. The methodology employs VMD to decompose the nonlinear motion time series data of the USV obtained by numerical simulation into stationary Intrinsic Mode Functions (IMFs), subsequently extracting spatial features from these IMFs using CNN layers, and, finally, predicts temporal sequence via the LSTM module. Comparative analyses highlight the better performance of the VMD-CNN-LSTM model over standalone LSTM and CNN-LSTM models in predicting nonlinear motion under varying significant wave heights. At a Prediction Advance Time (PAT) of 3.7 s, the VMD-CNN-LSTM model improves prediction accuracy by 13.3% for a wave height of 1.015 m (Case I) and 54.9% for a wave height of 1.998 m (Case II) compared to the CNN-LSTM model. With a PAT of 5.6 s, the accuracy gains increase to 32.9% for Case I and 94.6% for Case II, demonstrating the model’s robustness in extended prediction scenarios.

Keywords:

nonlinear motion; variational mode decomposition; convolutional neural network; long short-term memory; extreme short-term prediction

1. Introduction

Unmanned Surface Vessel (USV), characterized by high-speed motion, exceptional maneuverability, and small size, often encounters complex hydrodynamic phenomena such as porpoising and stalling when operating in challenging marine environments [1]. These phenomena induce non-stationary and highly nonlinear motion, which can significantly increase the risk of capsizing [2]. To mitigate such risks, extreme short-term prediction of the nonlinear motion of the USV under wave excitation is essential. By predicting the USV’s dynamic behavior in the near future, the safety and operational reliability of the USV in complex sea conditions can be substantially improved. Furthermore, the predicted data on nonlinear motion can be integrated as feedback signals into the autonomous control system, enabling real-time adaptive adjustments to environmental variations and ensuring effective stabilization. Some studies have been conducted on how to integrate predicted nonlinear motion responses of ships as feedback signals to enhance the adaptability and stability of autonomous control systems. Zhang et al. utilized a nonlinear dynamic model to predict and adjust the ship trajectory, thereby achieving autonomous ship obstacle avoidance and trajectory tracking [3]. Ma et al. proposed a controller with nonlinear feedback based on an improved Lyapunov function for ship heading keeping. The incorporation of a nonlinear feedback term into the control system design enhances performance optimization, reduces energy consumption, and strengthens robustness across operational conditions [4]. These studies emphasize the importance of extreme short-term prediction of the USV nonlinear motion, which is of great engineering significance for improving the performance and safety of the USV in practical applications.

Currently, the extreme short-term prediction methods for ship motion primarily include the Kalman filter method [5], the time series method [6,7], and the neural network-based method [8]. Among these, the Kalman filter method is a recursive linear optimal state estimation algorithm, which is highly suitable for online real-time estimation. However, it performs poorly in terms of prediction accuracy for nonlinear systems. The time series method is based on historical data of ship motion for prediction, including the Autoregressive (AR) model [6] and the Autoregressive Integrated Moving Average (ARIMA) model [7]. This method does not rely on flow field information or motion equations during the ship navigation and is easy to implement. Nevertheless, as it is primarily applicable to linear problems, there are some limitations in dealing with strongly nonlinear phenomena such as severe heave and pitch of ships in waves.

As intelligent algorithms and data analysis technologies continue to advance, machine learning methods, including Artificial Neural Networks (ANN) [9,10] and Support Vector Machines (SVM) [11,12], known for their excellent nonlinear fitting capabilities, have been applied by some researchers in the prediction of ship motion. Khan et al. proposed a method of combining a Back-propagation (BP) neural network with a genetic algorithm and compared it with an ARMA prediction model to verify the superiority of neural networks in nonlinear modeling [13]. However, since the BP neural network cannot capture the time correlation in the time series, scholars have turned their attention to the Long Short-term Memory (LSTM) neural network model that can capture time features [14]. This model shows unique advantages in ship motion prediction [15,16]. However, due to the limitations of a single LSTM model in data mining, some scholars have taken optimization measures to enhance its feature extraction abilities in different prediction scenarios. Wang et al. proposed an integrated prediction model for ship roll motion by combining Convolutional Neural Networks (CNN) with bidirectional LSTM architecture, which incorporates multidimensional input variables including wind speed and rudder angle parameters, and demonstrates excellent performance in ship motion prediction [17]. Dong et al. developed an improved LSTM model by combining the LSTM model and attention mechanism. The model maintains reliable prediction performance for ship motion even under conditions lacking environmental observation data [18]. Zhan et al. integrated the CNN model with the LSTM model to explore the relationships between data dimensions through CNN, thereby enhancing the prediction performance of the CNN-LSTM model [19]. In addition to the application of new machine learning models, data preprocessing methods such as Wavelet Transform [18] and Empirical Mode Decomposition (EMD) [20] can further enhance the prediction accuracy. Zuo et al. applied Variational Mode Decomposition (VMD) to decompose ship motion data into several intrinsic modal components, which notably enhanced the prediction accuracy of the model [21]. Therefore, to improve the prediction accuracy of the nonlinear motion of the USV, this paper applies a VMD-CNN-LSTM prediction model based on Variational Mode Decomposition (VMD), Convolutional Neural Network (CNN), and Long Short-term Memory (LSTM) neural network for extreme short-term prediction of the nonlinear motion of the USV under waves.

In the study of extreme short-term prediction of nonlinear motion for the USV, the quality of datasets critically influences the training effectiveness and predictive accuracy of the prediction model, making their sources particularly significant. Currently, datasets for the USV nonlinear motion primarily rely on two methodologies: seakeeping tests and numerical simulations. Although seakeeping tests (e.g., towing tank experiments and full-scale sea trials) can directly acquire physically authentic motion data, their high costs, prolonged cycles, and environmental uncontrollability under extreme conditions limit data reliability and applicability [22,23]. In contrast, numerical simulations have emerged as a vital complementary approach due to their flexibility and cost-effectiveness. Traditional potential flow theory, based on inviscid, irrotational, and linear free-surface assumptions, struggles to accurately capture nonlinear factors such as fluid viscosity effects, vortex shedding, and turbulent dissipation, resulting in significantly reduced prediction accuracy under extreme waves or large-amplitude motions [24,25]. Conversely, the Computational Fluid Dynamics (CFD) method that accounts for viscous effects by solving the Reynolds-averaged Navier–Stokes (RANS) equations can more accurately simulate the coupling effects between fluid viscosity and structural motion. This approach has demonstrated advantages in enhancing the precision of capturing nonlinear factors [26,27]. Existing studies have validated that viscous effects substantially influence motion prediction accuracy in high-Reynolds-number flows and flow separation scenarios [26]. Furthermore, compared to potential flow theory, viscous flow models based on RANS equations reduce wave load calculation errors by approximately 15–20% [27]. Consequently, adopting CFD methods that account for viscous effects to acquire nonlinear motion datasets of the USV not only circumvents the limitations inherent in physical experiments, but also captures nonlinear dynamic characteristics inaccessible to conventional potential flow theory through refined flow field analysis, and provides high-precision data support for the extreme short-term prediction model.

This paper applies the CFD method to obtain the nonlinear motion dataset of the USV. To enhance prediction accuracy, a VMD-CNN-LSTM combined prediction model is applied for the extreme short-term prediction of the nonlinear motion of the USV. In this model, the VMD algorithm decomposes the nonlinear motion time series data of the USV, obtained through numerical simulation, into stationary signal components, reducing both non-stationarity and nonlinearity. The CNN module then extracts features from these components, improving the calculation efficiency of the model. Finally, the LSTM module is used to predict the time series based on the extracted features, capturing the inherent patterns and enhancing the prediction accuracy and robustness of the prediction model. The results demonstrate that the VMD-CNN-LSTM model outperforms both the LSTM and CNN-LSTM models in the extreme short-term prediction of the nonlinear motion of the USV under various significant wave heights and has high practical value and application prospects.

2. Methodology

2.1. Variational Mode Decomposition (VMD)

Variational Mode Decomposition (VMD), proposed by Dragomiretskiy et al. [28], can decompose the non-stationary and nonlinear original input signal into several Intrinsic Mode Functions (IMFs) and a Residual (Res). During the decomposition process of VMD, the AM-FM function with bandwidth limitation is defined as the IMF component. It is worth noting that the VMD method can be used to set the number of IMF components in advance, and the computational complexity of the model can be effectively reduced by adjusting the convergence conditions appropriately.

The global structure of VMD is a variational optimization problem that involves both the construction and solution processes. For each IMF component, the analytical signal is first obtained using the Hilbert transform, followed by extracting the unilateral spectrum. The spectrum of each mode is then shifted to the baseband through exponential mixing, and the baseband parameters are calibrated to align with the estimated central frequency. To estimate the bandwidth of each mode, Gaussian smoothing is applied to the demodulated signal. This process is repeated until the bandwidth for each IMF is obtained. The method decomposes the non-stationary and nonlinear original time series data by solving for the center frequencies, resulting in a finite number of bandwidth modes

u_{k} (t)

. The goal is to minimize the sum of the bandwidth estimates of each mode. The key constraint of this optimization problem is that the sum of all components must equal the original signal, as expressed by the following equation:

\begin{matrix} \min_{\{u_{k}\}, \{ω_{k}\}} \{\sum_{k = 1}^{K} {‖\partial_{t} [(δ (t) + \frac{j}{π t}) \otimes u_{k} (t)] e^{- j ω_{k} t}‖}_{2}^{2}\} \\ s . t . \sum_{k = 1}^{K} u_{k} = f \end{matrix}

(1)

where

\{u_{k}\} = \{u_{1}, u_{2}, \dots, u_{K}\}

is abbreviated for all modal functions,

\{ω_{k}\} = \{ω_{1}, ω_{2}, \dots, ω_{K}\}

is abbreviated for all center frequencies of modal functions,

\otimes

is convolution operation,

K

is the sum of all modal functions, and

δ (t)

is impulse function.

This paper applies the VMD method to decompose the pitch and heave motion time series data into intrinsic mode components with distinct frequency characteristics. Through the independent analysis of each modal component, a comprehensive understanding of the various frequency components embedded within the motion signals can be achieved. Specifically, Figure 1 depicts four fundamental modal components (IMF1-IMF4), along with one Residual component (Res). Among these fundamental modal components, IMF1 exhibits the highest frequency, indicating the high-frequency constituents in the pitch and heave motion signals of the USV. In the 30 s data, the maximum amplitude of IMF1 is different every 5 s, which characterizes the non-stationary and nonlinear characteristics of the pitch and heave motion data of the USV. The frequency of IMF2-IMF4 gradually decreases. In the 30 s data, the maximum amplitude of IMF2-IMF4 every 5 s is not the same, but the waveform tends to be stable. The Res signifies the low-frequency constituents within the signal, mirroring the nonlinear trend component. Through this decomposition, the dynamic characteristics of the signal can be better understood, especially in the prediction, and the changes in different frequency components can be more accurately captured. In this paper, the CNN module is first employed to extract features from different components. These extracted features are then input into the LSTM module, which captures the inherent patterns of the time series. The predictions from each component are subsequently aggregated, leading to a significant improvement in the overall prediction accuracy. The results of decomposing the non-stationary and nonlinear time series data of pitch and heave motions of the USV applying the VMD method are shown in Figure 1.

2.2. Convolutional Neural Network (CNN)

A Convolutional Neural Network (CNN) is a type of deep feedforward neural network that primarily performs convolution operations. CNN is commonly applied to time series data processing and consists of convolutional layers, pooling layers, and fully connected layers. The convolutional layer is the key component, where convolutional kernels are used to extract internal features. The mathematical expression is as follows:

y_{j} = σ (\sum x_{i} \otimes w_{i} + b_{i})

(2)

where

x_{i}

is the input value,

\otimes

is the convolution operation,

w_{i}

is the weight value,

b_{i}

is the offset value, and

σ

is the activation function.

CNN can effectively process time series data because it processes local data in the time dimension by sliding convolution kernels, rather than focusing on data at a certain time point alone. This local perception method based on a sliding window is helpful to capture the potential dynamic changes in time series, so as to extract more representative information. This method not only enhances the accuracy of the model, but also improves its generalization ability. As shown in Figure 2, a schematic diagram of the working principle of CNN is displayed.

As shown in Figure 3, the convolution kernel (also known as the filter) scans the two-dimensional data and performs convolution operations by gradually shifting, thereby extracting data features. The green part is the calculation area of the convolution kernel, the orange part is the convolution kernel, and the pink part is the calculation result of the convolution kernel. In order to fully extract feature information, multiple convolution kernels are generally required to set.

The main function of the pooling layer is to compress the data after convolution operation and reduce redundant information, so as to improve the generalization ability and calculation speed of the network. The fully connected layer integrates the extracted features by connecting each node with all the nodes in the previous layer to provide support for the subsequent prediction tasks of the LSTM layer.

In this study, CNN is used to extract effective features from IMFs obtained by VMD. Through the operation of the convolutional layer, CNN can learn useful spatial features from different frequency components, thereby improving the accuracy of the model for nonlinear motion prediction of the USV. This method effectively reduces the influence of noise in the original signal and makes the subsequent time series prediction of LSTM more accurate.

2.3. Long Short-Term Memory (LSTM)

In order to accurately predict the nonlinear motion of the USV, in addition to considering the recent historical data, the influence of long-term historical data also needs to be considered. Although the ordinary Recurrent Neural Network (RNN) has the ability of information memory, it is difficult to solve the problem of gradient disappearance in practical applications. Therefore, the Long Short-term Memory (LSTM) neural network came into being. Compared with RNN, LSTM performs better in allocating historical information and capturing long-term dependencies in time series, and its internal structure is shown in Figure 4.

The characteristic of the LSTM model is that it adds a memory unit to remember past information, which is controlled by the forget gate, input gate, and output gate to determine whether the information is output. The three gate structures of the LSTM model can not only choose whether to accept the information of the previous moment and its degree, but also limit the backward propagation of the information at the current moment. This allows the LSTM model to exhibit stronger memory abilities. Assuming the input sequence is

(x_{1}, x_{2} \dots, x_{t})

and the output sequence is

(h_{1}, h_{2} \dots, h_{t})

, then at

t

time, there are the following:

f_{t} = σ (W_{f} \cdot [h_{t - 1}, x_{t}] + b_{f})

(3)

i_{t} = σ (W_{i} \cdot [h_{t - 1}, x_{t}] + b_{i})

(4)

\tilde{C_{t}} = t a n h (W_{c} \cdot [h_{t - 1}, x_{t}] + b_{c})

(5)

C_{t} = f_{t} \otimes C_{t - 1} + i_{t} \otimes \tilde{C_{t}}

(6)

o_{t} = σ (W_{o} \cdot [h_{t - 1}, x_{t}] + b_{o})

(7)

h_{t} = o_{t} \otimes t a n h (C_{t})

(8)

where

f_{t}

is the forgetting gate,

i_{t}

is the input gate,

o_{t}

is the output gate,

C_{t}

is the memory unit,

h_{t - 1}

is the output information of the hidden layer unit at the previous moment,

h_{t}

is the current output information,

W_{f}, W_{i}, W_{c}, W_{o}

is the weight of different connection layers,

b_{f}, b_{i}, b_{c}, b_{o}

is the offset value of different connection layers, and

σ

and

t a n h

are different activation functions.

In this study, the time dependence of the nonlinear motion of the USV was captured by the LSTM network. Through the recursive structure of LSTM, we can model the long-term dependence in time series data and further improve the accuracy of prediction. LSTM can effectively process the features extracted by the CNN module and provide an accurate prediction of the nonlinear motion of the USV.

2.4. Evaluation Criterion

In order to evaluate the prediction performance of the proposed model, the commonly used evaluation indicators include Mean Square Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and determination coefficient (R²). The closer the MSE, RMSE, and MAE values are to 0, the smaller the difference between the predicted value and the real data, and the better the prediction effect of the model. On the other hand, the closer the R² value is to 1, the better the fitting effect of the model, and the closer the predicted value is to the true value. The following equations show the calculation formulas of these evaluation indicators:

M S E = \frac{1}{n} {\sum_{i = 1}^{n} (\hat{y_{i}} - y_{i})}^{2}

(9)

R M S E = \sqrt{\frac{1}{n} {\sum_{i = 1}^{n} (\hat{y_{i}} - y_{i})}^{2}}

(10)

M A E = \frac{1}{n} \sum_{i = 1}^{n} |\hat{y_{i}} - y_{i}|

(11)

R^{2} = 1 - \frac{{\sum_{i = 1}^{n} (\hat{y_{i}} - y_{i})}^{2}}{{\sum_{i = 1}^{n} (\bar{y} - y_{i})}^{2}}

(12)

where

n

is the number of data samples in the time series,

y_{i}

is the true value of the motion, and

\hat{y_{i}}

is the predicted value of the motion.

\bar{y}

is the average of the true values.

3. Data Source

3.1. Validation of the Numerical Simulation Method

To predict the nonlinear motion of the USV under waves, the calculation model in this paper uses the Generic Prismatic Planning Hull (GPPH) designed by the Naval Surface Warfare Center Carderock Division as the research object [29,30], and the USV model with a scale ratio of λ = 5.4 is studied and analyzed. Figure 5 illustrates the numerical calculation model of GPPH, while the main parameters used in the model are provided in Table 1.

Figure 6 shows the computational domain for the numerical modeling. The origin is located at the midpoint of the GPPH’s bottom, where the mid-transverse section intersects with the mid-longitudinal section. The x-axis aligns with the GPPH’s length, the y-axis with its width, and the z-axis is parallel to the wave height. This study simulates GPPH motion under waves, applying a mid-longitudinal symmetry plane to reduce the computational domain, decrease mesh count, and improve efficiency. According to the computational domain division rules recommended by the International Towing Tank Conference (ITTC) [31], the size of the computational domain is −5.0L < x < 2.7L, 0.0L < y < 2.0L, −2.0L < z < 1.2L (where L is the length of the GPPH).

The boundary conditions of the computational domain play a crucial role in the results of the CFD numerical simulation. These conditions typically consist of the background domain and the overset domain. The division of the types of boundary conditions used in the computational domain in the CFD simulation of this paper is listed in Table 2.

Figure 7 shows the fluid field during the simulation and the meshes around the GPPH in head waves. The mesh base size of the entire computational domain is 0.1 m, and the total number of meshes is 2.46 million. The hull is surrounded by a prismatic layer mesh with appropriate face encryption to ensure accurate capture of the motion response. The proportional extension factor for the thickness of each mesh layer is 1.2, and there are a total of 5 prismatic layers, with a total thickness of 6.25% of the base size. The duration of numerical simulation is 10 s.

In this paper, the PISO (Pressure-implicit with Splitting of Operators) algorithm is utilized to capture the motion of the USV, and the k-ε turbulence model is employed for the calculation model. The free surface is modeled using the Volume of Fluid (VOF) method. This study considers only two degrees of freedom: heave and pitch. The time discretization is performed using second-order discretization, with a maximum of 10 internal iterations. After the computational domain and mesh division are completed, the accuracy of the CFD method is verified in the next step. Firstly, the calculated pitch, heave, and acceleration at the center of gravity of GPPH are compared with the experimental test results for validation [29]. The test conditions are defined as a speed of 8.950 m/s, a wave height of 0.133 m, and a wave period of 2.110 s, which correspond to a real scale speed of 40 knots, a wave height of 0.718 m, and a wave period of 4.903 s. The comparison is shown in Figure 8, where the red solid line represents the experimental results, the blue dashed line shows the CFDShip-lowa calculation results from the reference, and the black dotted line represents the STAR-CCM+ calculation results from this study, accounting for viscous effects. From Figure 8, the calculated pitch amplitude differs from the experimental value by 4.11%, the heave amplitude by 0.02%, and the acceleration at the center of gravity by 0.26%. Moreover, the heave, pitch, and acceleration results in this study closely match the curve shapes and peak values of the other two methods. This indicates that the model used in this paper effectively simulates the motion of the USV in head waves.

3.2. Dataset Generation

After the numerical simulation method was verified, two sets of environmental parameters were selected as simulation conditions. As shown in Table 3, significant wave height Hs and spectral peak period Tp at the full scale of the USV are listed. The JONSWAP spectrum is used to describe random waves, and the simulation duration is set to 30 s. The prediction method proposed in this study is applicable to any degree of freedom of the GPPH. Since the prediction process is consistent across different degrees of freedom, only the pitch and heave motions are analyzed, following the approach of Deng et al. [32], to avoid redundant results. Numerical simulations, considering viscous effects, provide the nonlinear motion time series of the GPPH under the given environmental parameters, as shown in Figure 9 and Figure 10. The obtained nonlinear motion data are then divided into a training set and a test set in a 7:3 ratio. The training set is used to train the model, while the test set, independent of the training process, is used to evaluate the performance of the model.

The current validation tests are only carried out at a single speed (40 knots) and under specific wave parameters, mainly to simplify the preliminary validation process and ensure the basic accuracy of the model. In future studies, we plan to expand the test range and consider different speeds and wave parameters to comprehensively evaluate the performance of the model under various operating conditions.

Normalize the acquired time series data:

x^{'} = \frac{x - \min (x)}{\max (x) - \min (x)}

(13)

where

x

is the original data,

x^{'}

is the sample data to be trained,

\max (x)

and

\min (x)

, respectively, are the maximum and minimum values of the dataset.

4. Prediction Results and Analysis of Nonlinear Motion Response of the USV

This section applies the combined prediction model VMD-CNN-LSTM to address the issue of insufficient prediction accuracy in the nonlinear motion of GPPH and evaluates its applicability for this task. To further assess the prediction performance of the VMD-CNN-LSTM model, comparisons are made with the CNN-LSTM and LSTM models across different Prediction Advanced Times (PATs). Various evaluation methods are used, including time–history diagrams, R² histograms, error assessment indicators, and scatter plots. The time–history diagram visualizes prediction performance over the time series of each model. The R² histogram is used to assess the prediction accuracy, while the scatter plot shows the correlation between the simulated and predicted motion sequences.

4.1. The Proposed VMD-CNN-LSTM Model

In the complex marine environment, the nonlinear motion of the USV is affected by many factors, such as waves. It is difficult to accurately predict its nonlinear motion response by a single prediction model. In this paper, a VMD-CNN-LSTM prediction model based on Variational Mode Decomposition (VMD), Convolutional Neural Network (CNN), and Long Short-term Memory (LSTM) neural network is applied to the extreme short-term prediction of nonlinear motion of the USV. As shown in Figure 11, in the structural comparison between LSTM and VMD-CNN-LSTM models, the red box section shows the VMD-based prediction technology. This technique effectively reduces the non-stationarity and nonlinearity of the data and simplifies the subsequent modeling process by decomposing the complex time series data into multiple Intrinsic Mode Functions (IMFs) and a Residual (Res). Then, the CNN module is used to extract the local features of each intrinsic modal component, capture the correlation, and input the extracted features into the LSTM module for time series prediction, thereby improving the overall prediction performance. Through this method, the nonlinear motion of the USV in a complex environment can be predicted more accurately, and the prediction accuracy can be significantly improved.

It should be noted that the LSTM parameters in the VMD-CNN-LSTM model are consistent with the traditional LSTM model, aiming to highlight the improvements made by the model. Table 4 lists the training configuration parameters of the LSTM model. The sampling frequency of data is set to 25 Hz, and each input time step contains 200 time series points. The maximum number of training is 150, and the initial learning rate is 0.01. In addition, these two hyperparameters can be adjusted according to the actual test results. According to the research of Shi et al. [33], the Adam optimization algorithm has been widely used in the field of motion prediction, so the LSTM model uses the Adam algorithm for training. The algorithm can effectively optimize the model training process and improve the prediction accuracy. Through these configurations, the LSTM model can complete training in a relatively short time and achieve more accurate motion prediction, which further verifies the effectiveness and adaptability of the model.

In the research of this paper, all data processing is carried out on Windows 11 operating system. The deep learning test was performed on a laptop equipped with the 11^th-generation Intel (R) Core (TM) i5-11400 H CPU, 16 GB RAM, and NVIDIA GeForce RTX 3050 Laptop GPU. Specifically, compared with LSTM and CNN-LSTM models, VMD-CNN-LSTM adds the computational steps of VMD in the signal processing stage, which may increase the computational burden on some high-dimensional data processing. However, thanks to its stronger feature extraction and timing modeling capabilities, the model has obvious advantages in accuracy and robustness.

4.2. Performance Analysis

As illustrated in Figure 12 and Figure 13, the prediction results for the various models at varying wave heights are presented, with Prediction Advanced Times (PATs) of 3.7 s and 5.6 s, correspondingly. The evaluation indicators are outlined in Table 5 and Table 6. The results of the figures and tables show that the VMD-CNN-LSTM model provides predictions that are closest to the true values, outperforming both the VMD-LSTM and LSTM models in capturing peaks and troughs more accurately. However, as the PAT increases, all models show a gradual deviation from the true values. This deviation occurs because, with the extension of PAT, the time correlation between input and output weakens, increasing uncertainty. As the correlation diminishes, the learning ability of all the models declines. Despite this, the VMD-CNN-LSTM model continues to provide the predictions that are closest to the true values, demonstrating its excellent ability to predict the nonlinear motion of GPPH over longer PAT periods.

The determination coefficient R² of the three models is shown in Figure 14 and Figure 15. Among them, the VMD-CNN-LSTM model has the highest R², indicating its best performance. The VMD-CNN-LSTM model achieves the highest R² values for pitch and heave, which are 94.5% and 95.9%, respectively, under Case II with a 3.7 s prediction advance time (PAT). In comparison to the CNN-LSTM model, the VMD-CNN-LSTM model shows a significant improvement, with R² values for pitch and heave increasing by 13.3% and 9.8%, respectively, under Case I with a 3.7 s PAT. Moreover, when compared to the LSTM model, the VMD-CNN-LSTM model demonstrates a much larger enhancement, with R² values for pitch and heave increasing by 41.8% and 29.3%, respectively. Under Case II, the VMD-CNN-LSTM model further outperforms the CNN-LSTM model, with R² values for pitch and heave rising by 54.9% and 6.9%, respectively, and surpasses the LSTM model by 130.5% for pitch and 24.5% for heave. As the PAT extends to 5.6 s under Case I, the VMD-CNN-LSTM model continues to show greater improvements than the CNN-LSTM model, with R² values increasing by 32.9% for pitch and 16.5% for heave. Furthermore, under Case II, the R² values for pitch and heave improve by 94.6% and 9.0%, respectively, when compared to the CNN-LSTM model. Overall, these results clearly demonstrate that the VMD-CNN-LSTM model, when combined with the VMD method, outperforms both the CNN-LSTM and LSTM models in the extreme short-term prediction of the nonlinear motion of the USV.

As demonstrated in Figure 16 and Figure 17, the overall prediction discretization distribution for the three models overcomes the limitations of evaluating discretization based solely on time–history curves. In comparison to the CNN-LSTM and LSTM models, the VMD-CNN-LSTM model demonstrates a significant reduction in the dispersion of predicted values, resulting in a higher concentration of predictions that more closely align with the true values. Furthermore, the VMD-CNN-LSTM model demonstrates enhanced robustness across varying feature distributions, underscoring its strong suitability for extreme short-term prediction of the nonlinear motion of the USV. These advantages highlight the effectiveness of the model in accurately predicting the complex dynamics of the USV at different wave heights.

5. Conclusions

Under various significant wave heights, Unmanned Surface Vessel (USV) are susceptible to nonlinear motion induced by nonlinear wave loads, thereby compromising their navigational safety. In this paper, the numerical simulation method considering viscous effects is used to obtain the pitch and heave time series of the USV at the significant wave height of 1.015 m (Case I) and the significant wave height of 1.998 m (Case II) as the original data. In order to improve the accuracy of extreme short-term prediction of the nonlinear motion of the USV, a VMD-CNN-LSTM combined prediction model is applied. The applicability of the VMD-CNN-LSTM combined prediction model in the nonlinear motion of the USV is evaluated by comparing the time–history diagrams, the R² histograms, the error assessment criterion, and the scatter plots of prediction results with CNN-LSTM and LSTM prediction models. The conclusions are as follows:

(1) The numerical simulation method considering viscous effects is verified by the GPPH model, which can better capture the nonlinear motion of the USV.

(2) The results indicate that, compared to the CNN-LSTM and LSTM models, the VMD-CNN-LSTM combined model more effectively captures the peaks and troughs of nonlinear motion. Its predictions are the closest to the true values, demonstrating its better ability to predict the nonlinear motion of the USV over extended PAT periods.

(3) The study demonstrates that the VMD-CNN-LSTM combined model effectively predicts the nonlinear motion of the USV across various significant wave heights. At a Prediction Advance Time (PAT) of 3.7 s, its prediction accuracy surpasses the CNN-LSTM model by 13.3% for a wave height of 1.015 m (Case I) and by 54.9% for a wave height of 1.998 m (Case II). When the PAT increases to 5.6 s, the accuracy improvements reach 32.9% for Case I and 94.6% for Case II, highlighting the better performance of the model in extended prediction scenarios.

In this study, the VMD-CNN-LSTM combined prediction model is applied to verify its effectiveness in the extreme short-term prediction of the nonlinear motion of the USV. In particular, the prediction accuracy under different significant wave heights is significantly better than that of traditional LSTM and CNN-LSTM models. However, the study still has the following limitations that need to be further explored. Firstly, the current research is based on numerical simulation data. Although it can effectively characterize nonlinear motion characteristics, highly irregular wave conditions (such as multi-directional waves and transient wind–wave coupling effects) in actual sea conditions may challenge the generalization ability of the model. In the future, it is necessary to combine the measured data at sea to verify the robustness of the model in complex fluid dynamics scenarios (such as unsteady wave spectrum and wind–current joint action), and to quantify the correlation between its prediction error and parameters such as wave height and wave direction. Secondly, this study only focuses on specific types of USV configurations and specific environments, and the difference in hydrodynamic response characteristics of different types of USVs may affect the performance of the model. Future research will establish a standardized test dataset covering multiple types of USVs, systematically analyze the influence of the USV geometric parameters and load distribution on the prediction accuracy so as to improve the robustness, applicability, and accuracy of the model, and provide a basis for the marine control of the USV in the real world.

Author Contributions

Conceptualization, Y.W., J.L. and S.W.; methodology, Y.W. and J.L.; software, Y.W., J.L. and H.Z.; validation, Y.W. and J.L.; formal analysis, J.L.; investigation, L.Y.; resources, Y.W. and W.W.; data curation, H.Z. and L.Y.; writing—original draft, Y.W., J.L. and S.W.; writing—review and editing, Y.W., J.L. and S.W.; visualization, H.Z.; project administration, Y.W., S.W. and W.W.; funding acquisition, Y.W. All authors have read and agreed to the published version of the manuscript.

Funding

This project was supported by Wuhan Natural Science Foundation Exploration Program (Shuguang Program): 2023010201020318.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

Author Yiwen Wang was employed by Green & Smart River-Sea-Going Ship Cruise and Yacht Research Center, Wuhan University of Technology. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as potential conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

Yang, P.; Xue, J.; Hu, H. A bibliometric analysis and overall review of the new technology and development of Unmanned Surface Vessels. J. Mar. Sci. Eng. 2024, 12, 146. [Google Scholar] [CrossRef]
Yang, T.; Sun, N.; Chen, H.; Fang, Y. Neural network-based adaptive antiswing control of an underactuated ship-mounted crane with roll motions and input dead zones. IEEE Trans. Neural Netw. Learn. Syst. 2019, 31, 901–914. [Google Scholar] [CrossRef]
Zhang, M.; Hao, S.; Wu, D.; Wu, D.; Yuan, Z. Time-optimal obstacle avoidance of autonomous ship based on nonlinear model predictive control. Ocean Eng. 2022, 266, 112591. [Google Scholar] [CrossRef]
Ma, C.; Zhang, X.; Yang, G.P. Improved nonlinear control for ship course-keeping based on Lyapunov stability. Chin. J. Ship Res. 2019, 14, 150–155+161. [Google Scholar]
Fossen, T.I.; Perez, T. Kalman filtering for positioning and heading control of ships and offshore rigs. Control Syst. IEEE 2009, 29, 32–46. [Google Scholar]
Jiang, H.; Duan, S.; Huang, L.; Han, Y.; Ma, Q. Scale effects in AR model real-time ship motion prediction. Ocean Eng. 2020, 203, 107202. [Google Scholar] [CrossRef]
John, J.S.; Arunachalam, V.; Coronado, F.K.; Romero, O.L.; Ramirez, Y.Y. Time-series modeling of fishery landings in the Colombian Pacific Ocean using an ARIMA model. Reg. Stud. Mar. Sci. 2020, 39, 101477. [Google Scholar]
Huang, L.; Duan, W.; Han, Y.; Chen, Y. A review of short-term prediction techniques for ship motions in seaway. J. Ship Mech. 2014, 18, 1534–1542. [Google Scholar]
Zhou, H.; Chen, Y.; Zhang, S. Ship trajectory prediction based on BP neural network. J. Artif. Intell. 2019, 1, 29–36. [Google Scholar] [CrossRef]
Duan, S.; Ma, Q.; Huang, L.; Ma, X. A LSTM deep learning model for deterministic ship motions estimation using wave-excitation inputs. In Proceedings of the 29th International Ocean and Polar Engineering Conference, Honolulu, HI, USA, 16–21 June 2019. [Google Scholar]
Duan, W.; Huang, L.; Han, Y. A hybrid AR-EMD-SVR model for the short-term prediction of nonlinear and non-stationary ship motion. J. Zhejiang Univ. Sci. A 2015, 16, 562–576. [Google Scholar] [CrossRef]
Wang, W.; Li, M. A Short-time Prediction Method of Ship Motion Attitude Based on EEMD-LSSVM. Int. J. Sci. 2020, 7, 66–74. [Google Scholar]
Khan, A.; Bil, C.; Marion, K.; Crozier, M. Real time prediction of ship motions and attitudes using advantage prediction techniques. In Proceedings of the 24th International Congress of the Aeronautical Sciences, Yokohama, Japan, 29 August–3 September 2004. [Google Scholar]
Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
Chong, Y.; Xiong, H. A prediction method of ship motion based on LSTM neural network with variable step-variable sampling frequency characteristics. J. Mar. Sci. Eng. 2023, 11, 919. [Google Scholar] [CrossRef]
Zhang, D.; Zhou, X.; Wang, Z.; Yan, P.; Xie, S. A data driven method for multi-step prediction of ship roll motion in high sea states. Ocean Eng. 2023, 276, 114230. [Google Scholar] [CrossRef]
Wang, Y.; Wang, H.; Zhou, B.; Fu, H. Multi-dimensional prediction method based on Bi-LSTMC for ship roll. Ocean Eng. 2021, 242, 110106. [Google Scholar] [CrossRef]
Dong, L.; Ma, X.; Feng, J.; Kong, L.; Liu, Y.; Wang, H. Online prediction method of ship maneuvering motion based on improved long-short term memory neural network. Shipbuild. China 2023, 64, 184–198. [Google Scholar]
Zhan, K.; Zhu, R. A CNN-LSTM ship motion extreme value prediction model. J. Shanghai Jiao Tong Univ. 2023, 57, 963–971. [Google Scholar]
Zhang, B.; Peng, X.; Gao, J. Ship motion attitude prediction based on ELM-EMD-LSTM integrated model. J. Ship Mech. 2020, 24, 1413–1421. [Google Scholar]
Zuo, S.; Zhao, Q.; Zhang, B.; Pang, M.; Zhou, M. Prediction of ship motion attitude based on VMD-SSA-GRU model. Ship Sci. Technol. 2022, 44, 60–65. [Google Scholar]
Gong, X.; Datla, R.J.; Xiang, X. Seakeeping evaluation of a Tri-SWACH based on CFD calculations and model tests. In Proceedings of the SNAME 26th Offshore Symposium, Virtual, 6–7 April 2021. [Google Scholar]
Wang, X.; Sun, S.; Zhao, X. Research on model test of thousands-tons class high seakeeping performance hybrid monohull. J. Ship Mech. 2011, 15, 342–349. [Google Scholar]
Wu, Q.; Zhang, B. Calculation methods of added resistance and ship motion response based on potential flow and viscous flow theory. China Ocean Eng. 2022, 36, 488–499. [Google Scholar] [CrossRef]
Yu, L.; Wu, S.; Gu, Z.; Wu, C.; Li, C. Research on ship motion characteristics in a cross sea based on computational fluid dynamics and potential flow theory. Eng. Appl. Comput. Fluid Mech. 2023, 17, 2164618. [Google Scholar]
Lee, E.J.; Schleicher, C.C.; Merrill, C.F.; Fullerton, A.M.; Geiser, J.S.; Weil, C.R.; Morin, J.R.; Jiang, J.; Stern, F.; Mousaviraad, S.M.; et al. Benchmark testing of Generic Prismatic Planing Hull (GPPH) for validation of CFD tools. In Proceedings of the SNAME 30th American Towing Tank Conference, West Bethesda, MD, USA, 4 October 2017. [Google Scholar]
Li, J. Verification and Validation Study of OpenFOAM on the Generic Prismatic Planing Hull Form. Master’s Thesis, Virginia Polytechnic Institute and State University, Blacksburg, VA, USA, 2019. [Google Scholar]
Dragomiretskiyk, K.; Zosso, D. Variational Mode Decomposition. IEEE Trans. Signal Process. 2014, 62, 531–544. [Google Scholar] [CrossRef]
Diez, M.; Lee, E.J.; Harrison, E.L.; Stern, F. Experimental and computational fluid-structure interaction analysis and optimization of deep-V planning-hull grillage panels subject to slamming loads—Part I: Regular waves. Mar. Struct. 2022, 85, 103256. [Google Scholar] [CrossRef]
Lee, E.J.; Diez, M.; Harrison, E.L.; Jiang, J.; Snyder, L.A.; Powres, A.R.; Bay, R.J.; Serani, A.; Nadal, M.L.; Kubina, E.R.; et al. Experimental and computational fluid-structure interaction analysis and optimization of deep-V planning-hull grillage panels subject to slamming loads—Part II: Irregular waves. Ocean Eng. 2024, 292, 116346. [Google Scholar] [CrossRef]
ITTC. ITTC Quality System Manual Recommended Procedures and Guidelines: Practical Guidelines for Ship CFD Application. In Proceedings of the 27th International Towing Tank Conference, Copenhagen, Denmark, 31 August–5 September 2014. [Google Scholar]
Deng, S.; Ning, D.; Mayon, R. The motion forecasting study of floating offshore wind turbine using self-attention long short-term memory method. Ocean Eng. 2024, 310, 118709. [Google Scholar] [CrossRef]
Shi, W.; Hu, L.; Lin, Z.; Zhang, L.; Wu, J.; Chai, W. Short-term motion prediction of floating offshore wind turbine based on muti-input LSTM neural network. Ocean Eng. 2023, 280, 114558. [Google Scholar] [CrossRef]

Figure 1. VMD diagram of pitch and heave.

Figure 2. The working principle of Convolutional Neural Network.

Figure 3. Calculation process of convolution kernel.

Figure 4. Internal structure of LSTM.

Figure 5. Numerical calculation model of GPPH.

Figure 6. Size of computational domain.

Figure 7. Motion under head waves and mesh division of GPPH.

Figure 8. Comparison of motion of GPPH under head waves [29].

Figure 9. Time series data of pitch and heave of GPPH in Case I.

Figure 10. Time series data of pitch and heave of GPPH in Case II.

Figure 11. Comparison of LSTM and VMD-CNN-LSTM model structures.

Figure 12. Prediction results of different PATs in Case I, pitch (a,b); heave (c,d).

Figure 13. Prediction results of different PATs in Case II, pitch (a,b); heave (c,d).

Figure 14. R² of prediction results of different PATs in Case I.

Figure 15. R² of prediction results of different PATs in Case II.

Figure 16. The discrete scatter plot for prediction results in Case I, pitch (a,b); heave (c,d).

Figure 17. The discrete scatter plot for prediction results in Case II, pitch (a,b); heave (c,d).

Table 1. Main parameters of GPPH.

Parameter	Units	Full Scale	Model Scale (λ = 5.4)
Overall length	m	13.036	2.414
Breadth	m	4.001	0.741
Displacement	kg	15,000.984	101.510
Draft	m	0.788	0.146
Longitudinal center of gravity	m	4.644	0.860
Vertical center of gravity	m	0.745	0.138
Pitch radius of gyration	m	2.452	0.454

Table 2. Boundary conditions of computational domain.

Domain Type	Boundary	Boundary Condition
Background domain	Inlet	Velocity inlet
	Outlet	Velocity inlet
	Back	Velocity inlet
	Symmetry	Symmetry plane
	Bottom	Velocity inlet
	Top	Pressure outlet
Overset domain	Hull surface	Wall
	Vertical boundary	Overset mesh
	Symmetry	Symmetry plane

Table 3. Environment parameters.

Case	Hs (m)	Tp (s)
Case I	1.015	5.577
Case II	1.998	6.507

Table 4. Environment parameters.

Parameters	Value
Max epochs	150
Initial learn rate	0.01
Loss function	Root Mean Squared Error
Optimizer	Adam

Table 5. Evaluation indicators for Case I.

Dataset	PAT (s)	Model	MSE	RMSE	MAE	R²
Pitch (°)	3.7	LSTM	2.968	1.723	1.418	0.648
		CNN-LSTM	1.590	1.261	0.993	0.811
		VMD-CNN-LSTM	0.678	0.824	0.639	0.919
	5.6	LSTM	4.672	2.161	1.720	0.452
		CNN-LSTM	2.934	1.713	1.409	0.656
		VMD-CNN-LSTM	1.097	1.048	0.834	0.872
Heave (m)	3.7	LSTM	0.002	0.045	0.036	0.711
		CNN-LSTM	0.001	0.033	0.026	0.844
		VMD-CNN-LSTM	0.0005	0.022	0.019	0.927
	5.6	LSTM	0.003	0.055	0.044	0.565
		CNN-LSTM	0.002	0.041	0.032	0.764
		VMD-CNN-LSTM	0.0007	0.027	0.021	0.890

Table 6. Evaluation indicators for Case II.

Dataset	PAT (s)	Model	MSE	RMSE	MAE	R²
Pitch (°)	3.7	LSTM	14.952	3.867	3.058	0.410
		CNN-LSTM	9.788	3.128	2.231	0.610
		VMD-CNN-LSTM	1.383	1.176	0.944	0.945
	5.6	LSTM	20.530	4.531	3.727	0.219
		CNN-LSTM	13.712	3.703	2.787	0.459
		VMD-CNN-LSTM	2.712	1.647	1.321	0.893
Heave (m)	3.7	LSTM	0.010	0.101	0.084	0.770
		CNN-LSTM	0.005	0.067	0.051	0.897
		VMD-CNN-LSTM	0.002	0.043	0.034	0.959
	5.6	LSTM	0.019	0.137	0.108	0.583
		CNN-LSTM	0.008	0.089	0.071	0.820
		VMD-CNN-LSTM	0.004	0.068	0.054	0.894

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, Y.; Li, J.; Wang, S.; Zhang, H.; Yang, L.; Wu, W. Extreme Short-Term Prediction of Unmanned Surface Vessel Nonlinear Motion Under Waves. J. Mar. Sci. Eng. 2025, 13, 610. https://doi.org/10.3390/jmse13030610

AMA Style

Wang Y, Li J, Wang S, Zhang H, Yang L, Wu W. Extreme Short-Term Prediction of Unmanned Surface Vessel Nonlinear Motion Under Waves. Journal of Marine Science and Engineering. 2025; 13(3):610. https://doi.org/10.3390/jmse13030610

Chicago/Turabian Style

Wang, Yiwen, Jian Li, Shan Wang, Hantao Zhang, Long Yang, and Weiguo Wu. 2025. "Extreme Short-Term Prediction of Unmanned Surface Vessel Nonlinear Motion Under Waves" Journal of Marine Science and Engineering 13, no. 3: 610. https://doi.org/10.3390/jmse13030610

APA Style

Wang, Y., Li, J., Wang, S., Zhang, H., Yang, L., & Wu, W. (2025). Extreme Short-Term Prediction of Unmanned Surface Vessel Nonlinear Motion Under Waves. Journal of Marine Science and Engineering, 13(3), 610. https://doi.org/10.3390/jmse13030610

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Extreme Short-Term Prediction of Unmanned Surface Vessel Nonlinear Motion Under Waves

Abstract

1. Introduction

2. Methodology

2.1. Variational Mode Decomposition (VMD)

2.2. Convolutional Neural Network (CNN)

2.3. Long Short-Term Memory (LSTM)

2.4. Evaluation Criterion

3. Data Source

3.1. Validation of the Numerical Simulation Method

3.2. Dataset Generation

4. Prediction Results and Analysis of Nonlinear Motion Response of the USV

4.1. The Proposed VMD-CNN-LSTM Model

4.2. Performance Analysis

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI