Ship Pitch Prediction Based on Bi-ConvLSTM-CA Model

Fu, Huixuan; Gu, Zhiqiang; Wang, Yuchao

doi:10.3390/jmse10070840

Open AccessArticle

Ship Pitch Prediction Based on Bi-ConvLSTM-CA Model

by

Huixuan Fu

,

Zhiqiang Gu

and

Yuchao Wang

^*

College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin 150001, China

^*

Author to whom correspondence should be addressed.

J. Mar. Sci. Eng. 2022, 10(7), 840; https://doi.org/10.3390/jmse10070840

Submission received: 18 May 2022 / Revised: 15 June 2022 / Accepted: 17 June 2022 / Published: 21 June 2022

(This article belongs to the Section Ocean Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

When a ship is sailing at sea, its pitch angle will be affected by ship motions such as turning angle, relative wind speed, relative wind direction, velocity in surge and velocity in sway of the ship. Due to the randomness of ship motion attitude and the difficulty of capturing the motion rules, traditional machine learning models, statistical learning models and single deep learning models cannot accurately capture the correlation information between multiple variables, which results in poor prediction accuracy. To solve this problem, the bidirectional convolutional long short-term memory neural network (Bi-ConvLSTM) and channel attention (CA) for ship pitch prediction are used to build a Bi-ConvLSTM-CA model in this paper. The Bi-ConvLSTM-CA prediction model can simultaneously extract the time information and spatial information of the ship motion data, and use the channel attention mechanism to process the output of different time steps to obtain the corresponding weight of each channel. Using the weights to do dot product with the output of Bi-ConvLSTM, the resulting attention mechanism output is processed to produce predicted value by the fully connected layer. Compared with other models, the RMSE index of Bi-ConvLSTM-CA model decreased by at least 28.20%; the MAPE index decreased by at least 29.39%; the MAE index decreased by at least 22.68%. The experimental results of real ship data show that the proposed Bi-ConvLSTM-CA model has a significant reduction in mean absolute percentage error (MAPE), mean square error (MSE) and mean absolute error (MAE) compared with other advanced models, which verifies the effectiveness of the Bi-ConvLSTM-CA model in predicting ship pitch angle.

Keywords:

ship motion prediction; ship pitch angle; Bi-ConvLSTM; channel attention mechanism; multivariate prediction

1. Introduction

With the rapid development of the marine industry, the safe navigation and operation of ships are very important in a harsh marine environment. Ships sailing at sea are susceptible to the influence of waves and currents, and there will be six degrees of freedom coupled motion [1], in which the roll and pitch angle have the greatest impact on ship navigation. The prediction of ship motion uses the historical data of the ship for modeling and analysis to predict the trend in the future and adjust the ship’s navigation attitude in time to maintain the overall stability of the ship, which will effectively reduce the risk of ship navigation operations [2]. And it is significant for applications such as UAV landing on a moving ship condition [3,4]. The ship motion is recorded by sensors installed on the hull [5], which belongs to the category of time series and has the following characteristics: the recorded data are usually too large and high-dimensional; it contains noise, redundant information and human factors caused by the measurement. Since the 1960s, a Kalman filter has been applied to ship motion prediction [6]. With the development of statistical theory, artificial intelligence and neural network, more new methods were applied to the prediction of ship motion. The main prediction methods are follows: statistical learning prediction, machine learning prediction, gray prediction, and recurrent neural network (RNN) combined models.

Traditional statistical prediction methods include Auto Regressive (AR), Auto Regressive Moving Average (ARMA) and Auto Regressive Integrated Moving Average (ARIMA). These methods were applied to ship motion prediction in early years, and are suitable for very short-term prediction due to many constraints and computational complexity [7]. Chen et al., carried out an experimental study on ship motion prediction using an auto-regression model, and proved that the AR model can predict the ship motion in a short time [8]. Wang et al., built a multi-step prediction method combining an error correction and auto-regression (AR) model [9]. The experimental results showed that the proposed model has a better prediction effect than the original AR model. Due to the limitations of the prediction of a single statistical model, in recent years many scholars have begun to combine data decomposition with statistical models to improve the effect of ship motion prediction. Qin et al., used a combination of empirical mode decomposition (EMD) and discrete wavelet transform (DWT) on the AR model to process the ship motion attitude of nonlinear and non-stationary signals [10]. Compared with a single model, the AR-DWT-EMD model performs better on ship motion short-term prediction. Wang et al., established a dual auto-regression model to predict the ship pitch angle, resulting in better prediction results [11]. Pelevin presented a ship motion prediction approach based on the identified dynamic model and the disturbance and wind action models, whose parameters depend on the current conditions [12].

Machine learning algorithms used for regression prediction include support vector machines (SVM), ensemble learning algorithms, random forest regression (RFR), etc. The main machine learning algorithm used in ship motion attitude prediction is SVM, and many variants of SVM models have evolved. Nie et al., established a hybrid MSEMD-SVR model for ship motion short-term prediction using the mirror symmetry (MS) and support vector regression (SVR) algorithm to eliminate empirical mode decomposition (EMD) boundary effects, and achieved good prediction results [13]. Nie et al., used four different kernel functions to analyze the ship attitude prediction results of the SVR algorithm, and compared the practicability of the kernel functions for prediction [14]. Wang et al., used empirical mode decomposition (EEMD) and least squares (LS) to build an EEMD-LSSVM model, which effectively improved the accuracy and effectiveness of ship motion attitude prediction [15]. Li et al., built a new dynamic seasonal robust v-support vector regression forecasting model named DSR-vSVR, and used a Chaos Adaptive Efficient Fruit Fly Optimization Algorithm (CAEFOA) to optimize the model parameters, achieving better prediction results than that of the single model in ship motion prediction [16]. Ensemble learning algorithms and RFR algorithms have not yet been applied to ship motion attitude prediction yet, but have recently been used in multivariate time series forecasting in other fields. Jain et al., used the XGBoost algorithm to focus on the problem of telecommunication network traffic forecasting with time series data [17]. Kombo et al., established the KNN-RF model, which effectively predicted the groundwater level under a variety of predictive factors [18].

With the development of neural network and artificial intelligence technology, neural network models can process nonlinear and non-stationary data and have already been applied to time series predicting, and have achieved better results than traditional prediction methods [19,20,21]. Sun et al., used a recurrent neural network to predict ship behavior and optimize the parameters [22]. Yin et al., combined the discrete wavelet transform (DWT) method with the variable-structure radial basis function (RBF) network, and DWT can effectively improve the RBF prediction accuracy [23]. Yang et al., used the back propagation (BP) neural network to predict ship motion attitude and solved the problem of hidden layers [24]. Li et al., optimized neural network structure through sensitivity analysis and compared the prediction performance of neural networks under offline, online and hybrid learning, respectively [25]. Yao et al., proposed the particle swarm algorithm (PSO) and Long Short-Term Memory (LSTM) model for the parameter setting problem of an artificial neural network, which improved the accuracy of ship motion prediction [26]. Zhang et al., used wavelet transform to decompose ship motion into multiple frequency scales, which enabled LSTM to capture the inherent pattern of ship motion from each frequency scale and improve the prediction accuracy more effectively [27]. Sun proposed a new hybrid prediction model based on LSTM and Gaussian Process Regression (GPR) to predict ship motion [28]. Su et al., adopted a recurrent neural network for univariate and multivariate ship motion prediction, respectively [29]. Liu et al., optimized input vector space based on Impulse Response Function (IRF) and Auto-correlation Function (ACF), and combined LSTM to predict ship motion [30]. Rashid et al., combined the convolutional neural network (CNN) with LSTM and Gate Recurrent Unit (GRU) to predict ship motion [31]. Li et al., used a hybrid genetic cloud whale optimization algorithm (GCWOA) to optimize the hyper-parameters of the CNN-GRU-AM model to predict ship motion [32]. Zhang et al., selected an LSTM auto-encoder to extract key ship trajectory features and reconstruct the model input, and then used the attention mechanism and dual LSTM model to achieve trajectory prediction, which effectively improved the accuracy of trajectory prediction [33].

With the amount of time series data increasing, parameters of deep learning models used for prediction also increase, which is particularly prominent with recurrent neural network train time series. Many researchers have found that combining CNN with RNN can achieve the goal of reducing training parameters and making predictions more accurate. Kim et al., built a CNN-LSTM neural network that can extract spatial and temporal features to effectively predict housing energy consumption [34]. The proposed multivariate CNN-LSTM model has the advantages of low error and short training time. Mehtab et al., built four CNN- and five LSTM-based deep learning models, validated them and finally tested them on their performance [35]. Inspired by CNN and attention mechanism, Xiao et al., built a convolutional LSTM network model based on multivariate time series (MTS) prediction with two-stage attention [36]. Although the prediction performance of the CNN-LSTM model is better than that of a single model, its prediction accuracy can be further improved. In recent years, attention mechanisms have been widely used in natural language processing and computer vision. Choi et al., built a fine-grained (or 2D) attention mechanism in which each dimension of a context vector receives a separate attention score [37]. Usama et al., built an attention mechanism based on RNN and CNN for sentence sentiment analysis [38]. Woo et al., built a Convolutional Block Attention Module (CBAM), a simple yet effective attention module for feed-forward CNN [39]. CBAM sequentially infers attention maps along two separate dimensions, channel and spatial, then the attention maps are multiplied to the input feature map for adaptive feature refinement. Wang et al., introduced CBAM as an attention module into a novel VGG-based network to effectively identify COVID-19 disease [40]. Lu et al., built a laterality-reduction 3D CBAM-Resnet with a balanced-sampler strategy to improve the performance of the CNN [41]. Li et al., built a deep learning method named AC-LSTM, which includes one-dimensional CNN, LSTM and attention-based network for urban PM_2.5 concentration prediction [42]. Wu et al., built an attention-based CNN combining LSTM and bidirectional LSTM (Bi-LSTM) models for short-term load prediction of integrated energy system (IES) [43]. Zhu et al., built an attention-based CNN-LSTM model for gait trajectory prediction that predicts human knee and ankle joint trajectories based on upper and lower limb collaborative data [44]. Shi et al., proposed a ConvLSTM neural network that enables a convolution operation to act simultaneously with the three gates of LSTM, effectively capturing spatiotemporal features from the data [45].

In recent years, attention mechanisms such as multi-scale attention, temporal pattern attention and multivariate inter-attention are usually used for multivariate ship motion prediction and have achieved good prediction results. However, multivariate ship motion prediction has a very complex problem. When predicting ship motion, the relationship between different variables at the same time step and the relationship between variables at different time steps are usually difficult to obtain. To solve the above problems, a bidirectional convolutional long short-term memory neural network and channel attention combined model (Bi-ConvLSTM-CA) is proposed for multivariate ship motion prediction in this paper. The main neural networks used by researchers to predict ship motion are RNNs and CNN. Usually, all ship motion data at one moment are input as a single time step of the RNN. The input of the Bi-ConvLSTM of the proposed model for a single time step is multivariate ship motion data at multiple times. The Bi-ConvLSTM can process the temporal and spatial information of motion data at the same time, and can extract the information between multiple variables in multiple time steps. Usually, most researchers use the fully connected layer to directly process the output to obtain the final predicted value. However, it is impossible to know which time step is more beneficial to prediction accuracy, and it may even cause temporal information redundancy. To solve this problem, the channel attention mechanism (CA) was used to process the output of different time steps of the Bi-ConvLSTM, and more important time steps information can be selected. Finally, a fully connected layer was used to obtain the final predicted value. The experimental results on real ship data show that the proposed Bi-ConvLSTM-CA model has a significant reduction in performance indexes compared with other advanced models.

2. Neural Network Feature Extraction Structure

2.1. Convolutional Neural Network

Convolutional Neural Networks have excellent feature extraction ability for unstructured data [46]. It is a classic perception model and has been applied to the processing and analysis of image, speech and video data. CNN is a feedforward neural network with convolution as the core. Compared with other feedforward neural networks, CNN has fewer parameters and strong generalization ability. A trained CNN model can be regarded as nonlinear mapping, such as a time series vector mapped to the predicted value at the next moment or an image pixel mapped to the category of the image.

2.1.1. Convolutional Layer

The convolutional layer is one of the main components of the CNN, and the convolution kernel is a small-scale weighted sum on each position of the multi-channel input data in the form of a sliding window. The convolutional layer is a linear computational layer that reduces the number of parameters of the entire model by taking advantage of the limitations and position independence of features in the input data. The specific process of the single-channel convolution operation is shown in Figure 1. The specific process of the multi-channel convolution operation is shown in Figure 2.

In Figure 1, a two-dimensional input array and a two-dimensional kernel array output a two-dimensional array through a cross-correlation operation. In the two-dimensional cross-correlation operation, the convolution window starts from the top left of the input array and slides over the input array from left to right and top to bottom. When the convolution window slides to a certain position, the input sub-array in the window and the kernel array are multiplied element-wise and summed to obtain the element at the corresponding position in the output array.

In Figure 2, the input data of the convolution layer are n pieces of two-dimensional data

{I (i) | i = 1, 2, ......., n}

, and the output data are m pieces of two-dimensional data

{O (i) | i = 1, 2, ......, m}

. The middle part of Figure 2 represents the parameters of the convolutional layer

{C (i, j), b (i) | i = 1, 2, 3, ......, m; j = 1, 2, 3, ......, n}

. Among them, C is the convolution kernel parameter; b is the threshold parameter; i is the number of channels of the corresponding output data or the number of convolution kernels; j corresponds to the number of channels of the input data. The blue box represents the convolution kernel set. The green box represents the convolution kernel corresponding to the multi-channel input data. The purple box represents the convolution operation threshold corresponding to the convolution kernel set. The convolution kernel set performs a convolution operation on the input data, and each channel is linearly processed by the convolutional feature to obtain the output O.

2.1.2. Activation and Pooling Layers

The activation layer is a nonlinear layer that uses an activation function to process the input data, which improves the nonlinear modeling ability of neural networks. Commonly used activation functions are Sigmoid, Tanh, Relu, etc. [47].

The pooling layer has two purposes: reducing the computational complexity of the network and extracting the main feature information of the data. There are two types of pooling layers: max pooling and average pooling. The max-pooling layer operation selects the largest element in the defined input kernel as output, and the average-pooling layer calculates the average value in the defined input kernel. Figure 3 is a diagram of the pooling operation. In Figure 3, the pooling operation is selected as the average or maximum value of the pooled kernel size region on the two-dimensional array.

2.2. Recurrent Neural Network

RNNs are special neural networks primarily used to process time-series data, such as natural language texts, translations or stock market prices. In order to process the characteristics of time series, the traditional feedforward neural networks set a separate parameter set for each time step and obtain information about the corresponding position and learn the pattern of the corresponding position. However, the memory requirement is high, causing slow training. Compared with feedforward neural networks, RNNs share the same set of parameters over time, combined with updated state variables to predict the next value in the sequence. Since RNNs are trained based on time-based backpropagation, RNNs can only store limited memory by forcibly updating the hidden state at each training step, and gradient vanishing and exploding occur during training. LSTM and GRU are proposed to better solve the above problems. Compared with normal RNN, LSTM has more parameters and more complex structure, which can better control the storage memory and discard memory in a given time step. As a special RNN, LSTM can learn dependency information for a long time, and has good performance in processing time series data with temporal autocorrelation. The LSTM structure can be divided into state variables and gating mechanisms as follows:

(1): Cell state: the internal cell state (memory) of the LSTM.
(2): Hidden state: the outer hidden state used to calculate the prediction result.
(3): Input gate: determines the current input sent to the cell state.
(4): Forget Gate: determines the previous cell state that is sent to the current cell state.
(5): Output gate: determines the cell state that is output to the hidden state.

LSTM uses a gating mechanism to make the cell state change rapidly and make the final hidden state change relatively slowly, which helps solve the problem of vanishing and exploding gradients. To reduce the number of parameters, GRU was proposed.

2.2.1. ConvLSTM Structure

Srivastava et al. [48] proposed the FC-LSTM network in 2015. The input-to-state and state-to-state transfer adopts full connection. Input, output and state variables of FC-LSTM network are all one-dimensional. The ConvLSTM network was proposed on the basis of FC-LSTM in 2016, and changed the full connection to convolution operation during information transfer, which can extract data spatial feature information. The ConvLSTM network has the ability to prevent the gradient from exploding and vanishing, and can simultaneously process the temporal and spatial information of multiple temporal variables. The main calculation equations of ConvLSTM are shown in Equations (1)–(5):

i_{t} = σ (W_{x i} * x_{t} + W_{h i} * h_{t - 1} + W_{c i} \circ c_{t - 1} + b_{i})

(1)

f_{t} = σ (W_{x f} * x_{t} + W_{h f} * h_{t - 1} + W_{c f} \circ c_{t - 1} + b_{f})

(2)

o_{t} = σ (W_{x o} * x_{t} + W_{h o} * h_{t - 1} + W_{c o} \circ c_{t - 1} + b_{o})

(3)

c_{t} = f_{t} \circ c_{t - 1} + i_{t} \circ \tanh (W_{x i} * x_{t} + W_{h c} * h_{t - 1} + b_{c})

(4)

h_{t} = o_{t} \circ \tanh (c_{t})

(5)

where,

*

is the convolution operation,

\circ

is the Hadamard product,

σ

is the Sigmoid activation function;

W_{x}

,

W_{h}

,

W_{c}

,

b

are the input data weight, the hidden state data weight, the memory state data weight and the threshold, respectively.

Figure 4 shows the internal structure of ConvLSTM, showing the gating part and memory unit.

In Figure 4, the blue dashed box represents the forget gate; the pink dashed box represents the input gate; the green dashed box represents the output gate; the red solid line represents the reweighted memory state.

2.2.2. Bi-ConvLSTM Structure

The ConvLSTM network usually extracts time series information forward and cannot extract the correlation between the current time step and the previous time steps when dealing with complex and changing time series. On the basis of the forward ConvLSTM network, a reverse ConvLSTM network is added to form a bidirectional ConvLSTM network, namely Bi-ConvLSTM. The Bi-ConvLSTM network can extract the correlation between forward and reverse time steps, and can effectively extract the time information from forward and reverse time steps in time series. Figure 5 shows the Bi-ConvLSTM network structure.

In Figure 5, Bi-ConvLSTM adds reverse ConvLSTM to the forward ConvLSTM network. For the input sequence

X_{0}, X_{1}, ..... X_{t}

, the Bi-ConvLSTM model will generate forward hidden output

{\vec{h}}_{t}

and reverse hidden output

{\vec{h}}_{t}

, and form the final output

Y_{t}

.

Bi-ConvLSTM can introduce the temporal correlation of ship motion and better learn the temporal features of ship motion.

2.3. Channel Attention Machine

The channel attention mechanism is used to explore connections between channels within features. After the three-dimensional data are convolved with multiple convolution kernels, three-dimensional data of multiple channels will be generated, and the attention mechanism will pay attention to the channel features that are meaningful relative to the generated three-dimensional data. To compute channel attention weights, the spatial data on each channel needs to be compressed. The basic operation of the channel attention mechanism is the pooling operation, which performs average-pooling and max-pooling on the feature data

F \in ℝ^{H \times W \times C}

where C represents the number of data channels, and H and W represent height and width of each channel data, respectively. The two pooling outputs are input into a multi-layer perceptron (to avoid overfitting, the perceptron in this paper is set to one layer, that is, a fully connected layer), sharing the same parameters to generate each descriptor. Finally, the perceptron output is passed through the Sigmoid function to obtain the channel attention map

M_{c} \in ℝ^{1 \times 1 \times C}

. Figure 6 shows the structure of the channel attention model.

In Figure 6, input data of the model contain multiple channels, and each channel has the same dimension. The red part (MaxPool) extracts the maximum value of each channel data of the input feature. The green part (AvgPool) extracts the average value of each channel data of the input feature. The Shared Dense Layer is a shared perceptron, which performs nonlinear processing on the outputs of MaxPool and AvgPool to obtain corresponding outputs. Then, the output parts of the Shared Dense Layer are added and normalized using the Sigmoid function to obtain the channel attention weights. Carry out element-wise multiplication between the channel attention weights and the input feature to obtain weighted feature data. Finally, the newly generated weighted feature data are added to the values at the corresponding position of each channel to obtain the output feature. The channel attention mechanism process is shown in Equations (6)–(10).

F_{A v g}^{C} = {E (F^{1}), E (F^{2}), ......., E (F^{i}) | i = C}

(6)

F_{M a x}^{C} = {M a x (F^{1}), M a x (F^{2}), ......., M a x (F^{i}) | i = C}

(7)

M_{C} (F) = σ (Re l u (W \otimes F_{A v g}^{C} + b) + Re l u (W \otimes F_{M a x}^{C} + b))

(8)

\bar{F} = M_{C} (F) \otimes F

(9)

\hat{F} = \sum_{t = 1}^{C} {\bar{F}}^{t}

(10)

where,

\otimes

is the element-wise multiplication;

σ

is the Sigmoid function;

W \in ℝ^{C \times C}

and b are the weights and thresholds of the fully connected layer, respectively; the fully connected layer adopts the Relu activation function;

F^{i}

is the i-th channel data of

F \in ℝ^{H \times W \times C}

;

F_{A v g}^{C} \in ℝ^{1 \times 1 \times C}

is the channel average-pooling output;

F_{M a x}^{C} \in ℝ^{1 \times 1 \times C}

is the channel max-pooling output;

{\bar{F}}^{t}

is the t-th channel data of

\bar{F} \in ℝ^{H \times W \times C}

;

\hat{F} \in ℝ^{H \times W}

is the output feature of the attention mechanism.

3. Bi-ConvLSTM-CA Model for Ship Pitch Prediction

3.1. Supervised Learning Data Processing

The model used in this paper is a deep learning model based on supervised learning, which requires label data. Time window sliding is used to process ship motion data. The data of the previous moments are used as samples, and the data of future moments are used as labels. In the process of ship motion, the pitch of the ship is easily affected by environmental factors. According to the factors that affect the pitch of the ship, six kinds of data including pitch angle, turning angle, relative wind speed, relative wind direction, velocity in surge and velocity in sway are selected as inputs to predict pitch angle of the ship in the future. Based on the given data

Y = {(y_{1}, y_{2}, ......., y_{n}) | y \in ℝ^{1 \times 6}}

, the ship motion

Y_{t} = {(y_{t - d + 1}, y_{t - d + 2}, ......., y_{t}) | y \subseteq Y}

at historical time is set to be used for prediction

Y_{t + m} = {(x_{t + 1}, x_{t + 2}, ......., x_{t}) | x \in ℜ}

, where d is the size of the sliding window and m is the prediction horizons. Unlike the LSTM model, ConvLSTM processes both temporal and spatial information of time series. Therefore, the input of ConvLSTM is three-dimensional (Height, Width, Channel) data, which need to be converted. The data conversion flowchart is shown in Figure 7.

In Figure 7, the red box on the left represents the sliding window, and the green curve represents ship motion data. The yellow box on the right represents multivariate data in multi-time steps in the sliding window. T₁, T₂, T₃ and T₄ represent multiple time data composed of 4 time steps, that is, 4 inputs of ConvLSTM, and the data of one time step are the input of a ConvLSTM cell. The step size of the sliding window is set to 1, the window size is set to

6 \times 12

, and the prediction time is set to 1 s. The 4 time steps contain 6 variables in 12 s, and each time step contains 6 variables in 3 s. The multivariate data have time steps of 4, and the size of multivariate data at each time step is

6 \times 3 \times 1

. This paper uses single ship data, so the number of data channels per time step is 1.

Neural networks are used to learn the mapping of the ship motion historical data

Y_{t}

and their corresponding target value

Y_{t + m}

. The parameters of the model are continuously updated to minimize the loss function. The prediction equation of is shown in (11).

{\hat{y}}_{t + m} = G (Y_{t}, Y_{t + m} | w, b)

(11)

3.2. Bi-ConvLSTM-CA Prediction Model Structure for Ship Pitch Angle

Bi-ConvLSTM-CA model is proposed for ship pitch prediction with multivariate. Firstly, 6 kinds of multivariate data are converted into three-dimensional data, that is, the three-dimensional input data in different historical seconds are sorted into the same ConvLSTM cell. Secondly, the Bi-ConvLSTM module is used to learn the temporal information and correlation between multivariate. Convolution layer and pooling operation layer are used to reduce redundant information learned by Bi-ConvLSTM network. The channel attention mechanism module processes output of convolution layer and pooling operation layer to obtain the weights of each channel. Then flatten layer to make multi-dimensional data to be one-dimensional data. Finally, a fully connected layer is used to predict the final ship pitch angle prediction. The overall structure of the Bi-ConvLSTM-CA ship pitch angle prediction model is shown in Figure 8.

In Figure 8, the specific processes are described as follows:

(1): Supervised Data: The sliding window is used to slide on multivariable data to form three-dimensional data of multiple time steps as input and one-dimensional ship pitch angle as input.
(2): Bi-ConvLSTM Layer: The processed data are input into the Bi-ConvLSTM network layer. The Bi-ConvLSTM network layer performs 4 convolution operations on data, which can extract the information from data in multiple previous seconds. The Bi-ConvLSTM network layer is set as follows: the convolution kernel size of each ConvLSTM cell is $[k_{1}, k_{2}]$ , and the number of convolution kernels is m1, the number of layers of the Bi-ConvLSTM network is 2.
(3): Convolution and pooling operation layer: The output of the Bi-ConvLSTM layer is input into the convolution layer to reduce redundant information and prevent model overfitting. The settings of the convolution layer are as follows: the size of the convolution kernel is $[k_{3}, k_{4}]$ , the number of convolution kernels is $m_{2}$ , and the sliding step of the convolution kernel is $[s_{3}, s_{4}]$ , the output of the convolutional layer is the input of the pooling layer, the pooling kernel size is set to $[p_{1}, p_{2}]$ , and the sliding step is $[q_{1}, q_{2}]$ . The pooling layer aggregates the two-dimensional data information of each channel and uses a nonlinear activation function to extract useful information. Finally, the concatenate layer aggregates the output of the pooling layer in the data channel dimension to extract information from different ConvLSTM cells.
(4): Channel Attention Mechanism: The output of the concatenate layer is the input of the channel attention module. The maximum and average values of each channel data are extracted by the pooling layer. The sum of the extracted values is normalized by Sigmoid function to obtain the weight of each channel. Multiply the input features of the module with the corresponding weights to form new data features. Finally, the data features of multiple channels are added to obtain single-channel data features.
(5): The fully connected layer: First, the single-channel data of the attention module are compressed into one-dimensional using Flatten Layer. The output of the Flatten Layer is the input of the fully connected layer. In order to prevent overfitting, the number of nodes in the fully connected layer is set to 1. The linear function is used to achieve the output $H = ϕ (W_{f c} x + b_{f c})$ . Where, $W_{f c}$ is the weight matrix on the fully connected layer; $b_{f c}$ is the threshold vector; $x$ is the input data.

4. Experimental Results and Analysis of Ship Pitch Prediction

4.1. Data Preprocessing and Experimental Environment Configuration

In order to prove the effectiveness of the proposed method, real ship motion data are used as the dataset. The dataset contains six kinds of data: pitch angle, turning angle, relative wind speed, relative wind direction, velocity in surge and velocity in sway. The number of samples in the dataset is 10,000, and the sampling interval is 1 s. The first 80% of the data is used for model training, and the last 20% of the data is used for testing. Figure 9 shows part of the raw data of the ship motion. Turning angle, relative wind speed, relative wind direction, velocity in surge and velocity in sway all have certain effects on pitch angle, and their changes will also lead to corresponding changes in pitch angle. All the data are collected by the sensors during the navigation of the real ship.

It can be seen from Figure 9 that the data in each motion sequence fluctuate greatly with time. In order to avoid the model falling into a local optimum, the raw data are normalized, as shown in Equation (12).

x_{n o r m} = \frac{(y_{m a x} - y_{m i n}) * (x - x_{m i n})}{x_{m a x} - x_{m i n}} + y_{m i n}

(12)

where,

y_{m a x} = 1

,

y_{m i n} = 0

,

x_{m a x}

and

x_{m a x}

are the maximum and minimum values of each sequence, respectively, x is the raw data,

x_{n o r m}

is the normalized data of x.

The configuration of the computer used in the experiment is as follows: the processor is Intel Core CPU i5-5200U; the CPU frequency is 2.20 GHz; the memory is 4 GB; the graphics is GTX950M; the operating system is Ubuntu18.0 (64-bit). The experimental software environment is as follows: PyCharm Community Edition 2019 with Python 3.6; the deep learning framework is keras2.2.2, Tensorflow1.10; the driver version is NVIDIA Driver Version-460.91.03, CUDA Version-11.2.

4.2. Performance Evaluation Indexes

In order to evaluate the prediction performance of the model, root mean square error (RMSE), mean absolute percentage error (MAPE), mean square error (MSE) and mean absolute error (MAE) are used as evaluation indexes for the model prediction results. RMSE can directly reflect the deviation between real values and predicted values. MSE has a squared form for easy derivation, so it is often used as a loss function for deep learning models. The equations are shown in (13)–(16).

RMSE = \sqrt{\frac{\sum_{i = 1}^{n} {({\hat{y}}_{i} - y_{i})}^{2}}{n}}

(13)

MAPE = \frac{100 %}{n} \sum_{i = 1}^{n} | \frac{{\hat{y}}_{i} - y_{i}}{y_{i}} |

(14)

MAE = \frac{1}{n} \sum_{i = 1}^{n} | {\hat{y}}_{i} - y_{i} |

(15)

MSE = \frac{1}{n} \sum_{i = 1}^{n} ({\hat{y}}_{i} - y_{i})^{2}

(16)

where n is the number of samples,

y_{i}

is the real value,

{\hat{y}}_{i}

is the predicted value.

In order to compare the prediction effects of various models, 9 models commonly used in time series prediction were selected to compare with the Bi-ConvLSTM-CA model proposed in this paper. These models were Gradient Boost Regression Tree (GBRT), Random Forest Regression (RFR), LSTM, Bi-LSTM, Convolutional LSTM (ConvLSTM), CNN-LSTM, GRU, Bi-LSTM-TPA [49] and Bi-LSTMC [50]. Where the traditional deep learning methods for time series prediction are Bi-LSTM, LSTM, GRU and CNN-LSTM, Bi-LSTM-TPA and Bi-LSTMC were recently proposed for ship motion roll angle prediction. In order to reflect the improvement of the proposed model compared with other models, promoting mean absolute error (PMAE), promoting mean square error (PMSE), promoting root mean square error (PRMSE) and promoting mean absolute percentage error (PMAPE) were used in the experiments; the equations are shown in (17)–(20):

PMAE = \frac{MA E_{1} - MA E_{2}}{MA E_{1}}

(17)

PMAPE = \frac{MAP E_{1} - MAP E_{2}}{MAP E_{1}}

(18)

PRMSE = \frac{RMS E_{1} - RMS E_{2}}{RMS E_{1}}

(19)

PMSE = \frac{MS E_{1} - MS E_{2}}{MS E_{1}}

(20)

4.3. Model Training

4.3.1. Hyper-Parameters Setting

In order to optimize the performance of the Bi-ConvLSTM-CA model, the grid search method was used to optimize the hyper-parameters of the model. The optimal learning algorithm was obtained by optimizing the main model training parameters through the cross-validation method. The parts that need to be optimized are: Bi-ConvLSTM network layer, CNN network layer and training parameters. The specific hyper-parameter settings of each part are shown in Table 1.

4.3.2. Bi-ConvLSTM-CA Model Training

The input data of the model include pitch angle, turning angle, relative wind speed, relative wind direction, velocity in surge and velocity in sway, and the output is pitch angle. Six kinds of variables at the 12-s sampling time are input, and the output is a 1-s pitch angle. The dimension of the input data is

6 \times 3 \times 1

. The mini-batch gradient descent method with Adam optimizer was used to iteratively update the parameters to minimize the error between the predicted values and the label values, and achieve the purpose of continuous prediction in a sliding window manner. Therefore, ship motion prediction becomes a regression task. MSE was used as the loss function of the model. The expression of the objective function is:

J = \frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}

(21)

where n is the number of samples,

y_{i}

is the real value,

{\hat{y}}_{i}

is the predicted value.

Figure 10 shows changes of loss function values during training of different deep learning models.

In Figure 10, the proposed model training is divided into three parts: Bi-ConvLSTM parameter training, CNN parameter training and attention layer parameter training in CA. The loss value of Bi-ConvLSTM-CA model decreased the fastest, which indicates that it converged fastest on the training data.

4.4. Prediction Results of Ship Pitch Angle

In order to compare the models’ prediction effect better, this section will display pitch prediction values of models and real pitch values with graphs and tables. Figure 11 shows the prediction results of ship pitch angle of the Bi-ConvLSTM-CA model. In Figure 11, the back solid line represents the real pitch angle, and the red solid line represents the predicted pitch angle by the Bi-ConvLSTM-CA model. In the peak region, the Bi-ConvLSTM-CA model is able to accurately predict the real pitch angle. Figure 11a shows the ship pitch angle prediction results of the Bi-ConvLSTM-CA model, showing the change of ship pitch angle within 2000 s. Figure 11b shows the ship pitch angle prediction results of Bi-ConvLSTM-CA model in 800–1000 s, and it can be seen from Figure 11b that the Bi-ConvLSTM-CA prediction curve well fits the real ship pitch angle curve. When the direction of pitch angle changes, the Bi-ConvLSTM-CA model can capture the change of the real ship pitch angle. Even near the extreme value, between 925 s and 975 s, it can perfectly track the change of the ship pitch angle.

The Bi-ConvLSTM-CA prediction curve has a good tracking effect on the real ship pitch angle when the ship pitch angle curve is in the region of consistent rise or consistent decline, such as 825–925 s or 950–1000 s.

Figure 12 shows the comparison of the prediction results between the Bi-ConvLSTM-CA model and the other 9 models. The solid lines with different colors represent the prediction results of different models.

As shown in Figure 11 and Figure 12, the Bi-ConvLSTM-CA model can fit the real data well. Figure 12a shows the ship pitch angle prediction results of all models, showing the change of ship pitch angle within the 2000 s. Figure 12b shows the ship pitch angle prediction results of all models in the 700 s–1000 s. In the test set, the pitch angle within 2000 s changes rapidly, and can be divided into four parts according to the change frequency: 0–750 s, 750–1000 s, 1000–1750 s and 1750–2000 s. During the 0–750 s time period, the pitch change range is relatively gentle but the change is fast, and most models can fit the real data well. When reaching the peak for the first time (around 780 s), the prediction curve of the Bi-ConvLSTM-CA model almost coincides with the real data, but the other models have larger errors. When approaching the 1000 s, the amplitude of the pitch angle began to change rapidly, the randomness was also large, and the peak began to gradually decrease. As shown in Figure 12, in 1000–1750 s, ConvLSTM, GRU, LSTM, Bi-LSTM, Bi-LSTM-TPA and Bi-LSTMC do not predict well, while the CNN-LSTM, GBRT, RFR and Bi-ConvLSTM- CA models predict better. There are two parts of the pitch curve in the test set that show a downward trend: 750–1000 s and 1750–2000 s. All models predict well in the downtrend period, and only the Bi-ConvLSTM-CA model predicts the best around the peak at the end of the part. Between the 1750–2000 s, GRU, ConvLSTM, CNN-LSTM, Bi-LSTM-TPA and Bi-LSTMC have poor prediction effects, and the remaining four models have better prediction effects.

Figure 13 shows the prediction errors of the 10 models. The prediction error value is

e r r o r = \hat{y} - y

, where,

y

is the real value,

\hat{y}

is the predicted value.

In Figure 13, the prediction error of the GBRT model is in the range of −0.1940 rad to 0.2483 rad; the prediction error of the RFR model is in the range of −0.2192 rad to 0.2289 rad; the prediction error of the GRU model is in the range of −0.764 rad to 0.327 rad; the prediction error of the LSTM model is in the range of −0.3253 rad to 0.2748 rad; the prediction error of Bi-LSTM model is in the range of −0.4077 rad to 0.1477 rad; the prediction error of CNN-LSTM model is in the range of −0.5320 rad to 0.1905 rad; the prediction error of ConvLSTM model is in the range of −0.7432 rad to 0.1192 rad; the prediction error of the Bi-LSTM-TPA model is in the range of −0.4021 rad to 0.2667 rad; the prediction error of the Bi-LSTMC model is in the range of −0.2672 rad to 0.4939 rad; the prediction error of Bi-ConvLSTM-CA model is in the range of −0.1452 rad to 0.0796 rad. It can be concluded that the Bi-ConvLSTM-CA model has the smallest prediction error.

To further explore the prediction performance of the model, MSE, MAE, MAPE and RMSE were used to evaluate the prediction effect of all models. Table 2 lists the prediction performance index results of each model.

MSE and RMSE measure the deviation between predictive pitch values and real pitch values, and are sensitive to outliers in pitch. MAE is the mean of the absolute error of predictive pitch values and real pitch values, and can better reflect the actual situation of prediction error. MAPE reflects the proportion of predictive pitch values and real pitch values, and is suitable for measuring data with a large degree of deviation. The smaller these indexes are, the better the prediction performance of the model is. In Table 2, the MSE, MAPE, MAE and RMSE of Bi-ConvLSTM-CA model are 1.995 × 10⁻³, 4.440%, 0.03603 and 0.04466, respectively. The MSE of ConvLSTM model is 0.04795, which is the highest value of all models. The MAPE and MAE of ConvLSTM are 14.61% and 0.2190, respectively, and those of LSTM are 11.72% and 0.01012, respectively. The MSE, MAPE and MAE of Bi-LSTM-TPA model are 0.01672, 13.96% and 0.08948, respectively, and those of Bi-LSTMC are 0.02853, 19.13% and 0.1367. Four error indexes of Bi-LSTM-TPA are lower than Bi-LSTMC’s, which indicates that the attention machine is better than CNN for ship pitch prediction. The MSE, MAPE and MAE of CNN-LSTM model are 7.060 × 10⁻³, 8.194% and 0.0678, respectively, and those of Bi-LSTM model are 0.01502, 11.48%, and 0.09344, respectively. The four error indexes of CNN-LSTM are lower than Bi-LSTM’s, which indicates that CNN is better than reverse LSTM for ship pitch prediction. Compared with other models, the prediction error of the Bi-ConvLSTM-CA model is the smallest in the four error indexes, indicating that the prediction effect of the Bi-ConvLSTM-CA model is the best.

In order to better compare the prediction effects of the models, Table 3 lists the PMAE, PMSE, PRMSE and PMAPE predicted by the other nine models relative to Bi-ConvLSTM-CA. The promotion percentage indexes describe the difference between the two models on the same error indexes.

In Table 3, compared with the ConvLSTM model, the MSE index of the Bi-ConvLSTM-CA model decreased by 95.84%, the MAPE index decreased by 69.61%, the MAE index decreased by 74.23%. The Bi-ConvLSTM-CA model has the best performance in MSE, MAPE and MAE, since Bi-ConvLSTM can extract spatial and temporal features from forward and reverse, and CA can focus on extracting deep features beneficial for prediction. The prediction data and statistical results of real ship data show that the Bi-ConvLSTM-CA model has good prediction performance. The prediction effect is more accurate, stable and reliable than the other models. The prediction of ship motion in advance can provide a control basis for controlling the ship in advance to ensure the safety of ship navigation.

5. Conclusions

Aiming at the nonlinearity and high randomness of ship pitch changes, a Bi-ConvLSTM-CA model for ship pitch prediction was proposed and achieved good prediction results. The Bi-ConvLSTM-CA model can simultaneously extract temporal and spatial features from multivariate data, and can extract the forward and reverse time step features of ship motion data from two directions, and the CNN model will reduce the output channels of the Bi-ConvLSTM network, extract important information and reduce the amount of parameters. Before the fully connected layer, all the training data in the neural network are trained with three-dimensional data, and the channel attention module can effectively capture the more important channels. Experiments are carried out with real ship data, and the experimental results show that the proposed Bi-ConvLSTM-CA model has better prediction accuracy than other prediction algorithms. In the case of mastering the ship motion and other data, the model proposed in this paper can effectively predict the short-term pitch angle in the future, thereby improving safety and stability of the ship’s navigation. Single-step ship pitch angle prediction is discussed in this paper, and multi-step ship pitch angle prediction will be studied in the future.

Author Contributions

Conceptualization, H.F. and Y.W.; methodology, H.F., Z.G. and Y.W.; software, H.F. and Z.G.; validation, H.F. and Z.G.; writing—original draft preparation, H.F. and Z.G.; writing—review and editing, H.F., Z.G. and Y.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Natural Science Foundation of China, grant number 52071112.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study.

Abbreviations

The following abbreviations are used in this manuscript:

Bi-ConvLSTM	Bidirectional convolutional long short-term memory neural network
MAPE	Mean absolute percentage error
MSE	Mean square error
MAE	Mean absolute error
RNN	Recurrent neural network
AR	Auto Regressive
ARMA	Auto Regressive Moving Average
ARIMA	Auto Regressive Integrated Moving Average
EMD	Empirical mode decomposition
DWT	Discrete wavelet transform
SVM	Support vector machines
RFR	Random forest regression
MS	Mirror symmetry
SVR	support vector regression
LS	least squares
LSSVM	Least squares support vector machines
CAEFOA	Chaos Adaptive Efficient Fruit Fly Optimization Algorithm
XGBoost	Extreme gradient boosting
RBF	Radial basis function
PSO	Particle swarm algorithm
LSTM	Long Short-Term Memory
GPR	Gaussian Process Regression
IRF	Impulse Response Function
ACF	Auto-correlation Function
CNN	Convolutional neural network
GRU	Gate Recurrent Unit
GCWOA	Genetic cloud whale optimization algorithm
MTS	Multivariate time series
2D	Two Dimensional
3D	Three Dimensional
CBAM	Convolutional Block Attention Module
VGG	Visual Geometry Group
IES	Integrated energy system
Relu	Rectified Linear Units
FC-LSTM	Fully connected LSTM
Bi-LSTM-TPA	Bidirectional long short-term memory network and temporal pattern attention mechanism
PMAE	Promoting mean absolute error
PMSE	Promoting mean square error
PRMSE	Promoting root mean square error
PMAPE	Promoting mean absolute percentage error

References

Chang, B.C. On the parametric rolling of ships using a numerical simulation method. Ocean Eng. 2008, 35, 142–149. [Google Scholar] [CrossRef]
Takami, T.; Tomoki, U.D.; Nielsen, J.J. Real-time deterministic prediction of wave-induced ship responses based on short-time measurements. Ocean Eng. 2021, 221, 108503. [Google Scholar] [CrossRef]
Sharov, S.N.; Tolmachev, S.G. A decision making algorithm for an integrated system of UAV landing on the moving ship gripper. In Proceedings of the 22nd Saint Petersburg International Conference on Integrated Navigation Systems (ICINS 2015), Saint Petersburg, Russia, 25–27 May 2015; pp. 41–44. [Google Scholar]
Sharov, S.N.; Tolmachev, S.G. Prediction of the gripping device position in case of UAV landing on a moving ship in the con-ditions of ship motions. In Proceedings of the 20th Saint Petersburg International Conference on Integrated Navigation Sys-tems (ICINS 2013), Saint Petersburg, Russia, 27–29 May 2013; pp. 256–259. [Google Scholar]
Cheng, X.; Li, G.; Skulstad, R.; Major, P.; Chen, S.; Hildre, H.P.; Zhang, H. Data-driven uncertainty and sensitivity analysis for ship motion modeling in offshore operations. Ocean Eng. 2019, 179, 261–272. [Google Scholar] [CrossRef]
Kaplan, P. A study of prediction techniques for aircraft carrier motions at sea. J. Hydronautics 1969, 3, 121–131. [Google Scholar] [CrossRef]
Peng, X.Y.; Zhao, X.R.; Wei, N.X.; Xie, N. AR algorithm for extremely short-term prediction of large ship’s motion. Ship Eng. 2001, 5, 5–7. [Google Scholar]
Chen, Y.M.; Ye, J.W.; Zhang, X.L. Experiment of extremely short-term prediction of ship motion. Ship Ocean Eng. 2010, 39, 13–15. [Google Scholar]
Wang, X.; Tong, M.; Du, L. Multi-step Prediction AR Model of Ship Motion Based On Constructing and Correcting Error. In Proceedings of the 2018 IEEE CSAA Guidance, Navigation and Control Conference (CGNCC), Xiamen, China, 10–12 August 2018; pp. 1–4. [Google Scholar]
Qin, S.Q.; Wu, W. A hybrid AR-DWT-EMD model for the short-term prediction of nonlinear and non-stationary ship motion. In Proceedings of the 2016 Chinese Control and Decision Conference (CCDC), Yinchuan, China, 28–30 May 2016; pp. 4042–4047. [Google Scholar]
Wang, W.C.; Qin, S.Q.; Wu, W.; Zheng, J.X. Prediction of ship pitch motion by dual autoregressive model. In Proceedings of the 27th Chinese Control and Decision Conference (CCDC), Qingdao, China, 23–25 May 2015; pp. 4046–4849. [Google Scholar]
Pelevin, A.E. Prediction of ship deck inclination angle. Gyroscopy Navig. 2017, 8, 165–171. [Google Scholar] [CrossRef]
Nie, Z.; Shen, F.; Xu, D.; Li, Q. An EMD-SVR model for short-term prediction of ship motion using mirror symmetry and SVR algorithms to eliminate EMD boundary effect. Ocean Eng. 2020, 217, 107927. [Google Scholar] [CrossRef]
Nie, Z.; Yue, Y.; Xu, D.; Feng, S. Research on Support Vector Regression Model Based on Different Kernels for Short-term Prediction of Ship Motion. In Proceedings of the 12th International Symposium on Computational Intelligence and Design (ISCID), Hangzhou, China, 14–15 December 2019; pp. 61–64. [Google Scholar]
Wang, W.; LI, M. A short-time prediction method of ship motion attitude based on EEMD-LSSVM. Int. J. Sci. 2020, 7, 66–74. [Google Scholar]
Li, M.W.; Geng, J.; Han, D.F.; Zheng, D.F. Ship motion prediction using dynamic seasonal RvSVR with phase space reconstruction and the chaos adaptive efficient FOA. Neurocomputing 2016, 174, 661–680. [Google Scholar] [CrossRef]
Jain, G.; Prasad, R.R. Machine learning, Prophet and XGBoost algorithm: Analysis of Traffic Forecasting in Telecom Networks with time series data. In Proceedings of the 8th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO), Noida, India, 4–5 June 2020; pp. 893–897. [Google Scholar]
Kombo, O.H.; Kumaran, S.; Sheikh, Y.; Bovim, A.; Jayavel, K. Long-Term Groundwater Level Prediction Model Based on Hybrid KNN-RF Technique. Hydrology 2020, 7, 35–59. [Google Scholar] [CrossRef]
Xiao, Y.; Tian, X.; Xiao, M. Tourism Traffic Demand Prediction Using Google Trends Based on EEMD-DBN. Engineering 2020, 12, 194–215. [Google Scholar] [CrossRef] [Green Version]
Xin, B.; Peng, W. Prediction for Chaotic Time Series-Based AE-CNN and Transfer Learning. Complex 2020, 2020, 2680480. [Google Scholar] [CrossRef]
Widiasari, I.R.; Nugroho, L.E. Deep learning multilayer perceptron (MLP) for flood prediction model using wireless sensor network based hydrology time series data mining. In Proceedings of the 2017 International Conference on Innovative and Creative Information Technology (ICITech), Salatiga, Indonesia, 2–4 November 2017; pp. 1–5. [Google Scholar]
Sun, Y. An approach to ship behavior prediction based on AIS and RNN optimization model. Int. J. Transp. Eng. Technol 2020, 6, 16–21. [Google Scholar]
Yin, J.C.; Perakis, A.N.; Wang, N. A real-time ship roll motion prediction using wavelet transform and variable RBF network. Ocean Eng. 2018, 160, 10–19. [Google Scholar] [CrossRef]
Yang, G.; Qin, M.; Jie, Q.; Tao, N.Q. Prediction of ship motion attitude based on BP network. In Proceedings of the 29th Chinese Control and Decision Conference (CCDC), Chongqing, China, 28–30 May 2017; pp. 1596–1600. [Google Scholar]
Li, G.; Kawan, B.; Wang, H.; Zhang, H. Neural-network-based modelling and analysis for time series prediction of ship motion. Ship Technol. Res. 2017, 64, 30–39. [Google Scholar] [CrossRef] [Green Version]
Yao, Y.; Han, L.; Wang, J. LSTM-PSO: Long Short-Term Memory Ship Motion Prediction Based on Particle Swarm Optimization. In Proceedings of the 2018 IEEE CSAA Guidance, Navigation and Control Conference (CGNCC), Xiamen, China, 10–12 August 2018; pp. 1–5. [Google Scholar]
Zhang, T.; Zheng, X.; Liu, M. Multiscale attention-based LSTM for ship motion prediction. Ocean Eng. 2021, 230, 109066. [Google Scholar] [CrossRef]
Sun, Q.; Tang, Z.; Gao, J.; Zhang, G. Short-term ship motion attitude prediction based on LSTM and GPR. Appl. Ocean Res. 2022, 118, 102927. [Google Scholar] [CrossRef]
Su, Y.; Lin, J.; Zhao, D.; Guo, C.; Wang, C.; Guo, H. Real-Time Prediction of Large-Scale Ship Model Vertical Acceleration Based on Recurrent Neural Network. J. Mar. Sci. Eng. 2021, 8, 777. [Google Scholar] [CrossRef]
Liu, Y.; Duan, W.; Huang, L.; Duan, S.; Ma, X. The input vector space optimization for LSTM deep learning model in real-time prediction of ship motions. Ocean Eng. 2020, 213, 107681. [Google Scholar] [CrossRef]
Rashid, M.H.; Zhang, J.; Zhao, M. Real-Time Ship Motion Forecasting Using Deep Learning. In Proceedings of the 2nd International Conference on Computing and Data Science, Stanford, CA, USA, 28–30 January 2021; pp. 1–5. [Google Scholar]
Li, M.W.; Xu, D.Y.; Geng, J.; Hong, W.C. A hybrid approach for forecasting ship motion using CNN-GRU-AM and GCWOA. Appl. Soft Comput. 2022, 114, 108084. [Google Scholar] [CrossRef]
Zhang, S.; Wang, L.; Zhu, M.; Chen, S.; Zhang, H.; Zeng, Z. A Bi-directional LSTM Ship Trajectory Prediction Method based on Attention Mechanism. In Proceedings of the IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chongqing, China, 12–14 March 2021; pp. 1987–1993. [Google Scholar]
Kim, T.Y.; Cho, S.B. Predicting residential energy consumption using CNN-LSTM neural networks. Energy 2020, 182, 72–81. [Google Scholar] [CrossRef]
Mehtab, S.; Sen, J. Analysis and Forecasting of Financial Time Series Using CNN and LSTM-Based Deep Learning Models. In Advances in Distributed Computing and Machine Learning; Sahoo, J.P., Tripathy, A.K., Mohanty, M., Li, K.C., Nayak, A.K., Eds.; Lecture Notes in Networks and Systems; Springer: Singapore, 2022; Volume 302, pp. 405–423. [Google Scholar]
Xiao, Y.; Yin, H.; Zhang, Y.; Qi, H.; Zhang, Y.; Liu, Z. A dual-stage attention-based Conv-LSTM network for spatio-temporal correlation and multivariate time series prediction. Int. J. Intell. Syst. 2021, 36, 2036–2057. [Google Scholar] [CrossRef]
Choi, H.; Cho, K.; Bengio, Y. Fine-grained attention mechanism for neural machine translation. Neurocomputing 2018, 284, 171–176. [Google Scholar] [CrossRef] [Green Version]
Usama, M.; Ahmad, B.; Song, E.; Hossain, M.S. Mubarak Alrashoud; Muhammad Ghulam. Attention-based sentiment analysis using convolutional and recurrent neural network. Future Gener. Comput. Syst. 2020, 113, 571–578. [Google Scholar] [CrossRef]
Woo, S.; Park, J.; Lee, J.Y. Cbam: Convolutional block attention module. In Proceedings of the 2018 European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
Wang, S.; Fernandes, S.; Zhu, Z.; Zhang, Y.D. AVNC: Attention-based VGG-style network for COVID-19 diagnosis by CBAM. IEEE Sens. J. 2021. [Google Scholar] [CrossRef]
Lu, X.; Chang, E.; Liu, Z.; Hsu, C.N.; Du, J.; Gentili, A. ImageCLEF2020: Laterality-Reduction Three-Dimensional CBAM-Resnet with Balanced Sampler for Multi-Binary Classification of Tuberculosis and CT Auto Reports. In Proceedings of the 11th Conference and Labs of the Evaluation Forum (CLEF2020), Thessaloniki, Greece, 22–25 September 2020; p. 2696. [Google Scholar]
Li, S.; Xie, G.; Ren, J.; Guo, L.; Yang, Y.; Xu, X. Urban PM2.5 Concentration Prediction via Attention-Based CNN–LSTM. Appl. Sci. 2020, 10, 1953. [Google Scholar] [CrossRef] [Green Version]
Wu, K.; Wu, J.; Feng, L.; Yang, B.; Liang, R.; Ren, Y.; Zhao, R. An attention-based CNN-LSTM-BiLSTM model for short-term electric load forecasting in integrated energy system. Int. Trans. Electr. Energy Syst. 2021, 31, e12637. [Google Scholar] [CrossRef]
Zhu, C.; Liu, Q.; Meng, W.; Ai, Q.; Xie, S.Q. An Attention-Based CNN-LSTM Model with Limb Synergy for Joint Angle Prediction. In Proceedings of the 2021 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM), Aula Conference Centre of TU Delft, Delft, The Netherlands, 12–16 July 2021; pp. 747–752. [Google Scholar]
Shi, X.; Chen, Z.; Wang, H.; Yeung, D.Y.; Wong, W.K.; Woo, W.C. Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting. In Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal, QC, Canada, 8–9 December 2015; pp. 802–810. [Google Scholar]
LeCun, Y.; Bottou, L.; Bengio, Y. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef] [Green Version]
Shiv, R.D.; Satish, K.S.; Bidyut, B.C. A comprehensive survey and performance analysis of activation functions in deep learning. Arxiv 2021, arXiv:2109.14545. [Google Scholar]
Srivastava, N.; Mansimov, E.; Salakhutdinov, R. Unsupervised learning of video representations using lstms. In Proceedings of the 32nd International Conference on Machine Learning (ICML 2015), Lille, France, 6–11 July 2015; pp. 843–852. [Google Scholar]
Wang, Y.; Wang, H.; Zou, D.; Fu, H. Ship Roll Prediction Algorithm Based on Bi-LSTM-TPA Combined Model. J. Mar. Sci. Eng. 2021, 9, 387. [Google Scholar] [CrossRef]
Wang, Y.; Wang, H.; Zhou, B.; Fu, H. Multi-dimensional prediction method based on Bi-LSTMC for ship roll. Ocean Eng. 2021, 242, 110106. [Google Scholar] [CrossRef]

Figure 1. Schematic diagram of 2D convolution operation.

Figure 2. Schematic diagram of multi-channel convolution operations.

Figure 3. Schematic diagram of pooling operation.

Figure 4. ConvLSTM Cell structure.

Figure 5. Bidirectional ConvLSTM network structure.

Figure 6. Channel attention model structure.

Figure 7. Flowchart of feature data transformation.

Figure 8. Ship pitch angle prediction based on Bi-ConvLSTM-CA model.

Figure 9. Partial raw data of the ship motion.

Figure 10. Training error plot of each deep learning model.

Figure 11. Ship pitch angle prediction results of Bi-ConvLSTM-CA model. (a) Ship pitch angle prediction results of Bi-ConvLSTM-CA model in 0–2000 s; (b) Ship pitch angle prediction results of Bi-ConvLSTM-CA model in 810–1000 s.

Figure 12. Ship pitch angle prediction results of 10 different models. (a) Ship pitch angle prediction results of 10 different models in 0–2000 s; (b) Ship pitch angle prediction results of 10 different models in 700–1000 s.

Figure 13. Prediction errors of various models.

Table 1. Parameters setting table of each part of Bi-ConvLSTM-CA network.

Network Layer	Parameter Setting
Bi-ConvLSTM Layer_1	filters = 40, kernel size = (2, 4)
Bi-ConvLSTM Layer_2	filters = 40, kernel size = (2, 4)
Conv2D Layer	filters = 4, kernel size = (2, 3)
AvgPooling Layer	pool size = (2, 2)
Attention Layer	units = 12
Training Part	batch size = 64, epochs = 80

Table 2. Prediction performance indexes of each model.

Model	$MSE (10^{- 3})$	MAE	MAPE	RMSE
GBRT	4.625	0.0466	6.288	0.0680
RFR	3.867	0.0494	8.052	0.0622
CNN-LSTM	7.060	0.0678	8.194	0.0840
GRU	37.94	0.1220	13.49	0.1947
ConvLSTM	47.95 *	0.1398 *	14.61	0.2190 *
LSTM	10.24	0.07265	11.72	0.1012
Bi-LSTM	15.02	0.09344	11.48	0.1226
Bi-LSTM-TPA	16.72	0.08948	13.96	0.1293
Bi-LSTMC	28.53	0.1367	19.13 *	0.1689
Bi-ConvLSTM-CA	1.995 *	0.03603 *	4.440 *	0.04466 *

The numbers marked with asterisks in the table are the maximum or minimum values of the indexes.

Table 3. Promotion percentage of Bi-ConvLSTM-CA compared with other models.

Model	PMSE (%)	PMAE (%)	PMAPE (%)	PRMSE (%)
GBRT	56.86	22.68 *	29.39 *	34.32
RFR	48.41 *	27.06	44.86	28.20 *
CNN-LSTM	71.74	46.86	45.81	46.83
GRU	94.74	70.47	67.09	77.06
ConvLSTM	95.84 *	74.23 *	69.61	79.61 *
LSTM	80.52	50.40	62.11	55.87
Bi-LSTM	86.72	61.44	61.32	63.57
Bi-LSTM-TPA	88.06	59.73	68.48	65.43
Bi-LSTMC	93.00	73.63	76.79 *	73.56

The numbers marked with asterisks in the table are the maximum or minimum values of the indexes.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Fu, H.; Gu, Z.; Wang, Y. Ship Pitch Prediction Based on Bi-ConvLSTM-CA Model. J. Mar. Sci. Eng. 2022, 10, 840. https://doi.org/10.3390/jmse10070840

AMA Style

Fu H, Gu Z, Wang Y. Ship Pitch Prediction Based on Bi-ConvLSTM-CA Model. Journal of Marine Science and Engineering. 2022; 10(7):840. https://doi.org/10.3390/jmse10070840

Chicago/Turabian Style

Fu, Huixuan, Zhiqiang Gu, and Yuchao Wang. 2022. "Ship Pitch Prediction Based on Bi-ConvLSTM-CA Model" Journal of Marine Science and Engineering 10, no. 7: 840. https://doi.org/10.3390/jmse10070840

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Ship Pitch Prediction Based on Bi-ConvLSTM-CA Model

Abstract

1. Introduction

2. Neural Network Feature Extraction Structure

2.1. Convolutional Neural Network

2.1.1. Convolutional Layer

2.1.2. Activation and Pooling Layers

2.2. Recurrent Neural Network

2.2.1. ConvLSTM Structure

2.2.2. Bi-ConvLSTM Structure

2.3. Channel Attention Machine

3. Bi-ConvLSTM-CA Model for Ship Pitch Prediction

3.1. Supervised Learning Data Processing

3.2. Bi-ConvLSTM-CA Prediction Model Structure for Ship Pitch Angle

4. Experimental Results and Analysis of Ship Pitch Prediction

4.1. Data Preprocessing and Experimental Environment Configuration

4.2. Performance Evaluation Indexes

4.3. Model Training

4.3.1. Hyper-Parameters Setting

4.3.2. Bi-ConvLSTM-CA Model Training

4.4. Prediction Results of Ship Pitch Angle

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI