1. Introduction
To help maritime supervisors track ships and ensure navigation safety, the Interna-tional Maritime Organization (IMO) requires that ships with a gross tonnage of more than 300, or ships with a cargo capacity of more than 500 gross tons and non-international voyage cargo ships to be equipped with automatic identification system (AIS) [
1,
2]. Meanwhile, with the increase of the AIS data scale and the continuous development of artificial intelligence technology in recent years [
3,
4], ships’ intelligence and behavioral autonomy have been significantly improved [
5,
6]. However, the level of intelligence of the existing AISs is far from meeting the maritime management requirements. The ship trajectory prediction is vital to the intelligence of an AIS, and the higher the accuracy of ship trajectory prediction using AIS data, the more sufficient the response space and time to avoid accidents of ship collision [
7,
8].
This paper focuses on the ship trajectory prediction problem. In the past, ship path prediction relied on mathematical models such as the one proposed by Sutulo [
9], the structure of the generic maneuvering mathematical model leads naturally to two basic approaches based on dynamic and purely kinematic prediction models. An analytical scheme for the short-term kinematic prediction accounting for current values of accelerations is proposed. However, the mathematical model requires a large amount of input parameters, such as ship shape, current, wind direction, maneuvering, etc. It is difficult to obtain all the input parameters data needed. On the other hand, the inference-based trajectory prediction methods, such as Markov Chain [
10], based on the hidden Markov model (HMM), a spatio-temporal predictor, and a next-place predictor are proposed. Living habits are analyzed in terms of entropy, upon which users are clustered into distinct groups. They are subjected to unbiased statistics, resulting in poor scalability. Recently, neural network technologies, such as Recurrent Neural Network (RNN) [
11] and Long Short-Term Memory Networks [
12], based on the route data, a prediction algorithm such as LSTM (Long Short-Term Memory) recurrent neural network used to realize the prediction of the ship’s navigation trajectory showed good performance in trajectory prediction in cases where sufficient samples are available. However, most of these techniques focus only on the optimization of the methods, the density of ships and the density of ship routes are not considered. According to the distribution of real-time ship trajectory data, not only there is a great difference in ship density between offshore areas with high vessel density and open sea with low ship density [
13], the human factors also increased the complexity of the ship trajectory prediction, especially in offshore areas. The traditional machine learning algorithm can be used to predict the ship’s navigation trajectory. However, there are limitations in both accuracy and flexibility, as ship trajectory prediction is quite different in the offshore sea and open seas.
This paper proposed a new ship trajectory prediction method based on a neural network. The main innovation is to embed the optimized algorithm into the discriminant learning method, which combined the optimized KNN algorithm with neural network and LSTM neural network (Long Short-Term Memory network) to predict the ship trajectory in the open sea area when ship density is low. However, in offshore areas where ship density is high, current methods based on distance-trajectory similarity do not fully consider the speed characteristics of ship trajectories. Existing methods do not measure distance for the spherical characteristics of the nautical domain, resulting in less accurate measurement results, this paper used a new similarity distance formula in the KNN algorithm to predict ship tracks. As a result, the influence caused by different characteristics of trajectory data in trajectory prediction can be eliminated effectively.
The main contributions of this paper are as follows: (1) In view of the poor performance of the traditional KNN algorithm in low-density areas, the sea areas where the ships travel are divided according to the density of ships, and different trajectory prediction methods are adopted in sea areas with different vessel densities to avoid the influence of different trajectory data characteristics on prediction accuracy as far as possible. (2) The similarity distance formula in the traditional KNN algorithm is optimized to solve the problem that the effect of the KNN algorithm is not good because the Euclidean distance is not applicable to the similarity measurement between ship tracks, and further improves the prediction results in the sea area with large ship density. (3) The improved KNN algorithm and LSTM neural network are used to predict different ship density areas, respectively, to solve the problem of LSTM’s reduced prediction effect caused by insufficient data.
The remainder of the paper is organized as follows. In
Section 2 we discuss related work, in
Section 3 we describe our algorithm, in
Section 4 we illustrate the experiments we did to test the algorithm. Finally, in
Section 5 we give our conclusions and future work.
3. The Proposed Method
The model’s architecture in this paper is divided into two layers according to various ship densities, ship trajectory prediction in the offshore area, and ship trajectory prediction in the open sea area. From the observations of the offshore area, the data on single ships are small in size, but the overall number of ships is high. Furthermore, the ships influence each other in their trajectories. Thus, the classification method can be used for this prediction. When the ship enters the open sea, due to the change in the density of ships in the sea and the density of routes, the previous forecasting methods can no longer achieve the desired results all the time. Therefore, a new trajectory prediction method is proposed.
The overall structure of the model in this paper is shown in
Figure 1. The first step is to preprocess AIS data to remove null values inside, and divide the data according to the sea areas [
37]. Then, the optimized KNN algorithm is used to separate sea areas into offshore sea areas, and distant sea area by ship density, the details of the sea area division are described in
Section 3.2. After selecting the label and characteristic data, the final optimal hyperparameters are obtained by the retention method. The KNN algorithms obtain the classification of ships, the KNN algorithm is described in detail in
Section 3.2. For far seas, it is necessary to serialize the preprocessed data and divide the dataset first, before training and predicting the LSTM neural network. The LSTM neural network is described in detail in
Section 3.3.
3.1. Trajectory Prediction Method for the Offshore Areas
Due to the high concentration of ships in an offshore area, the classification approach is to classify ships using their characteristics, predicting the ship’s following positions according to the ships’ positions of the same type. While the traditional KNN algorithm is to classify the trajectory points of the ship, this paper uses the KNN algorithm to classify the characteristics of ships, which can reduce the influence of other factors on the ship’s navigation trajectory.
The main idea of the KNN algorithm is that if most K closest samples in the feature space belong to a certain category, the samples also belong to that category, and have the same characteristics as samples in this category [
38,
39,
40]. The KNN method mainly depends on the surrounding limited adjacent samples to determine the category, rather than the method of discriminating the class domain. Therefore, the KNN method is more suitable than the other methods for classifying sample sets with overlapping class domains.
The KNN algorithm that predicts the ship trajectory uses Euclidean distance as sim-ilarity distance and classification standard. However, many features were included dur-ing the classification of ship trajectory, such as latitude and longitude, velocity to earth, etc. Since the calculation of the ship distance is not the calculation of the straight-line distance on the plane, but the calculation of the spherical distance, the calculation of the Euclidean distance is prone to overfitting, so Euclidean distance, Frechet distance, Manhattan distance, etc., are no longer applicable. Therefore, the similarity distance in the KNN algorithm needs to be optimized, the relationships between ship position and speed are integrated, and the dynamic weights are allocated, so that the optimized KNN algorithm can be better applied to the ship trajectory classification.
Based on the above ideas, the KNN algorithm is obtained and the input training data set is given as the Formula (1).
The instance eigenvector of
n dimension as the Formula (2).
This formula is the category of instances, where i = 1, 2, 3 … n, prediction instance x.
Output category Y to which prediction instance X belongs.
Distance equation: The Euclidean distance adopted by the KNN algorithm cannot measure the similarity between ship tracks and the actual movement of ships, its earth-moving velocity has a great influence on ship tracks, and Euclidean distance does not involve this factor.
The distance measurement adopted in this paper is as the Formula (3)
In Formula (3), a is the weight value. In the new similarity distance, a is used as the hyperparameter value and the new similarity distance uses the idea of weighted voting, so a more reasonable weight value can lead to better prediction results.
Assume that the latitude of point A is
lat1, and the longitude is
lon1. The latitude of point B is
lat2, the longitude is
lon2, and the radius of the Earth is
R. Before finding this angle, first convert the coordinates to a point in the Cartesian space coordinate system, and let the center of the Earth be the coordinate center point, The coordinates of points A and B after the transformation are shown in Formulas (4) and (5).
Then, calculate the angle and use the vector angle calculation method to find the cosine value of the angle, let A be (
x1,
y1,
z1), B (
x2,
y2,
z2), and the formula is as the Formula (6).
Substitute the latitude and longitude coordinates as the Formula (7)
Let
be the actual distance between two ships, and further find the arc length (distance between ships), the formula is as the Formulas (8) and (9).
the specific calculation formula is as the Formula (10).
Let
be the difference in ground speed between two ships, which is computed as the Formula (11).
The parameter description is shown in
Table 1.
The steps of the method to predict the ship trajectory based on the KNN algorithm are as follows.
Step 1: prepare and preprocess data;
Step 2: calculate the similarity distance between the test sample point and every other sample point;
Step 3: sort all distances and select k points with the smallest similarity distance;
Step 4: compare the categories in which K track points belong, and classify the test sample points into the category with the highest proportion among k points according to classification decision rules;
Step 5: replace the next position of the ship to be predicted with the track point of a similar ship;
3.2. Division of the Sea Areas
In a traditional trajectory prediction model, for the coastal area with a high density of ships, the trajectory of one ship is affected by other ships. For example, the mutual blocking and collision avoidance between ships will affect the trajectory of the ship’s navigation. Therefore, the trajectory prediction error can be large if only the ship is considered. Furthermore, in offshore waters, the initial amount of ship data (such as latitude and longitude information, etc.) may also be insufficient to support the training part of the neural network model. Therefore, in offshore waters, using the LSTM neural network method for prediction may not achieve good results.
Due to the limitation of the number of relevant samples in the classification, the prediction accuracy will decrease as the ship density decreases. In addition, the ship track with low ship density is less affected by other ships. In open sea area with low ship density, the continuous use of the offshore area trajectory prediction method for ship track domain measurement will lead to poor classification effects. Therefore, according to the density of ship distribution, the method in this paper needs to choose the KNN algorithm to predict the peak time as the point at which the sea borders ship prior to this point in time, defined as the offshore waters, is optimized by KNN algorithm.
The steps of the method to predict the ship trajectory based on the KNN algorithm are as follows.
Step 1: Prepare and preprocess data;
Step 2: Select the trajectory data of the ship sailing from offshore to offshore in the experimental data set, and get the trajectory data of the surrounding ships in different time intervals corresponding to the trajectory.
Step 3: Experiment with the data of different time intervals obtained in the first step separately using the KNN algorithm, and optimize them using the leave-one-out method, and finally obtain the classification accuracy.
Step 4: analyze the results obtained in the second step and select the peak point of KNN classification accuracy as the dividing point of offshore and distant sea areas.
According to the actual environment and ship type factors in the offshore area, the reference data of a single ship is rich, which is suitable for trajectory prediction based on the machine learning classification method. For a ship in the open sea, its track can be regarded as the single ship track in the region. There is little that can be predicted on the influences among the trajectories of other ships, As the other ships have little influence on the predicted ship’s trajectory, the accuracy of the trajectory prediction method in the offshore area decreases. Moreover, the amount of track data of a single ship in the open sea is large, and the fluctuation of the ship’s track is small. Thus, it is suitable for adopting deep learning methods to predict the position based on historical track data.
3.3. Trajectory Prediction Method for the Open Sea Area
It can be regarded that the track of a ship in the open sea will not be affected by the track of other ships. The neural network method can implicitly consider the weather and sea area influence factors of the ship track in the open sea area. On the other hand, the ship track strongly correlates with time.
LSTM is usually suitable for dealing with issues sensitive to time series [
41]. LSTM can learn long-term dependence and has the form of a repeat module chain of a neural network, but it has a different structure in Recurrent Neural Network from other neural networks [
42,
43]. LSTM has a four-layer structure in which the layers uniquely interact with each other, and its selective memory-forget mechanism design makes it a powerful tool for sequence generation and prediction [
44]. As shown in
Figure 2, the key to LSTM is the cell state, the line running horizontally through the top of
Figure 2B represents the cell state.
The LSTM can delete and add information to the cell state, which is enabled by a structure called a gate. As shown in
Figure 2C, a gate is an optional way to let information through. It consists of a sigmoid neural network layer and a dot multiplication operation.
The ship trajectory prediction method for the open sea area based on LSTM performs the following steps.
Step 1: determine what information should be discarded from the cell state, which is implemented by the Sigmoid layer called “forget gate” (
). It looks at
(the previous output) and
(the current input), and outputs the number between 0 and 1 for each number in the cell state
Ct−1 (the previous state) as Formula (12). Here, 1 represents complete retention and 0 represents complete deletion.
where
.
Step 2: decide what information is to be stored in the cell state. The Sigmoid layer, called the Input Gate layer
, determines which values will be updated. The TANH layer then creates candidate vector
Ct, which will be added to the state of the cell as the Formulas (13) and (14).
where
.
Step 3: update the previous state value,
Ct−1, and update it to
Ct. Multiply the previous state value to express the part expected to be forgotten. All these values are then added to
to create new candidate values as the Formula (15).
Step 4: run a Sigmoid layer that determines which parts of the cell state to output. The cell state is then passed through tanh (normalizing the value to between −1 and 1) and multiplied by the output of the Sigmoid gate as the Formula (16).
where
.
The tunable elements of the LSTM model can be divided into two broad categories, parameters, and hyperparameters. Although parameters are model elements learned directly from training data, there is no available analysis formula to calculate appropriate values, therefore, it is not possible to estimate the hyperparameters directly from training data, and they are usually specified manually based on heuristic methods [
45].
The main hyperparameters that affect the performance of neural networks are the number of hidden layer layers, the number of nodes in each layer, the Activation Function in each layer, the Batch size, and the Dropout rate in each layer. Dropout rates apply to deep artificial neural networks by randomly deleting nodes (and their connections) from nodes to reduce model over-fitting during training. Dropout rates control for the likelihood of such a random effect occurring at each node.
These layers control the depth of the neural network. Increasing the depth of the network will increase its ability to learn features at different levels of abstraction. Excessive increasing depth will lead to over-fitting of the model. The number of nodes in each layer controls its width. Increasing the width will increase its memory capacity, and if the propagation depth is excessively increased, the gradient amplitude will be sharply reduced, which will lead to slow weight update of shallow neurons, and result in gradient dispersion.
Given a neural network with input layer
X, a hidden layer of
with M nodes and a regressor of the output layer composed of a single node, the form of each node is as the Formulas (17) and (18).
where
Z = (
Z1, Z2......,
Zm),
σ(*) is the activation function, and
g(*) is the optional output function. The sigmoID, TANH, or Relu functions are the main activation functions.
The batch size specifies the number of training instances entered into the model before updating the model parameters. Larger batches reduce the computational cost required, which may result in local optimality.
To sum up, when it comes to the trajectory prediction of ships in the open sea, it is necessary to preprocess the trajectory data of ships first, screen the data sensitive to time series as characteristic values, then convert these characteristic values into time series, and train the model by adjusting the hyperparameters of the neural network. The prediction of ship trajectory can then be realized after the training.
3.3.1. Parameter Settings
Hyperparameter selection of KNN algorithm: the value of weight a is 0.7, 0.8, 0.9, and the value range of k is [
11,
24], parameters are mainly used for tuning hyperparameters using grid search methods, parameter settings are shown in
Table 2.
LSTM network parameters are set as follows: the neural network layer is set to 3 layers, the LSTM network width of layer 1 is set to 64, the Dropout rate is set to 0.3, and the activation function is set to ReLU. The LSTM network width of layer 2 is set to 128, the Dropout rate is set to 0.3, and the activation function is set to ReLU. The width of the output gate is set to 2, the activation function is set to ReLU, the Optimizer of the neural network is set to Adam, and the number of samples contained in each batch in gradient Descent is 64. The epoch value of training model iteration times was 100 when training terminated. The structure of the LSTM neural network is shown in
Table 3.
3.3.2. Evaluation Criteria
In this paper, three evaluation criteria are used to evaluate the effect of the method on ship trajectory prediction: accuracy, mean square error, and coefficient of determination. Accuracy ACC refers to the degree to which the average value measured several times is consistent with the actual value under certain experimental conditions. It is expressed by error and used to indicate the size of systematic error.
Mean-square error (MSE) can be used to evaluate the degree of data change. The smaller the MSE value is, the better accuracy the prediction model has in describing experimental data. The real value-predicted value is adopted, and then the square is followed by the sum and average. The calculation formula is as the Formula (19).
The R2 coefficient, also known as the coefficient of determination, measures the overall fitting degree of the regression equation and expresses the overall relationship between the dependent variable and all independent variables. The closer R2_score is to 1, the better the regression fitting effect is. Its calculation formula is as the Formula (20).