1. Introduction
A ship automatic identification system (AIS) is an open data transmission system widely used in the fields of ship traffic information collection and analysis, ship navigation monitoring, and water traffic planning. The ship trajectory data collected by AIS has the advantages of massiveness and large geographical scope, but the data time interval is too large, and the quality is not high, which introduces challenges to the classification of ship trajectory.
At present, the specific application scenarios of ship trajectory classification methods at home and abroad mainly include the identification of ship types and the classification of ship motion patterns. The realization process is divided into three parts: feature extraction, transformation of ship trajectory data, and modeling of classification models. Chen et al. [
1] realized the classification of AIS ship trajectory based on the sparse representation classification algorithm and conducted experiments in the waters of the Yangtze River. The cubic spline method is used to approximate the trajectory of a ship, which may destroy the characteristics of the trajectory of the ship. Kraus et al. [
2] used the random forest algorithm to classify ship type by extracting geographic features (navigation route, stay area, etc.) and behavior features (heading, speed, etc.) of the ship’s trajectory and achieved 97.51% recognition accuracy. Based on the AIS ship trajectory, Sánchez et al. [
3,
4,
5,
6,
7] used SVM and a decision tree to achieve binary classification of fishing boats and preprocessed the trajectory by data cleaning, data filtering, trajectory segmentation, feature extraction and other methods to improve the accuracy of classification. Liu et al. [
8] used a semi-supervised deep learning model (SCEDN) for classification in the case of ship encounters, which used an encoder–decoder convolutional structure with four channels (distance, speed to approach point) for each segment time (TCPA) and distance to approach point (DCPA)). Sheng et al. [
9,
10,
11,
12,
13,
14,
15] divided the ship’s trajectory into three motion modes: anchoring, going straight, and turning. According to factors such as speed and heading, the behavior characteristics of the three modes were extracted, and the ship trajectory feature classification model was established by logistic regression. Cui Tong et al. [
16,
17,
18,
19,
20,
21,
22] combined LSTM and CNN to establish a hybrid classification model, which is characterized by speed, acceleration, heading and curvature, with feature vectors as inputs and ship shape as output. In this method, CNN is used to extract the spatial features of the trajectory data, and LSTM is used to extract the temporal features of the trajectory data. Because ship trajectory data belong to spatial data, in this paper, we refer to some methods for trajectory image classification.
However, with respect to the relevant research results at home and abroad, the following research trends and directions are observed. Research on ship trajectory clustering is gradually developing towards efficient execution and extraction of diversified trajectory data features, and research on trajectory classification is gradually developing towards accurate feature extraction and the establishment of mathematical models based on deep learning. Combined with the main research objects of this paper, the current research has the following shortcomings:
Most of the current ship trajectory clustering methods are based on the density clustering algorithm of DBSCAN. Although the algorithm complexity is high, there is room for improvement in execution efficiency, and it is difficult to select the dual parameters of DBSCAN.
When domestic and foreign scholars use supervised algorithms for ship trajectory classification, there is still room for improvement in the use of ship trajectory spatial feature information and the process of extracting features, such as ship trajectory heading and speed.
The main work of this paper:
In this paper, we take ship trajectory data as the research object and investigate a fast, efficient and accurate ship trajectory clustering method for waters with dense and complex traffic flow that obtains the ship trajectory data of various clusters in the water area. In this paper, we use the clustered ship trajectory data as the basis to study ship trajectory anomaly detection a channel classification so as to provide decision support for intelligent risk management and control of ship traffic control departments. Specifically, the main research work of this paper is as follows:
The main task of ship trajectory preprocessing is to eliminate interference trajectories by eliminating ship trajectories that are concentrated in a small area of water with little movement or ship trajectories with a sampling interval that is too long to characterize continuous motion characteristics, eliminating the interference of ship anchor points in trajectory analysis of moving ships, and reducing the complexity of ship trajectories. Under the premise of ship trajectory preprocessing, in this paper, we use the QuickBundles algorithm as a basic method to carry out ship trajectory clustering research. First, we analyze the performance of three trajectory similarity measurement methods, MDF [
23], DTW [
24], and Hausdorff [
25]. Then, aiming at the problem of insufficient sampling of local features of ship trajectory by the QuickBundles algorithm, a sampling method based on heading is used to improve it, and an improved QuickBundles ship trajectory clustering algorithm is proposed. We use the improved QuickBundles algorithm [
26] to establish a clustering model of ship trajectories, determine appropriate thresholds according to a variety of evaluation indicators, complete the task of ship trajectory clustering, and conduct comparative experiments with the improved QuickBundles algorithm and the traditional DBSCAN [
27] algorithm.
In view of the problem of ship trajectory classification based on latitude and longitude data, the spatial characteristics of the data are not obvious, and the classification effect is not ideal. In this paper, we propose a ship trajectory classification method based on a deep convolutional neural network to classify the channel to which a ship trajectory belongs, achieving the recognition of ship trajectories and waterways. Based on the clustering results, the latitude and longitude coordinates are mapped to the image pixel coordinates according to the scale, the spatial characteristics of the ship trajectory data are extracted, and the ship trajectory image dataset is established. The ship trajectory classification model based on a deep convolutional neural network is established according to on the ResNet50 [
28] model, using the training set to train the model. On the test set, the fully connected neural network and multi-class SVM classifier [
29] with latitude and longitude data as input are used for comparison with the deep convolution model with trajectory image data as input.
The main contributions include:
An improved QuickBundles ship trajectory clustering algorithm is proposed.
A method of ship trajectory classification based on a deep convolutional neural network is proposed that realizes the classification and identification of the waterway to which a ship trajectory belongs.
The contents of this paper are organized as follows:
Section 2 provides details of the proposed scheme, the result analysis is shown in
Section 3, and conclusions are presented in
Section 4.
2. Methods
The working process of the proposed methodology is shown in
Figure 1. This method takes specific ship trajectory AIS data as the research object and focuses on ship trajectory clustering, ship trajectory anomaly detection, and channel identification of ship trajectories in dense-traffic waters. Through the identification of abnormal trajectories and the classification of the channel to which a trajectory belongs, the ship supervision department provides technical support for targeted ship trajectory data analysis. Ship trajectory clustering research is carried out based on the QuickBundles clustering algorithm. The sampling method of QuickBundles is improved according to the local heading changes of the ship trajectory, and a fast, accurate, and efficient ship trajectory clustering method is proposed. Ship trajectory clustering research also provides cluster quantity parameters for anomaly detection models and data support for ship trajectory classification.
2.1. The Improved QuickBundles Algorithm Module
The trajectory of a ship can be of any length. Before the task of clustering the trajectory of the ship, data need to be divided and filtered so that the subtrajectory segments with similar motion characteristics can be retained and some important information can be obtained; therefore, it is very important to properly divide the original trajectory. Commonly used methods of ship trajectory division are based on time interval and speed changes.
The data used in this paper come from the US Coastal AIS Vessel Traffic Data (
https://marinecadastre.gov/ais/, accessed on 1 March 2022), which are collected by the US Coast Guard through on-board navigation and positioning equipment to monitor the location of large ships in the United States, as well as characteristics of coastal waters. In this section, we take the AIS dataset from January to March 2019 as the experimental object and use two methods to process ship trajectory data. The specific parameter settings are shown in
Table 1, and the processing results are shown in
Table 2.
The QuickBundles algorithm was originally designed for use with nerve bundles in the medical field. The local changes of nerve bundles are not complicated. Therefore, the QuickBundles algorithm uses only simple linear interpolation as the sampling method. However, if the clustering object is a ship trajectory with moving characteristics and the local heading changes are more complicated, then the characteristic changes of these local headings cannot be ignored, e.g., the 20 ship trajectory points shown in
Figure 2a,b. In the original trajectory, the ship’s course changes considerably due to reasons such as avoidance, and the changed trajectory is curved and smooth. After sampling by the QuickBundles algorithm, the local features of this heading change are replaced by simple polylines; the ship in the original trajectory in
Figure 2c has a short, straight line at the turn. After being sampled by the QuickBundles algorithm, this short straight line is ignored.
In order to overcome the above shortcomings, we improve the sampling method of the QuickBundles algorithm. First, the ship’s trajectory is compressed, with the heading as a factor, and the key position points of the ship’s trajectory are extracted. Then, the ship trajectory is interpolated based on the distance between the trajectory points.
There are two purposes for ship trajectory compression in this paper: one is to reduce the number of trajectory points of all ship trajectories so as to more conveniently achieve the unification of the number of trajectory points in the future; the other is to reduce the number of trajectory points to improve the similarity between trajectories and calculate efficiency.
The course can indicate the direction of a ship’s trajectory and the trend of a ship’s movement.
Figure 3 shows the difference in heading angle. The heading angle difference (AD) represents the difference in the direction angle of the adjacent ship trajectory segment, which can more clearly illustrate the change in the current trajectory segment compared to the previous trajectory segment. Through the calculation of the heading angle difference, the key position points in the trajectory of a ship can be accurately obtained, and the compression of the trajectory of the ship can be determined. The detailed calculation process is shown in
Figure 4. The input is the angle threshold and the ship trajectory. The heading angle difference between the current trajectory point and the previous trajectory point is calculated. If the heading angle difference is greater than the threshold, the current trajectory point is retained; otherwise, the current trajectory point is deleted.
The QuickBundles clustering algorithm requires that the trajectories to be clustered have the same number of trajectory points. After compressing the ship’s trajectory, in order to meet this requirement, in this section, we adopt the segmented interpolation method based on the distance between the trajectory points to unify the number of ship trajectory points. The specific process is shown in
Figure 5. First, the number of track points to be inserted is obtained, and then the distance between each adjacent track point is calculated. According to the ratio between the distances, the number of inserted track points to the track to be inserted in each segment is allocated a corresponding number of points.
2.2. Ship Trajectory Classification Module
In the key monitoring areas of ports, seaports, and other regulatory agencies, as the flow of ships increases, an efficient ship trajectory classification algorithm is needed to classify ships in the jurisdiction, improve the level of intelligent management and supervision efficiency, and reduce busy waters. There is a risk of major and catastrophic traffic accidents. In this section, we use the trajectory clustering results as the training dataset to investigate the classification of ship trajectories and propose a ship trajectory classification method based on deep convolutional neural networks.
2.2.1. Longitude and Latitude Mapping and Coordinate Conversion
The latitude range of the water area where the experimental data in this article are located is 48 degrees 9 min 7.28 s north latitude to 49 degrees 6 min 44.28 s north latitude, and the longitude range is 123 degrees 3 min 43.33 s west longitude to 123 degrees 42 min 2.71 s west longitude, as shown in
Figure 6. In this section, we assume that the area is the key monitoring area of the ship supervision department, model the area and convert the latitude and longitude data into image data according to the length and width ratio of the water area where the experimental data are located.
2.2.2. Calculation of the Aspect Ratio of the Water Area
The water area where the experimental data are located is a rectangular area, and the aspect ratio is obtained by calculating the distance between the two sides of the rectangular area to determine the image resolution using the Haversine formula [
30] to calculate the distance between two longitude and latitude coordinate points. Formula (1) introduces the method for calculating the distance between two longitude and latitude coordinate points when two longitude and latitude coordinate points are known.
R is the radius of the earth, and the average value is 6371 km.
and
represent the latitude of the two points, and
represents the difference between the longitudes of the two points. According to this calculation, the length of the experimental area is 28.41 km, the width is 17.82 km, and the approximate ratio is 14:9.
The higher the image resolution, the higher the computational cost and the lower the computational efficiency of the deep convolutional neural network. Considering the above problems, in this paper, we set the resolution to
, keeping the ratio of the image unchanged at 14:9 so that the latitude value of (49.06, 48.90) is mapped to the range of the pixel point (0, 71) inside, the longitude value of (−123.42, −123.03) is mapped to the pixel point (0, 111) range, as shown in
Figure 7.
Figure 8 shows the ship trajectory image data after the latitude and longitude data of the ship trajectory are converted. According to the clustering results in
Section 3, there are five types of ship trajectories in the waters where the experimental data are located based on the channel category division, so the label of the dataset is set to 0, 1, 2, 3, 4. The resolution of each ship trajectory image is
, which corresponds to the latitude and longitude range of the water area. The specific dataset details, as well as the division of training set and test set are shown in
Table 3.
2.2.3. Deep Convolutional Network Model Construction
ResNet (residual network) residual network [
31] is widely used in target classification and other fields. It is a part of the classic backbone neural network for computer vision tasks. Typical networks include ResNet50, ResNet101, etc. The ResNet network proves that convolutional neural networks can develop more deeply (including more hidden layers) and verifies that deep convolutional neural networks have better performance.
ResNet50 has a unique residual structure, as shown in
Figure 9. One of the core technologies of the residual structure is the use of a shortcut connection. There are two main reasons for the disappearance of the gradient. When the number of network layers is very deep and the layer where the current parameter is located is close to the input of the network, the derivation chain is very long; if some of the intermediate results have a low value, after chain accumulation, the final gradient value will be close to zero, resulting in the parameters not being updated. The input is directly added to the output obtained through the convolution operation, which can avoid the problem of the disappearance of the gradient and can capture small perturbations. In addition, the first and last ends of the residual structure use convolution to reduce and restore data dimensions. The time complexity of the two structures is similar, but it deepens the number of network layers and structures and resolves network degradation and training process performance. As shown in
Figure 10 and
Figure 11, in the actual processing step, jump connections are divided into two types according to the size of the input and output of the residual block. One is the identity block (ID BLOCK) when the input and output are consistent, and the other is the convolutional block (CONV BLOCK) when the input and output are inconsistent. The jump connection is processed by convolution calculation to achieve unity of input and output dimensions. ResNet50 adopts small-size convolution kernels and uses batch normalization [
32] technology. In this paper, we build a ship trajectory and channel classification model based on ResNet50 as a deep convolutional neural network framework.
2.2.4. Model Building
The ship trajectory classification network structure proposed in this paper is shown in
Figure 12 and
Table 4. The structure is composed of five convolution blocks stacked in sequence. Each convolution block contains the residual network substructure shown in
Figure 11. The residual network substructure in different convolution blocks has different numbers of convolution kernels. The input layer dimension parameter of the network model is set to
, the mini batch size is set to 64, and the output layer category is set to 5.