**1. Introduction**

About 90% of global trade is carried by maritime transportation [1]. With the continuing growth of international trade, modern maritime transportation calls for more intelligent methods for transportation management to achieve larger capacity, faster traveling speed and higher safety levels. To achieve these goals, accurate predictions of vessels' future movement is important and can be used in many maritime applications, such as port management, anomaly detection and collision avoidance [2].

Despite the importance of vessel trajectory prediction, it remains a challenging task due to the diverse navigation environments and the stochastic nature of vessel movements. While most of the existing works on vessel trajectory prediction have focused on making short-term predictions [3–7], being particularly useful for collision warning and avoidance, this work investigates the problem of long-term vessel trajectory prediction. Long-term trajectory predictions can be used to guide the captains to operate the ship in a more fuel-efficient way and hence reduce the carbon dioxide emissions, with assistance also from additional information, including weather and current forecasts on the route. Long-term vessel trajectory predictions can also be used by the agencies of port management to obtain more accurate estimates of the remaining navigation distance and subsequently obtain more accurate predictions for vessel arrivals.

The prevalence of maritime transportation data makes it possible to take a datadriven approach to develop high-accuracy vessel trajectory algorithms [2,8]. For instance, the Automatic Identification System (AIS), a global autonomous tracking system that has been made compulsory for ships exceeding 300 tons, provides abundant and near real-time information about ships. Apart from static information such as MMSI, ship name and ship type, the AIS messages also contain the location information of the ship (longitude (Lon) and latitude (Lat)) and information about its traveling (e.g., speed over ground (SOG),

**Citation:** Xu, X.; Liu, C.; Li, J.; Miao, Y.; Zhao, L. Long-Term Trajectory Prediction for Oil Tankers via Grid-Based Clustering. *J. Mar. Sci. Eng.* **2023**, *11*, 1211. https://doi.org/ 10.3390/jmse11061211

Academic Editor: Sergei Chernyi

Received: 19 May 2023 Revised: 7 June 2023 Accepted: 8 June 2023 Published: 11 June 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

course over ground (COG) and heading) [9]. This information, when gathered in large scale, can be used to characterize the voyage patterns in a corresponding area and can further be exploited to predict vessel trajectory.

This paper presents an AIS data based long-term vessel trajectory prediction algorithm, aiming to predict the trajectory of oil tankers from any location to a pre-defined destination, e.g., a port. The proposed algorithm takes a data-driven approach. First, the traveling patterns of tankers in an area of interest are extracted from the historical AIS data via key point clustering. The output of clustering is a set of *waymark points*, with each corresponding to a cluster. Each waymark point is characterized by the average Lon, Lat, COG and the number of key points in the cluster that it represents. In real-time trajectory prediction, the algorithm uses the Lon and Lat to filter the waymark points and uses the information of COG and the number of points to calculate a weighted distance between the filtered waymark points and the current reference point. A segment of the predicted trajectory is generated by connecting the reference point to the waymark point that has the smallest weighted distance. The complete long-term trajectory prediction is made by repeating this process until the reference point reaches the destination. For the proposed algorithm, the key point clustering only needs to be perform once. Once the set of waymark points is obtained, all subsequent oil tankers arriving at the same destination can use the obtained set of waymark points to achieve trajectory prediction. We verify the effectiveness of the proposed algorithm using real AIS data provided by the Danish Maritime Authority (DMA) [10].

The rest of this paper is organized as follows. Section 2 provides a brief review of related work in the area of trajectory prediction. Section 3 describes the proposed trajectory prediction algorithm. In Section 4, the proposed algorithm is applied to real AIS data and is compared to state-of-the-art algorithms. Finally, Section 5 draws conclusions and discusses possible future research directions. For ease of exposition, Table 1 lists the key notation used in this paper.




#### **2. Related Work**

With the advances in high-precision positioning technologies such as the Global Positioning Systems and real-time radars, it is possible to acquire accurate information about positions for vehicles, aircraft, ships and pedestrians. The availability of highaccuracy positioning data, together with other information such as speed and acceleration, makes it possible to predict the trajectories of the targets of interest automatically, which could find applications in many areas, including terrestrial navigation [11,12], autonomous driving [13,14], and maritime traffic management [2,8,15,16]. This paper focuses on the trajectory prediction problem for vessels, for which the existing works can be classified into two categories, i.e., short-term trajectory prediction [3–7] and long-term trajectory prediction [17,18].

For short-term vessel trajectory prediction, the time horizon over which the predictions are made is usually from a few seconds to a few tens of minutes. Such vessel prediction algorithms are often developed for collision avoidance to ensure navigation safety. Hence, apart from the prediction accuracy, timeliness is also important [19]. In this category, classical algorithms follow from mathematical models of mobility and statistical techniques for accurate trajectory predictions. Examples of such methods include the Kalman filterbased algorithm [20], the Gaussian process-based algorithm [3], and the Markov processbased algorithm [21]. However, due to the complicated nature of vessel trajectories, these methods may not be able to capture the characteristics of the trajectories to predict and thus fail to provide accurate predictions.

The recent advances in machine learning techniques, along with the assistance of abundant AIS data, have stimulated a burst of research on machine-learning-based vessel trajectory algorithms [5–7]. For instance, Zhang et al. [5] used a hybrid method based on LSTM and KNN for short-term trajectory prediction. The algorithm switches between LSTM and KNN based on the trajectory densities of different areas. That is, in dense areas, KNN is used for trajectory prediction, while in sparse areas, LSTM is adopted. Using publicly available AIS data collected near Xiamen Port, Fujian Province from 2018 to 2019 [5], the results show that this method has better prediction accuracy compared with classical prediction algorithms developed based on Kalman filtering. You et al. [6] proposed a sequence-to-sequence model based on GRU to predict vessel trajectory at a time horizon of 5 min. The model works by first encoding the trajectory as a context vector to maintain the temporal relationship of the trajectory position, and then using GRU as a decoder to output the future trajectory. Numerical results based on the data collected near Chongqing and Wuhan of Yangtze River Channel demonstrate good short-term trajectory predictions. Murray et al. [7] proposed a data-driven trajectory prediction method which first applies Principal Component Analysis (PCA) to each trajectory to generate feature vectors and then uses the extracted feature vectors as inputs for the Gaussian Mixer model to predict multiple possible trajectories at the same time. The time horizon of prediction is about 30 min. All of the above work uses deep learning or data-driven methods and achieves better results.

Compared to the problem of short-term trajectory prediction, the progress on the longterm trajectory prediction problem is much less reported, partly due to the more challenging nature of the long-term prediction problem. Existing short-term prediction algorithms may

be used to produce long-term predictions through recursions; however, the accuracies are expected to decrease as the number of prediction steps increases [6]. One relevant work that tackles the problem of long-term trajectory can be found in [22], where DBSCAN was used to cluster historical trajectories, and the trajectory predictions are produced by pretrained deep learning models. This work was tested using the publicly available AIS data from MarineCadastre in Zones 15 and 16. While [22] also adopted DBSCAN, the overall methodology is completely different from our proposed approach. For instance, in [22], DBSCAN was used to cluster trajectories based on trajectory statistics such as the trimmed mean of the longitude (latitude). In contrast, our work adopts DBSCAN to cluster the key points from many historical trajectories, where key points sharing similar characteristics are grouped into the same cluster. Two other relevant works on long-term vessel trajectory predictions can be found in [17,18]. Both these two works attempt to predict the remaining path of ships to a destination in order to better estimate the remaining traveling distance and to achieve a more accurate estimate of the arrival times. Ogura et al. [17] evaluated the difference between the predicted weather and the weather of historical trajectories, and then adopts the historical trajectory with the smallest difference as the prediction. The AIS data of four cargos traveling to and from Japan was used to test the weather-based algorithm [17]. Alessandrini et al. [18] adopted a graph-based approach by calculating the raster of density and directionality of all the historical data and applying a path-finding algorithm to identify a path from the current location to the destination. This method was tested using AIS data recorded in October 2015 near the port of Trieste, an Italian city on the Northeastern Adriatic Sea, with the assistance of historical LRIT data from 2009 to 2014 [18].
