**1. Introduction**

In the last two decades, research on automatic collision avoidance and optimal path planning for surface vessels has intensified, driven by the ever-growing density of maritime traffic in narrow waterways, such as gulfs, ports, and canals [1]. Motivated by the design of autonomous surface vehicles (ASV) controllers, but also aiming for the decision support of officers on watch in conventional vessels [2], control and optimization tools that ensure the safety and the cost effectiveness of navigational actions are being intensively developed. These tools are perceptive of the surrounding environment through arrays of sensors, radars, and other positioning and communication aids. In this context, the automatic identification system (AIS) encompasses most of the aforementioned technologies in order to gather positioning and other vessel data. The already vast AIS comprises an everexpanding worldwide maritime trajectory dataset, which is made available by vessels, port authorities, and other platforms in charge of efficient and safe maritime path planning. Given the fact that the majority of vessel accidents are related to erroneous handling rather than equipment failure or environmental conditions [3], these tools aim to phase out the human officers on watch as vessel controllers, or at least augment their navigational decision-making using optimization- and prediction-based methods.

The formulation of the trajectory optimization problem used in collision avoidance controllers must take multiple aspects of vessel navigation into account, while being perceptive of their surrounding environment in real time. The generation of control actions that will result in a trajectory remaining sufficiently clear from any stationary or moving objects is not the sole objective; an efficient controller should also ensure the

**Citation:** Papadimitrakis, M.; Stogiannos, M.; Sarimveis, H.; Alexandridis, A. Multi-Ship Control and Collision Avoidance Using MPC and RBF-Based Trajectory Predictions. *Sensors* **2021**, *21*, 6959. https:// doi.org/10.3390/s21216959

Academic Editors: Baochang Zhang, Reza Ghabcheloo and Antonio M. Pascoal

Received: 3 August 2021 Accepted: 16 October 2021 Published: 20 October 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

economy of the control actions, as well as the adherence to the collision avoidance rules, commonly known as the COLREGs [4]. Multiple collision avoidance controllers have been proposed that fulfil the aforementioned specifications; in [5], a hierarchical multiobjective optimization problem is formulated, which generates an intermediate waypoint for the controlled vessel while accounting for the good seamanship rules. In [6], a fuzzy-Bayesian collision avoidance controller is formulated capable of addressing multiple obstacle vessels at once. In [7], optimal trajectories for the collision avoidance problem are generated using a B-Spline-based search algorithm. Lastly, in [8], a collision avoidance controller utilizes a probabilistic method in order to infer the one-step-ahead position of obstacle vessels, while also accounting for non-COLREG-compliant obstacle vessels.

In general, it has been observed that controllers that are not model-based can have trouble incorporating crucial aspects of the trajectory optimization problem, thus compromising practicality. Without a working model of the controlled vessel, its maneuvering capabilities cannot be easily included in the formulation, and neither can the effect of environmental conditions be quantified [9,10]. For these reasons, model predictive control (MPC) emerges as an effective control method for the problem at hand because it utilizes a model of the plant in order to compute an optimal control trajectory based on the predicted trajectory of other ships in the vicinity. As a framework, MPC can account for the uncertainties of both the utilized model of the plant and the trajectory prediction models of other ships, while also incorporating all possible control objectives (such as navigational risk, course smoothness, or deviation from the original path) in a single cost function. Some collision avoidance controllers based on MPC have been proposed in the literature; a robust MPC controller utilizing straight-line obstacle vessel trajectory predictions is proposed in [9], capable of COLREG compliance and handling of multiple obstacles. In [11], motion planning for an autonomous vessel using a sampling-based MPC method takes place. In [12], an MPC controller for the collision avoidance task is built by approximating the behavior of an LQR controller, thus ensuring asymptotic stability of the system. In [13], a neural network used to approximate the MPC response for the generation of COLREG compliant trajectories for multi ship encounters is presented. In addition, MPC has been integrated in distributed control frameworks of multi-ship schemes; for example, a distributed MPC scheme has been employed for a multi-vessel formation controller with collision avoidance capabilities [14], or for the robust distributed control of multiple vessels operating for the inter-terminal transport of containers [15].

It becomes apparent that for the scope of the collision avoidance task, information about the future trajectories of other ships plays a central role. Prevalent in non-data driven methodologies already used for the vessel trajectory prediction (VTP) problem is the first principles-based modeling technique [16], carrying a number of significant shortcomings, such as their inherent complexity, which has a greater negative impact due to the fact that the model is usually employed multiple times within the duration of each MPC sample. In order to simplify the solution of the employed kinematic differential equations and facilitate the real-time prediction of future states, these types of models are usually created using several assumptions that try to approximate real-world conditions, but also make the final model far less accurate. Therefore, one should employ a more sophisticated, data-driven approach for the creation of effective trajectory prediction models that are included in MPC controllers. Machine learning has answered the call of producing highly accurate models, which may be easily integrated in predictive frameworks through the use of black-box modeling, and more specifically, artificial neural network (NN) approaches [17]. NNs employ different architectures in order to remap the original non-linear problem to a higher-dimensional input space and approximate its dynamics utilizing standard functions. In this context, various NN techniques have been successfully utilized in control frameworks solving the vessel trajectory prediction problem.

Feedforward NN architectures, most commonly represented by the multilayer perceptrons (MLPs), have been employed to solve the vessel trajectory prediction problem as in [18,19], where MLP NNs are trained using the well-established backpropagation algorithm outperforming rival methodologies, i.e., linear models and Kalman filters. In [19], a real AIS dataset gathered from the confined space of a river waterway is used to approximate the vessel dynamics in such environments. Backpropagation has been the baseline of more efficient training methods as in [20], where different computational intelligence approaches like differential evolution, genetic algorithms, and swarm-based techniques are used to modify the original backpropagation algorithm in order to create more accurate feedforward NN models. Other NN architectures, like support vector machines (SVMs), have been employed in conjunction with computational intelligence optimization techniques, i.e., the particle swarm optimization (PSO) algorithm, on AIS datasets to solve the vessel trajectory prediction problem [21]. In most cases, the inherent abilities of NN architectures that can meet the standard of high accuracy are limited to a one-step ahead prediction horizon, in the sense that multi-step ahead predictions would require an approximation of unknown future states to be made and present an error enlarged through propagation to the end of the prediction horizon. Such an error would become critically high after a small number of steps, rendering the control framework useless.

To overcome this problem, long-term trajectory prediction approaches have been devised with the inclusion of memory features, such as the recurrent neural networks (RNNs), with their most notable representative, i.e., the long short-term memory (LSTM) NNs already used in the context of the vessel trajectory prediction problem [22–25]. Besides trajectory modeling and prediction in open waters, advances have also been made in crowded port waters as in [26], where another modification of the RNNs, namely the bidirectional gated recurrent unit, is used to address the vessel trajectory prediction problem, outperforming standard NN methods in such scenarios. Gated recurrent units are promising candidates for predicting the collective behavior of vessel fleets [27].

Radial basis function (RBF) networks form a unique NN architecture belonging to the general feedforward NN category. RBFs differ from other NN architectures, having simpler structures, employing faster training algorithms, and usually producing more accurate models than MLPs. Within the context of vessel trajectory prediction, RBFs have been integrated in control frameworks by approximating unknown vessel parameters [28–30]. Recently, RBFs have been applied on real AIS data in order to produce highly accurate models for one-step and multi-step ahead predictions [31], showing their potential in being integrated to receding horizon control methodologies.

Remarkably, in the collision avoidance research literature, there are no instances where the multi-step-ahead trajectory prediction of moving obstacles is addressed in such a systematic manner; usually these trajectories are either known a priori, or there are no obstacle ships present whatsoever. An exception occurs in [9], where straight line trajectory predictions are employed, based on estimates of current course and speed for the moving obstacle. In this work, a multi-ship MPC controller utilizing RBF obstacle trajectory prediction models trained using real AIS data is presented for the collision avoidance task. The main contributions of this work are as follows: first, we introduce a novel MPC scheme for collision avoidance control, where nonlinear data-driven models are used to predict the trajectories of obstacle ships; to the authors' best knowledge, this is the first such instance in the literature. The previous state-of-the-art approach of using straight-line obstacle trajectory predictions may have yielded satisfactory approximation results in open sea case studies, where ships are expected to travel in a straight line, but is of limited practical use for the cases of narrow gulfs, ports, or canals, where ships are expected to maneuver while navigating through, thus resulting to nonlinear trajectories. Secondly, the aforementioned nonlinear models are trained using an AIS data-driven methodology, which, once again, constitutes an original approach in the context of vessel collision avoidance. Using real trajectories as training data increases the practicality of the proposed scheme, since the resulting trajectory prediction models capture the dynamics of real vessels. The proposed method is tested in a collision avoidance case study at the Miami port area, and its performance is illustrated by the comparison with an MPC controller

employing straight-line obstacle prediction models, which corresponds to the current state-of-the-art approach [9].

The paper is structured as follows. In Section 2, the AIS-data-driven methodology for the creation of the RBF trajectory prediction models is presented. In Section 3, some preliminaries on maritime collision avoidance and optimal trajectory generation are described, and later, the proposed method is presented. In Section 4, the case study based on the port of Miami is outlined, and the simulation results are discussed in depth. Lastly, in Section 5, concluding remarks are made.

#### **2. AIS-Data-Based Trajectory Prediction Models**

#### *2.1. Radial Basis Function Neural Networks*

RBF NNs have been successfully employed to approximate nonlinear system dynamics in order to predict future system states in numerous diverse applications [32,33]. Their success can be mainly attributed to their structure, which is simpler when compared to other NN architectures, as they comprise a single hidden layer, attached linearly to the network output. This property not only allows for using very fast training algorithms, but also makes RBF NNs suitable for integration in MPC schemes, as (a) it facilitates the controller design [34], and (b) helps to solve the MPC online optimization problem in shorter computational times [35], thus rendering such schemes applicable to systems with fast dynamics [36]. Another property that makes RBF NNs a popular modeling method in predictive control schemes is related to their increased approximation capabilities [37,38]. Indeed, notwithstanding their simple structure, RBF NNs have proven to be more accurate in modeling various nonlinear systems when compared to other machine learning methods, including MLPs, support vector machines (SVMs), random forests, etc. [39]. Especially within the context of the vessel modeling and control problems, RBFs have already shown great potential in modeling unknown vessel parameters [28,29], in order to create models capable to be integrated in trajectory tracking control algorithms. Furthermore, in a recent publication [31], it has been shown that RBF NNs trained with the fuzzy means (FM) algorithm outperform other data-driven techniques such as MLPs when modeling vessel trajectories based on AIS data.

Training an RBF NN is a process consisting mainly of two phases. The first phase is performed by applying a clustering technique on the training dataset in order to identify the centerpoint and optimized parameters of a number of radially symmetric basis functions called nodes. The incorporation of radial basis functions (e.g., Gaussian, quadratic, etc.) is the first main difference between RBFs and other feedforward NNs. The linear combination of these nodes produces the output prediction of the network. Finding the node weights is the goal of the second phase, a problem trivially solved by least squares.

A typical RBF network can be seen in Figure 1. The structure comprises three distinct layers, the first of which is the input layer and has the sole purpose of distributing the *N* inputs to the *L* nodes of the hidden layer. The second point differentiating RBFs from other architectures is the existence of only one hidden layer of *N*-dimensional nodes. In order to produce a prediction <sup>ˆ</sup> *y*(*k*) given an input datapoint **x**(*k*), at first the Euclidean norm is used to calculate the activity *μl*(**x**(*k*)) for every node *l* by using the difference between the *<sup>k</sup>*-th input vector **<sup>x</sup>**(*k*) and the *<sup>l</sup>*-th node center **^ u***l*, such that

$$\mu\_l(\mathbf{x}(k)) = \|\mathbf{x}(k) - \hat{\mathbf{u}}\_l\| = \sqrt{\sum\_{i=1}^{N} \left(\mathbf{x}\_i(k) - \hat{\mathbf{u}}\_{i,l}\right)^2}, k = 1, \dots, K \tag{1}$$

The activity acts as input to the free parameters of each node according to the chosen RBF. The hidden node response vector **z**(*k*) for the *k*-th datapoint is given by

$$\mathbf{z}(k) = \left| \lg(\mu\_1(\mathbf{x}(k))), \lg(\mu\_2(\mathbf{x}(k))), \dots, \lg(\mu\_L(\mathbf{x}(k))) \right| \tag{2}$$

where *g* corresponds to the chosen activation function. Note that in this work, a thin plate spline function is employed

$$\log(\mu\_l(\mathbf{x}(k))) = \mu\_l^2(\mathbf{x}(k)) \cdot \log \mu\_l(\mathbf{x}(k)) \tag{3}$$

due to the fact that it is an established choice as an RBF kernel producing models of high accuracy [40], but also because there are no tunable parameters other than the actual input to the function. Such parameters would require optimization techniques to be included in the training process, e.g., employment of the Gaussian function would need kernel width optimization, which is usually performed by cpu-intensive iterative algorithms or suboptimal trial-and-error techniques.

Finally, the network's prediction is calculated by linearly combining the hidden note responses such that

$$
\hat{y}(k) = \mathbf{z}(k) \cdot \mathbf{w} \tag{4}
$$

where **w** is a vector containing the node weights.

**Figure 1.** A typical radial basis function network structure using Gaussian RBFs.

#### *2.2. The Fuzzy Means (Fm) Training Algorithm*

For a given training dataset where **Y** denotes the real outputs, and after the hidden node responses **Z** are formulated, the weight vector **w** can be trivially calculated by least squares in matrix form

$$\mathbf{w}^T = \mathbf{Y}^T \cdot \mathbf{Z} \cdot \left(\mathbf{Z}^T \cdot \mathbf{Z}\right)^{-1} \tag{5}$$

thus completing the second training phase in one easy step.

The first phase of training requires a clustering algorithm to be applied to the training dataset, in which case the fuzzy means (FM) algorithm is a great candidate for this task [39]. Following the notation of the previous example, let us suppose a system with *N* normalized input variables *xi*. At first, each input variable domain must be partitioned into *s* 1-D fuzzy sets (FS). Each fuzzy subspace **A***<sup>l</sup>* , where *l* = 1, 2, ... ,*sN*, is formed by the selected *s<sup>N</sup>* fuzzy sets according to the respective input variable. This process creates a *N*-dimensional grid, where all intersection points, also called nodes, are candidates to become RBF centerpoints. The FM algorithm undertakes the task of selecting the appropriate candidate nodes that will be assigned as RBF centers. To perform the selection procedure, a membership function

$$\mu\_{\mathbf{A}^l}(\mathbf{x}(k)) = \begin{cases} 1 - d\_r^l(\mathbf{x}(k)), & \text{if } d\_r^l(\mathbf{x}(k)) \le 1 \\ 0, & \text{if } other\text{wise} \end{cases} \tag{6}$$

determines whether an input vector lies within the domain of an RBF centered around a candidate node. In the simple case where all input variable spaces are equally partitioned, the following distance metric can be used to assign *N*-dimensional spherical domains to each candidate node

$$d\_r^l(\mathbf{x}(k)) = \sqrt{\sum\_{i=1}^N \left(a\_{i, \overline{i}i}^l - \mathbf{x}\_i(k)\right)^2} / \sqrt{N} \delta a \tag{7}$$

where **x**(*k*) is the *k*-th input vector, *a<sup>l</sup> <sup>i</sup>*,*ji* is the centerpoint of fuzzy subspace **<sup>A</sup>***<sup>l</sup>* , and *δa* is the sphere radius, which is the same for each input. The FM algorithm uses a fast non-iterative procedure to find a subset of the subspaces, so that the final RBF NN's hidden layer comprises only the fuzzy sets, which sufficiently cover all training datapoints, in the sense that each datapoint is included in at least one fuzzy set. For an in-depth description of the FM algorithm, the reader may refer to [39].

#### *2.3. Data Preprocessing*

Best modeling practices mandate that a training dataset should be error- and noisefree, a case that is far from truth when using data from AIS transceivers [41–45]. AIS data are irregularly sampled and contain heavy noise, missing data, and erroneous values. Thus, before employing any modeling technique, rigorous preprocessing is in order.

The Marine Cadastre service (www.marinecadastre.gov, accessed on 25 July 2021) has been the source of all data used in this work. MarineCadastre.gov is a is a service that gathers and publicly provides AIS data to marine planning initiatives. In this work, data from all days between 1 July 2019 and 30 June 2020 have been included and filtered to keep vessels sailing an area around the port of Miami covered by the geolocation rectangle defined by the latitudes of 25.720◦ through 25.840◦ and the longitudes of −80.145◦ through −80.042◦. To conform to the initial assumption of similar size and similar dynamics, we allowed only cargo ships sailing on engine power into the dataset, further filtering the dataset to yield a total of 180 vessels.

To address the problems of sample irregularity, noise, and erroneous values, the dataset was resampled to 120 s, which was deemed enough to capture the high inertia dynamics of large cargo ships. The interpolation technique applied on the data to perform the resampling was the Akima piecewise cubic interpolation [46], which is quite effective on geolocation data, performing a mild denoising as well. A heuristic that rejects very far-off outlier values due to GPS errors was also applied. The trajectories were split in data samples, with each one containing ten consecutive vessel positions. Note that each trajectory's starting point should be the last point of the previous one resulting in an overlap of one point, but this final position will be used as the model's output, so no actual overlapping exists within the input data. The resampling and splitting process yielded a total of about 14 k samples from 3.1 k resampled trajectories of the initial 180 vessels. Algorithm 1 depicts the step-by-step procedure of preprocessing.

**Algorithm 1** Preprocessing Stage

1: Process each entry in the common dataset so that it contains only the following: Vessel ID, timestamp, latitude, and longitude. Reject all other information.

### *2.4. Modeling Procedures*

AIS transceiver equipped vessels are able to record and exchange timestamped information, including geolocation, speed, direction, vessel identification, and specifications

<sup>2:</sup> Sort dataset by vessel ID and sort each vessel data by date.

<sup>3:</sup> Apply resampling and outlier filtering on the data of each vessel to achieve a resampling of 120 s.

<sup>4:</sup> Split vessel data into trajectories containing ten consecutive vessel positions each.

<sup>5:</sup> Create final preprocessed dataset, which should contain the vessel ID and final

<sup>10-</sup>position trajectories.

data. Vessel trajectory prediction algorithms integrated with collision avoidance techniques can incorporate trajectory prediction models in order to identify imminent threats and navigate safely and efficiently within heavily crowded port areas or open seas.

Let us suppose an available AIS dataset, comprising an arbitrary number of *T<sup>v</sup>* trajectories for a total of *V* vessels, where *v* = 1, 2, ... , *V*. Let us also suppose that the included trajectories contain an arbitrary number of *Kv*,*<sup>t</sup>* AIS messages *AISmv*,*<sup>t</sup> <sup>k</sup>* (timestamped geolocation and other data). In this work, for simplicity reasons, we employ the following format in AIS messages

$$ASm\_k^{v,t} = \left\{ \begin{array}{cc} Ts\_k^{v,t} & y\_k^{v,t} & x\_k^{v,t} \end{array} \right\} \tag{8}$$

where *k* = 1, 2, ... , *Kv*,*<sup>t</sup>* , and *Tsv*,*<sup>t</sup> <sup>k</sup>* denotes the message timestamp, while *<sup>y</sup>v*,*<sup>t</sup> <sup>k</sup>* and *<sup>x</sup>v*,*<sup>t</sup> <sup>k</sup>* are the respective latitude and longitude contained in the *k*-th AIS message for the *t*-th trajectory of the *v*-th vessel. The fact that there are unknown parameters, e.g., the state and controls of the vessels, prohibits the use of kinematics in calculating future vessel states. Nevertheless, the vessel dynamics exist in the information hidden within the dataset and can be extracted and, in most cases, approximated by using a black-box modeling technique such as RBF NNs. We can assume that a common underlying pattern exists in the dynamics of samesize vessels executing similar maneuvers, for example, when approaching or leaving a port, when berthing, when crossing waterway paths, etc. Thus, if a suitable dataset of sufficient size is made available, an RBF NN can be trained to perform one-step-ahead predictions about a vessel's future geolocation by using past AIS messages as seen in the following equations

$$\left\{ \begin{array}{c} \Delta y\_{k+1}^{\upsilon, t} \\ \stackrel{\scriptstyle \upsilon, t}{\end{array}} \right\} = RBF \text{ NN} \left( \begin{array}{c} AISm\_{k}^{\upsilon, t} \\ \end{array} \dots \quad AISm\_{k-N}^{\upsilon, t} \right) \tag{9}$$

$$\begin{cases} \overset{\circ}{y}\_{k+1}^{v,t} = y\_k^{v,t} + \Delta \overset{\circ}{y}\_{k+1}^{v,t} \\ \mathbf{x}\_{k+1} = \mathbf{x}\_k^{v,t} + \Delta \mathbf{x}\_{k+1} \end{cases} \tag{10}$$

where *N* is the number of past AIS messages given as inputs to the RBF NN.

The accuracy and simple structures of FM-trained RBF NNs make them ideal to be integrated in multi-step-ahead predictive control formulations. Receding horizon techniques require models of very high accuracy due to the inevitable error enlargement through propagation. This effect appears when the incorporated model is expected to recurrently make future predictions based on its own previous output throughout the prediction horizon. The problem is further worsened with the increase of the prediction horizon and can ultimately drive the control algorithm to failure. Another important point to be noted is that the linear combination used to produce an RBF NN's prediction is a simple and very fast calculation, a fact that benefits model predictive control (MPC) frameworks [36], which require an optimization problem to be solved iteratively and expect a significant number of model predictions to be made in order to converge to the optimal solution at each timestep.

Delta values of the last position of each sample were used as the model's output, while the first nine positions were the model's input

$$\left\{ \begin{array}{cccc} \text{\reflectbox{ $\upsilon$ }}\_{\text{v},t} \\ \text{\reflectbox{ $\upsilon$ }}\_{\text{x},t} \\ \text{\reflectbox{ $\upsilon$ }}\_{k+1} \end{array} \right\} = RBF \text{ NN} \left( \begin{array}{cccc} \text{\reflectbox{ $\upsilon$ }}\_{k} & \text{\reflectbox{ $\upsilon$ }}\_{k} & \text{\reflectbox{ $\upsilon$ }}\_{k} & \dots & \text{\reflectbox{ $\upsilon$ }}\_{k-8} & \text{\reflectbox{ $\upsilon$ }}\_{k-8} \end{array} \right) \tag{11}$$

The <sup>Δ</sup><sup>ˆ</sup> *y v*,*t <sup>k</sup>*+<sup>1</sup> and <sup>Δ</sup><sup>ˆ</sup> *x v*,*t <sup>k</sup>*+<sup>1</sup> values may be added to the last input position to calculate the final predicted vessel position. Based on the above procedure, the results of the modeling process produced an RBF model of very high accuracy [31]. The step-by-step procedure of the modeling stage can be seen in Algorithm 2.

Note that the number of past inputs was determined after a trial-and-error procedure, where several RBF models were trained using a different number of inputs. After testing inputs in the range of 3 to 15 past vessel positions, data obtained on model performance showed that using less than nine inputs produced models with reduced prediction accuracy, while using more than nine inputs increased the model's complexity without any significant accuracy gain compared to the model using nine inputs.

**Algorithm 2** Modeling Stage

1: Load final preprocessed dataset.

2 : Replace the final value of all included 10 −

position trajectories with the respective delta value according to - Δ*yv*,*<sup>t</sup>* 10 Δ*xv*,*<sup>t</sup>* 10 . = *yv*,*<sup>t</sup>* <sup>10</sup> <sup>−</sup> *<sup>y</sup>v*,*<sup>t</sup>* 9 *xv*,*<sup>t</sup>* <sup>10</sup> <sup>−</sup> *<sup>x</sup>v*,*<sup>t</sup>* 9 .

,

so that each trajectory sample is in the form /

*yv*,*<sup>t</sup>* <sup>1</sup> *<sup>x</sup>v*,*<sup>t</sup>* <sup>1</sup> *<sup>y</sup>v*,*<sup>t</sup>* <sup>2</sup> *<sup>x</sup>v*,*<sup>t</sup>* <sup>2</sup> ... *<sup>y</sup>v*,*<sup>t</sup>* <sup>9</sup> *<sup>x</sup>v*,*<sup>t</sup>* <sup>9</sup> <sup>Δ</sup>*yv*,*<sup>t</sup>* <sup>Δ</sup>*xv*,*<sup>t</sup>*

3: Randomly permute the trajectory samples of each vessel.

4: Split the trajectory samples of each vessel into training, validation, and testing subsets (in this work a 50%–5%–25% percentage split is used). Do this so that all vessels contribute to all three subsets according to the chosen splitting.

0 .

5: Merge all subset samples, e.g., all training samples of all vessels together in one single dataset that will be used for training. Do the same for the validation and testing subsets.

6: Normalize the inputs and outputs of the training subset. Apply the normalization coefficients to the validation and testing subsets.

7 : Apply the fuzzy means algorithm on the training and validation dataset using the nine first sets of *yv*,*<sup>t</sup>*

<sup>−</sup> *<sup>x</sup>v*,*<sup>t</sup>* values as inputs and the last set of <sup>Δ</sup>*yv*,*<sup>t</sup>* <sup>−</sup> <sup>Δ</sup>*xv*,*<sup>t</sup>* values as output.

8: Final model is in the form of Equation (11).

Moreover, a series of tests has been performed by the recurrent application of this model based on a horizon of 5 timesteps for all trajectories of the testing subset, where, at each successive timestep, the model had to use an increasing number of its own previous predictions. As the model uses more of its past predictions, accuracy decreases due to the enlargement of the propagated prediction error. Such a test can provide intuition on the models' ability to be incorporated in receding horizon predictive frameworks. The quality metrics used for these tests were the root mean squared error (RMSE) and the root mean squared haversine formula distance (RMSHFD). The haversine formula is commonly used to measure great circle distances on spherical surfaces. Table 1 presents the performance metrics obtained after the recurrent application of the chosen model in order to make predictions for the full length of the trajectories included in the testing subset of the training procedure. Mean RMSE values for the two outputs of the model, namely the latitude and longitude, are provided in degrees, wherein can be seen that the error lies in the order of 1.5 thousandth of a degree. The mean RMSHFD metric shows the respective error margin in meters when combining the two model outputs to get the actual predicted future vessel position for all tested trajectories. More details on the modeling procedure for the one-step ahead models, including detailed results and comparison with other machine learning approaches, can be found in [31].

**Table 1.** Performance metrics of the produced RBF NN model.


#### **3. MPC for Multi-Ship Collision Avoidance**

#### *3.1. Preliminaries on Maritime Collision Avoidance and Trajectory Generation*

The objective of maritime collision avoidance is the generation of a risk-free trajectory that the controlled vessel should follow. A well-defined and effective method of assessing collision risk in the near future is the closest point of approach (CPA). Stemming from the concept of the CPA, two metrics are defined: time to CPA (TCPA) and distance to CPA (DCPA) (see Figure 2). A discussion regarding the quick calculation of TCPA and DCPA using the line-of-sight (LOS) distance between the controlled vessel and the obstacle ship is presented in [5]. These metrics depict the urgency of the collision danger of vessel *i* with another vessel *j* as well as its magnitude, and by specifying lowest acceptable thresholds *dmin* and *tmin* concerning the minimum DCPA and minimum TCPA, respectively, one can construct a risk cost function, as presented in [5].

**Figure 2.** Illustration of the CPA metrics, as well as the LOS angle concept.

Here, *a*<sup>0</sup> is a scaling parameter, and *T<sup>i</sup>* denotes the trajectory matrix containing the x-y position of the *i* vessel for every timestep

$$T\_i = \left[ \begin{array}{cc} x\_1 & y\_1 \\ \vdots & \vdots \\ x\_n & y\_n \end{array} \right]. \tag{13}$$

By combining TCPA and DCPA, the spatial-temporal nature of a maritime collision risk with vessel *i* is successfully reflected. The physical interpretation of Equation (12) is that a candidate trajectory with larger minimum distance from an obstacle ship occurring at an earlier time will always be safer than a path with a smaller minimum distance and/or earlier time of occurrence. Common values for *tmin* and *dmin* are 10 min and 0.6 nm; because the present paper is concerned with collision avoidance in busy waterways such as ports, a lower *dmin* value of 0.4 nm is used. In any case, Equation (12) can be readily incorporated in the cost function of an MPC optimization problem formulation.

A second item in the domain of trajectory generation is efficiency. Vessels should strive to not deviate too much from their original course when addressing a collision risk with another vessel. The efficiency of the generated trajectory *T<sup>i</sup>* for vessel *i* can be reflected by calculating the sum of absolute deviations from the original trajectory *TOG*,*<sup>i</sup>*

$$f\_{d,i} = \left\|{T\_{OG,i} - T\_i}\right\|.\tag{14}$$

Next, an important requirement to be fulfilled when addressing the problem of collision avoidance are the COLREGs [4]. The implementation of the COLREGs restricts the domain of possible candidate paths according to the type of encounter, for example 'head-on,' 'crossing,' and 'overtaking.' Head-on vessels should pass each other on the port side, while a vessel crossing from the starboard side should be given the right of way. A visual depiction of the encounter rules takes place in Figure 3. Multiple approaches for

the modeling of the COLREG rules have been made in the literature [2,5,8], although these are usually concerned with a one-step-ahead calculation. However, for the case of an MPC controller, in order to ensure COLREG compliance for a candidate trajectory, all of its waypoints must be taken into account. By assuming that the LOS angle is increasing in the anti-clockwise direction, one needs to evaluate whether the LOS angles of each sequential trajectory timestep position are increasing monotonically, in order to confirm the compliance of the trajectory for the 'head-on' and 'give-way' situations.

**Figure 3.** (**a**) A head-on situation between two ships; (**b**) a crossing situation between two ships (give-way); the orange ship must give way to the crossing ship on its starboard side.

The idea is depicted in Figure 4, where a head-on encounter between vessels *i* and *j* occurs; here, the LOS angles for trajectory *T<sup>i</sup>* monotonically increase, and therefore, it is deemed as compliant. In contrast, the monotonically decreasing LOS angles of the *T i* trajectory confirm its non-compliance as per the COLREGs intentions. A penalty for noncompliance of a vessel *i* encountering a vessel *j* in a 'head-on' or 'give-way' situation can therefore be formulated,

$$P\_{ij} = \begin{cases} \quad 1, & \text{if } \mathfrak{a}\_{LOS\_{ij}} \searrow \\ \quad 0, & \text{if } \mathfrak{otherwise} \end{cases} \tag{15}$$

where *aLOSij* is the LOS angle vector, calculated for each trajectory point of encountering vessel *i* and the current position of encountered vessel *j*.

**Figure 4.** Head-on situation between vessels *i* and *j*; the LOS angle can be used to assess the COLREG compliance of a candidate trajectory.

Next, the generated vessel trajectory, apart from being safe and COLREG compliant, should also take into account the maneuvering capabilities of the controlled vessel, i.e., it should be guaranteed that the trajectory is technically possible to be tracked by the vessel. The feasible search domain of the trajectory optimization problem can be constructed by a purely geometric approach in the case of a one-step-ahead calculation, such as in [5], where the design variables are the vessel's next position and course. However, the extension of this geometric approach to multiple-steps-ahead requires the application of nonlinear

constraints that would bound every sequential vessel position with its previous one, in order to enforce technical feasibility. For this reason, a model-based approach is preferred. The Nomoto models constitute a class of vessel course models that are tailored for this task, and have been widely adopted, not only for the design of collision avoidance schemes [47], but also for path tracking controllers [48]. The 1st order linear Nomoto model is shown as follows:

$$T\_s \frac{d\omega}{dt} + \omega = K\_s a. \tag{16}$$

Here, *ω* is the angular velocity of the vessel, while *a* is the control input to the vessel's rudder. The maneuvering capabilities of the vessel are reflected by the *Ts* and *Ks* constants, called time constants and rudder gain constants, respectively, while typical values are in the [0.5, 2] range for both. Solving the differential Equation (16) by assuming constant rudder angle input for a *t* time interval, the 1st order linear Nomoto model can be discretized as follows [8]:

$$
\Delta\theta(t) = K\_s \left( t - T\_s + T\_s \exp\left(\frac{t}{T\_s}\right) \right) \tag{17}
$$

Here, Δ*θ* is the course change that would occur if a control input of *a* was applied and held for a time period of *t*. By setting this time period *t* as the discretization interval Δ*t*, a course model can be used to create a discrete vessel position model as follows

$$\begin{aligned} \theta\_{k+1} &= \theta\_{k-1} + \Delta\theta\_k(a\_k) \\ x\_{k+1} &= x\_k + \cos(\theta\_{k+1}) V\_k \,\Delta t \\ y\_{k+1} &= y\_k + \sin(\theta\_{k+1}) V\_k \,\Delta t \end{aligned} \tag{18}$$

Here, *θk*, *xk*, *yk* is the current course, horizontal displacement, and vertical displacement according to a global reference frame, respectively, while *Vk* is the vessel velocity. The discretization interval Δ*t* can be set according to the simulation resolution required. Equation (18) constitute a discrete position model *Li* for the *i*-th vessel,

$$\mathbf{x}\_{i}(k+1) = L\_{i}(\mathbf{u}\_{i}, \mathbf{x}\_{i}(k)),\tag{19}$$

with input vector *u<sup>i</sup>* = *a V* and state vector *x<sup>i</sup>* = *θ x y* . By evaluating the discrete vessel position model *Li* for {1, 2, . . . , *n*} consecutive timesteps, where *n* the total timesteps, a trajectory *T<sup>i</sup>* can be created for the i-th vessel, as shown in Equation (13).

#### *3.2. Collision Avoidance with Mpc and Obstacle Trajectory Prediction Models*

The MPC framework has demonstrated its aptitude in handling the uncertainties and nonlinearities of the collision avoidance problem multiple times in the literature [9,49]; however, no other works have incorporated a nonlinear data-driven obstacle trajectory prediction model in their formulation. In MPC, the optimal moves of the controlled vessels are calculated for multiple steps ahead by solving a constrained optimization problem, with constraints in real time, for each controller sample time *tcst*. The cost function of the optimization problem is constituted by two horizons, namely the prediction horizon *hp* and the control horizon *hc*; the first accounts for the total discrete timesteps ahead that the model can be evaluated, while the second for the number of timesteps that the control variables can be modified. Given a set of controlled vessels *Vc* = {1, 2, . . . , *Nc*} and a set of non-controlled or obstacle vessels *Vo* = {1, 2, . . . , *No*} where *Nc* and *No* are the total number of controlled and non-controlled vessels, respectively, the optimization problem's cost function can be formulated as the summation of all the cost functions of the respective controlled vessels for the *k*th timestep:

$$\begin{array}{lcl}\min\_{\mathbf{U}(k)}\sum\_{i\in V} \mathbf{C}\_{i}(\mathbf{X}\_{\varepsilon}(k), \mathbf{X}\_{o}(k))\\ \text{s.t. } \mathbf{U}\_{\mathsf{u}} \le \mathbf{U}(k) \le \mathbf{U}\_{l} \\\ \mathbf{N}(\mathbf{X}\_{\varepsilon}(k), \mathbf{X}\_{o}(k)) \ge \mathbf{d}\_{\varepsilon} \\\ \mathbf{P}(\mathbf{X}\_{\varepsilon}(k), \mathbf{X}\_{o}(k)) = 0 \end{array} \tag{20}$$

Here, *U*(*k*) is the input matrix and is created by the horizontal concatenation of the input vectors of all controlled vessels *Vc*, up until the control horizon *hc*:

$$\mathbf{U}(k) = \begin{bmatrix} \mathfrak{u}\_1(k) & \mathfrak{u}\_1(k+1) & \cdots & \mathfrak{u}\_1(k+h\_c-1) \\ \vdots & \ddots & \vdots \\ \mathfrak{u}\_{N\_c}(k) & \mathfrak{u}\_{N\_c}(k+1) & \cdots & \mathfrak{u}\_{N\_c}(k+h\_c-1) \end{bmatrix}. \tag{21}$$

Next *Xc*(*k*) and *Xo*(*k*) are the controlled and non-controlled vessel state matrices, respectively, and are created by the horizontal concatenation of the state vectors of all controlled and non-controlled vessels *Vc* and *Vo*, respectively, up to the prediction horizon *hp*.

$$\begin{aligned} \mathbf{X}\_{\varepsilon}(k) &= \begin{bmatrix} \mathbf{x}\_{\varepsilon,1}(k) & \mathbf{x}\_{\varepsilon,1}(k+1) & \cdots & \mathbf{x}\_{\varepsilon,1}\left(k+h\_{p}-1\right) \\ \vdots & \ddots & \vdots \\ \mathbf{x}\_{\varepsilon,N\_{\varepsilon}}(k) & \mathbf{x}\_{\varepsilon,N\_{\varepsilon}}(k+1) & \cdots & \mathbf{x}\_{\varepsilon,N\_{\varepsilon}}\left(k+h\_{p}-1\right) \\ \mathbf{x}\_{o,1}(k) & \mathbf{x}\_{o,1}(k+1) & \cdots & \mathbf{x}\_{o,1}\left(k+h\_{p}-1\right) \\ \vdots & \ddots & \vdots \\ \mathbf{x}\_{o,N\_{\varepsilon}}(k) & \mathbf{x}\_{o,N\_{o}}(k+1) & \cdots & \mathbf{x}\_{o,N\_{\varepsilon}}\left(k+h\_{p}-1\right) \end{bmatrix} \end{aligned} \tag{22}$$

For simplicity, because consecutive states *xi*(*k*) up to *x<sup>i</sup> k* + *hp* − 1 constitute a single trajectory *Ti*(*k*), one can write *Xc*(*k*) and *Xo*(*k*) as the concatenation of the trajectories of the respective vessel sets *Vc*, *Vo* as per Equation (13):

$$\mathbf{X}\_{\mathcal{E}}(k) = \begin{bmatrix} \ & T\_{c,1}(k) \\ \vdots \\ \ & T\_{c,N\_{\mathcal{E}}}(k) \end{bmatrix} \quad \mathbf{X}\_{\mathcal{o}}(k) = \begin{bmatrix} \ & T\_{o,1}(k) \\ \vdots \\ \ & T\_{o,N\_{\mathcal{B}}}(k) \end{bmatrix}. \tag{23}$$

Next, *Ci*(*k*) is the cost function of the *i*-th controlled vessel, formulated as follows:

$$\mathbf{C}\_{i}(k) = F\_{i}(\mathbf{X}(k)) + a\_{G}\mathbf{G}^{2}(\mathbf{U}\_{i}(k)) \tag{24}$$

Here, *X*(*k*) is the vertical concatenation of the two state matrices *Xc*(*k*), *Xo*(*k*), containing the trajectories of all vessels *V* = *Vc* 1 *Vo*

$$X(k) = \begin{bmatrix} \ T\_{c,1}(k) & \dots & \ T\_{c,N\_c}(k) & T\_{o,1}(k) & \dots & T\_{o,N\_0}(k) & \end{bmatrix} \tag{25}$$

The cost function is comprised by two terms *Fi* and *G*, each concerned with the prediction and control horizon, respectively. The presence of *G* term, weighted by the *aG* parameter, encourages the smoothness of the control actions and, consequently, the generated trajectories of the controlled vessels

$$G(\mathcal{U}\_i(k)) = \sum\_{j=1}^{h\_\epsilon - 1} \left\| \mathcal{U}\_{i, j+1}(k) - \mathcal{U}\_{i, j}(k) \right\|. \tag{26}$$

Term *Fi* consolidates the collision avoidance and course keeping objectives, and is specific to the *i*-th vessel

$$F\_{\bar{i}}(\mathbf{X}(k)) = a\_r \sum\_{j \in V \backslash i} \left( f\_{r, \bar{i}j} ^2(\mathbf{X}\_i(k), \mathbf{X}\_j(k)) \right) \frac{1}{|V \backslash i|} + a\_d \, f\_{d, \bar{i}} ^2(\mathbf{X}\_i(k)). \tag{27}$$

In Equation (26), *fr*,*ij* is the collision risk between the *i*-th and the *j*-th vessel, as calculated using their respective trajectories *Xi*(*k*), *Xj*(*k*) by applying Equation (12), and *fd*,*<sup>i</sup>* is the deviation from the original trajectory *TOG*, *<sup>i</sup>*, as expressed in Equation (14). Both terms are weighted by the *ar* and *ad* weighting parameters, respectively. Since we are concerned with the safety of the generated trajectory throughout the whole prediction horizon, the mean

collision risk from all vessels in *V*\*i* is evaluated, in contrast to other approaches [5], where only the maximum collision risk at time *k* is minimized. This way, all possible collision risks are addressed and reduced simultaneously, thus avoiding the adverse possibility of evading one collision risk and increasing another. Moreover, the reason that risk avoidance is used as a control objective in Equation (27) and not as a hard optimization constraint is to ensure that the MPC optimization problem of Equation (20) will not fail in the case of the existence of an inescapable collision risk; as shown in Equation (12), risk is a function of distance to CPA, meaning that the controller will continue to attempt to maximize that distance, thus fulfilling the control intention in such an encounter. However, in order to guarantee that collisions will be avoided, one more constraint to the MPC optimization problem is added by setting an emergency distance d*<sup>e</sup>* (where d*<sup>e</sup>* < *dmin*); to be more specific, the vector *N* contains the DCPAs of all controlled vessels *Vc*, which are required to be above the emergency distance.

At this point, it must be noted that since the state matrix *X*(*k*) consolidates all vessel trajectories, controlled and non-controlled alike, a degree of cooperation is induced between the respective controlled vessels *Vc*. Lastly, returning to the optimization problem denoted in Equation (20), the *U*(*k*) input matrix is bounded by the upper and lower matrices *Uu***,** *Ul*, respectively. The vector *P* contains the COLREG non-compliance penalties for the controlled vessels *Vc* as calculated in Equation (15), which are required to be zero via an equality constraint.

The next item to be addressed regarding the MPC formulation is the used model that maps the input variables *U* to the state variables of the controlled vessels *Xc*. Here, the 1st order linear Nomoto model is used, as described in Equation (18), with the addition of input noise that accounts for modeling error *e* and environmental parameters:

$$\begin{aligned} \mathbf{x}\_{\varepsilon,i}(k+1) &= \hat{L}\_i(\mathbf{u}\_{i\prime}, \mathbf{x}\_{\varepsilon,i}(k)), \text{ where } i \in V\_{\varepsilon} \\ \hat{L}\_i(\mathbf{u}\_{i\prime}, \mathbf{x}\_{\varepsilon,i}(k)) &= L\_i(\mathbf{u}\_i + \varepsilon(\mathbf{u}\_i), \mathbf{x}\_{\varepsilon,i}(k)), \text{where } \varepsilon(\mathbf{u}\_i) = \mathbf{u}\_i \odot (0, \sigma^2). \end{aligned} \tag{28}$$

Here, *G* is a random variable sampled from a Gaussian distribution with a standard deviation of *σ*. Finally, the state matrix of the non-controlled vessels *Xo*(*k*) is assumed to be unknown for the scope of this research, and thus, an estimation is required, based on past positions. For this task, the RBF prediction model presented in Section 2.3 is employed for each non-controlled vessel *j*, and its trajectory for the *k*-th timestep is estimated using its past nine positions:

$$\hat{T}\_{o,j}(k) = RBF\{\mathbf{x}\_{o,j}(k), \mathbf{x}\_{o,j}(k-1), \dots, \mathbf{x}\_{o,j}(k-9)\}, \text{ where } j \in V\_o. \tag{29}$$

Next, in order to alleviate a possible computational burden for the MPC optimization problem, an important assumption should be made. The formulation of the control scheme as-presented would give rise to a high-dimensional search space for the MPC optimization problem, thus greatly hindering its effective solution. It is assumed then that all vessels retain their initial speed, with the only controllable variable being the vessel's rudder angle; this way, the total number of control variables is reduced by half. This approach to the collision avoidance problem has occurred in the literature [5], and is not simplistic for two reasons: first, good seafaring practice dictates that course change maneuvers are preferred over speed ones, not only because they conserve energy, but also because they better emphasize the intentions of the vessel to outside observers, such as other vessels in the vicinity. Second, since large container ships will be examined in the scope this case study, their large longitudinal inertia [48] confirms the assumption that the speed remains almost constant during the timeframe of a typical collision avoidance maneuvering scenario. Therefore, for the scope of this paper, the input matrix *U* at timestep *k* is formulated as follows:

$$\mathbf{U}(k) = \begin{bmatrix} a\_1(k) & a\_1(k+1) & \cdots & a\_1(k+h\_c-1) \\ \vdots & \ddots & \vdots \\ a\_{N\_c}(k) & a\_{N\_c}(k+1) & \cdots & a\_{N\_c}(k+h\_c-1) \end{bmatrix}' \tag{30}$$

where *ai*(*k*) is the rudder angle of vessel *i* at timestep *k.*

Having defined all aspects of the MPC optimization problem, a reiteration of the challenges of the collision avoidance control problem and how they are addressed by the controller is in order: firstly, the goal of the control design is to generate trajectories for the controlled vessels that are risk-free (Equation (12)), smooth (Equation (26)), COLREGcompliant (Equation (15)), and do not deviate from the original course (Equation (14)). Possible collision risks are assessed by utilizing trajectory predictions for non-controlled (obstacle) vessels in the vicinity. The controllable variables are the rudder angles of the vessels (vessel speed is considered constant), while a discrete 1st order Nomoto model (Equation (28)) is used for the modeling of the vessel dynamics, which was also infused with a noise signal for the purpose of accounting for uncertainties and environmental factors. The aforementioned vessel dynamics model has been compared to its higher-order nonlinear counterparts in [50], and it was shown that vessel course inaccuracies occur only for high yaw rates. Given the fact that the proposed collision avoidance method is concerned with large vessels with slow dynamics, the used vessel dynamics model is deemed adequate for the case. In addition, MPC has shown to be robust against model uncertainties or input noise [36]. Finally, the constraints that must be adhered to when searching for the optimal solution (Equation (20)) are the technical bounds on the controlled variables (i.e., maximum and minimum rudder angles) and the COLREG compliance of the result trajectory.

#### *3.3. Control Framework*

Having presented the proposed MPC controller, this section describes its integration within a general control framework. As shown in Figure 5, the framework is comprised by an offline and an online process. The offline process corresponds to the RBF trajectory prediction model training, using data from a specific area of interest (for example, a port) naturally then, it could be undertaken by the port authority. The online process corresponds to the real-time control of autonomous vessels in the presence of obstacle vessels in the area of interest. The MPC collision avoidance controller, as described in Section 3.2, is integrated here and is supplied with real time trajectory predictions of all obstacle vessels in order to calculate the optimal control actions for the controlled vessels. Since the RBF trajectory prediction model has been trained offline in the port authority premises, it is sensible to place the MPC controller there too, and to communicate the computed control actions per control timestep via a communications link with the controlled vessels. Figure 6 demonstrates this concept.

The MPC optimization problem described in Equation (20) is solved using the sequential quadratic programming (SQP) algorithm, which involves iterative calls to the objective function [35]. As shown in Figure 5, the integration of the MPC controller in the control framework requires the calculation of the obstacle vessel trajectory predictions for every controller timestep. Therefore, two main sources of computational complexity arise: the first is the evaluation of the RBF trajectory prediction model, which is shown to be in the order of magnitude of milliseconds [31], meaning that multiple obstacle vessels can be accounted for by the control scheme. The second is the solution of the optimization problem (Equation (20)) by the SQP algorithm, which is known to converge quickly and with few objective function calls [51]. It is concluded that a typical controller timestep duration, comprised by the two aforementioned sources, will not exceed the order of magnitude of seconds, which is considered reasonable given the slow dynamics of large vessels.

**Figure 5.** The proposed control framework.

**Figure 6.** Communications within the proposed control framework.

#### **4. Case Study**

In this section, the performance of the proposed multi-ship MPC controller is assessed using real-life obstacle ship trajectories, which were sourced and preprocessed as described in Section 2.4. In order to underline the importance of using sophisticated trajectory prediction models in the context of collision avoidance controller design, the proposed method is compared to an MPC controller that uses straight-line predictions for the trajectories of obstacle ships based on their current course and speed [9]. To this end, two crossing scenarios are examined, while performance indicators of the generated trajectories are extracted and discussed in detail. The simulations were coded and executed on Matlab 2020b, on a computer with an Intel i7 processor and 16 GB RAM. The simulation sample time is 30". Lastly, the tuning and parameters of the methods are shown in Table 2, while the vessel parameters are shown in Table 3.


**Table 2.** MPC tuning parameters.

**Table 3.** Vessel parameters.

