Determination of Ship Collision Avoidance Timing Using Machine Learning Method

Zhou, Yu; Du, Weijie; Liu, Jiao; Li, Haoqing; Grifoll, Manel; Song, Weijun; Zheng, Pengjun

doi:10.3390/su16114626

Open AccessArticle

Determination of Ship Collision Avoidance Timing Using Machine Learning Method

by

Yu Zhou

^1,2,3,

Weijie Du

⁴,

Jiao Liu

^1,2,3,

Haoqing Li

^1,2,3,

Manel Grifoll

⁵

,

Weijun Song

^6,* and

Pengjun Zheng

^1,2,3,*

¹

Faculty of Maritime and Transportation, Ningbo University, Ningbo 315211, China

²

Collaborative Innovation Center of Modern Urban Traffic Technologies, Southeast University, Nanjing 211189, China

³

National Traffic Management Engineering & Technology Research Center Ningbo University Sub-Center, Ningbo 315832, China

⁴

Ningbo Pilot Station, Ningbo 315040, China

⁵

Barcelona School of Nautical Studies, Universitat Politècnica de Catalunya (UPC—BarcelonaTech), 08034 Barcelona, Spain

⁶

Ningbo Liwan New Material Co., Ltd., Ningbo 315812, China

^*

Authors to whom correspondence should be addressed.

Sustainability 2024, 16(11), 4626; https://doi.org/10.3390/su16114626

Submission received: 22 April 2024 / Revised: 26 May 2024 / Accepted: 27 May 2024 / Published: 29 May 2024

(This article belongs to the Special Issue Sustainable Maritime Transportation)

Download

Browse Figures

Versions Notes

Abstract

:

The accurate timing for collision avoidance actions is crucial for preventing maritime collisions. Traditional methods often rely on collision risk assessments, using quantitative indicators like the Distance to the Closest Point of Approach (DCPA) and the Time to the Closest Point of Approach (TCPA). Ship Officers on Watch (OOWs) are required to execute avoidance maneuvers once these indicators reach or exceed preset safety thresholds. However, the effectiveness of these indicators is limited by uncertainties in the maritime environment and the human behaviors of OOWs. To address these limitations, this study introduces a machine learning method to learn collision avoidance behavior from empirical data of ship collision avoidance, particularly in cross-encounter situations. The research utilizes Automatic Identification System (AIS) data from the open waters around Ningbo Zhoushan Port. After data preprocessing and applying spatio-temporal constraints, this study identifies ship trajectory pairs in crossing scenarios and calculates their relative motion parameters. The Douglas–Peucker algorithm is used to identify the timing of ship collision avoidance actions and a collision avoidance decision dataset is constructed. The Random Forest algorithm was then used to analyze the factors affecting the timing of collision avoidance, and six key factors were identified: the distance, relative speed, relative bearing, DCPA, TCPA, and the ratio of the lengths of the giving-way and stand-on ships. These factors serve as inputs for the XGBoost algorithm model, which is enhanced with Particle Swarm Optimization (PSO), and thus constructing a ship collision avoidance decision model. In addition, considering the inherent errors in any model and the dynamic nature of the ship collision avoidance process, an action time window for collision avoidance is introduced, which provides a more flexible time range for ships to make timely collision avoidance responses based on actual conditions and the specific encounter environment. This model provides OOWs with accurate timing for taking collision avoidance decisions. Case studies have validated the practicality and effectiveness of this model, offering new theoretical foundations and practical guidance for maritime collision avoidance.

Keywords:

maritime transportation; collision avoidance time window; AIS data; PSO-XGBoost; COLREGS; machine learning

1. Introduction

The statistical analysis indicates many causes for ship collisions, including human factors, environmental factors, ship factors, and factors related to the cargo carried on board, among which human factors are the leading cause of maritime collision accidents. This is mainly due to the inaccurate decision making of Officers on Watch (OOWs) and issues related to the timing of collision avoidance [1]. Given these findings, there is significant practical importance in exploring new ideas and methods for determining collision avoidance timing.

Collision avoidance behavior is a dynamic decision-making process that involves several critical steps [2]: deciding when to take action, determining the appropriate action, choosing a new course, deciding when to return to the original course, and selecting a course for repositioning. Among these steps, timing the collision avoidance maneuver is particularly crucial. If an OOW initiates avoidance too early, the vessel may deviate unnecessarily from its course, leading to a longer journey and economic inefficiencies. Conversely, delaying avoidance action can result in emergencies, dangers, or even collisions.

Determining the precise moment for avoidance is challenging, especially for less experienced OOWs who may struggle with making such judgments under pressure. Consequently, ship OOWs often prefer to rely on specific, quantifiable guidelines to inform their actions. However, research on the optimal timing for ship avoidance maneuvers is limited, making it difficult to establish precise and universally applicable rules for this critical aspect of maritime safety.

The primary aim of this study is to enhance maritime navigation safety by reducing the incidence of ship collisions and mitigating the impact of human factors in the collision avoidance decision-making process. To achieve this, this study seeks to provide Officers on Watch (OOWs) with rational and data-driven collision avoidance timing. By focusing on a data-driven approach, this research aims to improve the precision and efficiency of decision-making processes related to ship collision avoidance. Ultimately, the goal is to support OOWs with reliable timing information that can significantly enhance their ability to make timely and effective collision avoidance decisions, thereby contributing to safer maritime navigation.

This paper introduces a novel method to determine the timing of ship collision avoidance actions for give-way ships in crossing situations, utilizing the Particle Swarm Optimization and eXtreme Gradient Boosting (PSO-XGBoost) model to dynamically establish collision avoidance timing. Considering the myriad of factors affecting ships during navigation, this research does not pinpoint a single time point for collision avoidance; instead, it introduces the concept of an “action time window.” This time window accommodates the operational characteristics of most OOWs during the collision avoidance decision-making process and provides a buffer for OOWs to select the most appropriate timing for collision avoidance within this window.

Unlike most methods, this study utilizes a combination of extensive Automatic Identification System (AIS) data and machine learning techniques to investigate the timing of ship collision avoidance. This approach not only considers various factors that influence the timing of collision avoidance, but also places special emphasis on the key variable of the lengths of the give-way and stand-on ships, which has often been overlooked in previous research. By considering these factors comprehensively, the model developed in this study can more accurately assist navigators in determining the optimal timing for collision avoidance, thereby reducing the uncertainty introduced by subjective judgments. Furthermore, the collision avoidance timing window proposed by this study provides navigators with a reasonable operational buffer period. This not only enhances the model’s practicality, but also offers additional safety assurance for maritime navigation.

The structure of the paper is organized as follows: The introduction is followed by Section 2, which reviews studies on the timing of ship collision avoidance. Section 3 describes the research methodology employed in this study. Section 4 details the practical application results of the model. Section 5 discusses the strengths and weaknesses of the approach taken in this paper. Finally, Section 6 concludes with the main findings and suggests directions for future research.

2. Literature Review

2.1. Identification of Collision Avoidance Behavior from AIS Data

The widespread use of AIS in the nautical field has led many scholars to analyze historical ship voyage data to study ship collision avoidance. As a result, various methods for identifying collision avoidance behaviors have been developed.

Yang Zhou et al. [3] have categorized collision avoidance behaviors into two types: steering and deceleration. They introduced a sliding window method applicable to different scenarios of ship encounters, such as pursuit, crossing, and pair encounters. This method identifies the key moment for taking collision avoidance action based on changes in the ship’s heading and speed, with thresholds set at 0.4 times the ship’s beam for heading changes, and 10% of the original speed for speed changes. Bin You et al. [4] focused on the AIS data from ships near the Zhoushan Islands in the Zhejiang Province. They determined collision avoidance behaviors based on the spatio-temporal distribution of encounter scenarios and changes in the heading and speed. The method uses the moment a ship changes its heading and speed when the Time to Closest Point of Approach (TCPA) is greater than zero as the start of collision avoidance, and the change when TCPA is less than zero as the end of the maneuver. Po-Ruey Lei et al. [5] developed a linear regression-based feature extraction method for collision avoidance maneuvers, which smooths the heading differences between two adjacent points using linear regression and then identifies potential maneuvers using the zero-center method.

Jinfen Zhang et al. [6] proposed a two-stage method using the Dynamic Programming algorithm. The first stage processes trajectory points to identify approximate steering points of the ship. In the second stage, they calculate the distance and bearing between each neighboring point to determine the approximate and then the exact timing and amplitude of collision avoidance. Rong et al. [7] introduced a sliding window algorithm that combines three variables: the relative distance between colliding ships, their turning speed, and the derivative of the turning speed. Collision avoidance behavior is identified starting from the moment of the shortest distance between ships, then moving the time window forward to detect any changes in behavior. Lei Du et al. [8] presented a method that combines the Dynamic Programming algorithm with conflict detection to recognize collision avoidance behavior. This method identifies the turning point of a ship’s trajectory, marking it as the point of action, typically a course change, and then checks for conflicts at these turning points to confirm collision avoidance actions.

Overall, these methods employ techniques like trajectory compression or sliding time windows to pinpoint changes in heading and speed during a voyage, marking them as critical times for initiating collision avoidance behaviors.

2.2. Timing of Collision Avoidance

Current studies on ship avoidance timing are categorized into two main approaches: ship domain-based approaches and indicator-based approaches. Each method offers a different perspective on how and when OOWs should engage in avoidance maneuvers, addressing the need to understand and effectively respond to hazardous situations.

2.2.1. Ship Domain-Based Approach

The concept of a ship’s domain, defined as a specific area around a ship that other vessels should avoid to prevent collisions, has been extensively explored and diversified in maritime research. This domain acts as a buffer zone, and its violation indicates a potential collision scenario. Researchers have proposed various shapes for these domains, such as elliptical, circular, and polygonal, each tailored to different navigational needs and scenarios.

Fujii and Tanaka [9] proposed an elliptical ship domain based on statistical methods, defining the long axis as eight times the length of the ship and the short axis as 3.2 times, particularly useful for analyzing traffic flow in specific channels or waterways. Conversely, Goodwin [10] used statistical methods to propose circular ship domains divided into three asymmetric sectors to study required encounter distances under the International Regulations for the Prevention of Collisions at Sea (IRPCS), focusing on ships traveling in distant seas.

Pietrzykowski [11] introduced a polygonal ship domain based on the dynamic functions of ship size and speed, offering another perspective on spatial navigation safety. Ning Wang [12] expanded on these ideas with the Quaternion Ship Domain (QSD) model, which considers four radii—forward, aft, starboard, and port—to encompass various factors affecting navigational safety, such as ship maneuverability and environmental conditions.

The application of AIS has led to further innovations in ship domain modeling. Hansen [13] developed an empirical minimum ship domain model by analyzing the southern Danish waters’ sailing distances over four years, aimed at ensuring comfortable sailing. Wang, Y [14] created a model for ships in confined waters with dynamically increasing polygonal regions around the ships based on their speed, enhancing the model’s responsiveness to real-time conditions.

Fan Zhang et al. [15] proposed a method for dynamically determining the collision avoidance domain of ships in inland waters by analyzing ship trajectories under different hydrological conditions and months using AIS data. The aim was to improve ship collision avoidance decisions and inland waterway traffic management. Dinh et al. [16] introduced a dual-part ship domain consisting of a quadrilateral “blocking zone” and a circular “action zone”, delineating areas for immediate action and heightened caution. Rafal Szlapczynski [17], using a large amount of ship encounter data from 36 locations, proposed a method for predicting the safe space (i.e., ship domain) required by ships in coastal waters. The aim is to provide decision support for ship operations and enhance maritime safety. Dongqin Liu et al. [18] proposed a Dynamic Quaternion Ship Domain (DQSD) model based on AIS data and the navigator’s state, which dynamically captures changes in the shape and size of the ship domain by considering factors such as ship maneuverability, the physical and mental states of the navigator, and environmental conditions. Silveira [19] improved the quaternion ship domain model by allowing each quadrant to have a different shape, aligning better with empirical data for cargo ships and tankers of various lengths. This model uses violations of the quaternion domain as indicators of collision risk.

These diverse approaches to defining ship domains reflect ongoing efforts to enhance maritime safety through empirical, knowledge-based, and analysis-driven methods. Szlapczynski et al. [20] noted that while ship domains are typically shaped by local traffic densities and navigational conditions, they are subject to the limitations of empirical data, subjective expert assessments, and the specific assumptions of each research approach.

2.2.2. Indicator-Based Approach

Indicator-based methods for determining the timing of ship collision avoidance primarily assess the risk of collision using key factors such as the relative distance, relative speed, TCPA, and distance to closest point of approach (DCPA). Depending on which factors are prioritized, various methods have been developed to accurately predict and implement collision avoidance maneuvers. Table 1 summarizes the specific factors selected by different authors along with the corresponding methods they developed for collision avoidance decisions.

The indicator-based approach to determining ship collision avoidance timing primarily relies on factors influencing ship collision avoidance actions. It employs various methods to construct a collision risk indicator, thereby assisting the OOWs in collision avoidance decisions. While this approach offers a concise and rapid method for selecting collision avoidance timing, it relies on predetermined threshold values to trigger the ship’s collision avoidance behavior. These threshold values vary across different methods and researchers. Additionally, the selection and number of influencing factors used in the models contribute to their precision—the more factors considered, the higher the potential accuracy.

Despite these methodologies, more studies are needed to study collision avoidance behavior from historical AIS data. Analyzing historical AIS data provides a realistic reflection of OOW operating characteristics and avoids the subjective biases often present in simulated data. However, most indicator-based methods do not consider ship length, a significant factor evident from various ship domain models, which also affects collision avoidance timing.

This paper proposes a machine learning-based method to determine a ship collision avoidance timing window using extensive data from successful avoidance maneuvers in open water cross-encounter scenarios. This data captures the actual decision-making processes of ship OOWs, including their subjective judgment, experience, and decision-making dynamics, thus providing a more holistic view of human factors in collision avoidance.

The method involves reconstructing cross-encounter scenarios from AIS data and using the Douglas–Peucker algorithm to identify critical turning points in the ship’s trajectory. These points mark crucial timing for collision avoidance decisions. A dataset was then constructed to recognize collision avoidance timing, and Random Forest (RF) analysis was used to evaluate the influencing factors, highlighting the effectiveness of the selected features. Finally, the PSO-XGBoost model was applied to a multi-input system to determine the optimal timing for collision avoidance. This paper also establishes a dynamic collision avoidance timing window rather than a fixed point, allowing for a more comprehensive consideration of the complex factors affecting a ship during its voyage. This method provides ship OOWs with more scientifically grounded and flexible support for making collision avoidance maneuvers.

3. Methods

The research methodology, illustrated in Figure 1 through a detailed flowchart, includes several key steps. Initially, to ensure data reliability, AIS data are preprocessed to create a dataset of ship trajectories. From this dataset, pairs of trajectories that represent potential collision scenarios of crossing situations are extracted based on their spatio-temporal characteristics. Following this extraction, the relative motion parameters of these ships are calculated. These parameters play a critical role in identifying the avoidance behavior of the ships, which is crucial for determining the timing window within which collision avoidance actions should occur. Once the ship collision avoidance timing is identified, these relative motion parameters are then utilized as input variables for the PSO-XGBoost model. This model is responsible for determining the timing for collision avoidance actions.

3.1. Data Preprocessing

The information recorded in AIS data includes: (1) static information, such as the Maritime Mobile Service Identity (MMSI) number, IMO number, ship name, radio call sign, ship type, overall length, and beam; (2) dynamic information, such as the UTC time, ship position, speed over ground (SOG), course over ground (COG), heading, and navigational status; and (3) voyage-related information, which encompasses the draught and destination. In this research, which focuses on ship behavior, the data utilized include SOG, COG, heading, and position. This study focused on ship behavior and utilized various dynamic data, including UTC, SOG, COG, heading, and positional information. Regarding static information, this study incorporated only the parameter of ship length. Given the inherent inaccuracies in the dataset’s meteorological and hydrological conditions, we have elected not to consider these factors in our analysis. COG indicates the direction of the ship’s SOG, whereas heading shows the actual direction the ship is pointing towards.

The reporting interval of AIS messages varies based on the ship’s speed and course alterations. In busy ports and waterways, AIS data are typically reported every 6 s, or at least every 10 s when ships are sailing at lower speeds. This frequent reporting allows for a detailed record of the ship’s behavior.

However, since each vessel transmits AIS messages at its own specific intervals, the data from different ships in the same area are not always synchronized. To analyze encounter situations accurately, it is necessary to standardize the AIS data by synchronizing ship trajectories to the same time stamp. Given the usual reporting intervals in waterways, all AIS data used in this study are resampled using linear interpolation at an interval of 10 s.

3.2. Ship Encounter Trajectory Data Extraction

3.2.1. Ship Encounter Scenario

The International Regulations for Preventing Collisions at Sea (COLREGS) provide detailed guidelines for vessels to avoid collisions in various navigational contexts. According to these regulations, when a vessel encounters potential collision risks with other ships, the encounter situation must be determined using relative bearing and heading information between the own ship and the approaching vessel. The COLREGS define three primary collision scenarios: head-on, overtaking, and crossing. In particular, a crossing situation occurs when the courses of the own vessel and the target vessel intersect with a course difference between 67.5–174 or 186–292.5 degrees, with a specific range of speed ratios between the vessels that leads to a heightened risk of collision. This paper specifically focuses on crossing situations involving two ships.

3.2.2. Extraction of Encounters in AIS Data

Firstly, select a study area and note that ship A enters at

t_{A}^{i}

and exits at

t_{A}^{j}

, while ship B enters at

t_{B}^{i}

and exits at

t_{B}^{j}

. If the entry and exit times for both ships i and j satisfy Equation (1), it can be concluded that the two ships have had an encounter.

[\begin{matrix} t_{A}^{i}, t_{A}^{j} \end{matrix}] \cap [\begin{matrix} t_{B}^{i}, t_{B}^{j} \end{matrix}] \neq \emptyset

(1)

In the study of crossing situations, the focus is on extracting trajectory pairs from two ships. To accomplish this, the AIS data within the study area are filtered to isolate trajectories specific to the ship crossing situation. This refined dataset will then be utilized in the subsequent stages of the analysis.

3.3. Relative Motion Parameters

When ships encounter each other, the decision-making process to avoid a collision relies heavily on evaluating the relative motion parameters between the vessels. These parameters include the DCPA, TCPA, relative distance (D), relative bearing (a_T), and relative speed (Vr).

In terms of data presentation and analysis, each ship views itself as “own ship (S₀)”, while any other ships are considered “target ships (S_T)”. In the coordinate system described in Figure 2, the positions of the own ship (S₀) and the target ship (S_T) are represented by the coordinates (x₀, y₀) and (x_T, y_T), respectively. Additionally, the heading and speed of each vessel are denoted by Ɵ₀, v₀ for the own ship, and Ɵ_T, v_T for the target ship:

\{\begin{array}{l} ν_{x 0} = ν_{0} \cdot \sin θ_{0} \\ ν_{y 0} = ν_{0} \cdot \cos θ_{0} \end{array}

(2)

\{\begin{array}{l} ν_{x t} = ν_{T} \cdot \sin θ_{T} \\ ν_{y t} = ν_{T} \cdot \cos θ_{T} \end{array}

(3)

\{\begin{array}{l} ν_{x r} = ν_{x t} - ν_{x 0} \\ ν_{y r} = ν_{y t} - ν_{y 0} \end{array}

(4)

ν_{0 T} = \sqrt{{ν_{x r}}^{2} + {ν_{y r}}^{2}}

(5)

θ_{0 T} = \arctan \frac{ν_{x r}}{ν_{y r}} + α

(6)

α = \{\begin{array}{l} 0^{o} & ν_{x r} \geq 0, ν_{y r} \geq 0 \\ 180^{o} & ν_{x r} < 0, ν_{y r} < 0 \\ 180^{o} & ν_{x r} \geq 0, ν_{y r} < 0 \\ 360^{o} & ν_{x r} < 0, ν_{y r} \geq 0 \end{array}

(7)

D = \sqrt{{(x_{T} - x_{0})}^{2} + {(y_{T} - y_{0})}^{2}}

(8)

a_{T} = \arctan \frac{x_{T} - x_{0}}{y_{T} - y_{0}} + γ

(9)

γ = \{\begin{array}{l} 0^{°} & x_{T} - x_{0} \geq 0, y_{T} - y_{0} \geq 0 \\ 180^{°} & x_{T} - x_{0} < 0, y_{T} - y_{0} < 0 \\ 180^{°} & x_{T} - x_{0} \geq 0, y_{T} - y_{0} < 0 \\ 360^{°} & x_{T} - x_{0} < 0, y_{T} - y_{0} \geq 0 \end{array}

(10)

DCPA and TCPA can be calculated as:

D C P A = D \sin (θ_{0 T} - a_{T} - π)

(11)

T C P A = D \frac{\cos (θ_{0 T} - a_{T} - π)}{v_{0 T}}

(12)

3.4. Recognition of Timing for Collision Avoidance

Figure 3 illustrates an example of collision avoidance between two vessels in a crossing situation. The trajectory of the give-way ship reveals significant changes in heading during maneuvers. This study employs the DP algorithm to identify these critical points in ship collision avoidance behavior.

The DP algorithm, developed by Douglas and Peucker, is a well-known vector data compression algorithm that effectively simplifies linear data. It achieves this by reducing the number of trajectory positions while preserving only the significant locations, making it ideal for compressing ship trajectory data and extracting key points [40,41]. This algorithm is particularly adept at maintaining the overall shape of the curve and avoiding local distortions that other methods may introduce due to the perpendicular distance limit. The compression steps of the DP algorithm are as follows:

(1): Set a compression threshold, d_max, and connect the first and last points of the curve.
(2): Assume that the perpendicular distance d from the middle point (suppose it is P) to the line connecting is the first and last points and compare d with d_max.
(3): If the distance d from a middle point (designated as P) to the connecting line is greater than dmax, then point P is retained as the reference point for the next judgment; otherwise, the point is discarded.
(4): Continue to extract key points according to the above steps until all the d values in the subsets are less than dmax, at which point the operation is terminated; if not, continue the process.

Figure 4 illustrates this algorithm. In part (a) of the figure, the original ship trajectory is shown with points from P₁ to P₁₁. In (b), a simplified trajectory is formed by directly connecting the start (P₁) and end (P₁₁) points, represented by a yellow line. The perpendicular distances from intermediate points P₂ to P₁₀ to this line are calculated. Point P₇, being the farthest, exceeds the threshold and is identified as a critical breakpoint. In part (c), these results in the trajectory are split at P₇ into two sub-trajectories. This procedure is repeated to pinpoint new break points, culminating in the final simplified trajectory shown in (d), which includes points P₁, P₄, P₇, P₁₀, and P₁₁.

In this research, we determined the ship collision avoidance decision timing using the following procedure:

Identify the Give-Way Vessel.
Apply the D-P Compression Algorithm on the trajectory of the give-way ship with the compression threshold of 60 m, and identify the turning points in the ship’s trajectory.
Determine Collision Avoidance Timing: the first turning point is designated as the timing point for the ship’s collision avoidance behavior.

3.5. Particle Swarm Optimization eXtreme Gradient Boosting

XGBoost uses the gradient boosting algorithm for higher performance, scalability, and robustness with respect to other models, and is able to handle large-scale datasets and provide accurate predictions. It is widely used in many practical applications and has achieved significant success in the field of data science [42]. Particle swarm optimization algorithms can effectively perform feature selection and parameter tuning through the process of searching for optimal solutions. Combining the two can automatically identify and select the most relevant features and find the best combination of parameters to improve the effectiveness of the model through better generalization.

3.5.1. XGBoost Algorithm

The eXtreme Gradient Boosting algorithm (XGBoost) was proposed by Dr. Tianqi Chen as an extension of the Gradient Boosting Decision Tree (GBDT) algorithm [43]. XGBoost is a variant of GBDT, but with an improved computational speed and efficiency. Its core optimizes the objective function by adding regularization terms that characterize the model complexity and loss functions to construct the model.

As an ensemble tree model, the XGBoost model consists of multiple Classification and Regression Trees (CART). The model’s output is the sum of the predictions from these trees, which serves as the final prediction of the XGBoost model [44]. The principles of the XGBoost model are as follows:

Assuming there are K trees, the ensemble classifier can be represented as:

\hat{y} = \sum_{k = 1}^{K} f_{k} (x_{i}), f_{k} \in F

(13)

where

f_{k}

is the generated tree model function,

F

represents a possible CART tree, and

\hat{y}

is the output of the XGBoost model. For a given sample with a length of n and m features, we have:

D = {(x_{i}, y_{i})} (| D | = n, x_{i} \in R^{m}, y_{i} \in R)

(14)

where

x_{i}

represents the input of the isample,

y_{i}

represents the output corresponding to the isample input, and G represents the space of CART trees, which can be represented as:

F = {f (x) = w_{q} (x)} (q : R^{m} \to T, w \in R^{T})

(15)

Here, q represents the structure of the CART tree, T denotes the number of child nodes in the CART tree, and

f (x)

represents the CART tree structure q and the weights of its child nodes. The establishment of the XGBoost model involves learning CART trees to determine their structures and weights.

By incorporating regularization, we can obtain the objective function of XGBoost, which can be expressed as:

O b j = \sum_{i = 1}^{n} l (y_{i}, y_{i}) + \sum_{k = 1}^{k} Ω (f_{k})

(16)

Here,

l (y_{i}, y_{i})

represents the loss function, which typically includes the mean squared error and logistic regression.

Ω (f_{k})

is the regularization term given by the following equation, which mainly constrains the depth of CART trees to reduce their complexity and prevent overfitting.

After t rounds of iterations, the loss function becomes:

O b j^{(t)} = \sum_{i = 1}^{n} l (y_{i}, y_{i}^{(i)}) + \sum_{i = 1}^{l} Ω (f_{i}) = \sum_{i = 1}^{n} l (y_{i}, y_{i}^{(i - 1)} + f_{i} (x_{i})) + Ω (f_{i}) + C

(17)

Expanding the above equation using Taylor series, we can obtain:

O b j^{(i)} \approx \sum_{i = 1}^{n} [l (y_{i}, y_{i}^{(i - 1)}) + g_{i} f_{i} (x_{i}) + \frac{1}{2} h_{i} f_{i}^{2} (x_{i})] + Ω (f_{i}) + C

(18)

g_{i} = \partial_{g^{(t - 1)}} l ({\hat{y}}_{i}^{(t - 1)}, y_{i})

(19)

where

g_{i}

is the first-order derivative of the Taylor expansion with respect to the sample and

h_{i}

is the second-order derivative of the Taylor expansion with respect to the sample. C is a constant term that does not affect optimization and can be omitted. After removing it, the new objective function becomes:

{\bar{O b j}}^{(t)} \approx \sum_{i = 1}^{n} [g_{t} f_{t} (x_{i}) + \frac{1}{2} h_{t} f_{t}^{2} (x_{t})] + Ω (f_{t})

(20)

The function defining the sum of squares of leaf node weights (w) is defined as a regularization term, as shown in the following equation:

Ω (f_{k}) = γ T + \frac{1}{2} λ \sum_{t = 1}^{T} w_{j}^{2}

(21)

From the above equation, we can obtain the objective function as follows:

{\bar{O b j}}^{(t)} = [G_{j} w_{j} + \frac{1}{2} (H_{j} + λ) w_{j}^{2}] + γ T

(22)

G_{j} = \sum_{i \in I_{j}} g_{i}, H_{j} = \sum_{i \in I_{j}} h_{i}

(23)

By taking the derivative of

w_{j}

, we can obtain the optimal solution for the node weights and the objective function:

w_{J}^{*} = - \frac{G_{J}}{H_{J} + λ}

(24)

O b j^{*} = - \frac{1}{2} \sum_{j = 1}^{T} \frac{G_{J}^{2}}{H_{J} + λ} + γ T

(25)

3.5.2. Particle Swarm Optimization

Particle Swarm Optimization (PSO) is a global search algorithm proposed by Eberhart and Kennedy in 1995 [45,46]. A stochastic search algorithm simulates biological activities in nature and collective intelligence. In addition to affecting the collective behavior of organisms, it incorporates individual cognition and social influence, making it a swarm intelligence algorithm. It relies on cooperation among individuals to solve complex problems. The particles represent the individuals in the population, distributed in D-dimensional space. The particles move through space with velocity V and are at position X, which denotes the potential solution to the problem.

During each iteration, the fitness of each particle can be obtained by calculating the fitness function. The individual optimal solution and the optimal global solution can be found according to fitness. Then, the velocity and position are adjusted as follows:

V_{i d} = ω V_{i d} + C_{1} r a n d o m (0, 1) (P_{i d} - X_{i d}) + C_{2} r a n d o m (0, 1) (P_{g d} - X_{d})

(26)

X_{id} {= X}_{id} {+ V}_{id}

(27)

where C₁ and C₂ are learning factors representing the individual and social influences on particle velocity, respectively, X_id denotes the current position of the ith particle, P_id denotes the optimal position of the ith particle, P_gd denotes the optimal global position, and ω is the inertia weight.

3.5.3. Collision Avoidance Decision Model

In the process of collision avoidance, the give-way vessel must continuously assess the need for evasive action until a decision is made. This decision-making process is described with a collision avoidance decision model. This paper utilizes parameters such as DCPA, TCPA, D (distance), αT (bearing angle), Vr (relative velocity), and the length ratio (LR; L/L1) as input vectors. The specific process for determining these parameters is described in Section 4.2. The output of the model is the decision variable indicating whether a collision avoidance maneuver should be initiated.

Figure 5 illustrates the training structure of the collision avoidance decision model. The input data for this model includes features corresponding to each collision avoidance decision point along with action values, where 0 indicates that no collision avoidance action is required and 1 indicates that an action is necessary. During the training phase, time points T1 through Tn are split into a training set and a validation set using an 8:2 ratio. The training employs a particle swarm algorithm to optimize the three key parameters of the model: num_boost_round, max_depth, and the learning rate. This optimization process aims to establish the most effective decision model by enhancing the correlation between the features. To prevent model overfitting, a 10-fold cross-validation method is utilized. After training, the PSO-XGBoost model is evaluated using the validation set to assess whether collision avoidance action is required at each time point. The model’s predictions are then compared with actual data to evaluate its accuracy and generalizability.

4. Results

4.1. Data

The research area of this study is the adjacent waters of Ningbo-Zhoushan Port, one of the world’s largest ports in terms of cargo throughput. The geographical coordinates of the area range from 122.40° E to 122.53° E in longitude and from 29.65° N to 29.75° N in latitude. This location is strategic as it connects the main shipping routes, making it a crucial point for large ships entering and exiting Ningbo-Zhoushan Port. Consequently, the area is a hotspot for shipping activities, where vessels on the north–south routes off the coast of China, including traditional shipping lanes off the coast of Zhejiang and in the East China Sea, frequently pass through. Due to the heavy traffic, collisions are relatively common in this region. To minimize the impact of fishing activities on the results of the study, we analyzed AIS data from June to August 2020, during the national fishing ban period. This temporal selection ensures that the data reflect only shipping movements, providing clearer insights into navigation patterns and risks associated with maritime traffic.

A total of 5 GB of AIS data was used for analysis. Due to potential issues such as equipment problems, sensor failures, and signal interference, some data may exhibit anomalies like erratic speeds or positions. To ensure the integrity of our analysis, this study excludes any vessels with such abnormal data, focusing only on those in a continuous state of navigation. As a result of this initial data cleansing, the dataset included 30,000 ship trajectories. All data were pre-processed as outlined in Section 3 of the paper. These data were manually reviewed, and 574 valid ship track pairs of collision avoidance behavior under a crossing situation were identified.

4.2. Analysis of Influencing Factors on the Timing of Ship Collision Avoidance Actions

This paper incorporates both dynamic and static factors as key elements influencing the timing of ship collision avoidance decision making. Specifically, the paper selects the DCPA, TCPA, D (distance), αT (bearing angle), Vr (relative velocity), L (length of the stand-on vessel), L1 (length of the give-way vessel), and the length ratio (LR; L/L1) as the influencing factors. To assess the impact of eight influential factors on the timing of ship collision avoidance, this study utilizes the RF method to rank and analyze their significance. The RF method surpasses traditional multiple linear regression models by effectively capturing the complex nonlinear relationships and interactions between variables. This unique algorithmic approach not only evaluates, but also highlights the contribution of each factor to the model’s overall predictive power. Such analysis is crucial for pinpointing key factors that significantly influence the timing of ship collision avoidance.

Figure 6 illustrates the influence of the eight specified factors on the timing of ship collision avoidance.The figure reveals that TCPA and distance are the most influential factors. Conversely, the length of the stand-on ship has the least impact. Despite this, the ship length remains a crucial factor in the study of ship collision avoidance and should not be disregarded. To improve model training efficiency and account for inter-variable correlations, the top six factors have been selected as input features for the model. These include the distance between the two ships, the relative speed, relative bearing, DCPA, TCPA, and the ratio of the lengths of the give-way ship and the stand-on ship. This selection aims to refine the input for the collision avoidance decision model, ensuring a more targeted and effective analysis.

4.3. Determination of the Timing of Collision Avoidance

In this study, we utilize the Scikit-Learn toolkit to implement the XGBoost model within Python software. The dataset consists of 574 trajectory pairs from collision avoidance behaviors in cross situations. A total of 10,332 time points were selected to represent collision avoidance decision points, including the actual timing of collision avoidance, nine points before the action, and eight points after, assuming both vessels maintain their speed and course. The model is initially trained using the particle swarm optimization algorithm to fine-tune the XGBoost parameters. This process continues until the iteration criteria are met, at which point the algorithm terminates and the optimal parameter values are established, as detailed in Table 2. The refined model then undergoes further training to enhance its robustness and improve its predictive accuracy for collision avoidance decisions.

To evaluate the accuracy of the model on both the training and validation sets, this study employs a 10-fold cross-validation approach. The ROC curve is displayed in Figure 7 and demonstrates a promising area under the curve (AUC) of about 0.90, indicating good performance. Further insights are provided by the confusion matrix in Table 3. The matrix details that, within the training set, there are 1036 data points belonging to Class 1, with 140 of these being misclassified as Class 0. Additionally, there are 1038 data points categorized as Class 0, of which 176 are misclassified as Class 1. The recall is 83%, accuracy is 84%, precision is 84%, and the F1-score is 84%, respectively. The results indicate that the model is both accurate and reliable.

4.4. Comparative Analysis

To verify the performance of the proposed PSO-XGBoost model, it is compared with other classification models such as the Support Vector Machine (SVM), EXtreme Gradient Boosting (XGBoost), Random Forest (RF), and Adaptive Boosting (AdaBoost) in this section. These classification algorithms are briefly described as follows.

SVM is a linear classifier that aims to find an optimal hyperplane to separate different classes of samples as much as possible while having good generalization capabilities for new models. It is used for binary classification and multi-class classification tasks. The core idea is based on the statistical learning theory’s structural risk minimization principle [47].

RF is composed of multiple independent decision trees. Each decision tree in the forest classifies the samples separately. The category with the most votes among all the results of the decision trees is used as the classification result of RF [48].

Adaboost is an ensemble learning method that enhances classification accuracy or regression performance by combining multiple weak learners to form a strong learner. A key advantage of Adaboost is its high robustness to outliers and noisy data. Additionally, it demonstrates good performance across various types of datasets, especially when the dataset is small or contains noise [49].

Accuracy, precision, the F1 score, and recall are selected as evaluation metrics and the results of their individual models are shown in Figure 8. The evaluation metric of AdaBoost is the lowest among all models. This is because AdaBoost is an ensemble learning method that combines multiple weak learners to improve the classification performance. AdaBoost performs softer than other ensemble learning models, such as RF and XGBoost. Since this paper deals with the nonlinear relationship between collision avoidance decisions and parameters, and the support vector machine (SVM) is a linear classifier that cannot handle nonlinear problems well, it exhibits a weaker performance. On the other hand, the XGBoost algorithm incorporates regularization terms to improve model accuracy and prevent overfitting. It also utilizes particle swarm optimization (PSO) to optimize parameters. Therefore, among all the algorithms, the PSO-XGBoost model performs the best.

The ROC curves of the models are displayed in Figure 9. The ROC curve of the PSO-XGBoost model, which approaches the top-left corner with an AUC of 0.90, indicates good prediction accuracy. Comparatively, the performance enhancement in PSO-XGBoost over the standard XGBoost model suggests that integrating the Particle Swarm Optimization algorithm improves the predictive accuracy.

Considering that the ROC curve alone is insufficient to fully evaluate the quality of a model, the Precision–Recall (P-R) curve is utilized as an alternative, focusing on recall and precision rates. In the context of multi-class classification scenarios, the area under the P-R curve is referred to as the mean Average Precision (mAP), which quantifies the accuracy of the classification. The larger the AUC value for the P-R curve, the higher the average precision across various recall levels, indicating a superior model performance. Figure 10 presents the P-R curves for all of the models, with the PSO-XGBoost model achieving the highest mAP value of 0.90. The lower performance of the SVM model is attributed to the skewed data distribution. Based on the aforementioned analysis, it can be concluded that the proposed PSO-XGBoost model exhibits optimal predictive and discriminative accuracy in the context of ship collision avoidance decision making.

4.5. Case Analysis

On 6 January 2018, a maritime collision accident occurred in the East China Sea involving the Panamanian oil tanker “Sanchi” and the Hong Kong-registered cargo ship “CF Crystal”. The accident led to the explosion and sinking of the “Sanchi”, while the “CF Crystal” sustained severe damage [50,51]. According to the accident investigation report, at the time of the collision, the “Sanchi” was en route to South Korea, and the “CF Crystal” was traveling from the United States to China. As shown in Figure 11, the report highlights that the “Sanchi”, as the give-way vessel, was responsible for initiating avoidance maneuvers between 19:32 and 19:44 to ensure the safe passage of both ships. However, the OOWs of both vessels failed to effectively identify and address the situation, leading to the development of a dangerous crossing situation and the subsequent collision.

In this study, we selected this collision case to test the proposed collision avoidance decision model. The change in distance over time between the two ships and the collision avoidance decision output from the model is shown in Figure 12. It is revealed that the distance between the vessels gradually decreased over time. By the time of 19:38:50, the model’s output changed, suggesting that the give-way ship should initiate collision avoidance maneuvers, with the two ships being approximately 3.9 nautical miles apart. This identified moment for action aligns precisely with the crucial interval of 19:32 to 19:44 suggested by experts in the accident report. This confirms the model’s effectiveness, as it accurately pinpoints the timing for initiating collision avoidance maneuvers.

4.6. Collision Avoidance Time Window

Given the inherent errors in any model and the dynamic nature of ship collision avoidance processes, which are influenced by various factors including environmental conditions and OOW characteristics, relying solely on a specific timing for collision avoidance action may not be sufficiently accurate. Therefore, this study introduces the concept of a “give-way ship action time window”. This concept provides a more flexible time range for ships to make timely collision avoidance responses based on the actual conditions and the specific encounter environment. This approach better accommodates the uncertainty and diversity of ship behavior, enhancing the applicability and practicality of collision avoidance decisions.

The results of a model, which are derived from data training, tend to reflect the collision avoidance timing recognized by the majority. By statistically analyzing these discrepancies between different collision avoidance cases, we can gain a more comprehensive understanding of the decisions made by different OOWs during actual operations. Consequently, we can develop a more comprehensive and inclusive collision avoidance time window that reflects not only the consensus of the majority, but also addresses the specific properties of individual cases.

The timings of the decision points for the 574 sets of data in the dataset of this research were analyzed, and the discrepancies between the model-identified timings and the actual decision points are presented in Figure 13. The figure shows that most of the model’s output aligns with the actual outcomes, indicating the model’s overall accuracy in identifying the appropriate timing for the give-way ship to take action. The distribution of the timing discrepancies of the actual and the model is also shown in the figure. When fitted to a normal distribution, it can be observed that at a 95% confidence interval, the time window for the give-way ship to take collision avoidance is about 60 s, which means in actual navigation, OOWs may choose to act within this time period based on their own experience and specific circumstances.

5. Discussions

In maritime operations, OOWs make decisions for collision avoidance decisions. This study uses machine learning methods to learn from the empirical data of the timing of collision avoidance under crossing. It examines factors influencing the timing of ship collision avoidance and introduces a collision avoidance time window to provide more practical guidance.

This study uses the Douglas–Peucker (DP) algorithm to analyze the heading and trajectory change characteristics of the give-way ship during collision avoidance maneuvers. By extracting critical inflection points from the trajectories, this method identifies key moments for initiating collision avoidance actions. For the AIS data, this study only considered dynamic information and ship length, including ship performance parameters, human factors, sea conditions, and weather. The ship’s loading conditions were also ignored. Ignoring this information may affect the comprehensiveness of our findings. Therefore, future research must incorporate these factors into the analysis to improve the accuracy and applicability of the model, enriching the AIS dataset with parameters such as the vessel type, loading configuration, weather conditions, and the knowledge and experience level of the OOWs.

Navigational safety depends heavily on the knowledge and experience of the OOWs, who are key in making and executing collision avoidance decisions. Therefore, some scholars, based on the COLREG, excellent maritime navigation techniques, and the practical collision avoidance experience of Officers on Watch (OOW), have constructed a knowledge base [52]. This knowledge base forms the foundation of an expert system designed to assist in ship collision avoidance decision making. However, this approach may be limited by the experts’ fields of expertise, their experience, and the speed at which up-to-date knowledge is updated. Moreover, the expert system may only partially simulate the complex human factors such as the crew’s judgment, communication, and collaboration. This study captures the essence of ship collision avoidance experience from empirical data, offering a method that reflects the OOW’s influence more accurately and introduces a timing window to enhance the practicality of the model.

Existing domain-based methods and indicator-based methods typically involve the quantification of collision risk, followed by the navigator’s determination of the appropriate collision avoidance timings based on their experience. The collision avoidance model proposed in this study can assist navigators in determining whether a specific moment is the right time to make collision avoidance decisions. OOWs can execute the corresponding collision avoidance actions based on the model’s analytical results, thereby reducing the impact of subjective judgments in ship navigation. Furthermore, in selecting factors affecting the timing of ship collision avoidance, this study comprehensively considers various factors, with particular emphasis on the length ratio between the give-way ship and the stand-on ship. This factor needs to be adequately considered in most existing research. This integrated approach not only enhances the model’s predictive accuracy, but also strengthens its applicability and effectiveness in actual maritime operations.

The study has several limitations. It focuses only on crossing encounters and does not account for environmental disturbances such as other vessels or weather conditions, nor does it consider changes in the ship speed, focusing only on course alterations. The current study focuses primarily on determining the timing of collision avoidance and does not address the specific planning of collision avoidance paths for ships. Our current work provides a crucial foundation by establishing the optimal timing for collision avoidance maneuvers, which is a significant aspect of maritime safety. However, to enhance the practical application of our findings, we recognize the need to extend our research to include the planning of precise collision avoidance paths.

In future research, we aim to expand the capabilities of our current algorithm by integrating it with more extensive collision avoidance path generation models. This integration will enable the development of a comprehensive system that not only determines the optimal timing for collision avoidance, but also plans and executes the necessary maneuvers automatically. By linking the algorithm to the ship’s automatic control systems, we can significantly reduce the reliance on human intervention and the associated risks of human error or negligence. This progression towards fully automated collision avoidance holds the potential to greatly enhance maritime safety, ensuring more precise and timely responses to potential collision threats. As such, this future research direction represents a critical step towards achieving a safer and more reliable navigation system in the maritime industry.

6. Conclusions

In this study, we introduced a method combining PSO-XGBoost to determine the optimal timing for ship collision avoidance from empirical data. This model utilizes ship relative motion variables as inputs to calculate the timing for collision avoidance maneuvers. The research results indicate that the model’s accuracy and other evaluation metrics are all above 83%. A Random Forest analysis was conducted to investigate the factors influencing the timing of ship collision avoidance. It was found that the TCPA and the distance between the vessels are critical factors. Additionally, this study revealed that the lengths of both the give-way and the stand-on ships impact the timing of collision avoidance maneuvers, a factor usually overlooked in previous research. The proposed PSO-XGBoost model has proven effective in identifying the appropriate moments for the give-way ship to initiate collision avoidance actions. A collision avoidance time window was also established, with the value of the timing window being approximately 60 s to facilitate practical application.

Although this model’s decisions are based on extensive experience and aim to provide reasonable outcomes, they may not be sufficient to influence the subjective judgments of operators with distinct individual personalities. Our algorithm is still in the process of continuous improvement and has not yet considered how to integrate it directly with the ship’s automatic control system. However, this challenge has been brought into our view and will become a core focus of our future research work. We will continue to explore and develop with the aim of achieving the seamless integration of the algorithm with the ship’s control system, promoting further development in the automation of ship navigation. In addition, this model is mainly about determining the timing of ship collision avoidance, and it is still unable to generate specific collision avoidance paths. In the future, we will combine this model with the model of collision avoidance behavior generation to generate specific collision avoidance paths, so as to better provide collision avoidance schemes for OOWs and promote maritime navigation safety.

In conclusion, despite its limitations, this study offers a new perspective on ship collision avoidance decision making. It provides valuable insights that can assist both OOWs and unmanned ships in making informed decisions, thereby enhancing maritime navigation safety.

Author Contributions

Conceptualization, Y.Z., J.L. and P.Z.; methodology, Y.Z., J.L. and W.D.; software, H.L.; validation, J.L., Y.Z. and W.D.; formal analysis, H.L.; investigation, W.S.; resources, Y.Z.; data curation, Y.Z.; writing—original draft preparation, Y.Z.; writing—review and editing, J.L., H.L., M.G. and P.Z.; visualization, W.D.; supervision, P.Z.; project administration, P.Z.; funding acquisition, P.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded in part by the National Natural Science Foundation of China (52272334), the Key R&D Program of the Zhejiang Province (2024C01180), the International Scientific and Technological Cooperation Projects of Ningbo (2023H020), the National Key Research and Development Program of China (2017YFE0194700), and the EC H2020 Project (690713).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Acknowledgments

We would like to thank the National “111” Center on the Safety and Intelligent Operation of Sea Bridges (D21013) and the Zhejiang 2011 Collaborative Innovation Center for Port Economy for the financial support in publishing this paper. The authors would like to thank the K.C. Wong Magna Fund in Ningbo University for sponsorship.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Moriarty, M.J. The United Nations Conference on Trade and Development 1964. Pac. Viewp. 1965, 6, 1–14. [Google Scholar] [CrossRef]
Tong, Y.; Weng, J.; Zhou, Y.; Liu, K. Quantitative Analysis of Collision Avoidance Actions Timing for Stand-on Vessels. In Proceedings of the 2023 7th International Conference on Transportation Information and Safety (ICTIS), Xi’an, China, 4–6 August 2023; pp. 1–9. [Google Scholar]
Zhou, Y.; Daamen, W.; Vellinga, T.; Hoogendoorn, S.P. Ship Behavior during Encounters in Ports and Waterways Based on AIS Data: From Theoretical Definitions to Empirical Findings. Ocean Eng. 2023, 272, 113879. [Google Scholar] [CrossRef]
You, B.; Zhang, J.; Hirdaris, S.; Liu, R. Ship Collision Avoidance Behavior Analysis Based on AIS Data. In Proceedings of the 2021 6th International Conference on Transportation Information and Safety (ICTIS), Wuhan, China, 22–24 October 2021; pp. 787–793. [Google Scholar]
Lei, P.-R.; Xiao, L.-P.; Wen, Y.-T.; Peng, W.-C. CAPatternMiner: Mining Ship Collision Avoidance Behavior from AIS Trajectory Data. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, Torino, Italy, 22–26 October 2018; Association for Computing Machinery: New York, NY, USA, 2018; pp. 1875–1878. [Google Scholar]
Zhang, J.; Liu, J.; Hirdaris, S.; Zhang, M.; Tian, W. An Interpretable Knowledge-Based Decision Support Method for Ship Collision Avoidance Using AIS Data. Reliab. Eng. Syst. Saf. 2023, 230, 108919. [Google Scholar] [CrossRef]
Rong, H.; Teixeira, A.P.; Soares, C.G. Ship Collision Avoidance Behaviour Recognition and Analysis Based on AIS Data. Ocean Eng. 2022, 245, 110479. [Google Scholar] [CrossRef]
Du, L.; Goerlandt, F.; Valdez Banda, O.A.; Huang, Y.; Wen, Y.; Kujala, P. Improving Stand-on Ship’s Situational Awareness by Estimating the Intention of the Give-Way Ship. Ocean Eng. 2020, 201, 107110. [Google Scholar] [CrossRef]
Fujii, Y.; Yamanouchi, H.; Matui, T. Survey on Vessel Traffic Management Systems and Brief Introduction to Marine Traffic Studies. Electron. Navig. Res. Inst. Pap. 1984, 1984, 1E-131. [Google Scholar] [CrossRef] [PubMed]
Goodwin, E.M. A Statistical Study of Ship Domains. J. Navig. 1973, 26, 130. [Google Scholar] [CrossRef]
Pietrzykowski, Z.; Uriasz, J. The Ship Domain–A Criterion of Navigational Safety Assessment in an Open Sea Area. J. Navig. 2009, 62, 93–108. [Google Scholar] [CrossRef]
Wang, N. An Intelligent Spatial Collision Risk Based on the Quaternion Ship Domain. J. Navig. 2010, 63, 733–749. [Google Scholar] [CrossRef]
Hansen, M.G.; Jensen, T.K.; Lehn-Schiøler, T.; Melchild, K.; Rasmussen, F.M.; Ennemark, F. Empirical Ship Domain Based on AIS Data. J. Navig. 2013, 66, 931–940. [Google Scholar] [CrossRef]
Wang, Y.; Chin, H.-C. An Empirically-Calibrated Ship Domain as a Safety Criterion for Navigation in Confined Waters. J. Navig. 2016, 69, 257–276. [Google Scholar] [CrossRef]
Zhang, F.; Peng, X.; Huang, L.; Zhu, M.; Wen, Y.; Zheng, H. A Spatiotemporal Statistical Method of Ship Domain in the Inland Waters Driven by Trajectory Data. J. Mar. Sci. Eng. 2021, 9, 410. [Google Scholar] [CrossRef]
Dinh, G.H.; Im, N. The Combination of Analytical and Statistical Method to Define Polygonal Ship Domain and Reflect Human Experiences in Estimating Dangerous Area. Int. J. E-Navig. Marit. Econ. 2016, 4, 97–108. [Google Scholar] [CrossRef]
Kundakçı, B.; Nas, S.; Gucma, L. Prediction of Ship Domain on Coastal Waters by Using AIS Data. Ocean Eng. 2023, 273, 113921. [Google Scholar] [CrossRef]
Liu, D.; Zheng, Z.; Liu, Z. Research on Dynamic Quaternion Ship Domain Model in Open Water Based on AIS Data and Navigator State. J. Mar. Sci. Eng. 2024, 12, 516. [Google Scholar] [CrossRef]
Silveira, P.; Teixeira, A.P.; Soares, C.G. A Method to Extract the Quaternion Ship Domain Parameters from AIS Data. Ocean Eng. 2022, 257, 111568. [Google Scholar] [CrossRef]
Szlapczynski, R.; Szlapczynska, J. Review of Ship Safety Domains: Models and Applications. Ocean Eng. 2017, 145, 277–289. [Google Scholar] [CrossRef]
Kouzuki, A.; Hasegawa, K. Automatic Collision Avoidance System for Ships Using Fuzzy Control. J. Kansai Soc. Nav. Arch. Jpn. 1987, 205, 1–10. [Google Scholar]
Perera, L.P.; Carvalho, J.P.; Guedes Soares, C. Fuzzy Logic Based Decision Making System for Collision Avoidance of Ocean Navigation under Critical Collision Conditions. J. Mar. Sci. Technol. 2011, 16, 84–99. [Google Scholar] [CrossRef]
Chauvin, C.; Lardjane, S. Decision Making and Strategies in an Interaction Situation: Collision Avoidance at Sea. Transp. Res. Part F Traffic Psychol. Behav. 2008, 11, 259–269. [Google Scholar] [CrossRef]
Su, C.-M.; Chang, K.-Y.; Cheng, C.-Y. Fuzzy decision on optimal collision avoidance measures for ships in vessel traffic service. J. Mar. Sci. Technol. 2012, 20, 5. [Google Scholar] [CrossRef]
Chin, H.C.; Debnath, A.K. Modeling Perceived Collision Risk in Port Water Navigation. Saf. Sci. 2009, 47, 1410–1416. [Google Scholar] [CrossRef]
Mou, J.M.; Tak, C.V.D.; Ligteringen, H. Study on Collision Avoidance in Busy Waterways by Using AIS Data. Ocean Eng. 2010, 37, 483–490. [Google Scholar] [CrossRef]
Ren, Y.; Mou, J.; Yan, Q.; Zhang, F. Study on Assessing Dynamic Risk of Ship Collision. In Proceedings of the ICTIS 2011, Wuhan, China, 30 June 30–2 July 2011; American Society of Civil Engineers: Wuhan, China, 2011; pp. 2751–2757. [Google Scholar]
Li, B.; Pang, F.-W. An Approach of Vessel Collision Risk Assessment Based on the D–S Evidence Theory. Ocean Eng. 2013, 74, 16–21. [Google Scholar] [CrossRef]
Ahn, J.-H.; Rhee, K.-P.; You, Y.-J. A Study on the Collision Avoidance of a Ship Using Neural Networks and Fuzzy Logic. Appl. Ocean Res. 2012, 37, 162–173. [Google Scholar] [CrossRef]
Zhang, W.; Goerlandt, F.; Montewka, J.; Kujala, P. A Method for Detecting Possible near Miss Ship Collisions from AIS Data. Ocean Eng. 2015, 107, 60–69. [Google Scholar] [CrossRef]
Gang, L.; Wang, Y.; Sun, Y.; Zhou, L.; Zhang, M. Estimation of Vessel Collision Risk Index Based on Support Vector Machine. Adv. Mech. Eng. 2016, 8, 1687814016671250. [Google Scholar] [CrossRef]
Chen, P.; Shi, G.; Liu, S.; Gao, M. Pattern Knowledge Discovery of Ship Collision Avoidance Based on AIS Data Analysis. Int. J. Perform. Eng. 2018, 14, 2449. [Google Scholar] [CrossRef]
Nguyen, M.; Zhang, S.; Wang, X. A Novel Method for Risk Assessment and Simulation of Collision Avoidance for Vessels Based on AIS. Algorithms 2018, 11, 204. [Google Scholar] [CrossRef]
Shi, J.; Liu, Z. Deep Learning in Unmanned Surface Vehicles Collision-Avoidance Pattern Based on AIS Big Data with Double GRU-RNN. J. Mar. Sci. Eng. 2020, 8, 682. [Google Scholar] [CrossRef]
Zheng, M.; Xie, S.; Chu, X.; Zhu, T.; Tian, G. Research on Autonomous Collision Avoidance of Merchant Ship Based on Inverse Reinforcement Learning. Int. J. Adv. Robot. Syst. 2020, 17, 172988142096908. [Google Scholar] [CrossRef]
Zhao, Y.; Li, W.; Shi, P. A Real-Time Collision Avoidance Learning System for Unmanned Surface Vessels. Neurocomputing 2016, 182, 255–266. [Google Scholar] [CrossRef]
Ożoga, B.; Montewka, J. Towards a Decision Support System for Maritime Navigation on Heavily Trafficked Basins. Ocean Eng. 2018, 159, 88–97. [Google Scholar] [CrossRef]
Kim, J.-K.; Park, D.J. Determining the Proper Times and Sufficient Actions for the Collision Avoidance of Navigator-Centered Ships in the Open Sea Using Artificial Neural Networks. J. Mar. Sci. Eng. 2023, 11, 1384. [Google Scholar] [CrossRef]
Ohn, S.W.; Namgung, H. Interval Type-2 Fuzzy Inference System Based on Closest Point of Approach for Collision Avoidance between Ships. Appl. Sci. 2020, 10, 3919. [Google Scholar] [CrossRef]
Mou, J.; Chen, P.; He, Y.; Zhang, X.; Zhu, J.; Rong, H. Fast Self-Tuning Spectral Clustering Algorithm for AIS Ship Trajectory. Harbin Gongcheng Daxue Xuebao/J. Harbin Eng. Univ. 2018, 39, 428–432. [Google Scholar] [CrossRef]
Zhao, L.; Shi, G. A Method for Simplifying Ship Trajectory Based on Improved Douglas–Peucker Algorithm. Ocean Eng. 2018, 166, 37–46. [Google Scholar] [CrossRef]
Bentéjac, C.; Csörgő, A.; Martínez-Muñoz, G. A Comparative Analysis of Gradient Boosting Algorithms. Artif. Intell. Rev. 2021, 54, 1937–1967. [Google Scholar] [CrossRef]
Qu, Y.; Lin, Z.; Li, H.; Zhang, X. Feature Recognition of Urban Road Traffic Accidents Based on GA-XGBoost in the Context of Big Data. IEEE Access 2019, 7, 170106–170115. [Google Scholar] [CrossRef]
Dong, X.; Lei, T.; Jin, S.; Hou, Z. Short-Term Traffic Flow Prediction Based on XGBoost. In Proceedings of the 2018 IEEE 7th Data Driven Control and Learning Systems Conference (DDCLS), Enshi, China, 25–27 May 2018; IEEE: New York, NY, USA, 2018; pp. 854–859. [Google Scholar]
Luo, C.; Huang, C.; Cao, J.; Lu, J.; Huang, W.; Guo, J.; Wei, Y. Short-Term Traffic Flow Prediction Based on Least Square Support Vector Machine with Hybrid Optimization Algorithm. Neural Process. Lett. 2019, 50, 2305–2322. [Google Scholar] [CrossRef]
Zemmal, N.; Azizi, N.; Sellami, M.; Cheriguene, S.; Ziani, A.; AlDwairi, M.; Dendani, N. Particle Swarm Optimization Based Swarm Intelligence for Active Learning Improvement: Application on Medical Data Classification. Cogn. Comput. 2020, 12, 991–1010. [Google Scholar] [CrossRef]
Huang, W.; Liu, H.; Zhang, Y.; Mi, R.; Tong, C.; Xiao, W.; Shuai, B. Railway Dangerous Goods Transportation System Risk Identification: Comparisons among SVM, PSO-SVM, GA-SVM and GS-SVM. Appl. Soft Comput. 2021, 109, 107541. [Google Scholar] [CrossRef]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Yang, S.; Chen, L.-F.; Yan, T.; Zhao, Y.-H.; Fan, Y.-J. An Ensemble Classification Algorithm for Convolutional Neural Network Based on AdaBoost. In Proceedings of the 2017 IEEE/ACIS 16th International Conference on Computer and Information Science (ICIS), Wuhan, China, 24–26 May 2017; pp. 401–406. [Google Scholar]
Dyngvold, G. A STAMP and HFACS-MA Analysis of the Sanchi Oil Tanker Disaster: Lessons Learned and Ways Forward. Master’s Thesis, University of South-Eastern Norway, Notodden, Norway, 2021. [Google Scholar]
Li, M.; Mou, J.; Chen, L.; He, Y.; Huang, Y. A Rule-Aware Time-Varying Conflict Risk Measure for MASS Considering Maritime Practice. Reliab. Eng. Syst. Saf. 2021, 215, 107816. [Google Scholar] [CrossRef]
Lyu, H.; Hao, Z.; Li, J.; Li, G.; Sun, X.; Zhang, G.; Yin, Y.; Zhao, Y.; Zhang, L. Ship Autonomous Collision-Avoidance Strategies—A Comprehensive Review. J. Mar. Sci. Eng. 2023, 11, 830. [Google Scholar] [CrossRef]

Figure 1. Flowchart for determining timing of ship collision avoidance action.

Figure 2. Ship relative motion.

Figure 3. Trajectories of ships in crossing situation.

Figure 4. Illustration of Douglas–Peucker (DP) algorithm.

Figure 5. Structure of the collision avoidance decision model.

Figure 6. Ranking of the importance of factors considered.

Figure 7. ROC curve.

Figure 8. The evaluation metrics of all models.

Figure 9. The ROC curves of all models.

Figure 10. The P-R curves of all models.

Figure 11. The encounter situation of Sanchi (going up) and CF Crystal (going down) (Blue star: collision point).

Figure 12. Changes in distance and model output with time.

Figure 13. Time window.

Table 1. Factors and methods of collision avoidance decisions.

References	Factors Considered	Methods
Kouzuki and Hasekawa [21]	DCPA, TCPA	Fuzzy control
Perera, L. P. [22]	Relative distance, Speed Course, Position	Fuzzy logic
Chauvin, C. and Lardjane [23]	Speed and Course, Distance DCPA, TCPA	Logistic regression models
Su, C. M. [24]	Speed, Ship type and Size, Traffic flow, Fuel cost	Fuzzy logic theory
Chin, H.C. [25]	DCPA, TCPA	Ordered probability regression model
Mou [26]	DCPA, TCPA Encounter angle	SAMSON system
Ren YaLei [27]	DCPA, TCPA Encounter angle	Fuzzy logic method
Li Bo [28]	DCPA, TCPA, Relative distance	D–S evidence theory
Ahn, J.-H. [29]	TCPA, DCPA, Ship’s maneuverability	Multilayer Perceptron neural network
Zhang Weibin [30]	Distance, Relative speed, Phase defined by course	Vessel Conflict Ranking Operator
Gang, L. [31]	DCPA, TCPA, Speed and Course, Visibility conditions	Support Vector Machine
Chen, P [32]	DCPA, TCPA	AIS statistical analysis
Nguyen, M [33]	DCPA, TCPA, Relative Distance, Relative bearing	Collision-risk index
Shi, J. [34]	Trajectory information	Deep learning
Zheng, M [35]	DCPA and Maximum heading angle	Inverse reinforcement learning
Zhao Yuxin [36]	DCPA, TCPA, Relative Distance, Relative bearing, Relative velocity, COLREGS	Optimal reciprocal collision avoidance algorithm
Bartosz Ożoga [37]	DCPA, TCPA, Hydro-meteo conditions	Multi-ARPA (MARPA) system
Kim, J. K. [38]	Speed and Course, Ship Length	Artificial neural networks
Ohn [39]	DCPA, TCPA	Type-2 Fuzzy Inference System

Table 2. Optimized XGBoost parameters.

Parameters	Value
Num_boost_round	628
Max_depth	12
Learning rate	0.207

Table 3. The confusion matrix.

Prediction	Truth		Total
Prediction	No action (0)	Action (1)	Total
No action (0)	862	176	1038
Action (1)	166	870	1036

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhou, Y.; Du, W.; Liu, J.; Li, H.; Grifoll, M.; Song, W.; Zheng, P. Determination of Ship Collision Avoidance Timing Using Machine Learning Method. Sustainability 2024, 16, 4626. https://doi.org/10.3390/su16114626

AMA Style

Zhou Y, Du W, Liu J, Li H, Grifoll M, Song W, Zheng P. Determination of Ship Collision Avoidance Timing Using Machine Learning Method. Sustainability. 2024; 16(11):4626. https://doi.org/10.3390/su16114626

Chicago/Turabian Style

Zhou, Yu, Weijie Du, Jiao Liu, Haoqing Li, Manel Grifoll, Weijun Song, and Pengjun Zheng. 2024. "Determination of Ship Collision Avoidance Timing Using Machine Learning Method" Sustainability 16, no. 11: 4626. https://doi.org/10.3390/su16114626

APA Style

Zhou, Y., Du, W., Liu, J., Li, H., Grifoll, M., Song, W., & Zheng, P. (2024). Determination of Ship Collision Avoidance Timing Using Machine Learning Method. Sustainability, 16(11), 4626. https://doi.org/10.3390/su16114626

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Determination of Ship Collision Avoidance Timing Using Machine Learning Method

Abstract

1. Introduction

2. Literature Review

2.1. Identification of Collision Avoidance Behavior from AIS Data

2.2. Timing of Collision Avoidance

2.2.1. Ship Domain-Based Approach

2.2.2. Indicator-Based Approach

3. Methods

3.1. Data Preprocessing

3.2. Ship Encounter Trajectory Data Extraction

3.2.1. Ship Encounter Scenario

3.2.2. Extraction of Encounters in AIS Data

3.3. Relative Motion Parameters

3.4. Recognition of Timing for Collision Avoidance

3.5. Particle Swarm Optimization eXtreme Gradient Boosting

3.5.1. XGBoost Algorithm

3.5.2. Particle Swarm Optimization

3.5.3. Collision Avoidance Decision Model

4. Results

4.1. Data

4.2. Analysis of Influencing Factors on the Timing of Ship Collision Avoidance Actions

4.3. Determination of the Timing of Collision Avoidance

4.4. Comparative Analysis

4.5. Case Analysis

4.6. Collision Avoidance Time Window

5. Discussions

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI