1.2. Related Work
Traditional ship control is usually based on PID, etc. Fossen et al. utilized a line-of-sight- (LOS) based underactuated ship path tracking control method, which is one of the earliest and most widely used strategies for ship motion control [
10]. Their team introduced a nonlinear Lyapunov function to design the controller, effectively addressing issues posed by the model [
11]. In this framework, DO [
12] introduced a nonlinear Lyapunov function [
13] to design the controller, effectively addressing issues posed by the model. Shen et al. [
14] modeled fully actuated ships and, to tackle external environmental disturbances, proposed an adaptive dynamic surface sliding mode control method with a disturbance observer. They performed simulation experiments to verify its feasibility. Jiao et al. [
15] introduced a performance function with constraints for controller design, utilizing radial basis function (RBF) neural networks to correct jittering parameters. Their simulation results showed that the trajectory tracking error converged within the specified range, verifying the effectiveness and superiority of the proposed control strategy. Qi [
16] used a high-gain observer to estimate the system’s velocity vector and combined this with the approximation capability of RBF neural networks and the backstepping method to design a controller that successfully completed trajectory tracking for the ship. Shen et al. [
17], under input constraints, designed a sliding mode recursive control law by constraining the ship’s trajectory, considering both the position and velocity errors, which enhanced the system’s robustness. Finally, they integrated neural networks to complete the trajectory tracking control of the ship. Liu et al. [
18] employed a disturbance observer to estimate low-frequency disturbances in the dynamics, designed adaptive laws to estimate unknown time-varying current velocities, and used an auxiliary power system to compensate for input saturation constraints in the brake system. Chen et al. [
19] proposed a fixed-time fractional order synovial control method to address uncertainty issues in the model and environment. This method can effectively track and reduce the shaking phenomenon of synovium. He et al. [
20] proposed an autonomous ship collision avoidance path planning method suitable for multi-ship encounter scenarios, achieving real-time ship course adjustment through fuzzy adaptive PID control.
In recent years, research on trajectory control has primarily focused on deep reinforcement learning methods. Zhao L et al. [
21] designed a trajectory controller based on the PPO algorithm, achieving good tracking performance in unknown environments. Song et al. [
22] developed a trajectory controller using carrot tracking guidance combined with the PPO algorithm, which avoids complex parameter calculations and demonstrates high tracking accuracy in disturbed environments. Zhang et al. [
23] and Zhu et al. [
24] designed a trajectory controller using line-of-sight (LOS) guidance combined with the DDPG algorithm, with simulation results showing good tracking performance; however, interference issues were not considered in the tests. Wang [
25] designed a trajectory controller based on DDPG-H, achieving precise tracking under both interference-free and disturbed conditions, with relatively smooth rudder angle outputs. Zhao Y et al. [
26] proposed a trajectory control method based on DQN smooth convergence. Chen et al. [
27] proposed a ship route planning method that integrates the A* algorithm with a Double Deep Q-Network (A-DDQN), demonstrating improvements in reducing fuel consumption and carbon emissions. Cui et al. [
28] incorporated LSTM and MHA mechanisms into the TD3 algorithm network, enhancing its attention to historical state information and achieving superior ship performance compared to conventional methods in complex encounter scenarios. Wang et al. [
29] proposed a SAC-based multi-path tracking controller that significantly improves path-following accuracy and success rate. Experimental results showed that the proposed algorithm converges faster and has better control effects compared to traditional DQN algorithms, which has practical significance.
In practical ship control applications, vessel trajectory tracking primarily relies on PID control and Model Predictive Control (MPC). Due to the inherent limitations of these two control algorithms, some real-world vessels adopt a hybrid “PID+MPC” architecture. For instance, Kongsberg’s Dynamic Positioning (DP) system, deployed on thousands of offshore vessels, traditionally uses PID for thrust allocation. Meanwhile, Hyundai Heavy Industries has applied MPC technology in LNG ship berthing systems, where its core functionality lies in multi-step prediction and constrained optimization to ensure safe docking in complex environments. However, these methods exhibit limited adaptability in harsh sea conditions and require high parameter-tuning costs, prompting the industry to explore smarter solutions.
Deep reinforcement learning, owing to its advantages in adaptive decision-making, has gradually transitioned from academic research to practical applications in recent years. Researchers at the Norwegian University of Science and Technology developed a DRL-based ship navigation controller and conducted experiments using a 1:75.5 scale physical ship model. Results showed that under wind-free conditions, the controller achieved precise tracking of a 40 m square trajectory, with measured position and heading angles aligning closely with simulation data. Even under strong wind disturbances, the system demonstrated robustness by completing complex maneuvers such as “figure-8” paths, despite initial deviations, thereby validating the feasibility of DRL.
A review of current research in ship motion control reveals that while traditional trajectory control algorithms can accomplish tracking tasks, they suffer from limited adaptability and high tuning costs. DRL algorithms effectively address these shortcomings while remaining compatible with other parallel research efforts. However, as studies progress, challenges such as prolonged training cycles and decision under- or overfitting have become apparent. Given these research characteristics, this study develops a Soft Actor–Critic algorithm to enhance training efficiency in ship control. This approach not only addresses a key challenge in ship trajectory tracking control but also constitutes the core contribution of this paper.