Next Article in Journal
Dynamic Optimal Obstacle Avoidance Control of AUV Formation Based on MLoTFWA Algorithm
Next Article in Special Issue
A High-Precision Real-Time Distance Difference Localization Algorithm Based on Long Baseline Measurement
Previous Article in Journal
Vessel Trajectory Prediction Based on Automatic Identification System Data: Multi-Gated Attention Encoder Decoder Network
Previous Article in Special Issue
An Anti-Occlusion Approach for Enhanced Unmanned Surface Vehicle Target Detection and Tracking with Multimodal Sensor Data
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Output Feedback Adaptive Optimal Control of Multiple Unmanned Marine Vehicles with Unknown External Disturbance

by
Liang-En Yuan
1,
Yang Xiao
2,*,
Tieshan Li
1,3,4 and
Dalin Zhou
5
1
Navigation College, Dalian Maritime University, Dalian 116026, China
2
Department of Computer Science, The University of Alabama, Tuscaloosa, AL 35487-0290, USA
3
College of Automation Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
4
Yangtze Delta Region Institute, University of Electronic Science and Technology of China, Huzhou 313001, China
5
School of Computing, University of Portsmouth, Portsmouth PO1 3HE, UK
*
Author to whom correspondence should be addressed.
J. Mar. Sci. Eng. 2024, 12(10), 1697; https://doi.org/10.3390/jmse12101697
Submission received: 29 July 2024 / Revised: 9 September 2024 / Accepted: 15 September 2024 / Published: 25 September 2024
(This article belongs to the Special Issue Unmanned Marine Vehicles: Navigation, Control and Sensing)

Abstract

:
This paper presents an optimal output-feedback tracking control problem for multiple unmanned marine vehicles (UMVs) to track a desired trajectory. To guarantee the control objective in an optimal manner, adaptive dynamic programming (ADP) with optimal compensation terms is adopted. A neural velocity observer is designed based on a neural network (NN) to estimate the unmeasured system states and the unknown system dynamics. Furthermore, a disturbance observer (DO) is proposed to go against the effect of the unknown external disturbance of the sea environment. It is proved that the proposed controller can guarantee that all signals in the closed-loop system are bounded. Simulation results are given to demonstrate the effectiveness of the proposed control algorithm.

1. Introduction

In recent years, unmanned marine vehicles (UMVs) have garnered increasing attention for the exploration of natural resources in oceanic spaces due to their unique advantages, such as low energy costs and advanced intelligence. Furthermore, UMVs can complete dangerous tasks without putting human lives at risk [1,2]. However, a solitary UMV may not suffice to execute complex tasks in some situations. Thus, the control problem of multiple UMVs has garnered substantial interest. In recent years, the coordinated control of multiple UMVs has become a burgeoning research topic [3]. Some important theoretical results of coordinated control of multiple vehicles have been reported [4,5,6,7,8]. In [8], a time-varying formation control problem was presented. Connectivity preservation and collision avoidance were investigated with position-heading measurements. In [5], the path-tracking control of under-actuated unmanned surface vehicles was investigated. Model uncertainties and unknown disturbances over a wireless network were considered. This research has far-reaching implications; however, the control attempts can use high amounts of energy because the aforementioned control design strategies do not consider the optimization problem.
Optimal control is a fundamental design principle that can be adopted to enhance the control performance of dynamic systems. Given that ocean transportation or deep-sea exploration frequently necessitates substantial energy provision, it becomes imperative to incorporate optimization into vessel control design to reduce energy consumption. However, the optimal control of UMVs is a difficult problem due to the inherent nonlinearities of the UMV system. The optimization problems of nonlinear systems often need to solve the Hamilton–Jacobi–Bellman (HJB) equation, which does not have a closed-form solution. To handle this difficulty, a promising adaptive optimal method was presented based on the reinforcement learning (RL) method, namely, adaptive dynamic programming (ADP) [9,10,11,12], in which an RL system was designed to approximate the HJB equation [13,14,15]. Many studies on the control problems of vehicles by using the ADP method have been reported [16,17,18]. In [16], the optimal tracking control problem for the dynamic positioning of marine vessels was investigated. The observer based on a fuzzy logic system (FLS) was given to handle the problem of unmeasured states of the vessels. A disturbance observer (DO) was given to estimate the external disturbances. In [17], a data-driven RL-based controller was introduced to address the optimal control problem of the vehicle. Using a data-driven approach, a model-free control method was formulated to achieve control optimality and prescribed tracking accuracy concurrently. In [18], an optimal control scheme was presented for RL-based optimal tracking control applied to UMVs. The unknown dead-zone input nonlinearities and unknown disturbances were considered and handled by using a neural network (NN)-based identifier. This research shows how the optimal control problem of a single UMV can be solved. However, the optimal tracking control problem of multiple UMVs cannot be handled directly using these methods.
In practical applications, accurately obtaining velocity information from shipborne sensors is often challenging due to the potential impact of noise interference and sensor signal loss on its effectiveness [19]. To handle this difficulty, some research based on the observer has been reported [3,20,21,22]. In [20], a study on the distributed containment maneuvering problem was conducted. Adopting the input–output data from each vehicle, a novel approach utilizing an echo state network-based observer was introduced to address the issue of unmeasured states. In [3], the flocking control problem of vehicles was studied. The velocity information of the vehicles was obtained by employing an extended state observer. In [22], the path-following problem of vehicles was studied. Unmeasured states were handled by giving a state observer based on FLS. From the studies mentioned above, establishing an observer to handle the unmeasured velocity information of vehicles is an effective approach.
When working in a sea environment, the control effectiveness of a UMV can be influenced by external disturbances such as wind and waves, which may lead to a failure to achieve the control target [23,24]. Thus, it is important to adopt disturbance rejection techniques to reduce the effect of unknown external disturbances. In recent years, many disturbance observer (OB)-based controllers were reported [24,25,26]. In [24], the robust leader–follower synchronization navigation for the UMVs was presented. The problem of unknown external disturbance was solved by adopting OB. In [26], the trajectory tracking control problem of UMVs was solved using the dynamic surface control technique. The time-varying disturbances were considered and estimated by a proposed DO. These results are fruitful and inspiring.
Motivated by the observations described above, the formation tracking problem of multiple UMVs is studied in this paper. An ADP algorithm with an optimal compensation term is adopted to guarantee the control objective optimally. An adaptive controller is designed using the backstepping technique; the optimal tracking control problem is transformed into an equivalent optimal regulation problem. Subsequently, an optimal compensation term is designed by using the policy iteration method. The overall optimal control input is the adaptive controller plus the optimal compensation term. To handle the unknown time-varying external disturbances of the sea environment, a DO is designed. It is proved that the proposed controller can guarantee that all signals in the closed-loop system are bounded. Simulation results are given to demonstrate the effectiveness of the proposed control algorithm.
The main contribution of this work can be summarized as follows.
(1) Unlike the references [4,5,6,7,8] investigating the tracking control problem of multiple UMVs without consideration of optimality, this paper considers optimality for designing the consensus controller. This means that the control method proposed in this paper can achieve the tracking control target with less energy consumption.
(2) Compared with the works in [17,18], an advantage of this paper is that this paper investigated the optimal tracking control problem of multiple UMVs. In contrast, the authors in [17,18] only investigated the optimal tracking control problem of a single UMV. Therefore, the task we set out to achieve is more challenging than the existing studies.
The rest of the paper is organized as follows: Section 2 provides the problem formulation, Section 3.1 presents the DO, Section 3.2 and Section 3.3 introduce the adaptive controller and the optimal compensation term, Section 3.4 provides a stability analysis, Section 4 provides the simulation results, and Section 5 presents concluding remarks.

2. Problem Formulation

Two reference frames are adopted: a body-fixed reference frame (BF) and a north-east-down reference frame (NED), which can be found in Figure 1. These two reference frames are generally adopted in ship motion control, and readers can find more details in [27,28,29].
Considering the leader–follower formation control problem of m UMVs, the underactuated system dynamics of 3 degrees of freedom (3-DOF) with uncertain dynamics can be described as follows [28,29,30]:
η ˙ i = R ( ψ i ) ν i M i ν ˙ i = C i ( ν i ) ν i D i ( ν i ) ν i + Δ ( η i , ν i ) + d i + τ i i = 1 , 2 , , m
where η i = [ x i , y i , ψ i ] , x i , y i is the position of UMV in the earth-fixed frame, ψ i is the heading angle in the earth-fixed frame, ν i = [ u i , v i , r i ] denotes the velocities of UMV in the body-fixed frame, and τ i = [ τ i 1 , τ i 2 , τ i 3 ] denotes the control input. Δ ( η i , ν i ) R 3 × 1 denotes the uncertain dynamics. d i is the unknown disturbance from the sea environment. R ( ψ i ) R 3 × 3 is the rotation matrix from the earth-fixed frame to the body-fixed frame, which is given as follows:
R ( ψ i ) = cos ( ψ i ) sin ( ψ i ) 0 sin ( ψ i ) cos ( ψ i ) 0 0 0 1
M R 3 × 3 is an inertia matrix including hydrodynamic added inertia; D i ( ν i ) R 3 × 3 is the damping matrix; C i ( ν i ) R 3 × 3 is a matrix of the centripetal and Coriolis terms. These three terms are shown as follows:
M i = m i 11 0 0 0 m i 22 m i 23 0 m i 32 m i 33 C i ( ν i ) = 0 0 c 13 i ( ν i , r i ) 0 0 m i 11 u i c 13 i ( ν i , r i ) m i 11 u i 0 D i ( ν i ) = d i 11 0 0 0 d i 22 d i 23 0 d i 32 d i 33
where m i 11 = m i X i u ˙ , m i 22 = m i Y i v ˙ , m i 32 = m i x i G N i v ˙ , m i 33 = I i z N i r ˙ , and c 13 i ( ν i , r i ) = m i 22 v i m i 23 r i . m i denotes the mass of the vessel, and x i G denotes the distance between the center of gravity of the vessel and the origin of the body-fixed frame. I i z is the moment of inertia. d i 11 = X i u X i | u | u | u i | , d i 22 = Y i v Y i | v | v | v i | Y i | r | v | r i | , d i 23 = Y i r Y i | v i | r | v i | Y i | r | r | r i | , d i 32 = N i v N i | v | v | v i | N i | r | v | r i | , and d i 33 = N i r N i | v | r | v i | N i | r | r | r i | . X , Y , and N are the corresponding hydrodynamic derivatives. Consider a virtual leader moving along a desired trajectory shown as η 0 d = x 0 d , y 0 d , ψ 0 d .
For simplicity, we can obtain the following equation by using System (1):
η ˙ i = υ i υ ˙ i = f i ( η i , υ i ) + M i 1 R i ( ψ i ) ( τ i + d i ) i = 1 , 2 , , m
where
υ i = R i ψ i ν i
and
f i ( η i , υ i ) = M i 1 R i ( ψ i ) ( C i ( ν i ) ν i D i ( ν i ) ν i + Δ ( η i , ν i ) ) + R ˙ i ( ψ i ) ν i
f i ( η i , υ i ) is an unknown function since Δ ( η i , ν i ) is unknown. To handle this problem, NN is adopted to obtain the approximation.
Control Problem Statement: This study aims to design an output feedback control algorithm that can handle the optimal tracking control problem of multiple UMVs with unknown external disturbance and uncertain dynamics. The controller can ensure that all the signals in the closed-loop are bounded.

3. Main Results

3.1. Neural Observer Design

A neural state observer (NSO) is adopted to handle the unmeasured velocities and unknown system dynamics. From (4), we can obtain the following equations:
X ˙ i = A i X i + B i 1 f i ( η i , υ i ) + D i B i 1 ( τ i + d i ) Y ˙ i = A i Y i + B i 2 f i ( η i , υ i ) + D i B i 2 ( τ i + d i ) Ψ ˙ i = A i Ψ i + B i 3 f i ( η i , υ i ) + D i B i 3 ( τ i + d i )
where X i = [ x i , u i ] , Y i = [ y i , v i ] , Ψ i = [ ψ i , r i ] , A i = 0 1 0 0 , B i 1 = 0 0 0 1 0 0 , B i 2 = 0 0 0 0 1 0 , B i 3 = 0 0 0 0 0 1 , D i = R i ( ψ i ) M 1 . Since f i ( η i , υ i ) is unknown, NN is adopted to obtain the approximation, which can be described as
f i ( η i , υ i ) = θ i * φ i ( η ^ i , υ ^ i ) + ε i
ε i denotes the minimum approximation error, i.e., ε i ε i m , where ε i m R 3 × 1 is a constant vector, φ i φ i m . We can obtain the approximation of f i ( η i , υ i ) as
f ^ i ( η ^ i , υ ^ i ) = θ ^ i φ i ( η ^ i , υ ^ i )
We can design the neural velocity observer as follows:
X ^ ˙ i = A i 1 X ^ i + C i X i + B i 1 θ ^ i φ i ( η ^ i , υ ^ i ) + D i B i 1 ( τ i + d ^ i ) Y ^ ˙ i = A i 2 Y ^ i + C i Y i + B i 2 θ ^ i φ i ( η ^ i , υ ^ i ) + D i B i 2 ( τ i + d ^ i ) Ψ ^ ˙ i = A i 3 Ψ ^ i + C i Ψ i + B i 3 θ ^ i φ i ( η ^ i , υ ^ i ) + D i B i 3 ( τ i + d ^ i )
where X ^ i = [ x ^ i , u ^ i ] , Y ^ i = [ y ^ i , v ^ i ] , and Ψ ^ i = [ ψ ^ i , r ^ i ] . A i 1 = k i 11 1 k i 12 0 , A i 2 = k i 21 1 k i 22 0 , A i 3 = k i 31 1 k i 32 0 . C i = [ 1 , 0 ] A i 1 , A i 2 , and A i 3 are a Hurwitz matrix by choosing the suitable parameters k i 11 , k i 12 , k i 21 , k i 22 , k i 31 , and k i 32 . P i 1 , P i 2 , P i 3 , Q i 1 , Q i 2 , and Q i 3 are positive definite matrices that satisfy A i 1 P i 1 + P i 1 A i 1 = Q i 1 , A i 2 P i 2 + P i 2 A i 2 = Q i 2 , and A i 3 P i 3 + P i 3 A i 3 = Q i 3 , where Q i 1 = Q i 1 > 0 , Q i 2 = Q i 2 > 0 , and Q i 3 = Q i 3 > 0 . d ^ i is the disturbance estimation that the DO will obtain. Let d ˜ i = d i d ^ i be the disturbance approximation error. Define the NSO error dynamics as X ˜ ˙ i = X ˙ i X ^ ˙ i , Y ˜ ˙ i = Y ˙ i Y ^ ˙ i , and Ψ ˜ ˙ i = Ψ ˙ i Ψ ^ ˙ i . From (7) and (10), we can obtain the error dynamics of NSO as follows:
X ˜ ˙ i = A i 1 X ^ i + B i 1 θ ˜ i φ i ( η ^ i , υ ^ i ) + D i B i 1 d ˜ i + B i 1 ε i Y ˜ ˙ i = A i 2 Y ^ i + B i 2 θ ˜ i φ i ( η ^ i , υ ^ i ) + D i B i 2 d ˜ i + B i 2 ε i Ψ ˜ ˙ i = A i 3 Ψ ^ i + B i 3 θ ˜ i φ i ( η ^ i , υ ^ i ) + D i B i 3 d ˜ i + B i 3 ε i

3.2. Disturbance Observer Design

To implement the DO, we begin by defining the auxiliary vector q i for each vehicle as follows:
q i = d i K i d υ i
where q i = [ q i 1 , q i 2 , q i 3 ] , and a positive definite design matrix K i d R 3 × 3 is employed. We can obtain the time derivative of q i as
q ˙ i = d ˙ i K i d υ ˙ i = d ˙ i K i d ( f i ( η i , υ i ) + R i ( ψ i ) M i 1 ( τ i + q i + K i d υ i ) )
Since d i is unknown, q i is also unknown. We can obtain an approximation of q i using the following equation:
q ^ ˙ i = K i d ( f ^ i ( η ^ i , υ ^ i ) + R i ( ψ i ) M i 1 ( τ i + q ^ i + K i d υ ^ i ) )
Thus, we obtain d ^ i as follows:
d ^ i = q ^ i + K i d υ ^ i
Let q ˜ i = q i q ^ i . The time derivative of q ˜ i is described as follows:
q ˜ ˙ i = q ˙ i q ^ ˙ i = d ˙ i K i d θ ˜ i φ i ( η ^ i , υ ^ i ) K i d R i ( ψ i ) M i 1 ( ψ i ) q ˜ i

3.3. Adaptive Controller Design

The proposed adaptive controller is designed utilizing the backstepping technique. First, the change of coordinates is given as follows:
z i 1 = η i η i d
z i 2 = υ ^ i α i
where η i d = η d + R i ψ i p i , and η d = x d , y d , ψ d represents the tracking trajectory of the leader. Here, p i = [ p i x , p i y , 0 ] , where p i x and p i y denote the relative position between the i-th ASV and the leader in the X E and Y E directions, respectively. α i serves as the virtual control for design purposes, where α i a and α i * are the adaptive virtual control and the optimal compensation term, respectively. The actual control combines these two terms, i.e., α i = α i a + α i * .
From (17), we can obtain the following:
z ˙ i 1 = η ˙ i η ˙ i d = υ i η ˙ i d = z i 2 + α i a + α i * η ˙ i d
To obtain the control objective, we design the following equation:
V i 1 = 1 2 X ˜ i P i 1 X ˜ i + 1 2 Y ˜ i P i 2 Y ˜ i + 1 2 Ψ ˜ i P i 3 Ψ ˜ i + 1 2 z i 1 z i 1
We can obtain the following:
V ˙ i 1 = 1 2 X ˜ ˙ i P i X ˜ i + 1 2 X ˜ i P i X ˜ ˙ i + 1 2 Y ˜ ˙ i P i 2 Y ˜ i + 1 2 Y ˜ i P i 2 Y ˜ ˙ i + 1 2 Ψ ˜ ˙ i P i 3 Ψ ˜ i + 1 2 Ψ ˜ i P i 3 Ψ ˜ ˙ i + z i 1 z ˙ i 1 1 2 λ min Q i 1 X ˜ i 2 1 2 λ min Q i 2 Y ˜ i 2 1 2 λ min Q i 3 Ψ ˜ i 2 + X ˜ i P i 1 B i 1 θ ˜ i φ i ( η ^ i , υ ^ i ) + B i 1 ε i + D i B i 1 d ˜ i + Y ˜ i P i 2 B i 2 θ ˜ i φ i ( η ^ i , υ ^ i ) + B i 2 ε i + D i B i 2 d ˜ i + Ψ ˜ i P i 3 B i 3 θ ˜ i φ i ( η ^ i , υ ^ i ) + B i 3 ε i + D i B i 3 d ˜ i + z i 1 z i 2 + α i a + α i * η ˙ i d
By using Young’s inequality, we can obtain the following:
X ˜ i P i 1 B i 1 θ ˜ i φ i ( η ^ i , υ ^ i ) 1 2 λ max 2 ( P i 1 ) X ˜ i 2 + 1 2 θ ˜ i 2
Y ˜ i P i 2 B i 2 θ ˜ i φ i ( η ^ i , υ ^ i ) 1 2 λ max 2 ( P i 2 ) Y ˜ i 2 + 1 2 θ ˜ i 2
Ψ ˜ i P i 3 B i 3 θ ˜ i φ i ( η ^ i , υ ^ i ) 1 2 λ max 2 ( P i 3 ) Ψ ˜ i 2 + 1 2 θ ˜ i 2
X ˜ i P i 1 B i 1 ε i 1 2 λ max 2 ( P i 1 ) X ˜ i 2 + 1 2 ε i m 2
Y ˜ i P i 2 B i 2 ε i 1 2 λ max 2 ( P i 2 ) Y ˜ i 2 + 1 2 ε i m 2
Ψ ˜ i P i 3 B i 3 ε i 1 2 λ max 2 ( P i 3 ) Ψ ˜ i 2 + 1 2 ε i m 2
X ˜ i P i 1 D i B i 2 d ˜ i 1 2 λ max 2 ( P i 1 D i B i 1 ) X ˜ i 2 + 1 2 d ˜ i 2
Y ˜ i P i 2 D i B i 2 d ˜ i 1 2 λ max 2 ( P i 2 D i B i 2 ) X ˜ i 2 + 1 2 d ˜ i 2
Ψ ˜ i P i 3 D i B i 3 d ˜ i 1 2 λ max 2 ( P i 3 D i B i 3 ) Ψ ˜ i 2 + 1 2 d ˜ i 2
The adaptive controller α i a can be designed as follows:
α i a = r i 1 z i 1 + η ˙ i d
where r i 1 = diag ( r i 11 , r i 12 , r i 13 ) is the positive design parameter vector. Thus, we can obtain the following:
V ˙ i 1 σ i 1 X ˜ i 2 σ i 2 Y ˜ i 2 σ i 3 Ψ ˜ i 2 r i 1 z i 1 z i 1 + z i 1 z i 2 + z i 1 α i * + 3 2 θ ˜ i 2 + 3 2 d ˜ i 2 + 3 2 ε i m 2
where σ i 1 = 1 2 λ min Q i 1 λ max 2 ( P i 1 ) X ˜ i 2 1 2 λ max 2 ( P i 1 B i 1 ) , σ i 2 = 1 2 λ min Q i 2 λ max 2 ( P i 2 ) Y ˜ i 2 1 2 λ max 2 ( P i 2 B i 2 ) , σ i 3 = 1 2 λ min Q i 3 λ max 2 ( P i 3 ) Ψ ˜ i 2 1 2 λ max 2 ( P i 3 B i 3 ) .
We design the following Lyapunov function:
V i 2 = V i 1 + 1 2 z i 2 z i 2 + 1 2 θ ˜ i θ ˜ i + 1 2 q ˜ i q ˜ i
Utilizing the fact that θ ˜ i = θ i * θ ^ i , we can obtain the following equation:
V ˙ i 2 = V ˙ i 1 + z i 2 z ˙ i 2 + θ ˜ i θ ˜ ˙ i + q ˜ i q ˜ ˙ i = V ˙ i 1 + z i 2 ( f ^ i ( η ^ i , υ ^ i ) + R i ( ψ i ) M i 1 ( τ i + d i ) α ˙ i + θ ˜ i φ i ( η ^ i , υ ^ i ) ) θ ˜ i θ ^ ˙ i + q ˜ i ( d ˙ i K i d θ ˜ i φ i ( η i , υ i ) K i d R i ( ψ i ) M i 1 q ˜ i )
By applying Young’s inequality, the following inequalities can be obtained:
q ˜ i d ˙ i 1 2 q i ˜ 2 + 1 2 d i m 2
and
q ˜ i K i d θ ˜ i φ i ( η i , υ i ) 1 2 q i ˜ 2 + 1 2 K i d 2 φ i m 2 θ ˜ i 2
We define h i ( Z i 2 ) f i ( η i , υ i ) f i ( α i ) , where Z i 2 = [ η i , α i ] , and one has
V ˙ i 2 V ˙ i 1 + z i 2 ( h i ( Z i 2 ) + θ ^ i φ i ( α i ) + θ ˜ i φ i ( α i ) + ε i + R i ( ψ i ) M i 1 ( τ i a + τ i * + d i ) α ˙ i ) θ ˜ i θ ^ ˙ i ( K i d R i M i 1 ( ψ i ) 1 ) q i ˜ 2 + 1 2 d i m 2 + 1 2 K i d 2 φ i m 2 θ ˜ i 2
We can design the adaptive controller and adaptive law as follows:
τ i a = M i R i ( ψ i ) ( z i 1 z i 2 r i 2 z i 2 θ ^ i φ i ( α i ) + α ˙ i ) q ^ i K i d υ i
and
θ ^ ˙ i = z i 2 φ i ( α i ) θ ^ i θ ^ i θ ^ i 2
where r i 2 = diag ( r i 21 , r i 22 , r i 23 ) is the positive design vector. According to Young’s inequality, we can obtain that
θ ˜ i θ ^ i 1 2 θ ˜ i 2 + 1 2 θ i * 2
and
θ ˜ i θ ^ i θ ^ i 2 1 10 θ ˜ i 4 + 1 2 θ i * 4
From (17) and (18), the tracking error dynamics can be described as follows:
Z ˙ i = F i ( Z i ) + G i U i *
where Z i = [ z i 1 , z i 2 ] , F i ( Z i ) = [ 0 3 × 3 , h i ( Z i 2 ) ] , G i = diag ( I 3 × 3 , R i M i 1 ( ψ i ) ) , U i * = [ α i * , τ i * ] . The cost function of the tacking error system can be described as
J i = t 0 r i Z i t , U i * t d t
where r i ( Z i , U i * ) = Q i ( Z i ) + U i * R i U i * is the optimal index, Q i ( Z i ) is a positive definite matrix satisfying Q i ( Z i )   =   0 only if Z i   =   0 and Γ imin Q i ( Z i ) Γ imax for Z ¯ imin Z i Z ¯ imax , Γ imin , Γ imax , Z ¯ imin , and Z ¯ imax are positive bounds. R i = R i is a positive definite matrix. An infinitesimal equivalent of (43) can be written as follows [31]:
J ˙ i = J i ( F i ( Z i ) + G i U i * ) = Q i ( Z i ) U i * R i U i *
We define a Hamiltonian function as
H i ( Z i , U i ) = r i ( Z i , U i ) + ( J i ( Z i ) ) ( F i ( Z i ) + G i U i )
where U i is the associated admissible control, and J i ( Z i ) is the gradient of J i ( Z i ) with regard to Z i . The optimal controller U i * ( Z i ) can be obtained by applying the condition H i ( Z i , U i ) / U i = 0 , and one has
U i * ( Z i ) = 1 2 R i 1 G i J i * ( Z i )
where J i * ( Z i ) represents the gradient of J i * ( Z i ) with respect to Z i . The HJB equation can be described as follows:
Q i ( Z i ) + ( J i * ( Z i ) ) F ( Z i ) 1 4 ( J i * ( Z i ) ) G i R i 1 ( J i * ( Z i ) ) = 0
with J i * ( 0 ) = 0 . It can be concluded that J i is a continuously differentiable and radially unbounded Lyapunov function, and the derivative with an optimal controller can be described as
J ˙ i = J i Z ˙ i = J i ( F i ( Z i ) + G i U i * ) 0
By choosing suitable Q i ( Z i ) satisfying lim Z i Q i ( Z i ) = and J i * Q i ( Z i ) J i = Q i ( Z i ) + U i * R i U i * , the following equation can be obtained [32]:
J i ( F i ( Z i ) + G i U I * ) = J i Q i ( Z i ) J i
From (37), (40), and (41), we can obtain that
V ˙ i 2 σ i 1 X ˜ i 2 σ i 2 Y ˜ i 2 σ i 3 Ψ ˜ i 2 γ i Z i 2 1 10 θ ˜ i 4 ( 1 2 K i d 2 φ i m 2 1 2 ) θ ˜ i 2 ( K i d R i ( ψ i ) M i 1 1 2 R i ( ψ i ) M i 1 M i R i ( ψ i ) ) q i ˜ 2 + 1 2 θ i * 2 + 1 2 θ i * 4 + 3 2 ε i m 2 + 1 2 d i m 2 + Z i F i ( Z i ) + G i U i *
where γ i = [ r i 1 , r i 2 ] . According to (49), it can be seen that Z i F i ( Z i ) + G i U i * becomes negative when the controller U i * is designed by adopting the optimal control method to stabilize the tracking error system (42), which implies that the tracking error Z i remains bounded and the cost function J i is minimized. However, the HJB Equation (47) is difficult to solve, so the ADP method is applied in the next section to obtain U i * .

3.4. Optimal Compensation Term Design

To obtain the approximation of the optimal cost function, NN is adopted as
J i * ( Z i ) = θ i b * φ i b ( Z i ) + ε i b
Here, θ i b * represents the ideal parameter, φ i b ( Z i ) denotes the basis function, and ε i b signifies the minimum approximation error. The gradient of the optimal cost function is given by
J i * ( Z i ) / i Z i = φ i b ( Z i ) θ i b * + ε i b
where φ i b ( Z i ) and ε i b are the gradients of φ i b ( Z i ) and ε i b , respectively. The optimal controller and the Hamiltonian function are expressed below:
U i * ( Z i ) = 1 2 R i 1 φ i b ( Z i ) θ i b * + ε i b
H i Z i , θ i b * = Q i ( Z i ) + θ i b * φ i b ( Z i ) H i ( Z i ) + ε i HJB 1 4 θ i b * φ i b ( Z i ) G i R i 1 G i φ i b ( Z i ) θ i b *
We can obtain the following equation:
ε i HJB = ( ε i b ) F i ( Z i ) + G i U i * + 1 4 ( ε i b ) G i R i 1 G i ε i b
Describing the approximation of the optimal cost function achieved by NN, we have
J ^ i ( Z i ) = θ ^ i b φ i b ( Z i )
where θ ^ i b is the approximation of θ i b * . The optimal controller estimation can be reformulated as follows:
U i ^ * ( Z i ) = 1 2 R i 1 G i φ i b ( Z i ) θ ^ i b
The approximate Hamiltonian function can be obtained as follows:
H ^ i Z i , θ ^ i b = Q i ( Z i ) + θ ^ i b φ i b ( Z i ) F ^ i ( Z i | Θ ^ i ) 1 4 θ ^ i b φ i b ( Z i ) G i R i 1 G i φ i ( Z i ) θ ^ i b
where F ^ i ( Z i | θ ^ i ) = f ^ i 1 ( z i 1 | θ ^ i 1 ) , f ^ i 2 ( z i 2 | θ ^ i 2 ) , , f ^ i n ( z i n | θ ^ i n ) , f ^ i 1 ( z i 1 | θ ^ i 1 ) = f ^ i 1 ( x i 1 | θ ^ i 1 ) f ^ i 1 ( x i 1 d | θ ^ i 1 ) , f ^ i j ( z i j | θ ^ i j ) = f ^ i j ( x i j | θ ^ i j ) f ^ i j ( x i j d | θ ^ i j ) j = 2 , 3 , , n .
We choose the weight updating law of θ ^ i b as follows:
θ ^ ˙ i b = φ i b ( Z i ) F ^ i ( Z i θ i ^ ) 1 2 φ i b ( Z i ) G i R i 1 G i φ i b ( Z i ) θ ^ i b × Q i ( Z i ) + θ ^ i b φ i b ( Z i ) F ^ i ( Z i Θ i ) 1 4 θ ^ i b φ i b ( Z i ) G i R i 1 G i φ i b ( Z i ) θ ^ i b
The estimation error of the optimal cost function parameter is defined as θ ˜ i b = θ i b * θ ^ i b . From this definition, we can obtain that
H ^ i ( Z i , θ ^ i b ) = 1 2 θ ˜ i b φ i b ( Z i ) G i R i 1 G i φ i b ( Z i ) θ i b * θ ˜ i b φ i b ( Z i ) F ˜ i ( Z i ) ε i HJB 1 4 θ ˜ i b φ i b ( Z i ) G i R i 1 G i φ i b ( Z i ) θ ˜ i b
The error dynamics of Equation (59) can be expressed as follows:
θ ˜ ˙ i b = φ i b ( Z i ) Z i ˙ φ i b ( Z i ) F i ˜ ( Z i ) + φ i b ( Z i ) G i R i 1 G i φ i b ( Z i ) θ ˜ i b + 1 2 φ i b ( Z i ) G i R i 1 G i ε i b ( Z i ) × θ ˜ i b φ i b ( Z i ) Z i ˙ + θ ^ i b φ i b ( Z i ) F i ˜ ( Z i ) + 1 4 θ ˜ i b φ i b ( Z i ) G i R i 1 G i φ i b ( Z i ) θ ˜ i b + 1 2 θ ˜ i b φ i b ( Z i ) G i R i 1 G i ε i b ( Z i ) + ε i HJB
where F ˜ i ( Z i ) = F i ( Z i ) F ^ i ( Z i | θ ^ i ) .

3.5. Stability Analysis

Theorem 1.
For the multiple UMV systems (System (1)), the weight updating law is given by (39), the adaptive controller is defined by (38), the optimal compensation term is provided by (57), and the updated law for the cost function is specified by (59). By selecting the design parameters appropriately, the entire control scheme ensures the boundedness of all signals in the closed-loop system, and the system outputs can optimally track the reference signal.
Proof. 
Consider the following Lyapunov function:
V i = 1 2 X ˜ i P i 1 X ˜ i + 1 2 Y ˜ i P i 2 Y ˜ i + 1 2 Ψ ˜ i P i 3 Ψ ˜ i + 1 2 z i 1 z i 1 + 1 2 z i 2 z i 2 + 1 2 θ ˜ i θ ˜ i + 1 2 θ ˜ i b θ ˜ i b + 1 2 q ˜ i q ˜ i
We can obtain that
V ˙ i = 1 2 X ˜ ˙ i P i 1 X ˜ i + 1 2 X ˜ i P i 1 X ˜ ˙ i + 1 2 Y ˜ ˙ i P i 2 Y ˜ i + 1 2 Y ˜ i P i 2 Y ˜ ˙ i + 1 2 Ψ ˜ ˙ i P i 3 Ψ ˜ i + 1 2 Ψ ˜ i P i 3 Ψ ˜ ˙ i + z i 1 z ˙ i 1 + z i 2 z ˙ i 2 + θ ˜ i θ ˜ ˙ i + θ ˜ i b θ ˜ ˙ i b + q ˜ i q ˜ ˙ i
Assume that φ i b Z i R i 1 φ i b Z i i 2 , F i Z i + G i U i * c i Z i , ε i Z i ε i b m , and φ i b Z i φ i b m , where i 2 , c i , ε i m , and φ i b m are positive constants. By applying Young’s inequality and the Cauchy–Schwartz inequality to (63), we can obtain the following inequality:
V ˙ i i 1 Z i 2 + i 2 Z i i 3 θ ˜ i 4 + i 4 θ ˜ i 2 i 5 θ ˜ i b 4 + i 6 θ ˜ i b 2 i 7 q ˜ i 2 i 8 X ˜ i 2 i 9 Y ˜ i 2 i 10 Ψ ˜ i 2 i 11
where
i 1 = γ i 11 4 c i 4 1 2 c i 4 φ i b m 4
i 2 = 2 c i 2 φ i b m 2
i 3 = 1 10 7 2 φ i m 4
i 4 = 1 4 φ i m 2 + 2 θ i b * φ i m 4 1 2
i 5 = 1 4 π i 5 2 99 32 φ i m 4 G i 4 R i 1 2 77 16 φ i m 4
i 6 = 3 8 φ i m 2 G i 4 R i 1 2 ε i b m 2 + 1 8 φ i m 2 G i 4 R i 1 2
i 7 = K i d R i ( ψ i ) M i 1 ( ψ i ) 1 2 2 K i d 2 φ i m 2 1 2 A i
i 8 = σ i 1
i 9 = σ i 2
i 10 = σ i 3
i 11 = 3 32 G i 8 R i 1 4 ε i b m 4 + 3 32 G i 4 R i 1 2 ε i b m 4 + 1 2 ε i m 2 + 1 2 d i m 2 + 1 2 θ i * 2 + 1 2 θ i * 4
Assume the following equations hold:
Z i > i 2 + i 2 2 + 4 i 1 i 11 2 i 1
or
θ ˜ i > i 4 + i 4 2 + 4 i 3 i 11 2 i 3
or
W ˜ i > i 6 + i 6 2 + 4 i 5 i 11 2 i 5
or
X ˜ i > i 11 i 8
or
Y ˜ i > i 11 i 9
or
Ψ ˜ i > i 11 i 10
or
q ˜ i > i 11 i 7
Thus, according to standard Lyapunov extension [31], we can obtain V ˙ i 3 < 0 , and all signals within the closed-loop system remain bounded. □

4. Simulations and Comparison Study

4.1. Simulation Experiment

The simulation is applied by using MatlabR2018a. The parameter of CyberShip II is adopted in the simulation part; the details of this simulation model can be found in [33]. Three UMVs are adopted in the simulation experiment, which are named UMV 1 , UMV 2 , and UMV 3 , respectively. R i is designed as 0.01 × I 3 × 3 . The initial position of the three UMVs are x 1 ( 0 ) = 0 , y 1 ( 0 ) = 0.7 , ψ 1 = 10 , x 2 ( 0 ) = 0.3 , y 2 ( 0 ) = 0.5 , ψ 2 = 20 , x 3 ( 0 ) = 0.3 , y 3 ( 0 ) = 0.5 , ψ 3 = 10 , and x 4 ( 0 ) = 0.4 , y 4 ( 0 ) = 0 , ψ 4 = 20 , respectively. The parameters of related position are given as follows: p 1 x = 2 / 2 , p 1 y = 0 , p 2 x = 1 / 2 , p 2 y = 1 / 2 , p 3 x = 1 / 2 , p 3 y = 1 / 2 , p 4 x = 1 / 5 , and p 4 y = 0 . k i 11 = k i 12 = k i 21 = k i 22 = k i 31 = k i 32 = 50 , and k i d = 10 . The applied disturbances are chosen as follows: d i 1 = 20 0.5 c o s 0.25 t + 0.5 s i n 0.15 t , d i 2 = 20 0.5 c o s 0.15 t 0.5 s i n 0.25 t , and d i 3 = 2 0.5 c o s 0.25 t + 0.5 s i n 0.15 t . The desired tracking signal of the virtual leader is given as η d = [ 10 sin ( 0.02 t ) , 10 ( 1 cos ( 0.02 t ) , 0.02 t ] .
The simulation results are presented in Figure 2, Figure 3, Figure 4, Figure 5, Figure 6, Figure 7, Figure 8, Figure 9, Figure 10, Figure 11, Figure 12, Figure 13 and Figure 14. The tracking trajectory of the UMVs is shown in Figure 2, where it can be seen that the tracking control objective is achieved, and the UMVs maintain the desired formation. The tracking errors in the x and y directions are shown in Figure 3 and Figure 4, respectively. Initially, the errors are large due to the distance between the UMVs’ initial positions and the desired positions. However, once the UMVs approach the desired positions, the tracking errors tend to zero. Figure 5, Figure 6 and Figure 7 present the observer errors of the NSO. Since the initial value of the NSO is different from the actual value, the initial error is large. However, the designed updating law helps the NSO’s estimates converge to the true velocity values. The disturbance observer (DO) errors are shown in Figure 8, Figure 9 and Figure 10. As mentioned before, the applied disturbances d i 1 , d i 2 , and d i 3 oscillate within the range of (−20, 20), (−20, 20), and (−2, 2), respectively. From Figure 8, Figure 9 and Figure 10, it can be observed that the DO errors stabilize within the range of [(−2, 2); (−4, 4); (−0.5, 0.5)], demonstrating that the designed DO effectively can reduce the impact of disturbances. The controller outputs are shown in Figure 11, Figure 12 and Figure 13, while the designed optimal index is shown in Figure 14. It can be seen that the proposed controller minimizes the optimal index.
In conclusion, the simulation results demonstrate that the proposed optimal algorithm successfully handles the control task, and the NSO and DO can estimate the unmeasured velocities and unknown disturbances.

4.2. Comparison Study

To further demonstrate the disturbance rejection capability of the proposed algorithm, a comparative study is presented in this part. The controller adopted in this part is a fuzzy adaptive optimal controller without the disturbance rejection technique [34]. The experiment conditions, initial conditions, and applied disturbances are the same with the conditions given in Section 4.1.
The simulation results are presented in Figure 15, Figure 16, Figure 17, Figure 18, Figure 19, Figure 20 and Figure 21. As shown in Figure 15, the controller proposed in [34] achieves the tracking control objective. However, in Figure 16 and Figure 17, it is evident that the tracking errors are larger compared to Figure 3 and Figure 4. From Figure 18, Figure 19 and Figure 20, it can be observed that the controller output is larger compared to Figure 11, Figure 12 and Figure 13. This suggests that, under the applied disturbances, the lack of a disturbance rejection technique leads to larger tracking errors, which in turn requires a greater controller output to decrease tracking errors and achieve the control objective. This increased output results in a higher optimal index, as seen in Figure 21. Comparing Figure 21 to Figure 14, it becomes clear that the proposed controller achieves better performance in minimizing the optimal index. Therefore, the simulation results in Figure 15, Figure 16, Figure 17, Figure 18, Figure 19, Figure 20 and Figure 21 demonstrate that, under the disturbances d i 1 , d i 2 , and d i 3 given in Section 4.1, the proposed control method with the DO demonstrates better control performance compared to the existing optimal control method without a disturbance rejection technique [34].

5. Conclusions

This paper has presented an observer-based adaptive optimal controller for tracking control problems of multiple UMVs with uncertain dynamics and unmeasured velocities. NSOs have been developed to manage the unmeasured velocities and the uncertain dynamics of the UMVs. An adaptive controller with an optimal compensation term has been implemented to achieve the control objective optimally. Additionally, a DO has been proposed to address unknown external disturbances. It has been proven that the proposed controller ensures all signals in the closed-loop system remain bounded. Simulation results have been provided to demonstrate the effectiveness of the proposed algorithm. From these results, it can be concluded that the proposed control algorithm has successfully achieved the control objectives and minimized the optimal index. The NSO and DO have effectively estimated the unmeasured velocities and unknown disturbances. To further illustrate the disturbance rejection capability of the proposed algorithm, a comparison study has been conducted using an existing method [34]. The simulation results of this comparison study have shown that, under the appied external disturbances, the proposed control algorithm can achieve better control performance by adopting the disturbance rejection technique.

Author Contributions

Methodology, L.-E.Y. and T.L.; software, L.-E.Y.; resources, Y.X. and T.L.; writing—original draft, L.-E.Y.; writing—review & editing, Y.X. and D.Z.; supervision, Y.X., T.L. and D.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was partly supported by the National Natural Science Foundation of China under Grants 51939001, 52301418, 61751202, and 61976033, in part by Liaoning Revitalization Talents Program under Grant XLYC1908018. This work was also supported by the China Scholarship Council (Fund No. 202206570022).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Peng, Z.; Wang, J.; Wang, D.; Han, Q.L. An Overview of Recent Advances in Coordinated Control of Multiple Autonomous Surface Vehicles. IEEE Trans. Ind. Inform. 2021, 17, 732–745. [Google Scholar] [CrossRef]
  2. Fossen, T.I. Marine Control Systems—Guidance, Navigation, and Control of Ships, Rigs and Underwater Vehicles; Marine Cybernetics: Trondheim, Norway, 2002. [Google Scholar]
  3. Peng, Z.; Liu, L.; Wang, J. Output-Feedback Flocking Control of Multiple Autonomous Surface Vehicles Based on Data-Driven Adaptive Extended State Observers. IEEE Trans. Cybern. 2021, 51, 4611–4622. [Google Scholar] [CrossRef] [PubMed]
  4. Peng, B.; Gu, N.; Wang, D.; Peng, Z. Model-Free Adaptive Disturbance Rejection Control of An RSV With Hardware-in-The-Loop Experiments. IEEE Trans. Ind. Electron. 2023, 70, 7507–7510. [Google Scholar] [CrossRef]
  5. Wu, W.; Peng, Z.; Wang, D.; Liu, L.; Han, Q.L. Network-Based Line-of-Sight Path Tracking of Underactuated Unmanned Surface Vehicles With Experiment Results. IEEE Trans. Cybern. 2022, 52, 10937–10947. [Google Scholar] [CrossRef] [PubMed]
  6. Peng, Z.; Wang, D.; Chen, Z.; Hu, X.; Lan, W. Adaptive Dynamic Surface Control for Formations of Autonomous Surface Vehicles With Uncertain Dynamics. IEEE Trans. Control Syst. Technol. 2013, 21, 513–520. [Google Scholar] [CrossRef]
  7. Gao, S.; Peng, Z.; Liu, L.; Wang, D.; Han, Q.L. Fixed-Time Resilient Edge-Triggered Estimation and Control of Surface Vehicles for Cooperative Target Tracking Under Attacks. IEEE Trans. Intell. Veh. 2023, 8, 547–556. [Google Scholar] [CrossRef]
  8. Peng, Z.; Wang, D.; Li, T.; Han, M. Output-Feedback Cooperative Formation Maneuvering of Autonomous Surface Vehicles With Connectivity Preservation and Collision Avoidance. IEEE Trans. Cybern. 2020, 50, 2527–2535. [Google Scholar] [CrossRef] [PubMed]
  9. Wang, F.; Zhang, H.; Liu, D. Adaptive Dynamic Programming: An Introduction. IEEE Comput. Intell. Mag. 2009, 4, 39–47. [Google Scholar] [CrossRef]
  10. Zhang, G.; Zhu, Q. Event-triggered optimal control for nonlinear stochastic systems via adaptive dynamic programming. Nonlinear Dyn. 2021, 105, 387–401. [Google Scholar] [CrossRef]
  11. Yuan, L.; Li, T.; Tong, S.; Xiao, Y.; Shan, Q. Broad Learning System Approximation-Based Adaptive Optimal Control for Unknown Discrete-Time Nonlinear Systems. IEEE Trans. Syst. Man Cybern. Syst. 2022, 52, 5028–5038. [Google Scholar] [CrossRef]
  12. Huang, Z.; Bai, W.; Li, T.; Long, Y.; Chen, C.P.; Liang, H.; Yang, H. Adaptive reinforcement learning optimal tracking control for strict-feedback nonlinear systems with prescribed performance. Inf. Sci. 2023, 621, 407–423. [Google Scholar] [CrossRef]
  13. Werbos, P.J. Approximate dynamic programming for realtime control and neural modelling. In Handbook of Intelligent Control: Neural, Fuzzy and Adaptive Approaches; Van Nostrand Reinhold: New York, NY, USA, 1992; pp. 493–525. [Google Scholar]
  14. Werbos, P.J. Consistency of HDP applied to a simple reinforcement learning problem. Neural Netw. 1990, 3, 179–189. [Google Scholar] [CrossRef]
  15. Jiang, Z.; Jiang, Y. Robust adaptive dynamic programming for linear and nonlinear systems: An overview. Eur. J. Control 2013, 19, 417–425. [Google Scholar] [CrossRef]
  16. Gao, X.; Long, Y.; Li, T.; Hu, X.; Chen, C.L.P.; Sun, F. Optimal Fuzzy Output Feedback Control for Dynamic Positioning of Vessels With Finite-Time Disturbance Rejection Under Thruster Saturations. IEEE Trans. Fuzzy Syst. 2023, 31, 3447–3458. [Google Scholar] [CrossRef]
  17. Wang, N.; Gao, Y.; Zhang, X. Data-driven performance-prescribed reinforcement learning control of an unmanned surface vehicle. IEEE Trans. Neural Netw. Learn. Syst. 2021, 32, 5456–5467. [Google Scholar] [CrossRef]
  18. Wang, N.; Gao, Y.; Zhao, H.; Ahn, C.K. Reinforcement Learning-Based Optimal Tracking Control of an Unknown Unmanned Surface Vehicle. IEEE Trans. Neural Netw. Learn. Syst. 2021, 32, 3034–3045. [Google Scholar] [CrossRef]
  19. Bellman, R.E. Dynamic Programming; Priceton Univ. Press: Priceton, NJ, USA, 1957. [Google Scholar]
  20. Peng, Z.; Wang, J.; Wang, D. Distributed Containment Maneuvering of Multiple Marine Vessels via Neurodynamics-Based Output Feedback. IEEE Trans. Ind. Electron. 2017, 64, 3831–3839. [Google Scholar] [CrossRef]
  21. Jiang, Y.; Peng, Z.; Wang, D.; Yin, Y.; Han, Q.L. Cooperative Target Enclosing of Ring-Networked Underactuated Autonomous Surface Vehicles Based on Data-Driven Fuzzy Predictors and Extended State Observers. IEEE Trans. Fuzzy Syst. 2022, 30, 2515–2528. [Google Scholar] [CrossRef]
  22. Deng, Y.; Zhang, X. Event-Triggered Composite Adaptive Fuzzy Output-Feedback Control for Path Following of Autonomous Surface Vessels. IEEE Trans. Fuzzy Syst. 2021, 29, 2701–2713. [Google Scholar] [CrossRef]
  23. Chen, W.H.; Yang, J.; Guo, L.; Li, S. Disturbance-Observer-Based Control and Related Method: An Overview. IEEE Trans. Ind. Electron. 2016, 63, 1083–1095. [Google Scholar] [CrossRef]
  24. Hu, X.; Wei, X.; Kao, Y.; Han, J. Robust Synchronization for Under-Actuated Vessels Based on Disturbance Observer. IEEE Trans. Intell. Transp. Syst. 2022, 23, 5470–5479. [Google Scholar] [CrossRef]
  25. Do, K. Practical control of underactuated ships. Ocean Eng. 2010, 37, 1111–1119. [Google Scholar] [CrossRef]
  26. von Ellenrieder, K.D. Dynamic surface control of trajectory tracking marine vehicles with actuator magnitude and rate limits. Automatica 2019, 105, 433–442. [Google Scholar] [CrossRef]
  27. Li, Z.; Sun, J.; Oh, S. Design, analysis and experimental validation of a robust nonlinear path following controller for marine surface vessels. Automatica 2009, 45, 1649–1658. [Google Scholar] [CrossRef]
  28. Gao, X.; Li, T.; Yuan, L.; Bai, W. Robust Fuzzy Adaptive Output Feedback Optimal Tracking Control for Dynamic Positioning of Marine Vessels with Unknown Disturbances and Uncertain Dynamics. Int. J. Fuzzy Syst. 2021. [Google Scholar] [CrossRef]
  29. Gao, X.; Bai, W.; Li, T.; Yuan, L.; Long, Y. Broad learning system-based adaptive optimal control design for dynamic positioning of marine vessels. Nonlinear Dyn. 2021, 105, 1593–1609. [Google Scholar] [CrossRef]
  30. Wondergem, M.; Lefeber, E.; Pettersen, K.Y.; Nijmeijer, H. Output Feedback Tracking of Ships. IEEE Trans. Control Syst. Technol. 2011, 19, 442–448. [Google Scholar] [CrossRef]
  31. Sarangapani, J. Neural Network Control of Nonlinear Discrete-Time Systems; CRC Press: Boca Raton, FL, USA, 2018. [Google Scholar]
  32. Vrabie, D.; Lewis, F. Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems. Neural Netw. 2009, 22, 237–246. [Google Scholar] [CrossRef]
  33. Skjetne, R.; Smogeli, Ø.; Fossen, T.I. Modeling, identification, and adaptive maneuvering of Cybership II: A complete design with experiments. IFAC Proc. Vol. 2004, 37, 203–208. [Google Scholar] [CrossRef]
  34. Sun, K.; Li, Y.; Tong, S. Fuzzy Adaptive Output Feedback Optimal Control Design for Strict-Feedback Nonlinear Systems. IEEE Trans. Syst. Man Cybern. Syst. 2017, 47, 33–44. [Google Scholar] [CrossRef]
Figure 1. Illustration of the coordinations in NED ( O E X E Y E ) and BF ( O B X B Y B ).
Figure 1. Illustration of the coordinations in NED ( O E X E Y E ) and BF ( O B X B Y B ).
Jmse 12 01697 g001
Figure 2. Tracking trajectory of UMVs.
Figure 2. Tracking trajectory of UMVs.
Jmse 12 01697 g002
Figure 3. Tracking error in the x-direction.
Figure 3. Tracking error in the x-direction.
Jmse 12 01697 g003
Figure 4. Tracking error in the y-direction.
Figure 4. Tracking error in the y-direction.
Jmse 12 01697 g004
Figure 5. Observer error of NSO u ˜ i .
Figure 5. Observer error of NSO u ˜ i .
Jmse 12 01697 g005
Figure 6. Observer error of NSO v ˜ i .
Figure 6. Observer error of NSO v ˜ i .
Jmse 12 01697 g006
Figure 7. Observer error of NSO r ˜ i .
Figure 7. Observer error of NSO r ˜ i .
Jmse 12 01697 g007
Figure 8. Observer error of DO d ˜ i 1 .
Figure 8. Observer error of DO d ˜ i 1 .
Jmse 12 01697 g008
Figure 9. Observer error of DO d ˜ i 2 .
Figure 9. Observer error of DO d ˜ i 2 .
Jmse 12 01697 g009
Figure 10. Observer error of DO d ˜ i 3 .
Figure 10. Observer error of DO d ˜ i 3 .
Jmse 12 01697 g010
Figure 11. Controller output τ i 1 .
Figure 11. Controller output τ i 1 .
Jmse 12 01697 g011
Figure 12. Controller output τ i 2 .
Figure 12. Controller output τ i 2 .
Jmse 12 01697 g012
Figure 13. Controller output τ i 3 .
Figure 13. Controller output τ i 3 .
Jmse 12 01697 g013
Figure 14. Optimal index r i Z i t , U i * t .
Figure 14. Optimal index r i Z i t , U i * t .
Jmse 12 01697 g014
Figure 15. Tracking trajectory of the existing control method [34].
Figure 15. Tracking trajectory of the existing control method [34].
Jmse 12 01697 g015
Figure 16. Tracking error in the x direction of the existing control method [34].
Figure 16. Tracking error in the x direction of the existing control method [34].
Jmse 12 01697 g016
Figure 17. Tracking error in the y direction of the existing control method [34].
Figure 17. Tracking error in the y direction of the existing control method [34].
Jmse 12 01697 g017
Figure 18. Controller output τ i 1 of the existing control method [34].
Figure 18. Controller output τ i 1 of the existing control method [34].
Jmse 12 01697 g018
Figure 19. Controller output τ i 2 of the existing control method [34].
Figure 19. Controller output τ i 2 of the existing control method [34].
Jmse 12 01697 g019
Figure 20. Controller output τ i 3 of the existing control method [34].
Figure 20. Controller output τ i 3 of the existing control method [34].
Jmse 12 01697 g020
Figure 21. Optimal index r i Z i t , U i * t of the existing control method [34].
Figure 21. Optimal index r i Z i t , U i * t of the existing control method [34].
Jmse 12 01697 g021
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yuan, L.-E.; Xiao, Y.; Li, T.; Zhou, D. Output Feedback Adaptive Optimal Control of Multiple Unmanned Marine Vehicles with Unknown External Disturbance. J. Mar. Sci. Eng. 2024, 12, 1697. https://doi.org/10.3390/jmse12101697

AMA Style

Yuan L-E, Xiao Y, Li T, Zhou D. Output Feedback Adaptive Optimal Control of Multiple Unmanned Marine Vehicles with Unknown External Disturbance. Journal of Marine Science and Engineering. 2024; 12(10):1697. https://doi.org/10.3390/jmse12101697

Chicago/Turabian Style

Yuan, Liang-En, Yang Xiao, Tieshan Li, and Dalin Zhou. 2024. "Output Feedback Adaptive Optimal Control of Multiple Unmanned Marine Vehicles with Unknown External Disturbance" Journal of Marine Science and Engineering 12, no. 10: 1697. https://doi.org/10.3390/jmse12101697

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop