Output Feedback Adaptive Optimal Control of Multiple Unmanned Marine Vehicles with Unknown External Disturbance

Yuan, Liang-En; Xiao, Yang; Li, Tieshan; Zhou, Dalin

doi:10.3390/jmse12101697

Open AccessArticle

Output Feedback Adaptive Optimal Control of Multiple Unmanned Marine Vehicles with Unknown External Disturbance

by

Liang-En Yuan

¹

,

Yang Xiao

^2,*

,

Tieshan Li

^1,3,4 and

Dalin Zhou

⁵

¹

Navigation College, Dalian Maritime University, Dalian 116026, China

²

Department of Computer Science, The University of Alabama, Tuscaloosa, AL 35487-0290, USA

³

College of Automation Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China

⁴

Yangtze Delta Region Institute, University of Electronic Science and Technology of China, Huzhou 313001, China

⁵

School of Computing, University of Portsmouth, Portsmouth PO1 3HE, UK

^*

Author to whom correspondence should be addressed.

J. Mar. Sci. Eng. 2024, 12(10), 1697; https://doi.org/10.3390/jmse12101697

Submission received: 29 July 2024 / Revised: 9 September 2024 / Accepted: 15 September 2024 / Published: 25 September 2024

(This article belongs to the Special Issue Unmanned Marine Vehicles: Navigation, Control and Sensing)

Download

Browse Figures

Versions Notes

Abstract

:

This paper presents an optimal output-feedback tracking control problem for multiple unmanned marine vehicles (UMVs) to track a desired trajectory. To guarantee the control objective in an optimal manner, adaptive dynamic programming (ADP) with optimal compensation terms is adopted. A neural velocity observer is designed based on a neural network (NN) to estimate the unmeasured system states and the unknown system dynamics. Furthermore, a disturbance observer (DO) is proposed to go against the effect of the unknown external disturbance of the sea environment. It is proved that the proposed controller can guarantee that all signals in the closed-loop system are bounded. Simulation results are given to demonstrate the effectiveness of the proposed control algorithm.

Keywords:

unmanned marine vehicles (UMVs); adaptive dynamic programming (ADP); optimal tracking control; neural state observer (NSO)

1. Introduction

In recent years, unmanned marine vehicles (UMVs) have garnered increasing attention for the exploration of natural resources in oceanic spaces due to their unique advantages, such as low energy costs and advanced intelligence. Furthermore, UMVs can complete dangerous tasks without putting human lives at risk [1,2]. However, a solitary UMV may not suffice to execute complex tasks in some situations. Thus, the control problem of multiple UMVs has garnered substantial interest. In recent years, the coordinated control of multiple UMVs has become a burgeoning research topic [3]. Some important theoretical results of coordinated control of multiple vehicles have been reported [4,5,6,7,8]. In [8], a time-varying formation control problem was presented. Connectivity preservation and collision avoidance were investigated with position-heading measurements. In [5], the path-tracking control of under-actuated unmanned surface vehicles was investigated. Model uncertainties and unknown disturbances over a wireless network were considered. This research has far-reaching implications; however, the control attempts can use high amounts of energy because the aforementioned control design strategies do not consider the optimization problem.

Optimal control is a fundamental design principle that can be adopted to enhance the control performance of dynamic systems. Given that ocean transportation or deep-sea exploration frequently necessitates substantial energy provision, it becomes imperative to incorporate optimization into vessel control design to reduce energy consumption. However, the optimal control of UMVs is a difficult problem due to the inherent nonlinearities of the UMV system. The optimization problems of nonlinear systems often need to solve the Hamilton–Jacobi–Bellman (HJB) equation, which does not have a closed-form solution. To handle this difficulty, a promising adaptive optimal method was presented based on the reinforcement learning (RL) method, namely, adaptive dynamic programming (ADP) [9,10,11,12], in which an RL system was designed to approximate the HJB equation [13,14,15]. Many studies on the control problems of vehicles by using the ADP method have been reported [16,17,18]. In [16], the optimal tracking control problem for the dynamic positioning of marine vessels was investigated. The observer based on a fuzzy logic system (FLS) was given to handle the problem of unmeasured states of the vessels. A disturbance observer (DO) was given to estimate the external disturbances. In [17], a data-driven RL-based controller was introduced to address the optimal control problem of the vehicle. Using a data-driven approach, a model-free control method was formulated to achieve control optimality and prescribed tracking accuracy concurrently. In [18], an optimal control scheme was presented for RL-based optimal tracking control applied to UMVs. The unknown dead-zone input nonlinearities and unknown disturbances were considered and handled by using a neural network (NN)-based identifier. This research shows how the optimal control problem of a single UMV can be solved. However, the optimal tracking control problem of multiple UMVs cannot be handled directly using these methods.

In practical applications, accurately obtaining velocity information from shipborne sensors is often challenging due to the potential impact of noise interference and sensor signal loss on its effectiveness [19]. To handle this difficulty, some research based on the observer has been reported [3,20,21,22]. In [20], a study on the distributed containment maneuvering problem was conducted. Adopting the input–output data from each vehicle, a novel approach utilizing an echo state network-based observer was introduced to address the issue of unmeasured states. In [3], the flocking control problem of vehicles was studied. The velocity information of the vehicles was obtained by employing an extended state observer. In [22], the path-following problem of vehicles was studied. Unmeasured states were handled by giving a state observer based on FLS. From the studies mentioned above, establishing an observer to handle the unmeasured velocity information of vehicles is an effective approach.

When working in a sea environment, the control effectiveness of a UMV can be influenced by external disturbances such as wind and waves, which may lead to a failure to achieve the control target [23,24]. Thus, it is important to adopt disturbance rejection techniques to reduce the effect of unknown external disturbances. In recent years, many disturbance observer (OB)-based controllers were reported [24,25,26]. In [24], the robust leader–follower synchronization navigation for the UMVs was presented. The problem of unknown external disturbance was solved by adopting OB. In [26], the trajectory tracking control problem of UMVs was solved using the dynamic surface control technique. The time-varying disturbances were considered and estimated by a proposed DO. These results are fruitful and inspiring.

Motivated by the observations described above, the formation tracking problem of multiple UMVs is studied in this paper. An ADP algorithm with an optimal compensation term is adopted to guarantee the control objective optimally. An adaptive controller is designed using the backstepping technique; the optimal tracking control problem is transformed into an equivalent optimal regulation problem. Subsequently, an optimal compensation term is designed by using the policy iteration method. The overall optimal control input is the adaptive controller plus the optimal compensation term. To handle the unknown time-varying external disturbances of the sea environment, a DO is designed. It is proved that the proposed controller can guarantee that all signals in the closed-loop system are bounded. Simulation results are given to demonstrate the effectiveness of the proposed control algorithm.

The main contribution of this work can be summarized as follows.

(1) Unlike the references [4,5,6,7,8] investigating the tracking control problem of multiple UMVs without consideration of optimality, this paper considers optimality for designing the consensus controller. This means that the control method proposed in this paper can achieve the tracking control target with less energy consumption.

(2) Compared with the works in [17,18], an advantage of this paper is that this paper investigated the optimal tracking control problem of multiple UMVs. In contrast, the authors in [17,18] only investigated the optimal tracking control problem of a single UMV. Therefore, the task we set out to achieve is more challenging than the existing studies.

The rest of the paper is organized as follows: Section 2 provides the problem formulation, Section 3.1 presents the DO, Section 3.2 and Section 3.3 introduce the adaptive controller and the optimal compensation term, Section 3.4 provides a stability analysis, Section 4 provides the simulation results, and Section 5 presents concluding remarks.

2. Problem Formulation

Two reference frames are adopted: a body-fixed reference frame (BF) and a north-east-down reference frame (NED), which can be found in Figure 1. These two reference frames are generally adopted in ship motion control, and readers can find more details in [27,28,29].

Considering the leader–follower formation control problem of m UMVs, the underactuated system dynamics of 3 degrees of freedom (3-DOF) with uncertain dynamics can be described as follows [28,29,30]:

\begin{matrix} {\dot{η}}_{i} = R (ψ_{i}) ν_{i} \\ M_{i} {\dot{ν}}_{i} = - C_{i} (ν_{i}) ν_{i} - D_{i} (ν_{i}) ν_{i} + Δ (η_{i}, ν_{i}) + d_{i} + τ_{i} \\ i = 1, 2, \dots, m \end{matrix}

(1)

where

η_{i} = {[x_{i}, y_{i}, ψ_{i}]}^{⊤}

,

x_{i}, y_{i}

is the position of UMV in the earth-fixed frame,

ψ_{i}

is the heading angle in the earth-fixed frame,

ν_{i} = {[u_{i}, v_{i}, r_{i}]}^{⊤}

denotes the velocities of UMV in the body-fixed frame, and

τ_{i} = {[τ_{i 1}, τ_{i 2}, τ_{i 3}]}^{⊤}

denotes the control input.

Δ (η_{i}, ν_{i}) \in R^{3 \times 1}

denotes the uncertain dynamics.

d_{i}

is the unknown disturbance from the sea environment.

R (ψ_{i}) \in R^{3 \times 3}

is the rotation matrix from the earth-fixed frame to the body-fixed frame, which is given as follows:

R (ψ_{i}) = [\begin{matrix} cos (ψ_{i}) & - sin (ψ_{i}) & 0 \\ sin (ψ_{i}) & cos (ψ_{i}) & 0 \\ 0 & 0 & 1 \end{matrix}]

(2)

M \in R^{3 \times 3}

is an inertia matrix including hydrodynamic added inertia;

D_{i} (ν_{i}) \in R^{3 \times 3}

is the damping matrix;

C_{i} (ν_{i}) \in R^{3 \times 3}

is a matrix of the centripetal and Coriolis terms. These three terms are shown as follows:

\begin{matrix} M_{i} = [\begin{matrix} m_{i 11} & 0 & 0 \\ 0 & m_{i 22} & m_{i 23} \\ 0 & m_{i 32} & m_{i 33} \end{matrix}] \\ C_{i} (ν_{i}) = [\begin{matrix} 0 & 0 & c_{13 i} (ν_{i}, r_{i}) \\ 0 & 0 & m_{i 11} u_{i} \\ - c_{13 i} (ν_{i}, r_{i}) & - m_{i 11} u_{i} & 0 \end{matrix}] \\ D_{i} (ν_{i}) = [\begin{matrix} d_{i 11} & 0 & 0 \\ 0 & d_{i 22} & d_{i 23} \\ 0 & d_{i 32} & d_{i 33} \end{matrix}] \end{matrix}

(3)

where

m_{i 11} = m_{i} - X_{i \dot{u}}

,

m_{i 22} = m_{i} - Y_{i \dot{v}}

,

m_{i 32} = m_{i} x_{i G} - N_{i \dot{v}}

,

m_{i 33} = I_{i z} - N_{i \dot{r}}

, and

c_{13 i} (ν_{i}, r_{i}) = - m_{i 22} v_{i} - m_{i 23} r_{i}

.

m_{i}

denotes the mass of the vessel, and

x_{i G}

denotes the distance between the center of gravity of the vessel and the origin of the body-fixed frame.

I_{i z}

is the moment of inertia.

d_{i 11} = - X_{i u} - X_{i | u | u} | u_{i} |

,

d_{i 22} = - Y_{i v} - Y_{i | v | v} | v_{i} | - Y_{i | r | v} | r_{i} |

,

d_{i 23} = - Y_{i r} - Y_{i | v_{i} | r} | v_{i} | - Y_{i | r | r} | r_{i} |

,

d_{i 32} = - N_{i v} - N_{i | v | v} | v_{i} | - N_{i | r | v} | r_{i} |

, and

d_{i 33} = - N_{i r} - N_{i | v | r} | v_{i} | - N_{i | r | r} | r_{i} |

.

X_{•}

,

Y_{•}

, and

N_{•}

are the corresponding hydrodynamic derivatives. Consider a virtual leader moving along a desired trajectory shown as

η_{0 d} = {[x_{0 d}, y_{0 d}, ψ_{0 d}]}^{⊤}

.

For simplicity, we can obtain the following equation by using System (1):

\begin{matrix} {\dot{η}}_{i} = υ_{i} \\ {\dot{υ}}_{i} = f_{i} (η_{i}, υ_{i}) + M_{i}^{- 1} R_{i} (ψ_{i}) (τ_{i} + d_{i}) \\ i = 1, 2, \dots, m \end{matrix}

(4)

where

υ_{i} = R_{i} (ψ_{i}) ν_{i}

(5)

and

\begin{matrix} f_{i} (η_{i}, υ_{i}) = & {M_{i}}^{- 1} R_{i} (ψ_{i}) (- C_{i} (ν_{i}) ν_{i} - D_{i} (ν_{i}) ν_{i} + Δ (η_{i}, ν_{i})) + {\dot{R}}_{i} (ψ_{i}) ν_{i} \end{matrix}

(6)

f_{i} (η_{i}, υ_{i})

is an unknown function since

Δ (η_{i}, ν_{i})

is unknown. To handle this problem, NN is adopted to obtain the approximation.

Control Problem Statement: This study aims to design an output feedback control algorithm that can handle the optimal tracking control problem of multiple UMVs with unknown external disturbance and uncertain dynamics. The controller can ensure that all the signals in the closed-loop are bounded.

3. Main Results

3.1. Neural Observer Design

A neural state observer (NSO) is adopted to handle the unmeasured velocities and unknown system dynamics. From (4), we can obtain the following equations:

\begin{matrix} {\dot{X}}_{i} & = A_{i} X_{i} + B_{i 1} f_{i} (η_{i}, υ_{i}) + D_{i} B_{i 1} (τ_{i} + d_{i}) \\ {\dot{Y}}_{i} & = A_{i} Y_{i} + B_{i 2} f_{i} (η_{i}, υ_{i}) + D_{i} B_{i 2} (τ_{i} + d_{i}) \\ {\dot{Ψ}}_{i} & = A_{i} Ψ_{i} + B_{i 3} f_{i} (η_{i}, υ_{i}) + D_{i} B_{i 3} (τ_{i} + d_{i}) \end{matrix}

(7)

where

X_{i} = {[x_{i}, u_{i}]}^{⊤}

,

Y_{i} = {[y_{i}, v_{i}]}^{⊤}

,

Ψ_{i} = {[ψ_{i}, r_{i}]}^{⊤}

,

A_{i} = [\begin{matrix} 0 & 1 \\ 0 & 0 \end{matrix}]

,

B_{i 1} = [\begin{matrix} 0 & 0 & 0 \\ 1 & 0 & 0 \end{matrix}]

,

B_{i 2} = [\begin{matrix} 0 & 0 & 0 \\ 0 & 1 & 0 \end{matrix}]

,

B_{i 3} = [\begin{matrix} 0 & 0 & 0 \\ 0 & 0 & 1 \end{matrix}]

,

D_{i} = R_{i} (ψ_{i}) M^{- 1}

. Since

f_{i} (η_{i}, υ_{i})

is unknown, NN is adopted to obtain the approximation, which can be described as

f_{i} (η_{i}, υ_{i}) = θ_{i}^{*} φ_{i} ({\hat{η}}_{i}, {\hat{υ}}_{i}) + ε_{i}

(8)

ε_{i}

denotes the minimum approximation error, i.e.,

∥ ε_{i} ∥ \leq ε_{i m}

, where

ε_{i m} \in R^{3 \times 1}

is a constant vector,

∥ φ_{i} ∥ \leq φ_{i m}

. We can obtain the approximation of

f_{i} (η_{i}, υ_{i})

as

{\hat{f}}_{i} ({\hat{η}}_{i}, {\hat{υ}}_{i}) = {\hat{θ}}_{i} φ_{i} ({\hat{η}}_{i}, {\hat{υ}}_{i})

(9)

We can design the neural velocity observer as follows:

\begin{matrix} {\dot{\hat{X}}}_{i} & = - A_{i 1} {\hat{X}}_{i} + C_{i} X_{i} + B_{i 1} {\hat{θ}}_{i} φ_{i} ({\hat{η}}_{i}, {\hat{υ}}_{i}) + D_{i} B_{i 1} (τ_{i} + {\hat{d}}_{i}) \\ {\dot{\hat{Y}}}_{i} & = - A_{i 2} {\hat{Y}}_{i} + C_{i} Y_{i} + B_{i 2} {\hat{θ}}_{i} φ_{i} ({\hat{η}}_{i}, {\hat{υ}}_{i}) + D_{i} B_{i 2} (τ_{i} + {\hat{d}}_{i}) \\ {\dot{\hat{Ψ}}}_{i} & = - A_{i 3} {\hat{Ψ}}_{i} + C_{i} Ψ_{i} + B_{i 3} {\hat{θ}}_{i} φ_{i} ({\hat{η}}_{i}, {\hat{υ}}_{i}) + D_{i} B_{i 3} (τ_{i} + {\hat{d}}_{i}) \end{matrix}

(10)

where

{\hat{X}}_{i} = {[{\hat{x}}_{i}, {\hat{u}}_{i}]}^{⊤}

,

{\hat{Y}}_{i} = {[{\hat{y}}_{i}, {\hat{v}}_{i}]}^{⊤}

, and

{\hat{Ψ}}_{i} = {[{\hat{ψ}}_{i}, {\hat{r}}_{i}]}^{⊤}

.

A_{i 1} = [\begin{matrix} k_{i 11} & 1 \\ k_{i 12} & 0 \end{matrix}]

,

A_{i 2} = [\begin{matrix} k_{i 21} & 1 \\ k_{i 22} & 0 \end{matrix}]

,

A_{i 3} = [\begin{matrix} k_{i 31} & 1 \\ k_{i 32} & 0 \end{matrix}]

.

C_{i} = {[1, 0]}^{⊤}

A_{i 1}

,

A_{i 2}

, and

A_{i 3}

are a Hurwitz matrix by choosing the suitable parameters

k_{i 11}

,

k_{i 12}

,

k_{i 21}

,

k_{i 22}

,

k_{i 31}

, and

k_{i 32}

.

P_{i 1}

,

P_{i 2}

,

P_{i 3}

,

Q_{i 1}

,

Q_{i 2}

, and

Q_{i 3}

are positive definite matrices that satisfy

A_{i 1}^{⊤} P_{i 1} + P_{i 1}^{⊤} A_{i 1} = - Q_{i 1}

,

A_{i 2}^{⊤} P_{i 2} + P_{i 2}^{⊤} A_{i 2} = - Q_{i 2}

, and

A_{i 3}^{⊤} P_{i 3} + P_{i 3}^{⊤} A_{i 3} = - Q_{i 3}

, where

Q_{i 1} = Q_{i 1}^{⊤} > 0

,

Q_{i 2} = Q_{i 2}^{⊤} > 0

, and

Q_{i 3} = Q_{i 3}^{⊤} > 0

.

{\hat{d}}_{i}

is the disturbance estimation that the DO will obtain. Let

{\tilde{d}}_{i} = d_{i} - {\hat{d}}_{i}

be the disturbance approximation error. Define the NSO error dynamics as

{\dot{\tilde{X}}}_{i} = {\dot{X}}_{i} - {\dot{\hat{X}}}_{i}

,

{\dot{\tilde{Y}}}_{i} = {\dot{Y}}_{i} - {\dot{\hat{Y}}}_{i}

, and

{\dot{\tilde{Ψ}}}_{i} = {\dot{Ψ}}_{i} - {\dot{\hat{Ψ}}}_{i}

. From (7) and (10), we can obtain the error dynamics of NSO as follows:

\begin{matrix} {\dot{\tilde{X}}}_{i} & = A_{i 1} {\hat{X}}_{i} + B_{i 1} {\tilde{θ}}_{i} φ_{i} ({\hat{η}}_{i}, {\hat{υ}}_{i}) + D_{i} B_{i 1} {\tilde{d}}_{i} + B_{i 1} ε_{i} \\ {\dot{\tilde{Y}}}_{i} & = A_{i 2} {\hat{Y}}_{i} + B_{i 2} {\tilde{θ}}_{i} φ_{i} ({\hat{η}}_{i}, {\hat{υ}}_{i}) + D_{i} B_{i 2} {\tilde{d}}_{i} + B_{i 2} ε_{i} \\ {\dot{\tilde{Ψ}}}_{i} & = A_{i 3} {\hat{Ψ}}_{i} + B_{i 3} {\tilde{θ}}_{i} φ_{i} ({\hat{η}}_{i}, {\hat{υ}}_{i}) + D_{i} B_{i 3} {\tilde{d}}_{i} + B_{i 3} ε_{i} \end{matrix}

(11)

3.2. Disturbance Observer Design

To implement the DO, we begin by defining the auxiliary vector

q_{i}

for each vehicle as follows:

q_{i} = d_{i} - K_{i d} υ_{i}

(12)

where

q_{i} = {[q_{i 1}, q_{i 2}, q_{i 3}]}^{⊤}

, and a positive definite design matrix

K_{i d} \in R^{3 \times 3}

is employed. We can obtain the time derivative of

q_{i}

as

\begin{matrix} {\dot{q}}_{i} & = {\dot{d}}_{i} - K_{i d} {\dot{υ}}_{i} \\ = {\dot{d}}_{i} - K_{i d} (f_{i} (η_{i}, υ_{i}) + R_{i} (ψ_{i}) M_{i}^{- 1} (τ_{i} + q_{i} + K_{i d} υ_{i})) \end{matrix}

(13)

Since

d_{i}

is unknown,

q_{i}

is also unknown. We can obtain an approximation of

q_{i}

using the following equation:

{\dot{\hat{q}}}_{i} = - K_{i d} ({\hat{f}}_{i} ({\hat{η}}_{i}, {\hat{υ}}_{i}) + R_{i} (ψ_{i}) M_{i}^{- 1} (τ_{i} + {\hat{q}}_{i} + K_{i d} {\hat{υ}}_{i}))

(14)

Thus, we obtain

{\hat{d}}_{i}

as follows:

{\hat{d}}_{i} = {\hat{q}}_{i} + K_{i d} {\hat{υ}}_{i}

(15)

Let

{\tilde{q}}_{i} = q_{i} - {\hat{q}}_{i}

. The time derivative of

{\tilde{q}}_{i}

is described as follows:

\begin{matrix} {\dot{\tilde{q}}}_{i} & = {\dot{q}}_{i} - {\dot{\hat{q}}}_{i} \\ = {\dot{d}}_{i} - K_{i d} {\tilde{θ}}_{i} φ_{i} ({\hat{η}}_{i}, {\hat{υ}}_{i}) - K_{i d} R_{i} (ψ_{i}) M_{i}^{- 1} (ψ_{i}) {\tilde{q}}_{i} \end{matrix}

(16)

3.3. Adaptive Controller Design

The proposed adaptive controller is designed utilizing the backstepping technique. First, the change of coordinates is given as follows:

z_{i 1} = η_{i} - η_{i d}

(17)

z_{i 2} = {\hat{υ}}_{i} - α_{i}

(18)

where

η_{i d} = η_{d} + R_{i} (ψ_{i}) p_{i}

, and

η_{d} = {[x_{d}, y_{d}, ψ_{d}]}^{⊤}

represents the tracking trajectory of the leader. Here,

p_{i} = {[p_{i x}, p_{i y}, 0]}^{⊤}

, where

p_{i x}

and

p_{i y}

denote the relative position between the i-th ASV and the leader in the

X_{E}

and

Y_{E}

directions, respectively.

α_{i}

serves as the virtual control for design purposes, where

α_{i}^{a}

and

α_{i}^{*}

are the adaptive virtual control and the optimal compensation term, respectively. The actual control combines these two terms, i.e.,

α_{i} = α_{i}^{a} + α_{i}^{*}

.

From (17), we can obtain the following:

\begin{matrix} {\dot{z}}_{i 1} & = {\dot{η}}_{i} - {\dot{η}}_{i d} \\ = υ_{i} - {\dot{η}}_{i d} \\ = z_{i 2} + α_{i}^{a} + α_{i}^{*} - {\dot{η}}_{i d} \end{matrix}

(19)

To obtain the control objective, we design the following equation:

\begin{matrix} V_{i 1} = \frac{1}{2} {\tilde{X}}_{i}^{⊤} P_{i 1} {\tilde{X}}_{i} + \frac{1}{2} {\tilde{Y}}_{i}^{⊤} P_{i 2} {\tilde{Y}}_{i} + \frac{1}{2} {\tilde{Ψ}}_{i}^{⊤} P_{i 3} {\tilde{Ψ}}_{i} + \frac{1}{2} z_{i 1}^{⊤} z_{i 1} \end{matrix}

(20)

We can obtain the following:

\begin{matrix} {\dot{V}}_{i 1} & = \frac{1}{2} {\dot{\tilde{X}}}_{i}^{⊤} P_{i} {\tilde{X}}_{i} + \frac{1}{2} {\tilde{X}}_{i}^{⊤} P_{i} {\dot{\tilde{X}}}_{i} + \frac{1}{2} {\dot{\tilde{Y}}}_{i}^{⊤} P_{i 2} {\tilde{Y}}_{i} + \frac{1}{2} {\tilde{Y}}_{i}^{⊤} P_{i 2} {\dot{\tilde{Y}}}_{i} \\ + \frac{1}{2} {\dot{\tilde{Ψ}}}_{i}^{⊤} P_{i 3} {\tilde{Ψ}}_{i} + \frac{1}{2} {\tilde{Ψ}}_{i}^{⊤} P_{i 3} {\dot{\tilde{Ψ}}}_{i} + z_{i 1}^{⊤} {\dot{z}}_{i 1} \\ \leq - \frac{1}{2} λ_{min} (Q_{i 1}) ∥ {\tilde{X}}_{i} ∥^{2} - \frac{1}{2} λ_{min} (Q_{i 2}) ∥ {\tilde{Y}}_{i} ∥^{2} - \frac{1}{2} λ_{min} (Q_{i 3}) {∥ {\tilde{Ψ}}_{i} ∥}^{2} \\ + {\tilde{X}}_{i}^{⊤} P_{i 1} (B_{i 1} {\tilde{θ}}_{i}^{⊤} φ_{i} ({\hat{η}}_{i}, {\hat{υ}}_{i}) + B_{i 1} ε_{i} + D_{i} B_{i 1} {\tilde{d}}_{i}) \\ + {\tilde{Y}}_{i}^{⊤} P_{i 2} (B_{i 2} {\tilde{θ}}_{i}^{⊤} φ_{i} ({\hat{η}}_{i}, {\hat{υ}}_{i}) + B_{i 2} ε_{i} + D_{i} B_{i 2} {\tilde{d}}_{i}) \\ + {\tilde{Ψ}}_{i}^{⊤} P_{i 3} (B_{i 3} {\tilde{θ}}_{i}^{⊤} φ_{i} ({\hat{η}}_{i}, {\hat{υ}}_{i}) + B_{i 3} ε_{i} + D_{i} B_{i 3} {\tilde{d}}_{i}) \\ + z_{i 1}^{⊤} (z_{i 2} + α_{i}^{a} + α_{i}^{*} - {\dot{η}}_{i d}) \end{matrix}

(21)

By using Young’s inequality, we can obtain the following:

{\tilde{X}}_{i}^{⊤} P_{i 1} B_{i 1} {\tilde{θ}}_{i}^{⊤} φ_{i} ({\hat{η}}_{i}, {\hat{υ}}_{i}) \leq \frac{1}{2} λ_{\max}^{2} (P_{i 1}) ∥ {\tilde{X}}_{i} ∥^{2} + \frac{1}{2} {∥ {\tilde{θ}}_{i} ∥}^{2}

(22)

{\tilde{Y}}_{i}^{⊤} P_{i 2} B_{i 2} {\tilde{θ}}_{i}^{⊤} φ_{i} ({\hat{η}}_{i}, {\hat{υ}}_{i}) \leq \frac{1}{2} λ_{\max}^{2} (P_{i 2}) ∥ {\tilde{Y}}_{i} ∥^{2} + \frac{1}{2} {∥ {\tilde{θ}}_{i} ∥}^{2}

(23)

{\tilde{Ψ}}_{i}^{⊤} P_{i 3} B_{i 3} {\tilde{θ}}_{i}^{⊤} φ_{i} ({\hat{η}}_{i}, {\hat{υ}}_{i}) \leq \frac{1}{2} λ_{\max}^{2} (P_{i 3}) ∥ {\tilde{Ψ}}_{i} ∥^{2} + \frac{1}{2} {∥ {\tilde{θ}}_{i} ∥}^{2}

(24)

{\tilde{X}}_{i}^{⊤} P_{i 1} B_{i 1} ε_{i} \leq \frac{1}{2} λ_{\max}^{2} (P_{i 1}) {∥ {\tilde{X}}_{i} ∥}^{2} + \frac{1}{2} ε_{i m}^{2}

(25)

{\tilde{Y}}_{i}^{⊤} P_{i 2} B_{i 2} ε_{i} \leq \frac{1}{2} λ_{\max}^{2} (P_{i 2}) {∥ {\tilde{Y}}_{i} ∥}^{2} + \frac{1}{2} ε_{i m}^{2}

(26)

{\tilde{Ψ}}_{i}^{⊤} P_{i 3} B_{i 3} ε_{i} \leq \frac{1}{2} λ_{\max}^{2} (P_{i 3}) {∥ {\tilde{Ψ}}_{i} ∥}^{2} + \frac{1}{2} ε_{i m}^{2}

(27)

{\tilde{X}}_{i}^{⊤} P_{i 1} D_{i} B_{i 2} {\tilde{d}}_{i} \leq \frac{1}{2} λ_{\max}^{2} (P_{i 1} D_{i} B_{i 1}) ∥ {\tilde{X}}_{i} ∥^{2} + \frac{1}{2} {∥ {\tilde{d}}_{i} ∥}^{2}

(28)

{\tilde{Y}}_{i}^{⊤} P_{i 2} D_{i} B_{i 2} {\tilde{d}}_{i} \leq \frac{1}{2} λ_{\max}^{2} (P_{i 2} D_{i} B_{i 2}) ∥ {\tilde{X}}_{i} ∥^{2} + \frac{1}{2} {∥ {\tilde{d}}_{i} ∥}^{2}

(29)

{\tilde{Ψ}}_{i}^{⊤} P_{i 3} D_{i} B_{i 3} {\tilde{d}}_{i} \leq \frac{1}{2} λ_{\max}^{2} (P_{i 3} D_{i} B_{i 3}) ∥ {\tilde{Ψ}}_{i} ∥^{2} + \frac{1}{2} {∥ {\tilde{d}}_{i} ∥}^{2}

(30)

The adaptive controller

α_{i}^{a}

can be designed as follows:

α_{i}^{a} = - r_{i 1} z_{i 1} + {\dot{η}}_{i d}

(31)

where

r_{i 1} = diag (r_{i 11}, r_{i 12}, r_{i 13})

is the positive design parameter vector. Thus, we can obtain the following:

\begin{matrix} {\dot{V}}_{i 1} \leq & - σ_{i 1} ∥ {\tilde{X}}_{i} ∥^{2} - σ_{i 2} ∥ {\tilde{Y}}_{i} ∥^{2} - σ_{i 3} {∥ {\tilde{Ψ}}_{i} ∥}^{2} - r_{i 1} z_{i 1}^{⊤} z_{i 1} + z_{i 1}^{⊤} z_{i 2} + z_{i 1}^{⊤} α_{i}^{*} \\ + \frac{3}{2} ∥ {\tilde{θ}}_{i} ∥^{2} + \frac{3}{2} {∥ {\tilde{d}}_{i} ∥}^{2} + \frac{3}{2} ε_{i m}^{2} \end{matrix}

(32)

where

σ_{i 1} = \frac{1}{2} λ_{min} (Q_{i 1}) - λ_{\max}^{2} (P_{i 1}) {∥ {\tilde{X}}_{i} ∥}^{2} - \frac{1}{2} λ_{\max}^{2} (P_{i 1} B_{i 1})

,

σ_{i 2} = \frac{1}{2} λ_{min} (Q_{i 2}) - λ_{\max}^{2} (P_{i 2})

∥ {\tilde{Y}}_{i} ∥^{2} - \frac{1}{2} λ_{\max}^{2} (P_{i 2} B_{i 2})

,

σ_{i 3} = \frac{1}{2} λ_{min} (Q_{i 3}) - λ_{\max}^{2} (P_{i 3}) {∥ {\tilde{Ψ}}_{i} ∥}^{2} - \frac{1}{2} λ_{\max}^{2} (P_{i 3} B_{i 3})

.

We design the following Lyapunov function:

\begin{matrix} V_{i 2} = V_{i 1} + \frac{1}{2} z_{i 2}^{⊤} z_{i 2} + \frac{1}{2} {\tilde{θ}}_{i}^{⊤} {\tilde{θ}}_{i} + \frac{1}{2} {\tilde{q}}_{i}^{⊤} {\tilde{q}}_{i} \end{matrix}

(33)

Utilizing the fact that

{\tilde{θ}}_{i} = θ_{i}^{*} - {\hat{θ}}_{i}

, we can obtain the following equation:

\begin{matrix} {\dot{V}}_{i 2} = & {\dot{V}}_{i 1} + z_{i 2}^{⊤} {\dot{z}}_{i 2} + {\tilde{θ}}_{i}^{⊤} {\dot{\tilde{θ}}}_{i} + {\tilde{q}}_{i}^{⊤} {\dot{\tilde{q}}}_{i} \\ = & {\dot{V}}_{i 1} + z_{i 2}^{⊤} ({\hat{f}}_{i} ({\hat{η}}_{i}, {\hat{υ}}_{i}) + R_{i} (ψ_{i}) M_{i}^{- 1} (τ_{i} + d_{i}) \\ - {\dot{α}}_{i} + {\tilde{θ}}_{i} φ_{i} ({\hat{η}}_{i}, {\hat{υ}}_{i})) - {\tilde{θ}}_{i}^{⊤} {\dot{\hat{θ}}}_{i} \\ + {\tilde{q}}_{i}^{⊤} ({\dot{d}}_{i} - K_{i d} {\tilde{θ}}_{i} φ_{i} (η_{i}, υ_{i}) - K_{i d} R_{i} (ψ_{i}) M_{i}^{- 1} {\tilde{q}}_{i}) \end{matrix}

(34)

By applying Young’s inequality, the following inequalities can be obtained:

{\tilde{q}}_{i} {\dot{d}}_{i} \leq \frac{1}{2} {∥ \tilde{q_{i}} ∥}^{2} + \frac{1}{2} d_{i m}^{2}

(35)

and

{\tilde{q}}_{i}^{⊤} K_{i d} {\tilde{θ}}_{i} φ_{i} (η_{i}, υ_{i}) \leq \frac{1}{2} ∥ \tilde{q_{i}} ∥^{2} + \frac{1}{2} K_{i d}^{2} φ_{i m}^{2} {∥ {\tilde{θ}}_{i} ∥}^{2}

(36)

We define

h_{i} (Z_{i 2}) ≜ f_{i} (η_{i}, υ_{i}) - f_{i} (α_{i})

, where

Z_{i 2} = {[η_{i}, α_{i}]}^{⊤}

, and one has

\begin{matrix} {\dot{V}}_{i 2} \leq & {\dot{V}}_{i 1} + z_{i 2}^{⊤} (h_{i} (Z_{i 2}) + {\hat{θ}}_{i} φ_{i} (α_{i}) + {\tilde{θ}}_{i} φ_{i} (α_{i}) + ε_{i} \\ + R_{i} (ψ_{i}) M_{i}^{- 1} (τ_{i}^{a} + τ_{i}^{*} + d_{i}) - {\dot{α}}_{i}) \\ - {\tilde{θ}}_{i}^{⊤} {\dot{\hat{θ}}}_{i} - (K_{i d} R_{i} M_{i}^{- 1} (ψ_{i}) - 1) {∥ \tilde{q_{i}} ∥}^{2} \\ + \frac{1}{2} d_{i m}^{2} + \frac{1}{2} K_{i d}^{2} φ_{i m}^{2} {∥ {\tilde{θ}}_{i} ∥}^{2} \end{matrix}

(37)

We can design the adaptive controller and adaptive law as follows:

\begin{matrix} τ_{i}^{a} = M_{i} R_{i}^{⊤} (ψ_{i}) (- z_{i 1} - z_{i 2} - r_{i 2} z_{i 2} - {\hat{θ}}_{i} φ_{i} (α_{i}) + {\dot{α}}_{i}) - {\hat{q}}_{i} - K_{i d} υ_{i} \end{matrix}

(38)

and

{\dot{\hat{θ}}}_{i} = z_{i 2}^{⊤} φ_{i} (α_{i}) - {\hat{θ}}_{i} - {\hat{θ}}_{i} {∥ {\hat{θ}}_{i} ∥}^{2}

(39)

where

r_{i 2} = diag (r_{i 21}, r_{i 22}, r_{i 23})

is the positive design vector. According to Young’s inequality, we can obtain that

{\tilde{θ}}_{i}^{⊤} {\hat{θ}}_{i} \leq - \frac{1}{2} ∥ {\tilde{θ}}_{i} ∥^{2} + \frac{1}{2} {∥ θ_{i}^{*} ∥}^{2}

(40)

and

{\tilde{θ}}_{i}^{⊤} {\hat{θ}}_{i} ∥ {\hat{θ}}_{i} ∥^{2} \leq - \frac{1}{10} ∥ {\tilde{θ}}_{i} ∥^{4} + \frac{1}{2} {∥ θ_{i}^{*} ∥}^{4}

(41)

From (17) and (18), the tracking error dynamics can be described as follows:

{\dot{Z}}_{i} = F_{i} (Z_{i}) + G_{i} U_{i}^{*}

(42)

where

Z_{i} = {[z_{i 1}^{⊤}, z_{i 2}^{⊤}]}^{⊤}

,

F_{i} (Z_{i}) = {[0_{3 \times 3}, h_{i} (Z_{i 2})]}^{⊤}

,

G_{i} = diag (I_{3 \times 3}, R_{i} M_{i}^{- 1} (ψ_{i}))

,

U_{i}^{*} = {[α_{i}^{* ⊤}, τ_{i}^{* ⊤}]}^{⊤}

. The cost function of the tacking error system can be described as

J_{i} = \int_{t_{0}}^{\infty} r_{i} (Z_{i} (t), U_{i}^{*} (t)) d t

(43)

where

r_{i} (Z_{i}, U_{i}^{*}) = Q_{i} (Z_{i}) + U_{i}^{* ⊤} R_{i} U_{i}^{*}

is the optimal index,

Q_{i} (Z_{i})

is a positive definite matrix satisfying

∥ Q_{i} (Z_{i}) ∥ = 0

only if

Z_{i} = 0

and

Γ_{imin} \leq ∥ Q_{i} (Z_{i}) ∥ \leq Γ_{imax}

for

{\bar{Z}}_{imin} \leq ∥ Z_{i} ∥ \leq {\bar{Z}}_{imax}

,

Γ_{imin}, Γ_{imax}

,

{\bar{Z}}_{imin}

, and

{\bar{Z}}_{imax}

are positive bounds.

R_{i} = R_{i}^{⊤}

is a positive definite matrix. An infinitesimal equivalent of (43) can be written as follows [31]:

{\dot{J}}_{i} = \nabla J_{i}^{⊤} (F_{i} (Z_{i}) + G_{i} U_{i}^{*}) = - Q_{i} (Z_{i}) - U_{i}^{* ⊤} R_{i} U_{i}^{*}

(44)

We define a Hamiltonian function as

H_{i} (Z_{i}, U_{i}^{'}) = r_{i} (Z_{i}, U_{i}^{'}) + {(\nabla J_{i} (Z_{i}))}^{⊤} (F_{i} (Z_{i}) + G_{i} U_{i}^{'})

(45)

where

U_{i}^{'}

is the associated admissible control, and

\nabla J_{i} (Z_{i})

is the gradient of

J_{i} (Z_{i})

with regard to

Z_{i}

. The optimal controller

U_{i}^{*} (Z_{i})

can be obtained by applying the condition

\partial H_{i} (Z_{i}, U_{i}^{'}) / \partial U_{i}^{'} = 0

, and one has

U_{i}^{*} (Z_{i}) = - \frac{1}{2} R_{i}^{- 1} G_{i}^{⊤} \nabla J_{i}^{*} (Z_{i})

(46)

where

\nabla J_{i}^{*} (Z_{i})

represents the gradient of

J_{i}^{*} (Z_{i})

with respect to

Z_{i}

. The HJB equation can be described as follows:

\begin{matrix} Q_{i} (Z_{i}) + {(\nabla J_{i}^{*} (Z_{i}))}^{⊤} F (Z_{i}) - \frac{1}{4} {(\nabla J_{i}^{*} (Z_{i}))}^{⊤} G_{i} R_{i}^{- 1} (\nabla J_{i}^{*} (Z_{i})) = 0 \end{matrix}

(47)

with

J_{i}^{*} (0) = 0

. It can be concluded that

J_{i}

is a continuously differentiable and radially unbounded Lyapunov function, and the derivative with an optimal controller can be described as

{\dot{J}}_{i} = \nabla J_{i}^{⊤} {\dot{Z}}_{i} = \nabla J_{i}^{⊤} (F_{i} (Z_{i}) + G_{i} U_{i}^{*}) \leq 0

(48)

By choosing suitable

Q_{i} (Z_{i})

satisfying

\lim_{Z_{i} \to \infty} Q_{i} (Z_{i}) = \infty

and

\nabla J_{i}^{* ⊤} Q_{i} (Z_{i}) \nabla J_{i}

= Q_{i} (Z_{i}) + U_{i}^{* ⊤} R_{i} U_{i}^{*}

, the following equation can be obtained [32]:

\nabla J_{i}^{⊤} (F_{i} (Z_{i}) + G_{i} U_{I}^{*}) = - \nabla J_{i}^{⊤} Q_{i} (Z_{i}) \nabla J_{i}

(49)

From (37), (40), and (41), we can obtain that

\begin{matrix} {\dot{V}}_{i 2} \leq & - σ_{i 1} ∥ {\tilde{X}}_{i} ∥^{2} - σ_{i 2} ∥ {\tilde{Y}}_{i} ∥^{2} - σ_{i 3} ∥ {\tilde{Ψ}}_{i} ∥^{2} - γ_{i} ∥ Z_{i} ∥^{2} - \frac{1}{10} ∥ {\tilde{θ}}_{i} ∥^{4} - (\frac{1}{2} K_{i d}^{2} φ_{i m}^{2} - \frac{1}{2}) {∥ {\tilde{θ}}_{i} ∥}^{2} \\ - (K_{i d} R_{i} (ψ_{i}) M_{i}^{- 1} - \frac{1}{2} R_{i} (ψ_{i}) M_{i}^{- 1} M_{i} R_{i}^{⊤} (ψ_{i})) {∥ \tilde{q_{i}} ∥}^{2} \\ + \frac{1}{2} ∥ θ_{i}^{*} ∥^{2} + \frac{1}{2} {∥ θ_{i}^{*} ∥}^{4} + \frac{3}{2} ε_{i m}^{2} + \frac{1}{2} d_{i m}^{2} + Z_{i}^{⊤} (F_{i} (Z_{i}) + G_{i} U_{i}^{*}) \end{matrix}

(50)

where

γ_{i} = {[r_{i 1}^{⊤}, r_{i 2}^{⊤}]}^{⊤}

. According to (49), it can be seen that

Z_{i}^{⊤} (F_{i} (Z_{i}) + G_{i} U_{i}^{*})

becomes negative when the controller

U_{i}^{*}

is designed by adopting the optimal control method to stabilize the tracking error system (42), which implies that the tracking error

Z_{i}

remains bounded and the cost function

J_{i}

is minimized. However, the HJB Equation (47) is difficult to solve, so the ADP method is applied in the next section to obtain

U_{i}^{*}

.

3.4. Optimal Compensation Term Design

To obtain the approximation of the optimal cost function, NN is adopted as

J_{i}^{*} (Z_{i}) = θ_{i b}^{* ⊤} φ_{i b} (Z_{i}) + ε_{i b}

(51)

Here,

θ_{i b}^{*}

represents the ideal parameter,

φ_{i b} (Z_{i})

denotes the basis function, and

ε_{i b}

signifies the minimum approximation error. The gradient of the optimal cost function is given by

\partial J_{i}^{*} (Z_{i}) / \partial_{i} Z_{i} = \nabla φ_{i b}^{⊤} (Z_{i}) θ_{i b}^{*} + \nabla ε_{i b}

(52)

where

\nabla φ_{i b}^{⊤} (Z_{i})

and

\nabla ε_{i b}

are the gradients of

φ_{i b}^{⊤} (Z_{i})

and

ε_{i b}

, respectively. The optimal controller and the Hamiltonian function are expressed below:

U_{i}^{*} (Z_{i}) = - \frac{1}{2} R_{i}^{- 1} (\nabla φ_{i b}^{⊤} (Z_{i}) θ_{i b}^{*} + \nabla ε_{i b})

(53)

\begin{matrix} H_{i} (Z_{i}, θ_{i b}^{*}) = & Q_{i} (Z_{i}) + θ_{i b}^{* ⊤} {(\nabla φ_{i b}^{⊤} (Z_{i}))}^{⊤} H_{i} (Z_{i}) + ε_{i HJB} \\ - \frac{1}{4} θ_{i b}^{* ⊤} {(\nabla φ_{i b}^{⊤} (Z_{i}))}^{⊤} G_{i} R_{i}^{- 1} G_{i}^{⊤} \nabla φ_{i b}^{⊤} (Z_{i}) θ_{i b}^{*} \end{matrix}

(54)

We can obtain the following equation:

\begin{matrix} ε_{i HJB} = & {(\nabla ε_{i b})}^{⊤} (F_{i} (Z_{i}) + G_{i} U_{i}^{*}) \\ + \frac{1}{4} {(\nabla ε_{i b})}^{⊤} G_{i} R_{i}^{- 1} G_{i}^{⊤} \nabla ε_{i b} \end{matrix}

(55)

Describing the approximation of the optimal cost function achieved by NN, we have

{\hat{J}}_{i} (Z_{i}) = {\hat{θ}}_{i b}^{⊤} φ_{i b} (Z_{i})

(56)

where

{\hat{θ}}_{i b}

is the approximation of

θ_{i b}^{*}

. The optimal controller estimation can be reformulated as follows:

{\hat{U_{i}}}^{*} (Z_{i}) = - \frac{1}{2} R_{i}^{- 1} G_{i}^{⊤} \nabla φ_{i b}^{⊤} (Z_{i}) {\hat{θ}}_{i b}

(57)

The approximate Hamiltonian function can be obtained as follows:

\begin{matrix} {\hat{H}}_{i} (Z_{i}, {\hat{θ}}_{i b}) = & Q_{i} (Z_{i}) + {\hat{θ}}_{i b}^{⊤} {(\nabla φ_{i b}^{⊤} (Z_{i}))}^{⊤} {\hat{F}}_{i} (Z_{i} | {\hat{Θ}}_{i}) \\ - \frac{1}{4} {\hat{θ}}_{i b}^{⊤} {(\nabla φ_{i b}^{⊤} (Z_{i}))}^{⊤} G_{i} R_{i}^{- 1} G_{i}^{⊤} \nabla φ_{i}^{⊤} (Z_{i}) {\hat{θ}}_{i b} \end{matrix}

(58)

where

{\hat{F}}_{i} (Z_{i} | {\hat{θ}}_{i}) = {[{\hat{f}}_{i 1} (z_{i 1} | {\hat{θ}}_{i 1}), {\hat{f}}_{i 2} (z_{i 2} | {\hat{θ}}_{i 2}), \dots, {\hat{f}}_{i n} (z_{i n} | {\hat{θ}}_{i n})]}^{⊤}

,

{\hat{f}}_{i 1} (z_{i 1} | {\hat{θ}}_{i 1}) = {\hat{f}}_{i 1} (x_{i 1} | {\hat{θ}}_{i 1}) - {\hat{f}}_{i 1} (x_{i 1 d} | {\hat{θ}}_{i 1})

,

{\hat{f}}_{i j} (z_{i j} | {\hat{θ}}_{i j}) = {\hat{f}}_{i j} (x_{i j} | {\hat{θ}}_{i j}) - {\hat{f}}_{i j} (x_{i j d} | {\hat{θ}}_{i j}) j = 2, 3, \dots, n

.

We choose the weight updating law of

{\hat{θ}}_{i b}

as follows:

\begin{matrix} {\dot{\hat{θ}}}_{i b} = & - [{(\nabla φ_{i b}^{⊤} (Z_{i}))}^{⊤} {\hat{F}}_{i} (Z_{i} ∣ \hat{θ_{i}}) \\ - \frac{1}{2} {(\nabla φ_{i b}^{⊤} (Z_{i}))}^{⊤} G_{i} R_{i}^{- 1} G_{i}^{⊤} \nabla φ_{i b}^{⊤} (Z_{i}) {\hat{θ}}_{i b}] \\ \times [Q_{i} (Z_{i}) + {\hat{θ}}_{i b}^{⊤} {(\nabla φ_{i b}^{⊤} (Z_{i}))}^{⊤} {\hat{F}}_{i} (Z_{i} ∣ Θ_{i}) \\ - \frac{1}{4} {\hat{θ}}_{i b}^{⊤} {(\nabla φ_{i b}^{⊤} (Z_{i}))}^{⊤} G_{i} R_{i}^{- 1} G_{i}^{⊤} \nabla φ_{i b}^{⊤} (Z_{i}) {\hat{θ}}_{i b}] \end{matrix}

(59)

The estimation error of the optimal cost function parameter is defined as

{\tilde{θ}}_{i b} = θ_{i b}^{*} - {\hat{θ}}_{i b}

. From this definition, we can obtain that

\begin{matrix} {\hat{H}}_{i} (Z_{i}, {\hat{θ}}_{i b}) = & \frac{1}{2} {\tilde{θ}}_{i b}^{⊤} {(\nabla φ_{i b}^{⊤} (Z_{i}))}^{⊤} G_{i} R_{i}^{- 1} G_{i}^{⊤} \nabla φ_{i b}^{⊤} (Z_{i}) θ_{i b}^{*} \\ - {\tilde{θ}}_{i b}^{⊤} {(\nabla φ_{i b}^{⊤} (Z_{i}))}^{⊤} {\tilde{F}}_{i} (Z_{i}) - ε_{i HJB} \\ - \frac{1}{4} {\tilde{θ}}_{i b}^{⊤} {(\nabla φ_{i b}^{⊤} (Z_{i}))}^{⊤} G_{i} R_{i}^{- 1} G_{i}^{⊤} \nabla φ_{i b}^{⊤} (Z_{i}) {\tilde{θ}}_{i b} \end{matrix}

(60)

The error dynamics of Equation (59) can be expressed as follows:

\begin{matrix} {\dot{\tilde{θ}}}_{i b} = & - [{(\nabla φ_{i b}^{⊤} (Z_{i}))}^{⊤} \dot{Z_{i}} - {(\nabla φ_{i b}^{⊤} (Z_{i}))}^{⊤} \tilde{F_{i}} (Z_{i}) \\ + {(\nabla φ_{i b}^{⊤} (Z_{i}))}^{⊤} G_{i} R_{i}^{- 1} G_{i}^{⊤} \nabla φ_{i b}^{⊤} (Z_{i}) {\tilde{θ}}_{i b} \\ + \frac{1}{2} {(\nabla φ_{i b}^{⊤} (Z_{i}))}^{⊤} G_{i} R_{i}^{- 1} G_{i}^{⊤} \nabla ε_{i b} (Z_{i})] \\ \times [{\tilde{θ}}_{i b}^{⊤} {(\nabla φ_{i b}^{⊤} (Z_{i}))}^{⊤} \dot{Z_{i}} + {\hat{θ}}_{i b}^{⊤} {(\nabla φ_{i b}^{⊤} (Z_{i}))}^{⊤} \tilde{F_{i}} (Z_{i}) \\ + \frac{1}{4} {\tilde{θ}}_{i b}^{⊤} {(\nabla φ_{i b}^{⊤} (Z_{i}))}^{⊤} G_{i} R_{i}^{- 1} G_{i}^{⊤} \nabla φ_{i b}^{⊤} (Z_{i}) {\tilde{θ}}_{i b} \\ + \frac{1}{2} {\tilde{θ}}_{i b}^{⊤} {(\nabla φ_{i b}^{⊤} (Z_{i}))}^{⊤} G_{i} R_{i}^{- 1} G_{i}^{⊤} \nabla ε_{i b} (Z_{i}) + ε_{i HJB}] \end{matrix}

(61)

where

{\tilde{F}}_{i} (Z_{i}) = F_{i} (Z_{i}) - {\hat{F}}_{i} (Z_{i} | {\hat{θ}}_{i})

.

3.5. Stability Analysis

Theorem 1.

For the multiple UMV systems (System (1)), the weight updating law is given by (39), the adaptive controller is defined by (38), the optimal compensation term is provided by (57), and the updated law for the cost function is specified by (59). By selecting the design parameters appropriately, the entire control scheme ensures the boundedness of all signals in the closed-loop system, and the system outputs can optimally track the reference signal.

Proof.

Consider the following Lyapunov function:

\begin{matrix} V_{i} = & \frac{1}{2} {\tilde{X}}_{i}^{⊤} P_{i 1} {\tilde{X}}_{i} + \frac{1}{2} {\tilde{Y}}_{i}^{⊤} P_{i 2} {\tilde{Y}}_{i} + \frac{1}{2} {\tilde{Ψ}}_{i}^{⊤} P_{i 3} {\tilde{Ψ}}_{i} + \frac{1}{2} z_{i 1}^{⊤} z_{i 1} \\ + \frac{1}{2} z_{i 2}^{⊤} z_{i 2} + \frac{1}{2} {\tilde{θ}}_{i}^{⊤} {\tilde{θ}}_{i} + \frac{1}{2} {\tilde{θ}}_{i b}^{⊤} {\tilde{θ}}_{i b} + \frac{1}{2} {\tilde{q}}_{i}^{⊤} {\tilde{q}}_{i} \end{matrix}

(62)

We can obtain that

\begin{matrix} {\dot{V}}_{i} = & \frac{1}{2} {\dot{\tilde{X}}}_{i}^{⊤} P_{i 1} {\tilde{X}}_{i} + \frac{1}{2} {\tilde{X}}_{i}^{⊤} P_{i 1} {\dot{\tilde{X}}}_{i} + \frac{1}{2} {\dot{\tilde{Y}}}_{i}^{⊤} P_{i 2} {\tilde{Y}}_{i} + \frac{1}{2} {\tilde{Y}}_{i}^{⊤} P_{i 2} {\dot{\tilde{Y}}}_{i} + \frac{1}{2} {\dot{\tilde{Ψ}}}_{i}^{⊤} P_{i 3} {\tilde{Ψ}}_{i} \\ + \frac{1}{2} {\tilde{Ψ}}_{i}^{⊤} P_{i 3} {\dot{\tilde{Ψ}}}_{i} + z_{i 1}^{⊤} {\dot{z}}_{i 1} + z_{i 2}^{⊤} {\dot{z}}_{i 2} + {\tilde{θ}}_{i}^{⊤} {\dot{\tilde{θ}}}_{i} + {\tilde{θ}}_{i b}^{⊤} {\dot{\tilde{θ}}}_{i b} + {\tilde{q}}_{i}^{⊤} {\dot{\tilde{q}}}_{i} \end{matrix}

(63)

Assume that

∥{(\nabla φ_{i b}^{⊤} (Z_{i}))}^{⊤} R_{i}^{- 1} \nabla φ_{i b}^{⊤} (Z_{i})∥ \leq ℓ_{i 2}

,

F_{i} (Z_{i}) + G_{i} U_{i}^{*} \leq c_{i} \sqrt{∥Z_{i}∥}

,

∥\nabla ε_{i} (Z_{i})∥

\leq ε_{i b m}

, and

∥\nabla φ_{i b}^{⊤} (Z_{i})∥ \leq φ_{i b m}

, where

ℓ_{i 2}

,

c_{i}

,

ε_{i m}

, and

φ_{i b m}

are positive constants. By applying Young’s inequality and the Cauchy–Schwartz inequality to (63), we can obtain the following inequality:

\begin{matrix} {\dot{V}}_{i} \leq & - ℓ_{i 1} {∥Z_{i}∥}^{2} + ℓ_{i 2} ∥Z_{i}∥ - ℓ_{i 3} {∥{\tilde{θ}}_{i}∥}^{4} + ℓ_{i 4} {∥{\tilde{θ}}_{i}∥}^{2} \\ - ℓ_{i 5} {∥{\tilde{θ}}_{i b}∥}^{4} + ℓ_{i 6} {∥{\tilde{θ}}_{i b}∥}^{2} - ℓ_{i 7} {∥{\tilde{q}}_{i}∥}^{2} - ℓ_{i 8} {∥ {\tilde{X}}_{i} ∥}^{2} \\ - ℓ_{i 9} ∥ {\tilde{Y}}_{i} ∥^{2} - ℓ_{i 10} {∥ {\tilde{Ψ}}_{i} ∥}^{2} - ℓ_{i 11} \end{matrix}

(64)

where

\begin{matrix} ℓ_{i 1} = γ_{i} - \frac{11}{4} c_{i}^{4} - \frac{1}{2} c_{i}^{4} φ_{i b m}^{4} \end{matrix}

(65)

\begin{matrix} ℓ_{i 2} = 2 c_{i}^{2} φ_{i b m}^{2} \end{matrix}

(66)

\begin{matrix} ℓ_{i 3} = \frac{1}{10} - \frac{7}{2} φ_{i m}^{4} \end{matrix}

(67)

\begin{matrix} ℓ_{i 4} = \frac{1}{4} φ_{i m}^{2} + 2 θ_{i b}^{*} φ_{i m}^{4} - \frac{1}{2} \end{matrix}

(68)

ℓ_{i 5} = \frac{1}{4} π_{i 5}^{2} - \frac{99}{32} φ_{i m}^{4} {∥G_{i}∥}^{4} {∥R_{i}^{- 1}∥}^{2} - \frac{77}{16} φ_{i m}^{4}

(69)

ℓ_{i 6} = \frac{3}{8} φ_{i m}^{2} {∥G_{i}∥}^{4} {∥R_{i}^{- 1}∥}^{2} ε_{i b m}^{2} + \frac{1}{8} φ_{i m}^{2} {∥G_{i}∥}^{4} {∥R_{i}^{- 1}∥}^{2}

(70)

ℓ_{i 7} = K_{i d} R_{i} (ψ_{i}) M_{i}^{- 1} (ψ_{i}) - \frac{1}{2} - 2 K_{i d}^{2} φ_{i m}^{2} - \frac{1}{2} A_{i}

(71)

ℓ_{i 8} = σ_{i 1}

(72)

ℓ_{i 9} = σ_{i 2}

(73)

ℓ_{i 10} = σ_{i 3}

(74)

\begin{matrix} ℓ_{i 11} = & \frac{3}{32} {∥G_{i}∥}^{8} {∥R_{i}^{- 1}∥}^{4} ε_{i b m}^{4} + \frac{3}{32} {∥G_{i}∥}^{4} {∥R_{i}^{- 1}∥}^{2} ε_{i b m}^{4} \\ + \frac{1}{2} ε_{i m}^{2} + \frac{1}{2} d_{i m}^{2} + \frac{1}{2} ∥ θ_{i}^{*} ∥^{2} + \frac{1}{2} {∥ θ_{i}^{*} ∥}^{4} \end{matrix}

(75)

Assume the following equations hold:

∥Z_{i}∥ > \frac{- ℓ_{i 2} + \sqrt{ℓ_{i 2}^{2} + 4 ℓ_{i 1} ℓ_{i 11}}}{2 ℓ_{i 1}}

(76)

or

∥{\tilde{θ}}_{i}∥ > \sqrt{\frac{- ℓ_{i 4} + \sqrt{ℓ_{i 4}^{2} + 4 ℓ_{i 3} ℓ_{i 11}}}{2 ℓ_{i 3}}}

(77)

or

∥{\tilde{W}}_{i}∥ > \sqrt{\frac{- ℓ_{i 6} + \sqrt{ℓ_{i 6}^{2} + 4 ℓ_{i 5} ℓ_{i 11}}}{2 ℓ_{i 5}}}

(78)

or

∥{\tilde{X}}_{i}∥ > \sqrt{\frac{ℓ_{i 11}}{ℓ_{i 8}}}

(79)

or

∥{\tilde{Y}}_{i}∥ > \sqrt{\frac{ℓ_{i 11}}{ℓ_{i 9}}}

(80)

or

∥{\tilde{Ψ}}_{i}∥ > \sqrt{\frac{ℓ_{i 11}}{ℓ_{i 10}}}

(81)

or

∥{\tilde{q}}_{i}∥ > \sqrt{\frac{ℓ_{i 11}}{ℓ_{i 7}}}

(82)

Thus, according to standard Lyapunov extension [31], we can obtain

{\dot{V}}_{i 3} < 0

, and all signals within the closed-loop system remain bounded. □

4. Simulations and Comparison Study

4.1. Simulation Experiment

The simulation is applied by using MatlabR2018a. The parameter of CyberShip II is adopted in the simulation part; the details of this simulation model can be found in [33]. Three UMVs are adopted in the simulation experiment, which are named

{UMV}_{1}

,

{UMV}_{2}

, and

{UMV}_{3}

, respectively.

R_{i}

is designed as

0.01 \times I_{3 \times 3}

. The initial position of the three UMVs are

x_{1} (0) = 0, y_{1} (0) = 0.7, ψ_{1} = 10

,

x_{2} (0) = - 0.3, y_{2} (0) = - 0.5, ψ_{2} = - 20

,

x_{3} (0) = 0.3, y_{3} (0) = - 0.5, ψ_{3} = 10

, and

x_{4} (0) = 0.4, y_{4} (0) = 0, ψ_{4} = - 20

, respectively. The parameters of related position are given as follows:

p_{1 x} = \sqrt{2} / 2

,

p_{1 y} = 0

,

p_{2 x} = - 1 / 2

,

p_{2 y} = - 1 / 2

,

p_{3 x} = - 1 / 2

,

p_{3 y} = 1 / 2

,

p_{4 x} = - 1 / 5

, and

p_{4 y} = 0

.

k_{i 11} = k_{i 12} = k_{i 21} = k_{i 22} = k_{i 31} = k_{i 32} = 50

, and

k_{i d} = 10

. The applied disturbances are chosen as follows:

d_{i 1} = 20 (0.5 c o s (0.25 t) + 0.5 s i n (0.15 t))

,

d_{i 2} = 20 (- 0.5 c o s (0.15 t) - 0.5 s i n (0.25 t))

, and

d_{i 3} = 2 (0.5 c o s (0.25 t) + 0.5 s i n (0.15 t))

. The desired tracking signal of the virtual leader is given as

η_{d} {= [10 \sin (0.02 t), 10 (1 - \cos (0.02 t), 0.02 t]}^{⊤}

.

The simulation results are presented in Figure 2, Figure 3, Figure 4, Figure 5, Figure 6, Figure 7, Figure 8, Figure 9, Figure 10, Figure 11, Figure 12, Figure 13 and Figure 14. The tracking trajectory of the UMVs is shown in Figure 2, where it can be seen that the tracking control objective is achieved, and the UMVs maintain the desired formation. The tracking errors in the x and y directions are shown in Figure 3 and Figure 4, respectively. Initially, the errors are large due to the distance between the UMVs’ initial positions and the desired positions. However, once the UMVs approach the desired positions, the tracking errors tend to zero. Figure 5, Figure 6 and Figure 7 present the observer errors of the NSO. Since the initial value of the NSO is different from the actual value, the initial error is large. However, the designed updating law helps the NSO’s estimates converge to the true velocity values. The disturbance observer (DO) errors are shown in Figure 8, Figure 9 and Figure 10. As mentioned before, the applied disturbances

d_{i 1}

,

d_{i 2}

, and

d_{i 3}

oscillate within the range of (−20, 20), (−20, 20), and (−2, 2), respectively. From Figure 8, Figure 9 and Figure 10, it can be observed that the DO errors stabilize within the range of [(−2, 2); (−4, 4); (−0.5, 0.5)], demonstrating that the designed DO effectively can reduce the impact of disturbances. The controller outputs are shown in Figure 11, Figure 12 and Figure 13, while the designed optimal index is shown in Figure 14. It can be seen that the proposed controller minimizes the optimal index.

In conclusion, the simulation results demonstrate that the proposed optimal algorithm successfully handles the control task, and the NSO and DO can estimate the unmeasured velocities and unknown disturbances.

4.2. Comparison Study

To further demonstrate the disturbance rejection capability of the proposed algorithm, a comparative study is presented in this part. The controller adopted in this part is a fuzzy adaptive optimal controller without the disturbance rejection technique [34]. The experiment conditions, initial conditions, and applied disturbances are the same with the conditions given in Section 4.1.

The simulation results are presented in Figure 15, Figure 16, Figure 17, Figure 18, Figure 19, Figure 20 and Figure 21. As shown in Figure 15, the controller proposed in [34] achieves the tracking control objective. However, in Figure 16 and Figure 17, it is evident that the tracking errors are larger compared to Figure 3 and Figure 4. From Figure 18, Figure 19 and Figure 20, it can be observed that the controller output is larger compared to Figure 11, Figure 12 and Figure 13. This suggests that, under the applied disturbances, the lack of a disturbance rejection technique leads to larger tracking errors, which in turn requires a greater controller output to decrease tracking errors and achieve the control objective. This increased output results in a higher optimal index, as seen in Figure 21. Comparing Figure 21 to Figure 14, it becomes clear that the proposed controller achieves better performance in minimizing the optimal index. Therefore, the simulation results in Figure 15, Figure 16, Figure 17, Figure 18, Figure 19, Figure 20 and Figure 21 demonstrate that, under the disturbances

d_{i 1}

,

d_{i 2}

, and

d_{i 3}

given in Section 4.1, the proposed control method with the DO demonstrates better control performance compared to the existing optimal control method without a disturbance rejection technique [34].

5. Conclusions

This paper has presented an observer-based adaptive optimal controller for tracking control problems of multiple UMVs with uncertain dynamics and unmeasured velocities. NSOs have been developed to manage the unmeasured velocities and the uncertain dynamics of the UMVs. An adaptive controller with an optimal compensation term has been implemented to achieve the control objective optimally. Additionally, a DO has been proposed to address unknown external disturbances. It has been proven that the proposed controller ensures all signals in the closed-loop system remain bounded. Simulation results have been provided to demonstrate the effectiveness of the proposed algorithm. From these results, it can be concluded that the proposed control algorithm has successfully achieved the control objectives and minimized the optimal index. The NSO and DO have effectively estimated the unmeasured velocities and unknown disturbances. To further illustrate the disturbance rejection capability of the proposed algorithm, a comparison study has been conducted using an existing method [34]. The simulation results of this comparison study have shown that, under the appied external disturbances, the proposed control algorithm can achieve better control performance by adopting the disturbance rejection technique.

Author Contributions

Methodology, L.-E.Y. and T.L.; software, L.-E.Y.; resources, Y.X. and T.L.; writing—original draft, L.-E.Y.; writing—review & editing, Y.X. and D.Z.; supervision, Y.X., T.L. and D.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was partly supported by the National Natural Science Foundation of China under Grants 51939001, 52301418, 61751202, and 61976033, in part by Liaoning Revitalization Talents Program under Grant XLYC1908018. This work was also supported by the China Scholarship Council (Fund No. 202206570022).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Peng, Z.; Wang, J.; Wang, D.; Han, Q.L. An Overview of Recent Advances in Coordinated Control of Multiple Autonomous Surface Vehicles. IEEE Trans. Ind. Inform. 2021, 17, 732–745. [Google Scholar] [CrossRef]
Fossen, T.I. Marine Control Systems—Guidance, Navigation, and Control of Ships, Rigs and Underwater Vehicles; Marine Cybernetics: Trondheim, Norway, 2002. [Google Scholar]
Peng, Z.; Liu, L.; Wang, J. Output-Feedback Flocking Control of Multiple Autonomous Surface Vehicles Based on Data-Driven Adaptive Extended State Observers. IEEE Trans. Cybern. 2021, 51, 4611–4622. [Google Scholar] [CrossRef] [PubMed]
Peng, B.; Gu, N.; Wang, D.; Peng, Z. Model-Free Adaptive Disturbance Rejection Control of An RSV With Hardware-in-The-Loop Experiments. IEEE Trans. Ind. Electron. 2023, 70, 7507–7510. [Google Scholar] [CrossRef]
Wu, W.; Peng, Z.; Wang, D.; Liu, L.; Han, Q.L. Network-Based Line-of-Sight Path Tracking of Underactuated Unmanned Surface Vehicles With Experiment Results. IEEE Trans. Cybern. 2022, 52, 10937–10947. [Google Scholar] [CrossRef] [PubMed]
Peng, Z.; Wang, D.; Chen, Z.; Hu, X.; Lan, W. Adaptive Dynamic Surface Control for Formations of Autonomous Surface Vehicles With Uncertain Dynamics. IEEE Trans. Control Syst. Technol. 2013, 21, 513–520. [Google Scholar] [CrossRef]
Gao, S.; Peng, Z.; Liu, L.; Wang, D.; Han, Q.L. Fixed-Time Resilient Edge-Triggered Estimation and Control of Surface Vehicles for Cooperative Target Tracking Under Attacks. IEEE Trans. Intell. Veh. 2023, 8, 547–556. [Google Scholar] [CrossRef]
Peng, Z.; Wang, D.; Li, T.; Han, M. Output-Feedback Cooperative Formation Maneuvering of Autonomous Surface Vehicles With Connectivity Preservation and Collision Avoidance. IEEE Trans. Cybern. 2020, 50, 2527–2535. [Google Scholar] [CrossRef] [PubMed]
Wang, F.; Zhang, H.; Liu, D. Adaptive Dynamic Programming: An Introduction. IEEE Comput. Intell. Mag. 2009, 4, 39–47. [Google Scholar] [CrossRef]
Zhang, G.; Zhu, Q. Event-triggered optimal control for nonlinear stochastic systems via adaptive dynamic programming. Nonlinear Dyn. 2021, 105, 387–401. [Google Scholar] [CrossRef]
Yuan, L.; Li, T.; Tong, S.; Xiao, Y.; Shan, Q. Broad Learning System Approximation-Based Adaptive Optimal Control for Unknown Discrete-Time Nonlinear Systems. IEEE Trans. Syst. Man Cybern. Syst. 2022, 52, 5028–5038. [Google Scholar] [CrossRef]
Huang, Z.; Bai, W.; Li, T.; Long, Y.; Chen, C.P.; Liang, H.; Yang, H. Adaptive reinforcement learning optimal tracking control for strict-feedback nonlinear systems with prescribed performance. Inf. Sci. 2023, 621, 407–423. [Google Scholar] [CrossRef]
Werbos, P.J. Approximate dynamic programming for realtime control and neural modelling. In Handbook of Intelligent Control: Neural, Fuzzy and Adaptive Approaches; Van Nostrand Reinhold: New York, NY, USA, 1992; pp. 493–525. [Google Scholar]
Werbos, P.J. Consistency of HDP applied to a simple reinforcement learning problem. Neural Netw. 1990, 3, 179–189. [Google Scholar] [CrossRef]
Jiang, Z.; Jiang, Y. Robust adaptive dynamic programming for linear and nonlinear systems: An overview. Eur. J. Control 2013, 19, 417–425. [Google Scholar] [CrossRef]
Gao, X.; Long, Y.; Li, T.; Hu, X.; Chen, C.L.P.; Sun, F. Optimal Fuzzy Output Feedback Control for Dynamic Positioning of Vessels With Finite-Time Disturbance Rejection Under Thruster Saturations. IEEE Trans. Fuzzy Syst. 2023, 31, 3447–3458. [Google Scholar] [CrossRef]
Wang, N.; Gao, Y.; Zhang, X. Data-driven performance-prescribed reinforcement learning control of an unmanned surface vehicle. IEEE Trans. Neural Netw. Learn. Syst. 2021, 32, 5456–5467. [Google Scholar] [CrossRef]
Wang, N.; Gao, Y.; Zhao, H.; Ahn, C.K. Reinforcement Learning-Based Optimal Tracking Control of an Unknown Unmanned Surface Vehicle. IEEE Trans. Neural Netw. Learn. Syst. 2021, 32, 3034–3045. [Google Scholar] [CrossRef]
Bellman, R.E. Dynamic Programming; Priceton Univ. Press: Priceton, NJ, USA, 1957. [Google Scholar]
Peng, Z.; Wang, J.; Wang, D. Distributed Containment Maneuvering of Multiple Marine Vessels via Neurodynamics-Based Output Feedback. IEEE Trans. Ind. Electron. 2017, 64, 3831–3839. [Google Scholar] [CrossRef]
Jiang, Y.; Peng, Z.; Wang, D.; Yin, Y.; Han, Q.L. Cooperative Target Enclosing of Ring-Networked Underactuated Autonomous Surface Vehicles Based on Data-Driven Fuzzy Predictors and Extended State Observers. IEEE Trans. Fuzzy Syst. 2022, 30, 2515–2528. [Google Scholar] [CrossRef]
Deng, Y.; Zhang, X. Event-Triggered Composite Adaptive Fuzzy Output-Feedback Control for Path Following of Autonomous Surface Vessels. IEEE Trans. Fuzzy Syst. 2021, 29, 2701–2713. [Google Scholar] [CrossRef]
Chen, W.H.; Yang, J.; Guo, L.; Li, S. Disturbance-Observer-Based Control and Related Method: An Overview. IEEE Trans. Ind. Electron. 2016, 63, 1083–1095. [Google Scholar] [CrossRef]
Hu, X.; Wei, X.; Kao, Y.; Han, J. Robust Synchronization for Under-Actuated Vessels Based on Disturbance Observer. IEEE Trans. Intell. Transp. Syst. 2022, 23, 5470–5479. [Google Scholar] [CrossRef]
Do, K. Practical control of underactuated ships. Ocean Eng. 2010, 37, 1111–1119. [Google Scholar] [CrossRef]
von Ellenrieder, K.D. Dynamic surface control of trajectory tracking marine vehicles with actuator magnitude and rate limits. Automatica 2019, 105, 433–442. [Google Scholar] [CrossRef]
Li, Z.; Sun, J.; Oh, S. Design, analysis and experimental validation of a robust nonlinear path following controller for marine surface vessels. Automatica 2009, 45, 1649–1658. [Google Scholar] [CrossRef]
Gao, X.; Li, T.; Yuan, L.; Bai, W. Robust Fuzzy Adaptive Output Feedback Optimal Tracking Control for Dynamic Positioning of Marine Vessels with Unknown Disturbances and Uncertain Dynamics. Int. J. Fuzzy Syst. 2021. [Google Scholar] [CrossRef]
Gao, X.; Bai, W.; Li, T.; Yuan, L.; Long, Y. Broad learning system-based adaptive optimal control design for dynamic positioning of marine vessels. Nonlinear Dyn. 2021, 105, 1593–1609. [Google Scholar] [CrossRef]
Wondergem, M.; Lefeber, E.; Pettersen, K.Y.; Nijmeijer, H. Output Feedback Tracking of Ships. IEEE Trans. Control Syst. Technol. 2011, 19, 442–448. [Google Scholar] [CrossRef]
Sarangapani, J. Neural Network Control of Nonlinear Discrete-Time Systems; CRC Press: Boca Raton, FL, USA, 2018. [Google Scholar]
Vrabie, D.; Lewis, F. Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems. Neural Netw. 2009, 22, 237–246. [Google Scholar] [CrossRef]
Skjetne, R.; Smogeli, Ø.; Fossen, T.I. Modeling, identification, and adaptive maneuvering of Cybership II: A complete design with experiments. IFAC Proc. Vol. 2004, 37, 203–208. [Google Scholar] [CrossRef]
Sun, K.; Li, Y.; Tong, S. Fuzzy Adaptive Output Feedback Optimal Control Design for Strict-Feedback Nonlinear Systems. IEEE Trans. Syst. Man Cybern. Syst. 2017, 47, 33–44. [Google Scholar] [CrossRef]

Figure 1. Illustration of the coordinations in NED (

O_{E} X_{E} Y_{E}

) and BF (

O_{B} X_{B} Y_{B}

).

Figure 1. Illustration of the coordinations in NED (

O_{E} X_{E} Y_{E}

) and BF (

O_{B} X_{B} Y_{B}

).

Figure 2. Tracking trajectory of UMVs.

Figure 3. Tracking error in the x-direction.

Figure 4. Tracking error in the y-direction.

Figure 5. Observer error of NSO

{\tilde{u}}_{i}

.

Figure 5. Observer error of NSO

{\tilde{u}}_{i}

.

Figure 6. Observer error of NSO

{\tilde{v}}_{i}

.

Figure 6. Observer error of NSO

{\tilde{v}}_{i}

.

Figure 7. Observer error of NSO

{\tilde{r}}_{i}

.

Figure 7. Observer error of NSO

{\tilde{r}}_{i}

.

Figure 8. Observer error of DO

{\tilde{d}}_{i 1}

.

Figure 8. Observer error of DO

{\tilde{d}}_{i 1}

.

Figure 9. Observer error of DO

{\tilde{d}}_{i 2}

.

Figure 9. Observer error of DO

{\tilde{d}}_{i 2}

.

Figure 10. Observer error of DO

{\tilde{d}}_{i 3}

.

Figure 10. Observer error of DO

{\tilde{d}}_{i 3}

.

Figure 11. Controller output

τ_{i 1}

.

Figure 11. Controller output

τ_{i 1}

.

Figure 12. Controller output

τ_{i 2}

.

Figure 12. Controller output

τ_{i 2}

.

Figure 13. Controller output

τ_{i 3}

.

Figure 13. Controller output

τ_{i 3}

.

Figure 14. Optimal index

r_{i} (Z_{i} (t), U_{i}^{*} (t))

.

Figure 14. Optimal index

r_{i} (Z_{i} (t), U_{i}^{*} (t))

.

Figure 15. Tracking trajectory of the existing control method [34].

Figure 16. Tracking error in the x direction of the existing control method [34].

Figure 17. Tracking error in the y direction of the existing control method [34].