1. Introduction
Underwater manipulators have become an important tool for performing seabed operations in various underwater missions [
1], e.g., “Kraft Raptor”, a hydraulic underwater manipulator developed by Kraft TeleRobotics [
2]; “Eca robotics 7E”, an electrically driven underwater manipulator developed by ECA Group; and “Ansaldo Maris 7080”, a six DOF electrical manipulator developed by autonomous systems laboratory [
3]. In the complex seabed environment, the high-precision control and the dynamic modeling of underwater manipulators is one of the current research hotspots [
4]. At present, the influences on the control system of underwater manipulators are as follows: model uncertainties such as hydrodynamic components, highly nonlinear dynamic models, ocean current interference, etc. To solve these problems, researchers have proposed effective control algorithms such as proportional integral-derivative control (PID), adaptive robust control, sliding mode control (SMC), model-based control and model predictive control (MPC).
An adaptive robust controller combined with backstepping method has been proposed, in which the adaptive robust controller is used to deal with external interference and model uncertainty of manipulators, while the backstepping method is used to simplify the dynamic model of manipulators [
5]. An adaptive non-singular fast terminal SMC with the extended state observer (ESO) has been proposed to solve the trajectory tracking problem when the system is subjected to lumped disturbance of parameter uncertainty and external interference [
6]. An adaptive PID control method has been proposed, in which a nonlinear disturbance observer is used to compensate for ocean current interference and torsional flexibility [
7]. A comprehensive framework based on PID and fuzzy logic control methods has been proposed for the control of an underwater robot manipulator system (UVMS) [
8]. An adaptive controller based on neural network has been proposed, in which the neural network is used to compensate the underwater disturbance torque to improve the robustness of the manipulator control system [
9]. An adaptive controller based on observer compensation has been proposed, in which the observer is used to estimate the friction torque of the joint and compensate the controller [
10]. A disturbance observer-based control framework for autonomous underwater biomimetic submersible robotic arm system has been proposed for underwater maneuvering under unknown external disturbances [
11]. An adaptive fuzzy sliding mode control strategy has been proposed to improve the robustness of underwater robotic arms and solve the buffeting problem of traditional SMC [
12]. A continuous non-singular finite time control method has been proposed, in which a non-singular fast terminal sliding mode is used to ensure the finite time convergence of the controller, and a high-order superdistortion perturbation observer is used to observe the ocean current interference and parameter uncertainty [
13].
For the ability of incorporating input and state constraints, MPC has received a large amount of research attention from academic researchers and engineers in the field of underwater manipulators [
14]. In recent decades, MPC has been proven to provide an effective control strategy for underwater manipulator control systems [
15].
An adaptive robust MPC method has been proposed, in which the extended Kalman filter (EKF) is used to estimate the compensation of unmodeled uncertainties, sensory noise and ocean currents and other disturbances of the underwater manipulator, and this method can easily constrain the control output and system state variables [
16]. A robust MPC method based on a tube has been proposed, in which SMC is used as a robust control term to eliminate the error between the nominal model and the actual model, that is, the total set of internal and external interference. The advantage of this method is that the interference part and the nominal part of the dynamic equation can be treated separately [
17]. MPC using machine learning algorithms has also been extensively studied, and an MPC control method based on neural networks has been proposed, in which a double radial basis function neural network (RBFNNs) is used for online model estimation to solve the uncertainty of the underwater manipulator model [
18]. An MPC controller based on reinforcement learning has been proposed, in which reinforcement learning is used to optimize the motion trajectory to improve the working efficiency of the manipulator [
19]. An MPC method using neural networks has been proposed, and ensuring recursive feasibility and asymptotic stability [
20]. Aiming at the degradation of the control system performance caused by external interference and different payloads, a neural network-based MPC was proposed to control the adaptive controller of the underwater manipulator [
21], the neural network fits the dynamic model model based on data to make the MPC controller update its dynamic parameters.
The traditional MPC relies excessively on the nominal model established. If there is a large difference between the nominal model and the actual system, it will have a significant impact on the control effect and even lead to the divergence of the entire control system. Gaussian process regression (GPR) is a non-parametric regression model that uses the properties of Gaussian processes (GP) to perform regression analysis on known data. The GPR can model the behavior of other systems through the appropriate combination of GPs, and achieve prediction based on Bayesian framework and prior knowledge. The GPR model has few parameters and strong flexibility, which has great advantages in dealing with small-sample data analysis [
22,
23,
24]. However, their use is limited to a few thousand training samples due to their cubic time complexity [
24]. Therefore, this paper aims to address the special issue of online estimation of external interference in underwater robotic arms, and improves the original GPR by using a sliding window method to sample the state values of nearby moments. A small amount of data at the current moment are obtained as the training set of GPR, and the sampling window is moved over time to extract current environmental information under small-data conditions, ensuring the real-time and accuracy of the estimation algorithm.
In recent years, many scholars have also conducted relevant research on the combination of GPR and MPC control methods. An MPC-based control method that uses GP to estimate additional nonlinear dynamics terms and compensates the estimated results into the nominal model was proposed in [
25]. A method of MPC combined with GPR has also been applied to mobile robots, which uses GPR to compensate the dynamic model of the robot through the state measurement data in real time, so as to solve the problem of model inaccuracy [
26]. Therefore, this paper considers using the GPR method to estimate the unknown dynamic part of the actual system online and further compensate the nominal model. This paper is the first to apply a combination of GPR and MPC methods in the field of underwater robotic arm control. The proposed method combines the advantages of both, ensuring the high accuracy and ease of setting output constraints of the MPC algorithm, as well as the advantages of GPR that can achieve good training results using only small samples. Aiming at the special problem where the underwater manipulator is disturbed by fluid force when working in the underwater environment, a new solution is made, which enables the underwater manipulator to adapt to the real underwater environment only by adjusting its parameters once in the land environment. It avoids the problem of difficult adjustment and observation in the actual underwater environment, and improves the environmental adaptability and robustness of traditional MPC.
Moreover, on the basis of the dynamic model established by the Lagrange equation [
27], the influence of water resistance and added mass on the motion of underwater manipulator should be further discussed [
28], so that the established dynamic model of underwater manipulator is as consistent as possible with the real underwater situation, and the simulation experiment can better reflect the real underwater situation. To summarize, the main contributions of this paper are as follows:
An accurate dynamic model of a 6-DOF underwater manipulator in hydrostatic environment was established by combining the Lagrange equation with the Morrison formula.
An improved MPC method is proposed, in which the GPR algorithm is embedded into the trajectory tracking control of the underwater manipulator. GPR is used to predict the water resistance, added mass, buoyancy and external interference in real time, and compensate for the unmodeled part of the nominal dynamic model.
Through numerical experiments and comparison with traditional MPC and SMC, the effectiveness and efficiency of the improved MPC algorithm are fully proved.
The rest of this paper is organized as follows. The precise dynamic model of the 6-DOF underwater manipulator in the still water environment is established in
Section 2. The proposed adaptive model predictive control method combined with Gaussian process regression is introduced in
Section 3, and simulation and results analysis are provided in
Section 4. Finally,
Section 5 concludes this paper.
3. Adaptive Model Predictive Control Using Gaussian Process Regression
In this section, an adaptive MPC using GPR method is proposed and its stability is proven. To compensate for the difference between the nominal model and the actual system, GPR is used to estimate the unmodeled part of the nominal model. And the sliding window method is used to reduce the computational complexity of GPR.
3.1. Gaussian Process Regression Base on Sliding Window
In this paper, GPR is used to estimate the water resistance, added mass, buoyancy, and external disturbances mentioned above in real time. In addition, the sliding window method is adopted to select the training data, which not only satisfies the estimation accuracy, but also reduces the calculation load of the large training data set GPR.
Assume that there is a training set
to be regressed. The input of the training set is
X and the output is
Y, the input of the test set is
and the output is
, and the function
follows a normal distribution, that is,
∼
. Without loss of generality, assume its mean value
, then:
where:
and according to the conditional Gaussian distribution formula, under the condition
X,
Y,
,
follows the Gaussian distribution
, namely:
According to the formula, the mean value of Gaussian process can be taken as the estimated value of
:
it can be seen from Equation (
9) that the prediction result of Gaussian process regression depends entirely on the mean function and covariance function. This paper considers the widely used square index (SE) kernel (Equation (
10)) to calculate the covariance matrix,
where,
is the signal variance of the kernel function;
is a hyperparameter symmetric matrix;
is the variance of noise. For convenience, all hyperparameters are described as
.
After the kernel function is selected, the hyperparameters need to be optimized. The method often used is to transform the maximum a posteriori solution of the hyperparameters into the maximum marginal likelihood function solution by Bayesian theorem. First, establish the logarithmic likelihood function of the Gaussian process model [
29], as shown in the Equation (
11):
Then, the steepest descent method is used to solve the maximum marginal likelihood function to determine the optimal hyperparameters. Here, the partial derivative of
L to
is given.
In order to reduce the computational load of GPR for large training data sets, the sliding window method is used to select training data. Select
sampling data before the current time
k as the training set
, update the hyperparameters
in GPR and estimate the output
corresponding to the current input
. The calculation process of GPR based on sliding window is shown in Algorithm 1.
Algorithm 1 GPR base on sliding window (, , ) |
- 1:
Initialize learning factor , training accuracy , training steps N, Maximum steps of training - 2:
Calculate the partial derivative of maximum marginal likelihood function L to hyperparameters - 3:
while or do - 4:
- 5:
Calculate again with the updated hyperparameters - 6:
end while - 7:
Update hyperparameters - 8:
Estimate - 9:
return ,
|
3.2. Adaptive Model Predictive Control Using Gaussian Process Regression
This section gives the adaptive MPC using GPR method, its structure is shown in
Figure 2. Firstly, the terminal MPC controller will calculate the output torque of each joint
, with the desired trajectory
and status feedback value
input. Then, the actual manipulator and the nominal model will be calculated at the same time, and their output is the state value
and nominal state value
, respectively. Finally, taking
and
as the input of GPR, GPR will predict the external interference value
at this moment, and
will be used to compensate the nominal model at the next moment. At the same time,
and
will be saved into the training set and the hyperparameters will be updated. The specific calculation process is as follows:
The actual system is compensated by the estimated value
of unknown term and interference term through GPR. When Equation (
3) is subtracted from Equation (
14) to obtain
, that is, the interference value
at the previous time can be obtained by sampling to the actual state
and the nominal state
. These data will be used as hyperparameters for updating the GPR in the training set. The input of the specific GPR training set is the actual state
and nominal state
of the first n moments, and the output of the training set is the interference value
corresponding to the time. After training GPR, input the actual state
and nominal state
at the current time to predict the interference value
at the next time.
when
, by sorting out the above formula, we can obtain:
In this way, using the estimated external interference force on the dynamic model of the underwater robotic arm in the MPC controller is equivalent to adding a feedforward correction in the control loop. While compensating for the influence of external interference force on each joint motor, it ensures the stability of the traditional terminal MPC, and during the Quadratic programming optimization process of subsequent MPC calculation, it can ensure that the output torque is always within the constraint range. Then the controlled system becomes:
where:
,
Therefore, the problem can be described as nominal terminal MPC, the cost function in a finite horizon as Equation (
15)
P is from the Lyapunov function (
16)
To ensure that
P is solvable,
K is calculated by the infinite horizon linear quadratic regulator (LQR) of cost function
J, shown as (
17)
where:
Then, the first term of the optimal solution sequence of the optimization problem is the closed-loop control quantity, shown as (
18). The calculation process of the adaptive MPC using GPR method is shown in Algorithm 2.
Here, we can propose:
Theorem 1. If the constrained optimization problem (15) has a solution at the initial time, then it has a solution at any time greater than 0, and the closed-loop system is asymptotically stable. Proof. Assume that a group of optimal control sequences with prediction interval
N at time
k are:
The corresponding set of optimal state sequences and loss functions are:
select closed-loop controller as
then the status at time
is
select a group of control sequences at time
as
then the corresponding status sequence is:
□
Algorithm 2 The adaptive MPC using GPR method (J,) |
- 1:
Initialize hyper-parameters of GPR , the size of the sliding window , terminal constraint , state weight matrix Q, output weight matrix R, output constraint U, state constraint X, prediction time domain N, Distance length - 2:
Linearizing and discretizing the nominal model - 3:
while true do - 4:
Get the current status of the real system - 5:
Calculate the nominal state value - 6:
Calculate the difference between the nominal state value and the actual state value - 7:
Using Algorithm 1 to estimate the interference at the next moment, and update the GPR hyperparameters - 8:
Update the nominal model - 9:
Calculate the error between the current state and the desired trajectory - 10:
Calculate the cost function J to get the optimal control output - 11:
end while - 12:
return
|
Then the corresponding loss function is shown in Equation (
20), since
Q and
R are both positive definite,
, we can see that
is a closed-loop system Lyapunov function, and it is monotonically decreasing, and the system is asymptotically stable.
In summary, the sliding-window Gaussian process regression can utilize both actual and nominal state variables during the underwater manipulator’s operation to estimate real-time external disturbances such as fluid forces and unknown resistances that the manipulator will experience in future time steps. The estimated external disturbances are then used to compensate for the effects on joint torques in the output torque of MPC. This enables the underwater manipulator to adaptively adjust and compensate for various fluid-induced forces without manually readjusting parameters in the underwater environment, thereby enhancing the robustness of the controller and improving the adaptability of the underwater manipulator to its surroundings.
4. Simulation Results
In this paper, the trajectory tracking problem of the six-degrees-of-freedom manipulator is used to verify the effectiveness of the adaptive MPC using GPR method. In this section, the effectiveness of GPR for real-time estimation of the hydrodynamic term and time-varying external interference is analyzed. In addition, the trajectory tracking control effect of the adaptive MPC using GPR method under the complete dynamic model of the underwater six-degrees-of-freedom manipulator is given.
The structure diagram of the 6-DOF manipulator is shown in
Figure 3. The first joint of the manipulator will be fixed on a fixed platform. The specific kinematics and dynamics parameters are shown in
Table A1 and
Table A2 in
Appendix A. The parameter of manipulators are derived from UR5, which is a product manufactured by universal robots [
30]. The parameter equations of the hydrodynamic model based on the underwater manipulator are described in
Appendix B. The codes run in the computer with an Intel (R) Core (TM) i7-12700F CPU @ 2.1 GHz 16 GB RAM. The simulation experiment in this manuscript is run in matlabR2021a, and the differential equation is calculated using the fourth-order Runge–Kutta method (ode45). The simulation task is to complete the trajectory tracking of the manipulator, the desired joint trajectory is shown as (
21), and the working disturbance is given by (
22); here, the trigonometric functions varying in a certain range is used to express the influence of unknown ocean currents on the joints of the underwater manipulator, which will be added to the joint velocity term in the actual state value. Moreover, the influence of measurement noise on the control performance is considered, Gaussian white noise with a mean of 0 and a variance of 0.001 is taken as the measurement noise and added to the output state variable at each simulation cycle.
The parameters of the adaptive MPC using GPR method are selected as follows. Firstly, the initialization process of GPR is described in
Section 3.1, the initial value of hyperparameters is set as
, the size of the sliding window
is selected to 10. Secondly, for the parameters of the MPC cost function, the terminal penalty matrix P, the infinite horizon LQP control gain K, and the terminal area
will be determined according to the steps described in
Section 3.2. Here, the control parameters are selected as
,
,
, and
.
4.1. The Simulation of Gaussian Process Regression
The external interference is considered as the sum of water resistance, added mass, buoyancy and ocean current interference. The Gaussian process regression of sliding window is used to predict the external interference in real time and compensate for the unmodeled part of the nominal dynamic model.
The actual external interference and the predicted external interference through GPR based on sliding window are shown in
Figure 4; it can be seen that external interferences have a great impact on joint torque, especially for the first joint and the second joint, as both exceed 200
at the beginning of the simulation. In addition, the external interference is strongly nonlinear with time, which has a great impact on the prediction performance and control performance.
The external interference force on each joint mainly comes from the water resistance and added mass force caused by the upstream surface of the connecting rod during the movement of the manipulator. The first and second joints are located at the base, and the interference force received by each connecting rod will be transmitted to the motor of these two joints. The vertical force will be superimposed on the first joint, and the horizontal force will be superimposed on the second joint, so the first two joints will receive the maximum external interference force. Similarly, it can be concluded that the joint motor closer to the end of the robotic arm is subjected to less interference force, and the simulation results are also consistent with the actual situation, indicating the correctness of the established dynamic model of the underwater robotic arm.
However, the predicted value can well track the real value, the change of prediction error with time is shown in
Figure 5. The predicted value can converge to the actual value within 1 s, and the error of six joints are always controlled within 0.1
. After the convergence of the estimated value, the error between the estimated value and the real value fluctuates in a small range, because the Gradient descent used by GPR in the hyperparameter optimization may not find the optimal value in a limited number of steps, but does not affect the accuracy of the estimation. The estimation error has been kept within the acceptable accuracy range.
After the measurement noises are introduced, the actual value and the estimated value of external interference are shown in
Figure 6, the estimation of interference will jitter in a small range near the true value. However, it can be seen from
Figure 7 that the estimation error is still within a acceptable range, and it is still effective for the interference compensation of the controller.
4.2. The Simulation of Adaptive Model Predictive Control Using Gaussian Process Regression
In this section, we take the trajectory tracking of 6-DOF underwater manipulator as the task, and verify the effectiveness of proposed control method by displaying its position-tracking curve, speed-tracking curve, and output torque curve. Compared with the traditional MPC and sliding mode control methods, the simulation results show that the proposed control method has better control accuracy.
In the simulation experiment in this paper, the initial state of the simulation object, the six-degrees-of-freedom underwater manipulator, is that the angular positions of the six joints are all at 0 rad, and the angular velocities of the six joints are also at 0 rad/s. Each joint of the underwater manipulator starts to track the desired trajectory expressed in Equation (
21) at the initial time. The six expected trajectories are linear combinations of each Trigonometric functions, and the set trajectory curve position, speed and acceleration terms are continuously derivable. Under the condition that the articulated motor can be completed and the six manipulators move at the same time, the forces between the articulated motors interfere with each other, fully verifying the control performance of the underwater manipulator. And in the underwater environment, the motion speed and acceleration of the connecting rod are proportional to the fluid resistance and added mass force suffered by the joint. The trajectory curve of the joint motor with high motion speed will also increase the fluid resistance and added mass force suffered by the underwater manipulator, which is also a challenge to the controller designed in this paper.
The desired joint position track and the actual simulated joint position track are shown in
Figure 8. It can be seen that the actual value can well track the desired value. The actual value can keep up with the desired value within 1 s, and the error is always controlled within a small range, the two-norm of error is shown in
Figure 9. The figure also shows the error of traditional MPC and sliding mode control, and it can be seen that the control effect of the proposed method is obviously superior to both in terms of rate of convergence and steady-state error. The mean square error of each joint is shown in
Table 1.
The position control accuracies of joint 1 to joint 6 were increased by 44.85%, 7.68%, 25.87%, 55.97%, 41.16%, and 28.79%, respectively, and the average control accuracy was increased by 34.05%.
The desired joint speed track and the actual simulated joint speed track are shown in
Figure 10, and the speed following effect is also good. At the same time, the joint torque change curve is shown in
Figure 11. The torque changes smoothly without obvious jitter, and the output torque of each joint is within the constraint range. Compared with
Figure 4, a large part of the output torque is overcoming external interference during the movement, which also shows the good robustness of the adaptive MPC using GPR method.
The Gaussian white noises with mean 0 and variance 0.001 are introduced into the feedback values of joint position and joint velocity, and the obtained results are presented in
Figure 12,
Figure 13 and
Figure 14.
When the measurement noise is considered in the simulation experiment, the desired joint position and the actual joint position as shown in
Figure 12 still maintains a fast and stable tracking desired trajectory. The desired joint speed and the actual simulated joint speed as given in
Figure 13, although the actual joint speed contains a small vibration, it can track the desired joint speed and is stable in the overall tracking process. Meanwhile, the actual output torque as shown in
Figure 14 is relatively stable and there is no large vibration.
Figure 15 shows the trajectory-tracking errors of each joint of the underwater manipulator when there are measurement noises. It can be seen that the position control accuracy of the proposed control method is still better than the SMC method and the traditional MPC method. The mean square error of each joint is shown in
Table 2.
Compared with SMC, the proposed method has improved the control accuracy. Compared with MPC, the proposed method also has a certain improvement in control accuracy, the position control accuracy of joint 1 to joint 6 are increased by 38.58%, 36.28%, 60.95%, 45.95%, 91.55%, and 40.31%, respectively, and the average control accuracy is increased by 52.27%. Compared with the results presented in
Table 1, this increase is even more significant when the measurement noises are taken into account.