**Visual Servoing in Robotics**

Editor

**Jorge Pomares**

MDPI • Basel • Beijing • Wuhan • Barcelona • Belgrade • Manchester • Tokyo • Cluj • Tianjin

*Editor* Jorge Pomares Department of Physics, Systems Engineering and Signal Theory, University of Alicante Spain

*Editorial Office* MDPI St. Alban-Anlage 66 4052 Basel, Switzerland

This is a reprint of articles from the Special Issue published online in the open access journal *Electronics* (ISSN 2079-9292) (available at: https://www.mdpi.com/journal/electronics/special issues/Visual Servoing).

For citation purposes, cite each article independently as indicated on the article page online and as indicated below:

LastName, A.A.; LastName, B.B.; LastName, C.C. Article Title. *Journal Name* **Year**, *Volume Number*, Page Range.

**ISBN 978-3-0365-0344-8 (Hbk) ISBN 978-3-0365-0345-5 (PDF)**

Cover image courtesy of Jorge Pomares.

© 2021 by the authors. Articles in this book are Open Access and distributed under the Creative Commons Attribution (CC BY) license, which allows users to download, copy and build upon published articles, as long as the author and publisher are properly credited, which ensures maximum dissemination and a wider impact of our publications.

The book as a whole is distributed by MDPI under the terms and conditions of the Creative Commons license CC BY-NC-ND.

## **Contents**


## **About the Editor**

**Jorge Pomares** (Prof. Dr.) obtained his degree in Computer Engineering and his PhD at the University of Alicante. He has been a member of the Department of Physics, Systems Engineering and Signal Theory at the University of Alicante since 2003. Since December 2017, he has been Full Professor in the aforementioned department in the area of systems engineering and automatics, being founder of the Human Robotics research group. His research career has focused on the field of robotics, visual servoing for guiding robots, space robotics, and robot control and manipulation. Within these fields, Prof. Dr. Pomares has participated in more than 15 research projects. His projects have covered robot guidance by vision, visual servoing, and the control of assistive robotics. These projects have allowed not only the development of robot guidance strategies but also the design of new approaches for the control of robotic systems. In addition, he has worked on the design of new strategies for guiding robots with different characteristics: robot manipulators, robotic hands, mobile robots, mobile manipulators, space robots, exoskeletons, and so on. Within this field, he has authored more than 40 JCR articles and presented over 100 contributions at national and international conferences. Throughout his research career, Prof. Dr. Pomares has collaborated with more than ten national and international research groups in the field of robotics, robot control systems, and space robotics. Most notably, regarding collaborations with foreign research groups, during the last few years, he has been collaborating with the Institute of Industrial Systems of Greece, the University of Lulea, and the Centre for Autonomous and Cyberphysical Systems at the University of Cranfield. These collaborations have been focused on the development of new methods of nonlinear control and filtering for robotic applications. This research aims to solve control, estimation, and filtering problems in advanced models of robotic manipulators and vehicles. The methods which have been developed are generic and applicable to a wide range of robotic systems. These control methods apply to to the following types of robots: multi-DOF rigid-link robots, manipulators subject to input/output delays, underactuated robots and redundant manipulators, closed-chain robotic systems, exoskeletons, flexible-link robots, and space robotics. Additionally, Prof. Dr. Pomares is collaborating with the Centre for Autonomous and Cyberphysical Systems at the University of Cranfield, within the research field of space robotics.

## **Preface to "Visual Servoing in Robotics"**

Today, computer vision systems are integrated into a wide range of industrial applications as a means of detecting defects in production lines, of object/pattern recognition, and as a surface scanning method, among other uses. Within these industrial applications, it is important to highlight visual servoing systems for the guidance of robotic manipulators. The main goal of visual servoing systems is the use of computer vision as a method for the guidance of robots. These control systems are fed with the information acquired by one or several cameras; this controller then determines the control actions to move the manipulator in order to carry out its task. Visual servoing approaches can be applied for the guidance of not only robot manipulators but also different kinds of robots, such as mobile robots, aerial robots, or parallel robots. This book includes different areas of research where visual servoing approaches are applied for the guidance of mobile manipulators, parallel robots, or even teleoperated robots.

As is evident throughout the book, visual servoing systems are widespread and extensively applied, not only in research laboratories but also in a wide range of additional applications, from industrial robotics to service robotics. This book also includes papers that describe new and interesting applications of visual servoing systems. One notable application is the use of visual servoing in apple-picking robots. The image-based visual servo control method is adopted to control the manipulator in order to improve the robot's grasping accuracy during the picking process. Additionally, a spatial trajectory optimization method for a spray-painting robot is presented in this book.

One factor which has favored the growth of visual servoing systems, their diffusion, and the extension of their application is the increase in the capture and processing capabilities of today's cameras (such as the RGBD cameras presented in this book), as well as the equipment used for processing data. This has allowed for the computation of vast amounts of information in less time, avoiding possible delays and helping to guide the robot in a smooth manner. Despite the efforts of the last decade dedicated to improving different aspects of the development of these visual servoing systems, even nowadays, there is considerable ongoing research to investigate means of increasing the robustness of such systems. Consequently, this book includes an image feature reconstruction algorithm based on the Kalman filter to handle feature loss during tracking.

> **Jorge Pomares** *Editor*

## *Editorial* **Visual Servoing in Robotics**

#### **Jorge Pomares**

Department of Physics, Systems Engineering and Signal Theory, University of Alicante, 03690 Alicante, Spain; jpomares@ua.es

Received: 4 October 2019; Accepted: 1 November 2019; Published: 6 November 2019

#### -- **-**

#### **1. Introduction**

Visual servoing is a well-known approach to guide robots using visual information. Image processing, robotics and control theory are combined in order to control the motion of a robot depending on the visual information extracted from the images captures by one or several cameras. With respect to vision issues, different problems are currently under research such as the use of different kinds of image features (or different kinds of cameras), image processing at high velocity, convergence properties, etc. Furthermore, the use of new control schemes allows the system to behave more robustly, efficiently, or compliantly with less delays. Related issues such as optimal and robust approaches, direct control, path tracking or sensor fusion allows the application of the visual servoing systems in different domains.

In so-called image-based visual servoing systems, the control law is calculated using directly visual information. These last systems do not need a complete 3D reconstruction of the environment. For tasks that require high precision, speed or response, several works suggest that it may be beneficial to take into account the dynamics of the robot when designing visual servoing control laws. The type of visual control systems that consider the dynamics of the robot in the control law are often referred to as direct or dynamic visual control systems. However, for simplicity, indirect visual servoing schemes are mostly used in the literature.

Nowadays, the application fields of the visual servoing systems are very wide, and include research and application fields such as navigation and localization of mobile robots, guidance of humanoid robots, robust and optimal control of robots, manipulation, intelligent transportation, deep learning and machine learning in visual servoing and visual guidance of field robotics (aerial robots, assistive Robots, medical robots, etc.).

#### **2. The Present Issue**

This special issue consists of eight papers covering important topics in the field of visual servoing. In [1], an enhanced switch image-based visual servoing system for a 6 degrees of freedom industrial robot is proposed. An image feature reconstruction algorithm based on the Kalman filter is presented to handle feature loss during the tracking. Visual servoing approaches can be applied for the guidance of different kinds of robots such as mobile robots, aerial robots or parallel robots. The latter is the case described in [2], where an optical coordinate measuring machine is employed to stabilize parallel robots. In this case, the dynamic model parameters are identified by using a non-linear optimisation technique. In [3], a direct image-based visual servoing system is used for the guidance of a mobile manipulator. This approach considers not only kinematic properties of the robot, but also dynamic ones for guiding both the robot base and the manipulator arm. An optimal control approach is used in this paper.

This special issue also includes papers that describe new and interesting applications of the visual servoing systems such as the ones described in [4] or [5]. In [4], a visual servoing system is applied to an apple-picking robot. The image-based visual servo control method is adopted to control the manipulator in order to improve the grasping accuracy in the picking process. The joint control performance of the control system has been improved by the proposed adaptive fuzzy neural network sliding-mode control algorithm. Additionally, in [5], a spatial trajectory optimization method of a spray-painting robot is proposed.

Additionally, visual servoing approaches are very related with the necessity of estimating parameters or variables used in the control process. For example, in [6], a visual-based method to estimate robot orientation with RGB-D cameras is proposed. In [7], the reaction force of the end effector and second link of a three-degree of freedom hydraulic servo system with master–slave manipulators sliding mode control is determined with a sliding perturbation observer. Also, bilateral control is used to estimate the reaction force of the master device which is provided to the operator to handle the master device. Finally, in [8] a novel measurement system to visualize the motion-to-photon latency with time-series data in real time is proposed.

#### **3. Future**

While visual servoing systems have been an important field of research in the last years, several major challenges still remain. Tasks such as tracking, positioning, detection, segmentation, and localization play a critical role in visual servoing and different research is currently ongoing to increase the robustness of visual controllers. Additionally, new control approaches such as optimal control, robust control, dynamic control or predictive control will provide this kind of system with new dynamic properties. Furthermore, new computer vision systems, electronics, computers, etc., will offer new and interesting capabilities for the application of visual servoing in new kinds of robotics systems such as autonomous driving cars, humanoid robots, aerial robots, service robotics, UAV, parallel robots, space robotics, etc.

**Acknowledgments:** First of all, we would like to thank all researchers who submitted articles to this special issue for their excellent contributions. We are also grateful to all reviewers who helped in the evaluation of the manuscripts and made very valuable suggestions to improve the quality of contributions. We would like to acknowledge the editorial board of Electronics, who invited us to guest edit this special issue. We are also grateful to the Electronics Editorial Office staff who worked thoroughly to maintain the rigorous peer-review schedule and timely publication.

**Conflicts of Interest:** The author declares no conflict of interest.

#### **References**


© 2019 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Enhanced Switch Image-Based Visual Servoing Dealing with Features Loss**

#### **Ahmad Ghasemi 1, Pengcheng Li 1, Wen-Fang Xie 1,\* and Wei Tian <sup>2</sup>**


Received: 18 July 2019; Accepted: 9 August 2019; Published: 15 August 2019

**Abstract:** In this paper, an enhanced switch image-based visual servoing controller for a six-degreeof-freedom (DOF) robot with a monocular eye-in-hand camera configuration is presented. The switch control algorithm separates the rotating and translational camera motions and divides the image-based visual servoing (IBVS) control into three distinct stages with different gains. In the proposed method, an image feature reconstruction algorithm based on the Kalman filter is proposed to handle the situation where the image features go outside the camera's field of view (FOV). The combination of the switch controller and the feature reconstruction algorithm improves the system response speed and tracking performance of IBVS, while ensuring the success of servoing in the case of the feature loss. Extensive simulation and experimental tests are carried out on a 6-DOF robot to verify the effectiveness of the proposed method.

**Keywords:** image-based visual servoing; image feature loss; industrial robots; switch control

#### **1. Introduction**

Visual servoing has been employed to increase the deftness and intelligence of industrial robots, especially in unstructured environments [1–4]. Based on how the image data are used to control the robot, visual servoing is classified into two categories: position-based visual servoing (PBVS) and image-based visual servoing (IBVS). A comprehensive analysis of the advantages and drawbacks of the aforementioned methods can be found in [5]. This paper focuses on addressing some issues in IBVS.

Many studies have been conducted to overcome the weaknesses of IBVS and improve its efficiency [6–9]. However, the performance of most reported IBVS is not sufficiently high to meet the requirements of industrial applications [10]. An efficient IBVS feasible for practical robotic operations requires a fast response with strong robustness to feature loss. One obvious way to increase the speed of IBVS is to increase the gain values in the control law. However, there is a limitation on the application of this strategy because the high gain in the IBVS controller tends to create shakiness and instability in the robotic system [11]. Moreover, the stability of the traditional IBVS system is proven only in an area around the desired position [5,12]. Furthermore, when the initial feature configuration is distant from the desired one, the converging time is long, and possible image singularities may lead to IBVS failure. To address this issue, a switching scheme is proposed to switch the control signal between low-level visual servo controllers, i.e., homography-based controller [13] and affine-approximation controller [14]. In our previous work [15,16], the idea of switch control in IBVS was proposed to switch the controller between end-effector's rotating and translational movements. Although it has been demonstrated that the switch control can improve the speed and tracking performance of IBVS and avoid some of its inherent drawbacks, feature loss caused by the camera's limited FOV still prevents the method from being fully efficient and being applicable to real industrial robots.

The visual features contain much information such as the robots' pose information, the tasks' states, the influence of the environment, the disturbance to the robots, etc. The features are directly related to the motion screw of the end-effector of the robot. The completeness of the feature set during visual servoing is key to fulfilling the task successfully. Many features have been used in visual servoing such as feature points, image moments, lines, etc. The feature points are known for the ease of image processing and extraction. It is shown that at least three image points are needed for controlling a 6-DOF robot [17]. Hence, four image points are usually used for visual servoing. However, the feature points tend to leave the FOV during the process of visual servoing. A strategy is needed to handle the situation where the features are lost.

There are two main approaches to handle feature loss and/or occlusion caused by the limited FOV of the camera [18]. In the first approach, the controller is designed to avoid occlusion or feature loss, while in the second one, the controller is designed to handle the feature loss.

In the first approach, several techniques have been developed to avoid the feature loss or occlusion. In [19], occlusion avoidance was considered as the second task besides the primary visual servoing task. In [20], a reactive unified convex optimization-based controller was designed to avoid occlusion during teleoperation of a dual-arm robot. Some studies have been carried out in visual trajectory planning considering feature loss avoidance [21–23]. Model predictive control methods have been adopted in visual servoing to prevent feature loss due to its ability to deal with constraints [24–28]. In [29], predictive control was employed to handle visibility, workspace, and actuator constraints. Despite the success of the studies on preventing feature loss, they suffered from the limited maneuvering workspace of the robot, due to the conservative design required to satisfy many constraints.

In the second approach, the controller tries to handle the feature loss instead of avoiding it. When the loss or occlusion of features occurs, if the remaining visible features are sufficient to generate the non-singular inverse of the image Jacobian matrix, the visual servoing task can still be carried out successfully. In this situation, the rank of the relative Jacobian matrix must be the same as the degrees of freedom [30]. However, this method is no longer effective when the number of remaining visible features become too small to guarantee the full-rankness of the image Jacobian matrix. As studied in [31], another solution is to foresee the position of the lost features and to continue the control process using the predicted features until they become visible again. This method allows partial or complete loss or occlusion of the features. In the second approach [30,31], the classical IBVS control is employed as the control method, which does not usually provide a fast response.

In this paper, an enhanced switch image-based visual servoing (ESIBVS) method is presented in which a Kalman filter-based feature prediction algorithm is proposed and is combined with our previous work [15,16] to make the switch IBVS control robust in reaction to feature loss. The feature prediction algorithm can predict the lost feature points based on the previously-estimated points. The switch control with the improved tracking performance along with the robustness to feature loss makes it more feasible for industrial robotic applications. To validate the proposed controller, extensive simulations and experiments have been conducted on a 6-DOF Denso robot with a monocular eye-in-hand vision system.

The structure of the paper is given as follows. Section 2 gives a description of the problem. In Section 3, the feature reconstruction algorithm is presented. In Section 4, the controller design algorithm is developed. In Section 5, the simulation results are given. Experimental results are presented in Section 6, and finally, the concluding remarks are given in Section 7.

#### **2. Problem Statement**

In IBVS, the object with (*X*,*Y*, *Z*) coordinates with respect to camera has the projected image coordinates (*x*, *y*) in the camera image (Figure 1). The feature's positions and the desired ones for the *n*th feature in the image plane can be denoted by:

$$\mathbf{s}\_{\rm ll} = \begin{bmatrix} \mathbf{x}\_{\rm ll} \ y\_{\rm n} \end{bmatrix}^{T}, \qquad \mathbf{s}\_{\rm dn} = \begin{bmatrix} \mathbf{x}\_{\rm dn} \ y\_{\rm dn} \end{bmatrix}^{T} \tag{1}$$

Thus, the vector of *s* and *sd* is defined as:

$$\mathbf{s} = \begin{bmatrix} \mathbf{s}\_1 \\ \vdots \\ \mathbf{s}\_n \end{bmatrix} = \begin{bmatrix} \mathbf{x}\_1 \\ \mathbf{y}\_1 \\ \vdots \\ \mathbf{x}\_n \\ \mathbf{x}\_n \\ \mathbf{y}\_n \end{bmatrix}, \quad \mathbf{s}\_d = \begin{bmatrix} \mathbf{s}\_{d1} \\ \vdots \\ \mathbf{s}\_{dn} \end{bmatrix} = \begin{bmatrix} \mathbf{x}\_{d1} \\ \mathbf{y}\_{d1} \\ \vdots \\ \mathbf{x}\_{dn} \\ \mathbf{y}\_{dn} \end{bmatrix}. \tag{2}$$

The goal of the IBVS task is to generate camera velocity commands such that the actual features and the desired ones are matched in the image plane. The velocity of the camera is defined as *Vc*(*t*). The camera and image feature velocities are related by:

$$
\dot{s} = f\_{\text{imgl}} V\_{\text{c}} \tag{3}
$$

where,

$$J\_{\rm imp} = \begin{bmatrix} J\_{\rm imp}(s\_1, Z\_1) \\ \vdots \\ J\_{\rm imp}(s\_n, Z\_n) \end{bmatrix} \tag{4}$$

which is called the image Jacobian matrix and *Z*1, ... , *Zn* are the depths of the features *s*1, ...,*sn*. In this study, the system configuration is set as eye-in-hand, and the number of features is *n* = 4. Furthermore, it is assumed that all the features share the same depth *Z*. Considering these assumptions, the image Jacobian matrix for the *n*th feature is given in [17]:

$$f\_{\rm{img}}(s\_n) = \begin{bmatrix} \frac{f}{Z} & 0 & -\frac{\mathbf{x}\_n}{Z} & -\frac{\mathbf{x}\_n y\_n}{f} & \frac{f^2 + \mathbf{x}\_n^2}{f} & -y\_n \\ 0 & \frac{f}{Z} & -\frac{y\_n}{Z} & \frac{-f^2 - y\_n^2}{f} & \frac{\mathbf{x}\_n y\_n}{f} & \mathbf{x}\_n \end{bmatrix} \tag{5}$$

where *f* is the focal length of the camera.

The velocity of the camera can be calculated by manipulating (3):

$$\mathbf{V\_{c}} = \boldsymbol{I}\_{\mathrm{imp}}^{+} \boldsymbol{s}\_{\star} \tag{6}$$

where <sup>+</sup> *Jimg* is the pseudo-inverse of the image Jacobian matrix. The error signal is defined as *e* = *s* − *sd*. If we let *e*˙ = −*Kae*, the traditional IBVS control law may be designed as:

$$V\_c = -K\_a f\_{img}^+ \varepsilon\_r \tag{7}$$

where *Ka* is the proportional gain.

**Figure 1.** Schematic of the camera model.

While guiding the robot end-effector to make the desired image features match the actual ones, some unexpected situations may occur in IBVS. The first case is feature loss: i.e., some or all of the image features may go beyond the camera's FOV (Figure 2). The second case is feature occlusion: i.e., some or all of the image features temporarily become invisible to the camera due to obstacles. The goal of this paper is to improve the performance of IBVS in terms of response time and tracking performance, while dealing with the feature loss situation. To reach this goal, the performance of the switch method in our previous work [15,16] is enhanced when it is combined with the proposed feature reconstruction algorithm.

**Figure 2.** Desired and initial feature positions inside and outside the camera's field of view.

#### **3. Feature Reconstruction Algorithm**

The velocity of the camera *Vc* <sup>∈</sup> <sup>R</sup>(6×1) can be divided into the translational velocity *<sup>V</sup>* <sup>∈</sup> <sup>R</sup>(3×1) and rotating velocity *<sup>ω</sup>* <sup>∈</sup> <sup>R</sup>(3×1). Therefore, it can be expressed as:

$$V\_{\mathbb{C}} = \begin{bmatrix} V \\ \omega \end{bmatrix} = \begin{bmatrix} V\_x \\ V\_y \\ V\_z \\ \omega\_x \\ \omega\_y \\ \omega\_z \end{bmatrix},\tag{8}$$

Furthermore, for the *n*th feature (*n* = 1, 2, .., 4), the image Jacobian matrix in (5) can be divided into the translational part *Jt*(*sn*) and the rotating part *Jr*(*sn*):

$$J\_{\rm limg}(\mathbf{s}\_{\rm u}) = \begin{bmatrix} J\_{\rm l}(\mathbf{s}\_{\rm u}) & J\_{\rm l}(\mathbf{s}\_{\rm u}) \end{bmatrix}\_{\rm \prime} \tag{9}$$

where,

$$J\_t(s\_n) = \begin{bmatrix} \frac{f}{Z} & 0 & -\frac{\chi\_n}{Z} \\ 0 & \frac{f}{Z} & -\frac{\chi\_n}{Z} \end{bmatrix} \tag{10}$$

and:

$$f\_{\mathbf{f}}(\mathbf{s}\_{\mathbf{n}}) = \begin{bmatrix} -\frac{\chi\_{\mathbf{n}} y\_{\mathbf{n}}}{f} & \frac{f^2 + \mathbf{x}\_{\mathbf{n}}^2}{f} & -y\_{\mathbf{n}} \\ \frac{-f^2 - y\_{\mathbf{n}}^2}{f} & \frac{\chi\_{\mathbf{n}} y\_{\mathbf{n}}}{f} & \mathbf{x}\_{\mathbf{n}} \end{bmatrix}' \tag{11}$$

where *xn* and *yn* are the feature coordinates in the image space.

In the design of the switch controller, the movement of the camera during the control task is divided into three different stages [15,16]. In the first stage, the camera has only pure rotation. In the second stage, the camera has only translational movement. Finally, in the third stage, both camera rotation and translation are used to carry out the fine-tuning.

Considering (3), (8), (10) and (11), the feature velocity in the image plane can be expressed as: In the pure translational stage (first stage):

$$\begin{cases} \dot{X}\_{\text{il}} = \frac{f}{Z} V\_{\text{x}} - \frac{\chi\_{\text{n}}}{Z} V\_{\text{z}} \\\\ \dot{y}\_{\text{n}} = \frac{f}{Z} V\_{\text{y}} - \frac{y\_{\text{n}}}{Z} V\_{\text{z}} \end{cases} \tag{12}$$

In the pure rotating stage (second stage):

$$\begin{cases} \dot{\mathbf{x}}\_{\rm{il}} = -\frac{\mathbf{x}\_{\rm{n}y\_{\rm{n}}}}{f} \boldsymbol{\omega}\_{\rm{x}} + \frac{f^{2} + \mathbf{x}\_{\rm{n}}^{2}}{f} \boldsymbol{\omega}\_{\rm{y}} - y\_{\rm{n}} \boldsymbol{\omega}\_{\rm{z}}\\ \dot{\mathbf{y}}\_{\rm{n}} = -\frac{f^{2} + \mathbf{y}\_{\rm{n}}^{2}}{f} \boldsymbol{\omega}\_{\rm{x}} + \frac{\mathbf{x}\_{\rm{n}} y\_{\rm{n}}}{f} \boldsymbol{\omega}\_{\rm{y}} + \mathbf{x}\_{\rm{n}} \boldsymbol{\omega}\_{\rm{z}} \end{cases} \tag{13}$$

and in the fine-tuning stage (third stage):

$$\begin{cases} \dot{\mathbf{x}}\_{\text{ll}} = \frac{f}{Z}V\_{\text{x}} - \frac{\mathbf{x}\_{\text{n}}}{Z}V\_{z} - \frac{\mathbf{x}\_{\text{n}}y\_{\text{n}}(t\_{0})}{f}\omega\_{\text{x}} + \frac{f^{2} + \mathbf{x}\_{\text{n}}^{2}}{f}\omega\_{\text{y}} - y\_{\text{n}}\omega\_{\text{z}} \\\\ \dot{y}\_{\text{n}} = \frac{f}{Z}V\_{\text{y}} - \frac{y\_{\text{n}}}{Z}V\_{z} - \frac{f^{2} + y\_{\text{n}}^{2}}{f}\omega\_{\text{x}} + \frac{\mathbf{x}\_{\text{n}}y\_{\text{n}}}{f}\omega\_{\text{y}} + \mathbf{x}\_{\text{n}}\omega\_{\text{z}} \end{cases} \tag{14}$$

To remove the noise in the image processing and feature extraction, a feature state estimator is designed based on the Kalman filter algorithm.

In the formulations below, *k* denotes the current time instant and *k* + 1 the next time instant, while *Ts* represents the sampling time. The estimated states are denoted by ˆ notation. Considering four features, the feature state at the current instant (*k*th sample) is defined as:

$$X(k) = [\mathbf{x}\_1(k), \mathbf{y}\_1(k), \dots \mathbf{x}\_4(k), \mathbf{y}\_4(k), \dot{\mathbf{x}}\_1(k), \dot{\mathbf{y}}\_1(k), \dots, \dot{\mathbf{x}}\_4(k), \dot{\mathbf{y}}\_4(k)]^T,\tag{15}$$

or with consideration of (2):

$$X(k) = [s(k), \quad \quad \dot{s}(k)],\tag{16}$$

where the elements of the vector can be obtained from (12), (13), or (14). Furthermore, the measurement vector represents the vector of the image feature points' coordinates extracted from the images of the camera:

$$M(k) = [\mathbf{x}\_{m1}(k), \mathbf{y}\_{m1}(k), \dots \\ \mathbf{x}\_{m4}(k), \mathbf{y}\_{m4}(k), \dot{\mathbf{x}}\_{m1}(k), \dot{\mathbf{y}}\_{m1}(k), \dots, \dot{\mathbf{x}}\_{m4}(k), \dot{\mathbf{y}}\_{m4}(k)]^T \tag{17}$$

First, the prediction equations are:

$$\begin{aligned} \hat{X}(k|k-1) &= A\hat{X}(k-1|k-1) \\ P(k|k-1) &= AP(k-1|k-1)A^T + Q(k-1), \end{aligned} \tag{18}$$

where *A* is a 16 × 16 matrix whose diagonal elements equal one, *Ai*,*i*+8(*i* = 1, 2..., 8) are equal to sampling time *Ts*, and the rest of the elements are zero, *P*(*k*|*k* − 1) represents the current prediction of the error covariance matrix, which gives a measure of the state estimate accuracy, while *P*(*k* − 1|*k* − 1) is the previous error covariance matrix, and *Q*(*k* − 1) represents the process noise covariance computed using the information of the time instant (*k* − 1).

Second, the Kalman filter gain *D*(*K*) is:

$$D(k) = P(k|k-1)(P(k|k-1) + R(k-1)^{-1})\tag{19}$$

where *R*(*k* − 1) is the previous measurement covariance matrix.

Third, the estimation update is given as follows:

$$\begin{aligned} \hat{X}(k|k) &= \hat{X}(k|k-1) + D(k)(M(k) - \hat{X}(k|k-1) \\ P(k|k) &= P(K|k-1) - D(k)P(k|k-1), \end{aligned} \tag{20}$$

When the features are out of the FOV of the camera (i.e. *xmj*(*k*) = 0, *ymj*(*k*) = 0, *j* = 1, 2...4), the feature reconstruction algorithm is proposed to provide the updated estimation vector under this circumstance. Since the features are out of FOV, the measurement vector will have some elements with zero values. This measurement vector will not lead to a satisfactory performance of switch IBVS. In order to improve the performance, instead of having zero values of the elements of *M*(*k*) in (17), it is reasonable to assume that the *n*th feature that goes outside of FOV keeps its velocity at the moment (*t*0) of leaving (*s*˙*n*(*t*0)) during the period of feature loss. Hence, its position (i.e., point coordinates *sn*(*t*0)=[*xmn*(*t*0), *ymn*(*t*0)]) can be generated by integrating the velocity over the time. This means that the elements of *M*(*k*) can be represented by this formulation:

$$M(k) = [(K\_{\rm ad} \sum\_{l=0}^{b} \dot{s}\_{n}(t\_{0})T\_{\rm s} + s\_{n}(t\_{0})), \qquad \dot{s}\_{n}(t\_{0})], \tag{21}$$

where (*l* = 0, 1, 2, .., *b*) represents the number of time samples during the feature loss period, *Ts* is the sampling period, and *kad* is an adjusting coefficient. Once the feature is visible to the camera again, the actual value of *M*(*k*) provided by the camera is used to replace the state estimation (21).

#### **4. Controller Design**

The IBVS controller was designed using the switch scheme. This method can set distinct gain values for the stages of the control law to achieve a fast response system while preserving the system stability.

In order to design the switch controller, the movement of the camera during the control task was divided into three different stages [15,16]. A criterion was needed for the switch condition between stages. In [15], the norm of feature errors was defined as the switching criterion. In this paper, a more intuitive and effective criterion is used [16]. As is shown in Figure 3, the switch angle criterion *α* is introduced as the angle between actual features and the desired ones. As soon as the angle *α* meets the predefined value, the controller law switches to the next stage.

**Figure 3.** Switch angle criteria *α*: the angle between the desired and actual features.

Based on this criterion, the switching control law is presented as follows:

$$\begin{cases} V\_{\rm cs1} = -K\_1 f\_r^+ \mathcal{e}(\mathbf{s}), & |\boldsymbol{\alpha}| \ge \boldsymbol{\alpha}\_0 \\ V\_{\rm cs2} = -K\_2 f\_t^+ \mathcal{e}(\mathbf{s}), & \boldsymbol{\alpha}\_1 \le |\boldsymbol{\alpha}| < \boldsymbol{\alpha}\_0 \\ V\_{\rm cs3} = -K\_3 f\_{\rm inv}^+ \mathcal{e}(\mathbf{s}), & \text{otherwise} \end{cases} \tag{22}$$

where *Vcsi* (*i* = 1, 2, 3) is the velocity of the camera in the *i* th stage, *Ki* is the symmetric positive definite gain matrix at each stage, and *α*<sup>0</sup> and *α*<sup>1</sup> are two predefined thresholds for the control law to switch to the next stage. The block diagram of the proposed algorithm is shown in Figure 4. Furthermore, the flowchart of the whole process of feature reconstruction and control is illustrated in Figure 5.

It was expected that in comparison with switch IBVS, the proposed method would ensure the smooth transition of the visual servoing task in the case of the feature loss and provide a better convergence performance.

**Figure 4.** Block diagram of the proposed enhanced switch image-based visual servoing (IBVS) controller.

**Figure 5.** Flowchart of the Kalman filter feature reconstruction and control algorithm.

#### **5. Simulation Results**

To evaluate the performance of the proposed method, simulation tests were carried out by using MATLAB/SIMULINK software with the Vision and Robotic Toolbox. A 6-DOF DENSO robot with a camera installed in eye-in-hand configuration was simulated. The coordinates of the initial and desired features in the image space are given in Table 1. The camera parameters are as shown in Table 2.

**Table 1.** Test 1: simulation. Initial (I) and desired (D) feature point positions in pixels.

**Point 1 Point 2 Point 3 Point 4 (x, y) (x, y) (x, y) (x, y)** Test 1 I 376 757 202 621 20 814 218 969 D 612 312 612 512 812 512 812 312


**Table 2.** Camera parameters.

The task was to guide the end-effector to match the actual features with the desired ones in the camera image space. To simulate the condition where the features go outside FOV of the camera in real applications, the FOV of the camera was defined as the limited area shown in Figure 6a,b. When the features were in the defined FOV, they had actual position coordinates, and when they went outside FOV, the position coordinates of the features were set to zero. In this case, the proposed feature reconstruction algorithm was activated, and an estimate of the feature positions was generated. The norm of feature errors (NFE) is defined as below,

$$NFE = \sum\_{n=1}^{4} \sqrt{(\mathbf{x}\_n - \mathbf{x}\_{dn})^2 + (y\_n - y\_{dn})^2},\tag{23}$$

where *xn* and *yn* are the *n*th feature coordinates and *xdn* and *ydn* are the *n*th desired feature coordinates in the image plane.

In the simulation test, we set the initial feature coordinates and the desired ones in a way that the image features were out of FOV. Figures 6 and 7 demonstrate the performance comparison of the two methods. The paths of image features in the image space are given in Figure 6a,b. Figure 7a,b shows how the feature errors change with time in the proposed ESIBVSand switch method. Figure 7c,d demonstrates the norm of the feature errors' change with time in both methods. As shown in the figures, ESIBVS was able to reduce the norm of the errors to the preset threshold, while in the switch method, the norm of the errors did not converge. The summary of the simulation test is shown in Table 3. The results demonstrate how the proposed method was able to handle the situation in which the features went outside of the camera's FOV and completed the task successfully, while the switch method was unable to do so.

**Table 3.** Test 1: Comparison of simulation results between ESIBVSand switch IBVS.

**Figure 6.** Test 1: simulation. Image space feature trajectory comparison of enhanced switch IBVS and switch IBVS. (**a**) Image space feature trajectory in enhanced switch IBVS; (**b**) image space feature trajectory in switch IBVS.

**Figure 7.** *Cont.*

**Figure 7.** Test 1: simulation. Performance comparison of enhanced switch IBVS vs. switch IBVS. (**a**) Feature errors in enhanced switch IBVS; (**b**) feature errors in switch IBVS; (**c**) norm of feature errors in enhanced switch IBVS; (**d**) norm of feature errors in switch IBVS.

#### **6. Experimental Results**

In this section, to further verify the effectiveness of the proposed method, some experiments were carried out, and the results are presented. The experimental testbed included a 6-DOF DENSOrobot with a camera (specifications shown in Table 2) installed on its end-effector (Figure 8a). The camera model was a Logitech Webcam HD 720p, which captures the video with a resolution of 1280 × 720 pixels.

Two computers were used for the experimental tests. One computer carried out the image processing (PC2 in Figure 9) and sent the extracted feature coordinates to the other computer (PC1 in Figure 9), where the control algorithm was executed. Then, the control command (velocity of the end-effector) was sent to the robot controller. The image data taken by the camera were sent to an image processing program written by using the Computer Vision Toolbox of MATLAB. This program extracted the center coordinates of the features and sent them as feedback signals to the visual servoing controller in the sampling period of 0.001 s. Four feature points were used in the control task. The detailed information of the image processing and feature extraction algorithm can be seen in our previous work [32]. The goal was to control the end-effector so that the actual features matched the desired ones (Figure 8b).

To evaluate the efficiency of ESIBVS, its performance was compared to that of the switch IBVS method. In all the tests, the threshold value of NFE was set to 0.005 (equivalent to four pixels). When NFE reached this value, the robot stopped, and the servoing task was fulfilled. The initial angle *α* between the actual and desired features (Figure 3) was 50◦. *K*1, *K*2, and *K*<sup>3</sup> in (22) were set to 1, 0.4, and 0.3, respectively.

Test 2: In this test (The video can be found in the Supplementary Materials), the initial and desired features were set such that they went outside of the FOV of the camera during the test. The initial and desired feature coordinates in the test are given in Table 4. Figure 10 demonstrates the movement of actual features during the test of ESIBVS. It illustrates how the features went outside of FOV, then were reconstructed, went back to FOV, and finally matched the desired features.

**Figure 8.** (**a**) Experimental testbed, 6-DOF DENSO robot. (**b**) Actual and desired image features.

**Figure 9.** Structure of the experimental testbed.

**Figure 10.** Test 2: Snap shots of the camera image during the enhanced switch IBVS test: (**a**) Desired and actual feature positions at the start. (**b**) Actual features are out of FOV. (**c**–**e**) Features are reconstructed and returned to FOV. (**f**) Final match of the desired and actual features.


Figures 11–13 show the comparison results between ESIBVS and switch IBVS. Figure 11 shows the paths of features in the image space from the initial positions to the desired ones, as well as the camera trajectory in Cartesian space. In the proposed method, the actual and desired features matched, while in switch IBVS, the actual features did not converge to the desired ones. Figure 12 demonstrates the robot joint angles in ESIBVS and switch IBVS. Figure 13 shows the comparison regarding the feature errors. The feature errors and the norm of feature errors in the proposed method successfully converged to the desired values (Figure 13a,c), while in the switch IBVS, the task could not be completed, and thus, the feature errors did not converge (Figure 13b,d).

In order to further validate the performance of ESIBVS regarding the repeatability, the same test was repeated in 10 trials. The time of convergence and the final norms of feature error are shown in Table 5. The variations of feature error norms with time in 10 trials of ESIBVS are illustrated in Figure 14. As shown in the results, ESIBVS was able to overcome the feature loss and complete the task in each trial, while Switch IBVS was stuck in a point and did not converge.

**Figure 11.** Test 2: experiment: Image space feature trajectory and 3D camera trajectory in enhanced switch IBVS and switch IBVS. (**a**) Image space feature trajectory in enhanced switch IBVS; (**b**) image space feature trajectory in switch IBVS; (**c**) camera 3D trajectory in enhanced switch IBVS; (**d**) camera 3D trajectory in switch IBVS.

**Figure 12.** Test 2: experiment. Robot joint angles in enhanced switch IBVS and switch IBVS. (**a**) Joint angles (degree) in enhanced switch IBVS; (**b**) joint angles (degree) in switch IBVS.

**Figure 13.** Test 2: experiment. Comparison of the feature errors and the norm of feature errors in enhanced switch IBVS and switch IBVS. (**a**) Feature errors in enhanced switch IBVS; (**b**) feature errors in switch IBVS; (**c**) norm of feature errors in enhanced switch IBVS; (**d**) norm of feature errors in switch IBVS.


**Table 5.** Test 2: experiment. Repeatability comparison results.

**Figure 14.** Test 2: experiment. The time variations of feature error norms in 10 trials of ESIBVS.

Test 3: In this test, the performance of ESIBVS was compared with that of switch IBVS in the situation where the features did not leave the FOV of the camera. The initial and desired features were set in a way such that the features did not go outside of FOV (Table 6). Similar to the previous tests, ESIBVS and switch IBVS were compared, and the results are shown in Figures 15–17 and Table 7. As shown in the figures, ESIBVS had a 38% shorter convergence time than switch IBVS did, which was owed to the superior noise-filtering ability of the designed Kalman filter.


**Table 6.** Test 3: Experiment. Initial (I) and desired (D) feature point positions in pixels.

**Table 7.** Test 3: Comparison of experimental resuts between ESIBVS and Switch IBVS.


The experimental results showed the efficiency of ESIBVS in dealing with feature loss while keeping the superior performance of the switch IBVS over traditional IBVS. As already shown in our previous work [15,16], the switch method was proven to have a better performance in its response time and its tracking performance, making it more feasible for industrial applications in comparison with the conventional IBVS. However, it suffered the drawback of weakness in dealing with feature loss. The proposed ESIBVS solved this problem and made switch IBVS more robust by using the Kalman filter to reconstruct the lost features.

**Figure 15.** Test 3: experiment. Image space feature trajectory and 3D camera trajectory in enhanced switch IBVS and switch IBVS. (**a**) Image space feature trajectory in enhanced switch IBVS; (**b**) image space feature trajectory in switch IBVS; (**c**) camera 3D trajectory in enhanced switch IBVS; (**d**) camera 3D trajectory in switch IBVS.

**Figure 16.** Test 3: experiment. Robot joint angles in enhanced switch IBVS and switch IBVS. (**a**) Feature errors in enhanced switch IBVS; (**b**) feature errors in switch IBVS.

**Figure 17.** Test 3: experiment. Comparison of feature errors and the norm of feature errors in enhanced switch IBVS and switch IBVS. (**a**) Feature errors in enhanced switch IBVS; (**b**) feature errors in switch IBVS; (**c**) norm of feature errors in enhanced switch IBVS; (**d**) norm of feature errors in switch IBVS.

#### **7. Conclusions**

This paper proposed an enhanced switch IBVS for a 6-DOF industrial robot. An image feature reconstruction algorithm based on the Kalman filter was proposed to handle feature loss during the process of IBVS. The combination of a three-stage switch controller and feature reconstruction algorithm improved the system response speed and tracking performance of IBVS and simultaneously overcame the problem of feature loss during the task. The proposed method was simulated and then tested on a 6-DOF robotic system with the camera installed in an eye-in-hand configuration. Both simulation and experimental results verified the efficiency of the method. In the future, we may extend the method to make it more robust to uncertainties such as the depth of features and camera parameters. In addition, the effect of different sampling periods on the performance of the proposed ESIBVS will be investigated.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2079-9292/8/8/903/s1.

**Author Contributions:** The authors' individual contributions are provided as follow: conceptualization, A.G., P.L., W.-F.X. and W.T.; methodology, A.G., P.L., W.-F.X. and W.T.; software, A.G.; validation, A.G., P.L. and W.-F.X.; resources, W.-F.X.; writing–original draft preparation, A.G.; writing–review and editing, A.G., P.L., W.-F.X. and W.T.; supervision, W.-F.X. and W.T.; project administration, W.-F.X.; funding acquisition, W.-F.X.

**Funding:** This research was funded by Natural Sciences and Engineering Research Council of Canada grant number N00892.

**Conflicts of Interest:** The authors declare no conflict of interest. The funder had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **References**



© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Visual Closed-Loop Dynamic Model Identification of Parallel Robots Based on Optical CMM Sensor**

**Pengcheng Li 1, Ahmad Ghasemi 1, Wenfang Xie 1,\* and Wei Tian <sup>2</sup>**


Received: 27 June 2019; Accepted: 22 July 2019; Published: 26 July 2019

**Abstract:** Parallel robots present outstanding advantages compared with their serial counterparts; they have both a higher force-to-weight ratio and better stiffness. However, the existence of closed-chain mechanism yields difficulties in designing control system for practical applications, due to its highly coupled dynamics. This paper focuses on the dynamic model identification of the 6-DOF parallel robots for advanced model-based visual servoing control design purposes. A visual closed-loop output-error identification method based on an optical coordinate-measuring-machine (CMM) sensor for parallel robots is proposed. The main advantage, compared with the conventional identification method, is that the joint torque measurement and the exact knowledge of the built-in robot controllers are not needed. The time-consuming forward kinematics calculation, which is employed in the conventional identification method of the parallel robot, can be avoided due to the adoption of optical CMM sensor for real time pose estimation. A case study on a 6-DOF RSS parallel robot is carried out in this paper. The dynamic model of the parallel robot is derived based on the virtual work principle, and the built dynamic model is verified through Matlab/SimMechanics. By using an outer loop visual servoing controller to stabilize both the parallel robot and the simulated model, a visual closed-loop output-error identification method is proposed and the model parameters are identified by using a nonlinear optimization technique. The effectiveness of the proposed identification algorithm is validated by experimental tests.

**Keywords:** parallel robot; dynamic model; visual servoing; closed-loop output-error identification; optical CMM sensor

#### **1. Introduction**

Parallel robots are closed-chain mechanisms in which the end-effector is supported by a series of independent computer-controlled serial chains linked to the base platform. Parallel robots present some outstanding advantages in higher force-to-weight ratio and better stiffness compared with serial manipulators. Specifically, 6-DOF parallel robots have been used in various applications (e.g., flight simulators [1], manufacturing lines [2] and medical tools [3]). Due to the manufacturing tolerances and deflection in the robot structure, the typical positioning discrepancy between a virtual robot in simulation and a real robot can be 8–15 mm [4], which does not meet the precision requirement of many potential applications. The low absolute accuracy of the robot is the main problem for the off-line programming based applications where tens of thousand points or continuous trajectories are to be reached or tracked.

The existence of closed-chain mechanism and multiple moving parts in the parallel robots, for example, in a 6-DOF Gough-Stewart platform consisting of 13 moving bodies (12 legs and one

end-effector), yields difficulties in dynamic analysis of parallel robots. Moreover, the dynamic model plays an important role in the model-based controller designs, especially in applications where high positioning and tracking accuracy is needed. In contrast to serial robots, where the joint angles are used as the state variables for modeling, the parallel robots are preferably modeled and controlled in the operational (workspace) coordinates where the pose of the end-effector and its derivatives are used as the state variables [5]. The main reason is that there is almost no analytical expression for the forward kinematic model of 6-DOF parallel robots. For example, there are 40 forward kinematics solutions [6] mapping the joint coordinates to the pose of the end-effector platform in a Gough-Stewart platform. In addition, to this day, there is no effective method to determine the pose of the platform from multiple solutions. The dynamic models of 6-DOF Gough-Stewart parallel robot have been built by several approaches such as Newton-Euler [7], Lagrangian formulation [8] and the principle of virtual work [9]. Newton-Euler formulation can provide the internal forces for each individual body of the parallel robot, which can benefit the mechanical design process of the parallel robot. However, the computational load is high due to the large amount of equations. In contrast, the Lagrangian and virtual work methods are more efficient and suitable for the control design purpose since the reaction forces between the bodies of the parallel robot are not considered. The published research work on 6-DOF RSS parallel robots is considerably scarcer compared with that on Gough-Stewart parallel robot. The dynamic models of one type of 6-DOF RSS parallel robot, in which the active rotation axes are coplanar, are built based on Newton-Euler equations [10] or Lagrangian formulation [11] for dynamic analysis and tracking control purpose respectively. In this paper, the dynamic model of a 6-DOF RSS parallel robot, where the active rotation axes are parallel to each other, is built based on the virtual work principle, and the explicit form of the dynamic model is derived for identification and dynamic model-based visual servoing purposes.

The dynamic parameters are normally unknown or approximately derived from manufacturer specifications, which are not accurate enough for the dynamic model-based controller design. System identification is an effective method to perceive the uncertain parameters in the dynamic model of the system, and has been applied to many engineering practices [12]. As a highly coupled multi-input/multi-output (MIMO) nonlinear system, industrial robots aroused great interest and challenge for the identification method. The literatures on the state-of-the-art identification methods can be found in [13–15]. For the industrial robots, the dynamic identification is normally performed in closed-loop, since the robotic system is open loop unstable. In [16], a MIMO closed-loop identification based on weighted least square estimation has been applied to an industrial serial robot used in a planar configuration. In addition, other closed-loop identification methods with maximum likelihood, instrumental variable and related implementation issues on industrial serial robots are addressed in [17,18]. A new closed-loop output-error identification scheme has been adopted for the serial robots [19]. The output-error identification method aims at finding the dynamic model parameters by minimizing the output deviation between the actual and simulated systems subjecting to the same input [20]. In [19], the identification procedure is implemented in a closed-loop control structure and the joint torque is the measured output, which avoids the estimation of the velocity and acceleration from the measured joint position.

One potential issue of the above-mentioned identification methods is that joint torque measurement or a related control signal is needed for identification, which is not always available for the industrial robots, since the built-in controllers of many industrial robots are unaccessible and do not provide the torque actuation mode [21]. The input of built-in controllers is the position or velocity command, and the output is the joint torque which is unaccessible to the users. Hence the torque and current of the motors cannot be derived directly and it is not easy to install additional torque sensors to get the direct measurement. The unknown controller can be identified along with the dynamic parameters as introduced in [22]. However, joint torque measurement is still needed for identification purposes.

In [13,23–25], identification issues in parallel robots have been discussed. Most research works on dynamic identification of parallel robots are based on a simplified dynamic model, such as in [26–28]. Nevertheless, the systematic derivation of the full inverse dynamic model is proposed based on Jourdain's principle in [24]. In addition, the identification procedure is carried out in two steps: 1. identifying inertial parameters and 2. estimating the friction coefficients. In [23], the full parameters of robots' dynamic model and the joint drive gains are identified based on the total least square method, and the method is tested on a 3 DOF Orthoglide parallel robot. Many identification methods for 6 DOF parallel robots in [23–25] come directly from those of serial robots. By rewriting the inverse dynamic model and friction model into a linear form with respect to the dynamic and friction parameters, the identification of the unknown parameters can be done through least squares technique. However for 6 DOF parallel robots, to avoid solving forward kinematics, the regression matrix of the dynamic model is constituted of the pose state variables in the operational space. In [25], only joint space state variables are measured and estimated. Therefore the state variables in operational space should be calculated through numerical computation of the forward kinematics, which is quite time-consuming.

One way to avoid computing the forward kinematics in the identification process is to directly measure the pose of the end-effector in the operational space based on vision sensors. In [28], a camera-based dynamic identification procedure is given for a 4-DOF parallel robot with a heavily simplified dynamic model by considering only the inertia of the end-effector. Also, a vision sensor based dynamic identification has been carried out on serial robots [29] and cable robots [30], but seldom on parallel robots. The optical CMM sensor is a dual camera based vision sensor, which can provide the real-time pose information of the targets. The coordinates of target reflectors in the field of view (FOV) can be directly measured by the sensor. By observing four non-collinear reflectors on the platform, the pose of the end-effector can be measured. The optical CMM sensor has been applied to the kinematics calibration [31] and the path tracking controller design [32] for the robots. To increase the flexibility and tracking accuracy, vision can also be incorporated into the feedback control loop of the parallel robot systems to form the so-called visual servoing control. The pose of the end-effector can be acquired on-line by using the optical CMM sensor, i.e., C-track from Creaform Inc. (Levis, QC, Canada). The visual servoing controller for parallel robot is superior to the joint space controller due to the fact that the kinematic errors introduced from transforming the desired trajectory in the operational space into the one in the joint space in the joint controller can be avoided. The measured pose together with the visual servoing controller allows using the closed-loop output-error identification method [20] to identify the dynamic model of the parallel robot.

Upon the above discussions, a closed-loop output-error identification method based on a CMM sensor is proposed for parallel robots in this paper. The end-effector pose is measured by the optical CMM and served as the output of the real plant. The same outer loop visual servoing controller and reference trajectory are employed in both actual robot and simulation model for model identification. The forward kinematics of parallel robots, which is usually solved by using time-consuming numerical algorithm, can be avoided. The exact knowledge of the built-in controller and the joint torque are not needed. The dynamic model parameters are identified by using nonlinear optimization technique. The experimental tests validate the identification results.

This paper is organized as follows. Section 2 describes the dynamic model of a 6-RSS parallel robot. The closed-loop output-error identification method is proposed and the procedure of the identification is presented in Section 3. The dynamic model validation based on simulation and the experiment results of the identification are given in Section 4. Finally, the conclusion is drawn in Section 5.

#### **2. Dynamic Modeling**

In this section, the kinematic analysis of a 6 RSS parallel robot is conducted and the dynamic model is derived based on the virtual work principle.

#### *2.1. Kinematic Analysis*

The motivation of kinematic analysis is to determine geometry mapping from the motion of end-effector frame w.r.t the base frame (operational space motion) to the rotation of the actuators, as regarding the revolute joint frames (joint space motion). Based on the geometry mapping, the link Jacobian matrix is derived for building the dynamic model. As shown in Figure 1a, the 6-RSS parallel robot consists of six identical serial branch chains. Each serial branch, illustrated in Figure 1b, consists of a wrench, a link, an active revolute joint (R) and two passive spherical joints (S). One spherical joint is used to connect the wrench and the link. The revolute joint is driven by actuators and connects the wrench and the base platform, while the spherical joint is employed between the link and the end effector. The base frame *ΣO* is assigned at the symmetric center of the base platform and the end-effector frame *ΣE* is also attached at the symmetric center, while *E* denotes the coordinate vector of the frame origin regarding the base frame. The coordinate vectors of revolute joint centers regarding the base frame are marked by *Bi*(*i* = 1, 2, ... , 6) while the rotation centers of spherical joints are represented as *Ti* and *Ai* respectively. In the subsequent kinematic and dynamic analysis, the coordinates of the parts of parallel robot are defined regarding the base frame by default. The moving wrench and link frames *ΣWi* and *ΣLi* are attached to the wrenches and links respectively as depicted in Figure 1b. The coordinate vectors of the centers of mass of the wrenches and links are denoted as *cwi* and *cli* respectively. The vector *θ* = [*θ*1, *θ*2, ... , *θ*6] *<sup>T</sup>* represents the rotation angles of the actuators. The coordinate vector from *E* pointing to *Ai* regarding the end-effector frame is denoted as *ai*. And the coordinate vectors *wi* and *li* represent the directions and length of the wrench and link. The pose vector of the end-effector frame is expressed as *χ<sup>E</sup>* = [*h*, *φ*] *<sup>T</sup>*, where *h* = [*x*, *y*, *z*] *<sup>T</sup>* represents the position of the end-effector frame origin, while *φ* = [*α*, *β*, *γ*] *<sup>T</sup>* represents the Euler-angle rotation of the frame. The rotation matrix, R, is given by

$$R = R\_{\mathfrak{X}}(a)R\_{\mathfrak{Y}}(\mathfrak{f})R\_{\mathfrak{z}}(\gamma) = \begin{pmatrix} c\mathfrak{f}c\gamma & -c\mathfrak{f}s\gamma & s\mathfrak{f} \\ c\alpha s\gamma + c\gamma s\mathfrak{f}s\alpha & -s\mathfrak{f}s\alpha s\gamma + c\alpha c\gamma & -c\mathfrak{f}s\alpha \\ s\alpha s\gamma - c\alpha c\gamma s\mathfrak{f} & c\alpha s\mathfrak{f}s\gamma + c\gamma s\alpha & c\mathfrak{f}c\alpha \end{pmatrix},\tag{1}$$

where *Rx*, *Ry*, *Rz* are the orthonormal rotation matrices for the rotation about *X*,*Y*, *Z* axes respectively. The vector *χ***˙***<sup>E</sup>* = *h***˙** *<sup>T</sup>*, *φ***˙** *<sup>T</sup> T* and *χ***¨***<sup>E</sup>* = *h***¨** *<sup>T</sup>*, *φ***¨** *<sup>T</sup> T* are the first and second order time derivatives of *χE*. The vector *vE* = *h***˙** *<sup>T</sup>*, *ω<sup>T</sup> T* ∈ *<sup>R</sup>*6×<sup>1</sup> denotes the linear and angular velocities of the end-effector frame. Then *v***˙***<sup>E</sup>* is the acceleration vector. The relationship between the Euler angle rate *φ***˙** and angular velocity *ω* is expressed as follows

$$
\omega = \restriction{\varrho},
\tag{2}
$$

where *Je* = ⎡ ⎢ ⎣ 1 0 *sβ* 0 *cα* −*cβ sα* 0 *sα cβ cα* ⎤ ⎥ <sup>⎦</sup> is the analytical Jacobian matrix, *<sup>s</sup>* and *<sup>c</sup>* stand for *sin* and

*cos* respectively.

The following assumptions are made for kinematic and dynamic analysis of a 6-RSS parallel robot:


As shown in Figure 1a,b, the closure loop position relationship between the end-effector frame and the base frame can be expressed as the following:

$$E + a\_l - A\_l = 0 \quad \text{and} \tag{3}$$

$$B\_i + \mathfrak{z}w\_i + l\_i - A\_i = 0, i = 1, 2, \dots, 6. \tag{4}$$

Then the linear velocity can be derived by combining Equations (3) and (4) and taking the time derivative:

$$\mathbf{h} + \boldsymbol{\omega} \times \mathbf{a}\_{\mathrm{i}} = \boldsymbol{\omega}\_{\mathrm{1}} \times \mathbf{w}\_{\mathrm{i}} + \boldsymbol{\omega}\_{\mathrm{2}} \times \mathbf{l}\_{\mathrm{i}}, \mathrm{i} = \mathbf{1}, \mathbf{2}, \dots, \mathbf{6}, \tag{5}$$

where *ω***<sup>1</sup>** and *ω***<sup>2</sup>** are the angular velocity of frame *ΣWi* and *ΣLi* regarding *ΣO* respectively.

Then, dot multiplying *li* on both sides of Equation (5) yields:

$$d\_i \dot{\mathbf{h}} + (a\_i \times l\_i)\omega = (w\_l \times l\_i) \cdot \omega \mathbf{u}\_l, i = 1, 2, \dots, 6. \tag{6}$$

Since the wrench rotates around a fixed axis which is denoted as *s***ˆ** = [0, 0, 1] *<sup>T</sup>*, the Jacobian matrix mapping from the end-effector Cartesian velocity to joint velocity can be derived by the following:

$$J\_{a1}\boldsymbol{\theta} = J\_{a2}\boldsymbol{\upsilon}\_{\rm E} \tag{7}$$

where *Ja*<sup>1</sup> = *diag*((*w***<sup>1</sup>** × *l***1**) · *s***ˆ**,(*w***<sup>2</sup>** × *l***2**) · *s***ˆ**,...,(*w***<sup>6</sup>** × *l***6**) · *s***ˆ**), *Ja*<sup>2</sup> = ⎡ ⎢ ⎣ *lT* **<sup>1</sup>** (*a***<sup>1</sup>** × *<sup>l</sup>***1**)*<sup>T</sup>* . . . . . . *lT* **<sup>6</sup>** (*a***<sup>6</sup>** × *<sup>l</sup>***6**)*<sup>T</sup>* ⎤ ⎥ ⎦.

When the robot works in the singularity-free operational space, the Jacobian matrix *Jad* can be derived as follows:

$$
\dot{\boldsymbol{\theta}} = \boldsymbol{J}\_{a1}^{-1} \boldsymbol{J}\_{d2} \boldsymbol{\nu} \boldsymbol{\varepsilon} = \boldsymbol{J}\_{ad} \boldsymbol{\nu} \boldsymbol{\varepsilon}.\tag{8}
$$

Then the translational velocity of the center of mass of the wrench *cw*˙ *<sup>i</sup>* can be obtained from Equation (9).

$$
\mathfrak{c}\_{\mathfrak{w}\_i} = \theta\_i \mathfrak{k} \times \mathfrak{c}\_{\mathfrak{w}\_i} = f\_{\mathfrak{a}\mathfrak{v}} \mathfrak{v}\_{\mathbb{E}}.\tag{9}
$$

where *Jau* = (*s***<sup>ˆ</sup>** × *cwi*)*Jadi* ∈ *<sup>R</sup>*3×6, and *Jadi* is the *<sup>i</sup>*th row of *Jad*. Then the link Jacobian *Ja* mapping *vE* to the velocity of the center of mass of the *i*th wrench *v***<sup>1</sup>** = [*cw*˙ *<sup>i</sup> <sup>T</sup>*, *ω<sup>T</sup>* 1*i* ] *<sup>T</sup>* can be derived as

$$
\boldsymbol{\sigma}\_{\mathbf{1}} = \begin{bmatrix} \boldsymbol{\sigma}\_{\mathbf{w}\_{i}} \\ \boldsymbol{\omega}\_{\mathbf{1}\boldsymbol{i}} \end{bmatrix} = \begin{bmatrix} \boldsymbol{J}\_{d\boldsymbol{u}} \\ \mathfrak{F}\_{d\boldsymbol{d}i} \end{bmatrix} \boldsymbol{\sigma}\_{\mathbf{E}} = \boldsymbol{J}\_{d} \boldsymbol{\sigma}\_{\mathbf{E}}.\tag{10}
$$

Equation (11) can be deduced by right cross multiplying *li* on the both sides of Equation (5) and using Lagrange's rule.

$$(\omega\_2 \cdot l\_i) \cdot l\_i - (l\_i \cdot l\_i) \cdot \omega\_2 = h \times l\_i + (\omega \times \mathfrak{a}\_l) \times l\_l - (\omega\_1 \times \mathfrak{w}\_l) \times l\_l. \tag{11}$$

Since the link does not rotate about its longitudinal axis, *ω***<sup>2</sup>** · *li* = 0 holds. Rearranging Equation (11), the following equation can be derived:

$$\|\|\|l\_{l}\|\|^{2}\,\omega\_{2} = [l\_{l}]\_{X}\hat{\mathfrak{h}} - [l\_{l}]\_{X}\left[\mathfrak{a}\_{l}\right]\_{X}\,\omega + [l\_{l}]\_{X}\left[\mathfrak{w}\_{l}\right]\_{X}\,\omega\_{1\prime} \tag{12}$$

where the operator [·]*<sup>X</sup>* and · represents the cross product operation and Euclidean norm respectively.

By combining Equations (8) and (12), the following Jacobian matrix *Jbd* mapping from the end-effector Cartesian velocity to the angular velocity of the link frame can be deduced:

$$
\omega\_2 = f\_{bd} \mathbf{v}\_{E'} \\
$$

$$
J\_{bd} = \frac{1}{||\,I\_l||^2} \{ \left[ \begin{array}{c} [I\_l]\_X - [I\_l]\_X[\mathbf{a}\_l]\_X \end{array} \right] + [I\_l]\_X[\mathbf{w}\_l]\_X \\$f\_{adi} \}. \tag{13}
$$

The translational velocity of the center of mass of the link ˙ *cli* can be obtained as follows:

$$
\sigma\_{l\_l} = -[\mathfrak{w}\_l]\_X \omega\_\mathbf{1} - [\mathfrak{c}\_{l\_l}]\_X \omega\_\mathbf{2} \tag{14}
$$

By substituting Equations (10) and (13) into Equation (14), the following velocity relationship can be derived:

$$\mathfrak{c}\_{l\_i} = (-[\mathfrak{w}\_l]\_X \mathfrak{s})\_{adi} - [\mathfrak{c}\_{l\_i}]\_X l\_{bd} \, \mathfrak{w}\_E = f\_{bu} \mathfrak{w}\_E \tag{15}$$

Hence the link Jacobian *Jb* mapping *vE* to the velocity of the center of mass of the *i*th link *v***<sup>2</sup>** can be derived.

$$
\sigma \mathbf{v}\_2 = \begin{bmatrix} \mathbf{c}\_{l\_l} \\ \omega\_{2i} \end{bmatrix} = \begin{bmatrix} J\_{bu} \\ J\_{bd} \end{bmatrix} \mathbf{v}\_E = J\_b \mathbf{v}\_E \tag{16}
$$

**Figure 1.** (**a**) The sketch of 6-RSS parallel robot, (**b**) Single serial branch.

#### *2.2. Dynamic Modeling*

In contrast to serial robots, the dynamic modeling of parallel robots is more complicated due to its closure geometrical structure and difficulties in deriving the forward kinematics. The principle of virtual work is employed to derive the explicit form of the dynamic model in terms of the operational coordinates and their time derivatives as shown in Equation (17) for 6-DOF RSS robot, which is useful for dynamic model-based controller design.

$$
\pi\_{\mathcal{X}} = M(\chi\_E)\dot{\omega}\_E + C(\chi\_E, \omega\_E)\omega\_E + G(\chi\_E) + \pi\_{f,} \tag{17}
$$

where *τ<sup>g</sup>* denotes the general force acting on the end-effector frame, *τ<sup>f</sup>* is the friction, *M*(*χE*) is the mass matrix, *C*(*χE*, *vE*) is Coriolis and centrifugal matrix, and *G*(*χE*) denotes the gravity. In order to avoid solving the forward kinematics of the parallel robot which may not have analytical solutions, the pose in the coordinates of the end-effector and its time derivatives are employed in Equation (17).

The balance equation of virtual work principle for a moving rigid body, ∗, can be expressed as follows:

$$
\mathcal{F}\_\* \cdot \delta \chi\_\* = (\mathcal{F}\_{\text{ext}\_\*} + \mathcal{F}\_\*) \cdot \delta \chi\_\* = 0,\tag{18}
$$

in which *F*¯ <sup>∗</sup> contains the static balance force and torque, *Fext<sup>∗</sup>* = [*<sup>f</sup> <sup>T</sup> ext<sup>∗</sup>* , *<sup>τ</sup><sup>T</sup> ext<sup>∗</sup>* ] *<sup>T</sup>* is the external force (*fext<sup>∗</sup>* ) and torque (*τext<sup>∗</sup>* ) exerted on the center of mass of the body respectively, *δχ*<sup>∗</sup> denotes the virtual displacement, and the fictitious force and torque are:

$$\mathbf{f}\_{\*} = \begin{bmatrix} m\_{\*}\mathbf{g} - m\_{\*}\ddot{\mathbf{h}}\_{\*} \\ -(I\_{\*}\dot{\boldsymbol{\omega}}\_{\*} + \boldsymbol{\omega}\_{\*} \times I\_{\*}\boldsymbol{\omega}\_{\*}) \end{bmatrix} \tag{19}$$

where *<sup>m</sup>*<sup>∗</sup> is the mass of the body, *<sup>g</sup>* is the gravity vector, *<sup>h</sup>*¨ <sup>∗</sup> is the linear acceleration of center of mass of the body, *I*<sup>∗</sup> is the moment of inertia, *ω*<sup>∗</sup> and *ω*˙<sup>∗</sup> denote the angular velocity and acceleration of the moving body frame. Further, *F*¯ <sup>∗</sup> can be represented in the form similar to Equation (17):

$$\begin{aligned} \vec{F}\_{\*} &= F\_{\text{ext}\_{\*}} + M\_{\*} \vec{\boldsymbol{\nu}}\_{\*} + \mathbb{C}\_{\*} \boldsymbol{\nu}\_{\*} + G\_{\*}, \text{ where} \\ M\_{\*} &= \begin{bmatrix} -m\_{\*} E\_{3 \times 3} & \mathbf{0}\_{3 \times 3} \\ \mathbf{0}\_{3 \times 3} & -I\_{\*} \end{bmatrix}, \mathsf{C}\_{\*} = \begin{bmatrix} \mathbf{0}\_{3 \times 3} & \mathbf{0}\_{3 \times 3} \\ \mathbf{0}\_{3 \times 3} & -\omega\_{\*} \times I\_{\*} \end{bmatrix}, \\ G\_{\*} &= \begin{bmatrix} m\_{\*} \mathbf{g} \\ \mathbf{0}\_{3 \times 1} \end{bmatrix}, \vec{\boldsymbol{\nu}}\_{\*} = \begin{bmatrix} \ddot{\boldsymbol{h}}\_{\*} \\ \dot{\boldsymbol{\omega}}\_{\*} \end{bmatrix}, \vec{\nu}\_{\*} = \begin{bmatrix} \dot{\boldsymbol{h}}\_{\*} \\ \boldsymbol{\omega}\_{\*} \end{bmatrix} \end{aligned} \tag{20}$$

in which *<sup>E</sup>*3×<sup>3</sup> <sup>∈</sup> *<sup>R</sup>*3×<sup>3</sup> denotes the identity matrix. For a 6-DOF RSS parallel robot, there are 13 moving bodies including the end-effector, 6 wrenches and 6 links. Therefore the balance equation of the 6-DOF parallel robot can be rewritten as Equation (21):

$$
\vec{F}\_p \cdot \delta \chi\_{\mathbf{f}} + \sum\_{i=1}^6 \vec{F}\_{l\_i} \cdot \delta \chi\_{I\_i} + \sum\_{i=1}^6 \vec{F}\_{w\_i} \cdot \delta \chi\_{w\_i} = 0,\tag{21}
$$

where *F*¯ *<sup>p</sup>*, *F*¯ *li* and *<sup>F</sup>*¯ *wi* contain the static balance force and torque exerted on the centers of mass of the platform, links and wrenches respectively and can be represented in the same form as Equation (20), *δχe*, *δχli* , and *δχwi* are the virtual displacements accordingly.

In addition, the following relations hold for the velocity analysis:

$$
\delta\theta = \mathfrak{J}\_{\text{ad}} \delta \mathfrak{x}\_{\mathfrak{e}}, \ \delta \mathfrak{x}\_{\mathfrak{U}^{\mathfrak{U}}} = \mathfrak{J}\_{\text{d}} \delta \mathfrak{x}\_{\mathfrak{e}}, \ \delta \mathfrak{X}\_{\mathfrak{l}i} = \mathfrak{J}\_{\text{b}} \delta \mathfrak{X}\_{\mathfrak{e}}.\tag{22}
$$

Substituting Equation (22) into Equation (21), the terms in Equation (17) can be derived as the following:

$$\begin{split} \mathcal{M}(\chi\_{E}) &= \mathcal{M}\_{\mathcal{P}} + \sum\_{i=1}^{6} (J\_{a\_{i}}^{T} \mathcal{M}\_{\overline{u}\_{i}} I\_{d\_{i}} + J\_{b\_{i}}^{T} \mathcal{M}\_{l\_{i}} I\_{b\_{i}}), \\ \mathcal{C}(\chi\_{E}, \mathbf{\sigma}\_{E}) &= \mathbf{C}\_{\mathcal{P}} + \sum\_{i=1}^{6} (J\_{a\_{i}}^{T} \mathcal{C}\_{\overline{u}\_{i}} I\_{d\_{i}} + J\_{a\_{i}}^{T} \mathcal{M}\_{\overline{u}\_{i}} I\_{a\_{i}} + J\_{b\_{i}}^{T} \mathcal{C}\_{l\_{i}} I\_{b\_{i}} + J\_{b\_{i}}^{T} \mathcal{M}\_{l\_{i}} I\_{b\_{i}}), \\ \mathcal{G}(\chi\_{E}) &= \mathbf{G}\_{p} + \sum\_{i=1}^{6} (J\_{a\_{i}}^{T} \mathcal{G}\_{w\_{i}} + J\_{b\_{i}}^{T} \mathcal{G}\_{l\_{i}}), \\ \mathbf{\tau}\_{g} &= \mathbf{J}\_{\text{all}}^{T} \mathbf{\tau}\_{\mathbf{u}\_{f}} \end{split} \tag{23}$$

where *τ<sup>a</sup>* = [*τa*<sup>1</sup> , *τa*<sup>2</sup> ...*τa*<sup>6</sup> ] *<sup>T</sup>* is the actuator torque vector applying on the revolute joints, and the details of Equation (23) are given in Appendix A. The joint friction is described by Coulomb model [33] that has been used in the modelings of both parallel robots [24] and serial robots. Based on this friction model, the friction *τ<sup>f</sup>* in Equation (17) can be represented as:

$$\mathbf{r}\_{f} = J\_{ad}^{T} \begin{bmatrix} f\_{c\_1} \operatorname{sign}(f\_{ad\_1} \mathbf{v}\_E) + f\_{v\_1} f\_{ad\_1} \mathbf{v}\_E \\ f\_{c\_2} \operatorname{sign}(f\_{ad\_2} \mathbf{v}\_E) + f\_{v\_2} f\_{ad\_2} \mathbf{v}\_E \\ \cdots \\ f\_{c\_6} \operatorname{sign}(f\_{ad\_6} \mathbf{v}\_E) + f\_{v\_6} f\_{ad\_6} \mathbf{v}\_E \end{bmatrix} \tag{24}$$

where *fci* and *fvi* are the Coulomb and viscous friction parameters of the *i*th revolute joint.

#### *2.3. Dynamic Model Simplification*

Since the kinematic parameters of parallel robots can be obtained from the manufacturer specifications or by calibration [31], only inertia and friction parameters are considered for the model identification of parallel robots. Considering the heavy computation load of solving inverse dynamic model for the dynamic identification and visual servoing purpose, a reduction of the dynamic parameters with the trade-off between the computation load and accuracy should be implemented. The geometry feature of the parallel robot can be considered for simplification. For the 6-RSS parallel robot, it is assumed that the wrenches, links and end-effector are symmetric. Furthermore, the center of mass is assumed to be located in the symmetric center. Therefore, the dynamic parameters for each body of the wrenches, links and end-effector can be reduced to *ξ<sup>∗</sup>* = [*m*∗, *Ix*<sup>∗</sup> , *Iy*<sup>∗</sup> , *Iz*<sup>∗</sup> ]. Then Equation (17) can be rewritten as the linear form w.r.t. the dynamic parameters and the friction coefficients:

$$
\pi\_{\mathcal{X}} = \Gamma(\chi\_{E'} \upsilon\_{E'} \dot{\upsilon}\_{E}) \Xi,\tag{25}
$$

where **Ξ** = [*ξ<sup>T</sup> <sup>p</sup>*, *ξ<sup>T</sup> <sup>w</sup>***1**, *<sup>ξ</sup><sup>T</sup> <sup>w</sup>***2**, ... , *<sup>ξ</sup><sup>T</sup> <sup>w</sup>***6**, *<sup>ξ</sup><sup>T</sup> <sup>l</sup>***1**, *<sup>ξ</sup><sup>T</sup> <sup>l</sup>***2**, ... , *<sup>ξ</sup><sup>T</sup> <sup>l</sup>***6**, *fc*<sup>1</sup> , *fc*<sup>2</sup> , ... , *fc*<sup>6</sup> , *fv*<sup>1</sup> , *fv*<sup>2</sup> , ... , *fv*<sup>6</sup> ] *<sup>T</sup>* is a *R*64×<sup>1</sup> vector of dynamic parameters and the friction coefficients, and Γ(*χE*, *vE*, *v***˙***E*) is the regressor matrix, which consists of the kinematic parameters, state variables and their derivatives. Γ(*χE*, *vE*, *v***˙***E*) can be derived using the Symbolic Math Toolbox of Matlab. Given an exciting trajectory as a reference input to the robot, which will be introduced in Section 3.3, Equation (26) can be obtained by reorganizing Equation (25).

$$
\begin{bmatrix}
\mathsf{T}\_{\mathfrak{g}\_{1}} \\
\mathsf{T}\_{\mathfrak{g}\_{2}} \\
\vdots \\
\mathsf{T}\_{\mathfrak{g}\_{n}}
\end{bmatrix} = \begin{bmatrix}
\Gamma(\chi\_{E\_{1}\prime}\upsilon\_{E\_{1}\prime}\dot{\upsilon}\_{E\_{1}}) \\
\Gamma(\chi\_{E\_{2}\prime}\upsilon\_{E\_{2}\prime}\dot{\upsilon}\_{E\_{2}}) \\
\vdots \\
\Gamma(\chi\_{E\_{n}\prime}\upsilon\_{E\_{n}\prime}\dot{\upsilon}\_{E\_{n}})
\end{bmatrix} \mathsf{\Xi} = H\mathsf{\Xi},\tag{26}
$$

where *n* is the number of the sampled poses from the given trajectory. By feeding various testing trajectories to the robot, the regression matrix *H* is of full rank which means all elements of **Ξ** can be identified.

#### **3. Closed-Loop Output-Error Identification Based on Vision Feedback**

#### *3.1. Pose Estimation Using Optical CMM*

As shown in Figure 2, a dual-camera optical CMM C-track 780 is employed to measure the pose of end-effector for the identification of the dynamic model in this research. The pose measurement principle of the optical CMM sensor is presented in this subsection. The target reflectors are taken as the point features. The homogeneous coordinates of the reflectors w.r.t. the sensor frame can be obtained by using triangulation principle [34]. Given a group of non-collinear reflectors *pi*(*i* = 1, 2, ... , *n*) attached on the end-effector platform, the homogeneous coordinates in the sensor frame can be measured and denoted as *<sup>C</sup>***P***<sup>i</sup>* = [*xpi*, *ypi*, *zpi*, 1] *<sup>T</sup>*. In addition, the homogeneous coordinates of the reflectors w.r.t. the end-effector frame *ΣE* denoted as *<sup>E</sup>***P***<sup>i</sup>* = (*Exi*, *Eyi*, *Ezi*, 1) are known from the definition of the end-effector frame *ΣE*. Correspondingly, the transformation equation of *i*th feature point can be written as:

$$\prescript{C}{}{\mathbf{P}}\_i = \prescript{C}{}{\mathbf{T}}\_E \prescript{E}{}{\mathbf{P}}\_{i\nu} \tag{27}$$

where *<sup>C</sup>***T***<sup>E</sup>* is the homogeneous transformation matrix mapping from *ΣE* to *ΣC*. In order to derive *<sup>C</sup>***T***E*, at least three non-collinear feature points are required [35]. However, as indicated in [36], at least four coplanar feature points are needed for obtaining a unique solution while additional non-coplanar feature points can be used to improve the estimation accuracy with measurement noise. Similarly, the homogeneous transformation matrix mapping from *ΣO* to *ΣC*, *<sup>C</sup>***T***O*, can be derived. Then the

pose of the end-effector frame w.r.t. the base frame, *χE*, can be obtained from the homogeneous transformation matrix *<sup>O</sup>***T***<sup>E</sup>* given by Equation (28).

$$\prescript{O}{}{\mathbf{T}}\_{E} = \prescript{C}{}{\mathbf{T}}\_{O}^{-1} \prescript{C}{}{\mathbf{T}}\_{E}.\tag{28}$$

By using the proprietary software VXelements provided by Creaform Inc., the target frames can be defined based on the selected reflectors on the surface of the end-effector and base frame respectively. The real-time position and rotation information of the end-effector frame w.r.t. the base frame can be acquired, recorded or displayed simultaneously. In addition, the computation associated with the pose estimation of the target frame is carried out by VXelements.

**Figure 2.** The pose measurement system of a 6-RSS parallel robot.

#### *3.2. Closed-Loop Output-Error Identification Method*

The basic idea of output error identification is to use nonlinear optimization technique to minimize the squared error between the output of the real plant and the simulated model. Since the motor torque is unaccessible, the pose of the end-effector, *χE*, measured by optical CMM is used as the output of the system. Hence the closed-loop output-error identification method is adopted [19]. In conventional closed-loop output-error identification method, the controllers should be exactly known and applied to both the real plant and the simulated model. However for an industrial robot, the built-in controllers are usually unknown and need to be identified.

The closed-loop output-error identification approach for vision based robotic system, as depicted in Figure 3, is proposed in this paper. For the built-in controller of the 6-RSS parallel robot, a PID controller is used to control the joint angle of each revolute joint. However the three gains of PID controller are unknown and needed to be identified. During the process of identification, the gains and dynamic parameters are updated in each iteration of nonlinear optimization, which may make the simulated model unstable. Hence, an outer loop visual servoing controller is added to stabilize both real robot and simulated model. The visual servoing controller and the built-in controller construct a cascade PID controller. As stated in [37,38], the cascade controller yields better dynamic performance in terms of stability and working frequency compared with single loop controller. With a well-tuned outer loop visual servoing controller, both the real and simulation systems can have a better performance and stability.

**Figure 3.** The block diagram of closed-loop control system for model identification of 6-RSS parallel robot.

In Figure 3, for both the real plant and simulated model, the same exciting trajectory is given as the input *χd***(***t***)**. Then the outer loop visual servoing controller is designed as Equation (30). The inverse kinematic Jacobian *J<sup>θ</sup>* is used to transform the velocity in operational space to that in joint space. By combining Equations (2) and (8), Jacobian matrix *J<sup>θ</sup>* is derived as shown in Equation (29) and the joint position control signal *Uθ***(***t***)** is generated as shown in Equation (30).

$$
\boldsymbol{\theta} = \boldsymbol{J}\_{\text{ad}} \left[ \begin{array}{cc} \boldsymbol{E}\_{3 \times 3} & \mathbf{0}\_{3 \times 3} \\ \mathbf{0}\_{3 \times 3} & \boldsymbol{J}\_{\text{c}} \end{array} \right] \dot{\mathbf{x}} = \boldsymbol{J}\_{\boldsymbol{\theta}} \dot{\mathbf{x}}\_{\text{E}}.\tag{29}
$$

$$\mathcal{U}\_{\theta}(\mathbf{t}) = l\_{\theta} [k\_{\mathcal{P}}(\chi\_{d}(\mathbf{t}) - \chi\_{\mathfrak{m}}(\mathbf{t})) + k\_{d}(\dot{\chi}\_{d}(\mathbf{t}) - \dot{\chi}\_{\mathfrak{m}}(\mathbf{t})) + k\_{i} \int (\chi\_{d}(\mathbf{t}) - \chi\_{\mathfrak{m}}(\mathbf{t}))dt],\tag{30}$$

where *ki*, *kp*, *kd* are constants, and *χm***(***t***)** is the visual feedback obtained from VXelements software, while in the simulation *χm***(***t***)** is replaced by *χs***(***t***)** which is the pose calculated by using the forward dynamic model.

In addition, then the PID controller, given in Equation (31), is used to describe the built-in joint controller in each joint and is employed in the simulated model.

$$\pi\_{\mathfrak{a}}(t) = l\_p(\mathcal{U}\_{\mathfrak{b}}(\mathfrak{t}) - \theta\_{\mathfrak{s}}(\mathfrak{t})) + l\_{\mathfrak{d}}(\mathcal{U}\_{\mathfrak{b}}(\mathfrak{t}) - \theta\_{\mathfrak{s}}(\mathfrak{t})) + l\_i \int (\mathcal{U}\_{\mathfrak{b}}(\mathfrak{t}) - \theta\_{\mathfrak{s}}(\mathfrak{t}))dt,\tag{31}$$

where *li*, *lp*, *ld* are the PID parameters to be identified, and *θs***(***t***)** is the joint positions, which can be obtained by analytically solving the inverse kinematics.

The real plant output *Ym* = [*χm***(1)**, *χm***(2)**, ··· , *χm***(***k***)**, ] *<sup>T</sup>* and the simulation output *Ys* = [*χs***(1)**, *χs***(2)**, ··· , *χs***(***k***)**] *<sup>T</sup>* are fed to the optimization process. The parameters to be identified, **Λ**, can be denoted as **Λ** = [**Ξ***T*, *lp*, *li*, *ld*] *<sup>T</sup>*. Accordingly the identification of **Λ** can be converted to solving the following nonlinear optimization problem:

$$Minimiz \, \Phi(\mathbf{A}) = \left\| \begin{array}{c} \mathbf{Y\_m} - \mathbf{Y\_s} \, \right\|\, \right\|^2. \tag{32}$$

Then the updating formula for **Λ** is given as follows:

$$\mathbf{A}^{r+1} = (I\_{\Phi}^T I\_{\Phi})^{-1} I\_{\Phi}^T \Phi(\mathbf{A}^r) + \mathbf{A}^r \tag{33}$$

where **Λ***<sup>r</sup>* is the value of **Λ** in the *r*th iteration and *J*<sup>Φ</sup> is the Jacobian matrix of Φ(**Λ**) w.r.t. **Λ** given as:

$$J\boldsymbol{\phi} = \begin{bmatrix} \frac{\partial \Phi(\Lambda)}{\partial \Lambda\_1} & \frac{\partial \Phi(\Lambda)}{\partial \Lambda\_2} & \cdots & \frac{\partial \Phi(\Lambda)}{\partial \Lambda\_{\mathfrak{U}}} \end{bmatrix},\tag{34}$$

where **Λ***<sup>i</sup>* denotes the *i*th column of **Λ**. The terminate criteria is given as:

$$\begin{aligned} \frac{\|\:\Phi(\Lambda^{r+1}) - \Phi(\Lambda^r)\|\:\|}{\|\:\Phi(\Lambda^r)\|\:\|} &\le tol\_1\\ \max\_{i=1,\cdots,n} \left| \frac{\Lambda\_i^{r+1} - \Lambda\_i^r}{\Lambda\_i^r} \right| &\le tol\_{2'} \end{aligned} \tag{35}$$

where |·| denotes the absolute-value norm operation, *tol*<sup>1</sup> and *tol*<sup>2</sup> are the thresholds to be chosen for tuning the accuracy. A compromise should be made between the convergence speed and accuracy when choosing thresholds. To achieve good results in solving the nonlinear optimization problem, a proper initial guess of **Λ** is needed. For the dynamic parameters of the parallel robots, the initial guess can be calculated from manufacturer specifications. Then a priori PID parameters are obtained based on the simulation model.

#### *3.3. Modified Exciting Trajectory*

The exciting trajectories for the dynamic model identification of the serial robots based on the inverse dynamic model have been well studied in [39]. The Finite Fourier series-based exciting trajectory has been tested in a large amount of research works for identification purpose [40]. For serial robots, it can be represented by the following:

$$\begin{aligned} \theta\_i(t) &= \sum\_{l=1}^n [\frac{\sin(2\pi f\_0 l t)}{2\pi f\_0 l} s\_i^l - \frac{\cos(2\pi f\_0 l t)}{2\pi f\_0 l} c\_i^l] + \theta\_{0\_i} \\ \dot{\theta}\_i(t) &= \sum\_{l=1}^n [\cos(2\pi f\_0 l t) s\_i^l + \sin(2\pi f\_0 l t) c\_i^l] \\ \ddot{\theta}\_i(t) &= \sum\_{l=1}^n [-2\pi f\_0 l \sin(2\pi f\_0 l t) s\_i^l + 2\pi f\_0 l t \cos(2\pi f\_0 l t) c\_i^l]\_l \end{aligned} \tag{36}$$

where *θi*(*t*) is the *i*th joint angle trajectory of serial robots, *n* is the harmonics number, *f*<sup>0</sup> is the fundamental frequency, and *s<sup>l</sup> i* , *cl i* , *θ*0*<sup>i</sup>* are the trajectory parameters to be optimized. Instead of choosing the joint space states (*θ*, *θ*˙, *θ*¨) for serial robots, the pose in the operational space is used for the dynamic identification of parallel robots. A modified Finite Fourier series-based exciting trajectory for parallel robots is proposed as:

$$\chi\_i(t) = \sum\_{l=1}^{n} [\frac{\sin(2\pi\omega\_0 l t)}{2\pi\omega\_0 l} s\_i^l - \frac{\cos(2\pi\omega\_0 l t)}{2\pi\omega\_0 l} c\_i^l] + \chi\_{0\_i} \tag{37}$$

where *χi*(*t*) is the *i*th column of the pose trajectory. Therefore 2*n* + 1 parameters *δ<sup>i</sup>* = [*s*<sup>1</sup> *<sup>i</sup>* , *<sup>c</sup>*<sup>1</sup> *<sup>i</sup>* , ··· ,*s<sup>n</sup> <sup>i</sup>* , *<sup>c</sup><sup>n</sup> <sup>i</sup>* , *χ*0*<sup>i</sup>* ] *<sup>T</sup>* can be estimated by solving a nonlinear optimization problem. The singularity check should be implemented for each sampled pose. According to previous research work [31,41], the maximum wrench rotation range inside the singularity-free domain of the 6-RSS parallel robot is (−0.9948 rad, 0.9948 rad). The inverse kinematic model is used to map the poses into the joint space and to check if the joint angles stay inside the singularity-free domain. To obtain the operational space states in the regression matrix *H*, the time derivatives of the Euler-angle should be converted to the angular velocity and acceleration, as shown in Equations (2) and (38).

$$
\dot{\omega} = \begin{bmatrix}
\ddot{\alpha} + \dot{\gamma}\varsigma\beta + \dot{\beta}\dot{\gamma}\varsigma\beta \\
\alpha(\dot{\beta} - \dot{\alpha}\gamma\varsigma\beta) + \varsigma a(\dot{\beta}\dot{\gamma}\varsigma\beta - \dot{\beta}\dot{\alpha} - \dot{\gamma}\varsigma\beta) \\
\kappa(\dot{\beta} - \dot{\alpha}\gamma\varsigma\beta) + \epsilon a(\dot{\beta}\dot{\alpha} + \dot{\gamma}\varsigma\beta - \dot{\beta}\dot{\gamma}\varsigma\beta)
\end{bmatrix}.
\tag{38}
$$

As shown in Equation (26), the observability index of *H* should be maximized to achieve a good identification result for given the exciting trajectories. The observability index *Oin* used in [31] is chosen as the criteria. Therefore the optimal exciting trajectory can be obtained by solving the following nonlinear optimization problem:

$$\text{Maximize } O\_{in}(\delta) = \frac{\,^{\xi}\sqrt{\sigma\_1 \sigma\_2 \cdots \sigma\_\xi}}{\sqrt{m}} = \frac{\,^{\xi}\sqrt{\det(\sqrt{H^T H})}}{\sqrt{m}} \tag{39}$$

where the singular values of *H* are denoted by *σ*<sup>1</sup> ≥ *σ*<sup>2</sup> ≥···≥ *σς*, *m* is the number of sampled poses of the trajectory, *ς* is the number of dynamic parameters to be identified.

#### *3.4. The Procedure of Identification*

The whole procedure of the proposed closed-loop output-error identification method is given in Figure 4. Firstly, the dynamic model of the parallel robot is derived as Equations (17) and (25). Then the optimized exciting trajectory, *χd*(*t*), can be generated by using the method mentioned in previous subsection. By using *χ*(*t*) as reference input signal to the outer loop visual servoing control systems of both the real plant and the simulated model, the measured output pose *χm*(*t*) of the parallel robot by the optical CMM sensor is compared with that of the simulated model. The identification of the parameters **Λ** is carried out by solving the nonlinear optimization problem. Lastly, the identified model can be validated by feeding several testing trajectories to the systems.

**Figure 4.** Sketch of the identification procedure.

#### **4. Simulation and Experiment Results**

In this section, the dynamic model is verified by the simulation using Matlab/SimMechanics. In addition, the closed-loop output-error identification is carried out on a 6-RSS parallel robot. An outer loop visual servoing controller is implemented on the real plant and the simulated model individually. The C-track 780 from Creaform Inc. is adopted to measure the pose of the end-effector of parallel robot.

#### *4.1. Model Validation*

The analytical dynamic model is rather complex and it is a non-trival task to ensure that the code of dynamic model is mistake free. A simulation verification method is used to verify the built mathematical dynamic model. A mechanical model of the 6-RSS parallel robot is built by using the Multibody SimMechanics Toolbox of Matlab/Simulink. The SimMechanics model is constructed by choosing the parts from SimMechanics library as shown in Figure 5. The coordinates frames and physical parameters of rigid body blocks can be specified in the setting option. The revolute and spherical joints are used to connect the rigid bodies. The body sensor part can provide the position

and orientation of the coordinate frames. The entire model is directly actuated by the torque of the motors, and 3D animation shown in Figure 6 can be provided by SimMechanics. It should be noted that SimMechanics model can only simulate the forward dynamics and cannot be used for controller design. However, it is relatively easy and intuitive to build SimMechanics model with high fidelity [42]. To validate the mathematical dynamic model Equation (17), the explicit form of the forward dynamic model obtained by Equation (40) is employed to compare with the SimMechanics model.

$$
\dot{\boldsymbol{\sigma}}\_{\rm E} = M(\boldsymbol{\chi}\_{\rm E})^{-1} (\boldsymbol{\pi}\_{\rm \mathcal{S}} - \mathbb{C}(\boldsymbol{\chi}\_{\rm E}, \boldsymbol{\upsilon}\_{\rm E}) \boldsymbol{\upsilon}\_{\rm E} - G(\boldsymbol{\chi}\_{\rm E}) - \boldsymbol{\pi}\_{\rm f}), \tag{40}
$$

The mathematical dynamic model is built by using S-function of Simulink. The initial dynamic model parameters of both SimMechanics and mathematical models are derived from manufacturer specifications and are given in Table 1.

**Figure 5.** Mechanical model of 6-RSS parallel robot built by SimMechanics.

**Figure 6.** 3D animation of 6-RSS parallel robot. https://youtu.be/HXtCvgkn2jw.


**Table 1.** Initial dynamic model parameters of 6-RSS parallel robot.

As shown in Figure 7, a simple PID controller is used to stabilize both mathematical and SimMechanic models with the same PID gains. The exciting trajectory *χd***(***t***)** derived from Equation (36) is used as the reference input signal. The outputs of the SimMechanics and mathematical dynamic model (Equation (40)) are *χ*\_*s***(***t***)** and *χ*\_*m***(***t***)** respectively. The difference between *χ*\_*s***(***t***)** and *χ*\_*m***(***t***)** is shown in Figure 8. The maximum position and angle errors are around 1.25 mm and 3.25 × <sup>10</sup>−<sup>3</sup> rad, which occur at the beginning of the simulation, and are often caused by the kinematic error. The largest steady-state errors are about 0.1 mm in the position and 0.25 × <sup>10</sup>−<sup>3</sup> rad in the angle, which can prove the correctness of the mathematical model. The validated mathematical model can be used in the simulation part for the subsequent identification.

**Figure 7.** Dynamic model verification block diagram of 6-RSS parallel robot.

**Figure 8.** Simulation results of SimMechanics model. (**a**) Position error; (**b**) Angle error.

#### *4.2. Identification Experiment*

The experiment setup is shown in Figure 9. The 6-RSS parallel robot with 6 built-in joint controllers is provided by Servo & Simulation Inc. (Sanford, FL, USA). The built-in controllers communicate with the robot control computer through two Quanser MultiQ-PCI (Sensoray Model 626) data acquisition cards provided by Quanser Inc. (Markham, ON, Canada). Quanser's QUARC*TM* software is running on the robot control computer with Windows 7.0 32-bit operating system and Intel Core Processor i7-3770 3.4 GHz. QUARC*TM* software is capable of generating real-time application though Simulink based controllers and implementing the application in real time on the Windows target. The C-track 780 provided by Creaform Inc. (Levis, QC, Canada) is used to obtain the image data of the reflectors attached on the robot. The reflectors provided by Creaform Inc. are magnetic stickers which are easily fixed on the robots and are used as the feature points. In another Windows 7.0 64-bit computer with Intel Xeon Processor E5-1650 v3 3.5 GHz and NVIDIA Quadro K2200 (Santa Clara, CA, USA) professional graphics board, Vxelements software is used to process the image data and transmit the pose of the end-effector to the robot control computer.

**Figure 9.** Architecture of the Experiment System.

As shown in Figure 10, the reflectors are stuck on the surface of the moving platform. At least three non-collinear points on each plane of Plane A, Plane B and Plane C are employed to build up the equations of planes based on Cramer's rule. Then the intersection lines and points of three planes can be used to define the *x* direction of *ΣE*, and *z* direction is aligned with the norm of Plane A. The origin point of *ΣE* is derived from the intersection point of *l*<sup>1</sup> and *l*2. Then, the obtained *ΣE* in the optical CMM sensor frame is directly used as the target frame. The base frame of the parallel robot is defined by following the similar procedure.

To eliminate the high frequency noise of the pose measurement from the optical CMM, measurement data is filtered by the zero-phase forward and reverse 8th order Butterworth filter with the cut-off frequency 60 Hz. The filtering process is carried out by the Zero-phase digital filtering function of Maltab, *filtfilt*.

**Figure 10.** Measurement of *ΣE*.

The optimization of the exciting trajectory is carried out by using the *GA* function of the Optimization Toolbox of Matlab. The GA algorithm uses binary code to represent the harmonic parameters. The fundamental frequency *f*<sup>0</sup> is selected as 0.1 Hz and the harmonics number *n* is chosen as 5. By taking the observation index as the fitness function, the binary code is updated to maximize the fitness value in each step. The starting value of the observation index *Oin* is 0.38 and the stop criteria is set as 10−10. The algorithm stops after 324 iterations with the maximum *Oin* 1.345. The derived optimal exciting trajectory is given in Figure 11.

**Figure 11.** Optimal exciting trajectory. (**a**) Positional trajectory; (**b**) Angular trajectory.

The identification procedure is implemented off-line through simulation illustrated in Section 3.2. The optimization procedure is carried out by using optimization functions in Matlab R2016a. The minimal performance requirements of the computation platform are given as: 4 cores Inter or AMD Processor; 6 GB disk space; 4 GB RAM. The same outer loop visual servoing controller, Equation (30), is employed in the simulation. The gains of the visual servoing controller are obtained through trial and error in the extensive experimental tests. The well-tuned gains are given as *kp* = 0.3, *kd* = 0.001, *ki* = 2.4. and the built-in joint PID controllers, Equation (31), are also implemented in the simulation. The dynamic parameters derived from manufacturer specifications are used as initial values, and the initial values of the PID gains in Equation (31) are obtained through trial and error based on the simulation model. During the tuning, in the simulation system, the inertial and friction parameters are set as the values based on the manufacturer specifications, and the visual servoing controller gains are set as the same values of the controller of the real system. The identified parameters, given in Table 2, are derived by using *lsqnonlin* function of the Optimization Toolbox

of Matlab after 8 iterations. A total of 50 out of 67 parameters are identified, and are used in the simulation model, Equation (40), to capture the dynamic characteristic of the parallel robot. The other parameters either do not contribute to or have slight impact on the dynamics of the parallel robot. Those parameters can be eliminated by using QR decomposition on the regression matrix *H* [23,43]. If any diagonal elements of *R* are smaller than the pre-defined small number, i.e., *Rii* < *ε*, where *ε* is chosen as 10−<sup>3</sup> in the paper, the corresponding columns of the regression matrix *H* are deleted. By doing so, the matrix *H* is better conditioned and the identification procedure is sped up.

Then, the pose trajectories are generated by using the identified parameters in the simulation, and are compared with the pose measurement, as shown in Figure 12. Table 3 shows the root-mean-square (RMS) levels of the pose trajectory errors.


**Table 2.** Identified parameters of 6-RSS parallel robot.

**Figure 12.** *Cont*.

**Figure 12.** The pose trajectories of the parallel robot: the measurement of the real plant (**black dot**), the output of the simulation with initial parameters (**green line**), the output of the simulation with identified parameters (**blue line**), (**a**) Along X Direction; (**b**) Along Y Direction; (**c**) Along Z Direction; (**d**) Around *α* Axis; (**e**) Around *β* Axis; (**f**) Around *γ* Axis.


**Table 3.** The RMS levels of the pose trajectory errors.

#### *4.3. Identified Results Validation*

To verify the identified parameters, ten more trajectories are generated according to Equation (36) with random harmonic parameters under the singularity constraint. The generated trajectories are used as desired trajectories, and are fed to the parallel robot and the identified model in the simulation respectively. The RMS levels of the pose trajectory errors are given in Table 4, and the measurement and the simulated pose trajectories are given in Figure 13 according to the 1st desired trajectory. The RMS of the position and orientation errors for all ten trajectories are below 0.8 mm and 1.4 × <sup>10</sup>−<sup>3</sup> rad respectively, which validate the identified results of previous subsection. In addition, the proposed identification procedure is implemented based on the ten trajectories to analyze the statistic property of the identification results. After deriving ten more groups of identified parameters, the variation measure of the identification results are given in Table 5. The highest relative variation of the parameter is below 25%, which is acceptable. It has been stated that less than 30 percent in the variation measure of the parameters gives a good match to the real system [44].


**Table 4.** The RMS levels of the ten validation trajectory errors.

**Figure 13.** The pose trajectories of the parallel robot: the measurement of the real plant (**black dot**), the output of the simulation with identified parameters (**blue line**), (**a**) Along X Direction; (**b**) Along Y Direction; (**c**) Along Z Direction;(**d**) Around *α* Axis; (**e**) Around *β* Axis; (**f**) Around *γ* Axis.


**Table 5.** Variation measure of the identification result.

Therefore by using the proposed visual closed-loop output-error identification method, the identified dynamic model can approximate the real plant with acceptable accuracy.

#### **5. Conclusions and Further Works**

In this paper, a visual closed-loop output-error identification method based on an optical CMM sensor for parallel robots is proposed. An outer loop visual servoing controller is employed in both the real plant and the simulation model to stabilize the two systems. The benefits of the proposed method are summarized as follows: elimination of the need for the joint and torque measurements, the exact knowledge of the built-in joint controller of the industrial robots, and the time-consuming forward kinematics calculation. The correctness and accuracy of the built dynamic model are validated by the Matlab/SimMechanics simulation. The experimental test results show that the identified dynamic model can capture the dynamics of the real parallel robot with satisfactory accuracy. The proposed method can be easily applied to other types of industrial parallel robots with unknown PID built-in controller or its variant, such as 6 DOF Stewart platforms, 6 UPS and 6 RUS parallel robots etc. The complexity of those dynamic models is similar to that of the 6-RSS parallel robot. Since the analytical solution of the forward kinematics of those 6 DOF parallel robots does not exist, the proposed visual identification method does not need the forward kinematic model and hence has a lot of advantages. The proposed identification method can also be applied to parallel robots with less DOF than 6 DOF. Taking the advantages of the visual sensor, the dynamic model can be identified for the visual servoing purpose. In the future, the advanced model-based visual servoing control method will be further studied to improve the tracking performance of parallel robots based on the identification results.

**Author Contributions:** The authors' individual contributions are provided as follow: conceptualization, P.L., A.G., W.X. and W.T.; methodology, P.L., A.G., W.X. and W.T.; software, P.L.; validation, P.L., A.G., and W.X.; resources, W.X.; writing–original draft preparation, P.L.; writing–review and editing, P.L., A.G., W.X. and W.T.; supervision, W.X. and W.T.; project administration, W.X.; funding acquisition, W.X.

**Funding:** This research was funded by Natural Sciences and Engineering Research Council of Canada grant number N00892.

**Conflicts of Interest:** The authors declare no conflict of interest. The funder had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **Appendix A**

The derivative of the wrench Jacobian, ˙ *Jai* , is given as follows:

$$f\_{a i\_i} = \begin{bmatrix} f\_{a u\_i} \\ \mathfrak{sl}\_{a d\_i} \end{bmatrix} \tag{A1}$$

where

$$\|I\|\_{\text{ad}\_i} = -\frac{\dot{m}}{m^2} \begin{bmatrix} I\_i^T & (\boldsymbol{a}\_i \times I\_i)^T \end{bmatrix} + \frac{1}{m} \begin{bmatrix} (\boldsymbol{\omega}\_{2l} \times I\_l)^T & ([\boldsymbol{a}\_l]\_X[\boldsymbol{\omega}\_2]\_X I\_l)^T - ([I\_i]\_X[\boldsymbol{\omega}]\_X \boldsymbol{a}\_i)^T \end{bmatrix},\tag{A2}$$

and

$$\begin{aligned} m &= (\boldsymbol{w}\_{l} \times l\_{l}) \cdot \ $\\ \dot{m} &= ([\boldsymbol{w}\_{l}]\_{X}[\boldsymbol{\omega}\_{2}]\_{X}l\_{l} - [l\_{l}]\_{X}[\boldsymbol{\omega}\_{1}]\_{X}\boldsymbol{w}\_{l}) \cdot \$ \\ \dot{j}\_{d\boldsymbol{w}\_{i}} &= ([\boldsymbol{\mathfrak{s}}]\_{X}[\boldsymbol{\omega}\_{1}]\_{X}\boldsymbol{c}\_{\boldsymbol{w}\_{i}})l\_{d\boldsymbol{d}\_{i}} + ([\boldsymbol{\mathfrak{s}}]\_{X}\boldsymbol{c}\_{\boldsymbol{w}\_{i}})\dot{l}\_{d\boldsymbol{d}\_{i}} \end{aligned} \tag{A3}$$

In addition, the link Jacobian ˙ *Jbi* is obtained by:

$$
\dot{J}\_{b\_i} = \begin{bmatrix}
\dot{J}\_{bu\_i} \\
\dot{J}\_{bd\_i}
\end{bmatrix} \tag{A4}
$$

in which

$$\begin{split} \|I\_{hd\_{i}} - \frac{1}{||I\_{l}||^{2}} \{ [\omega\_{2} \times I\_{l}]\_{X} [\![\![w\_{l}]\!]\_{X} \\$\!]\_{ad\_{i}} + \\ \qquad \qquad \qquad \left[ \begin{array}{l} [\![\omega\_{2} \times I\_{l}]\_{X} \quad [I\_{l}]\_{X} [\![\omega \times a\_{l}]\!]\_{X} - [\![\omega\_{2} \times I\_{l}]\_{X} [a\_{l}]\_{X}] \end{array} \right] \\ \qquad \qquad + [I\_{l}]\_{X} [\![\omega\_{1} \times \omega\_{l}]\_{X} \\$]\_{ad\_{i}} + [I\_{l}]\_{X} [\![\omega\_{l}]\_{X} \\$]\_{ad\_{i}} \} , \end{split} \tag{A5}$$

and

$$f\_{\rm bu\_i} = -[\boldsymbol{\omega}\_1 \times \boldsymbol{\omega}\_i]\_X \ $f\_{\rm ad\_i} - [\boldsymbol{\omega}\_1]\_X \$ f\_{\rm ad\_i} - [\boldsymbol{\omega}\_2 \times I\_i]\_X f\_{\rm bd\_i} - [\boldsymbol{\sigma}\_{l\_i}]\_X f\_{\rm bd\_i} \tag{A6}$$

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Optimal Image-Based Guidance of Mobile Manipulators Using Direct Visual Servoing**

#### **Álvaro Belmonte, José L. Ramón, Jorge Pomares \*, Gabriel J. Garcia and Carlos A. Jara**

Department of Physics, Systems Engineering and Signal Theory, University of Alicante, Alicante 03690, Spain; bb35@alu.ua.es (Á.B.); jl.ramon@ua.es (J.L.R.); gjgg@ua.es (G.J.G.); carlos.jara@ua.es (C.A.J.)

**\*** Correspondence: jpomares@ua.es; Tel.: +34-965903400

Received: 20 February 2019; Accepted: 23 March 2019; Published: 27 March 2019

**Abstract:** This paper presents a direct image-based controller to perform the guidance of a mobile manipulator using image-based control. An eye-in-hand camera is employed to perform the guidance of a mobile differential platform with a seven degrees-of-freedom robot arm. The presented approach is based on an optimal control framework and it is employed to control mobile manipulators during the tracking of image trajectories taking into account robot dynamics. The direct approach allows us to take both the manipulator and base dynamics into account. The proposed image-based controllers consider the optimization of the motor signals sent to the mobile manipulator during the tracking of image trajectories by minimizing the control force and torque. As the results show, the proposed direct visual servoing system uses the eye-in-hand camera images for concurrently controlling both the base platform and robot arm. The use of the optimal framework allows us to derive different visual controllers with different dynamical behaviors during the tracking of image trajectories.

**Keywords:** visual servoing; optimal control; mobile manipulator; dynamic control

#### **1. Introduction**

This paper proposes a direct image-based visual servoing system for the guidance of a mobile manipulator. As described throughout the paper, the visual information is employed to guide both the robot manipulator and the base platform for the tracking of image trajectories. A common classification for visual servoing systems based on position and image is established [1]. Within the field of position-based visual servoing systems, it is worth mentioning some classic works [2] as well as some more recent ones such as the works of Huang et al. [3]. Besides these two works, it is worth mentioning other remarkable ones such as the works of Park et al. [4], which describes the use of vision for a robot's positioning with regard to an object whose position was previously determined. Position-based systems are those where the reference of the control system is in a desired tridimensional position and orientation. In contrast to position-based approaches, image-based visual servoing systems determine the reference in the image space. This strategy is the one used in this paper. Although the literature regarding image-based visual servoing is quite extensive, works, such as the presented in [1,5], perform a review of the classical approaches and problems related to indirect image-based controllers. One of the elements which most affects the behavior of image-based visual servoing controllers is the interaction matrix, which is employed by the controller and depends on several parameters. The interaction matrix *J<sup>s</sup>* is employed by the image-based visual servoing controllers for relating velocities in a tridimensional point with velocities of a corresponding point in the image space. Deep learning approaches are currently in development for image-based control of robot manipulators (see e.g. [6,7]). When this kind of approaches has to be applied to a mobile manipulator, it is necessary to consider not only the manipulator but also the mobile platform, having a redundant system. When dealing with stability problems, it is worth mentioning those that stem

from dealing with a redundant robot [8]. In this case, it is not only necessary to correctly perform the end effector guidance, but also to solve the redundancy by establishing the most appropriate joint configuration for each situation.

Another useful classification depending on the control strategy of visual servoing systems divides them in direct and indirect visual servoing. In indirect visual servoing, the control action is specified as velocities to be applied at the robot's end effector and does not take robot dynamics into account. In this case, the internal robot controller is used and translates these velocities into torques and forces for the joints. When applying an indirect controller to a task only kinematic aspects of the robot are taken into account. However, in direct controllers, the control action is usually given as forces and torques applied directly to the manipulator joints. By using this approach, the internal feedback control loop for the servomotors is removed, and the visual information is used to directly generate the currents or torques to be applied to the motors of the robot's every joint. Some direct visual control systems for redundant robots with chaos compensation have been developed to avoid robotic chaotic behaviors with new architectures to improve systems maintainability and traceability [9,10]. Within the field of direct image-based visual servoing systems it is worth mentioning works such as [11] that considers a FPGA (Field Programmable Gate Arrays) implementation, or [12] that extends the previous direct visual servoing for the optimal control of robot manipulators.

Within the field of indirect visual servoing, several approaches can be found for the guidance of mobile robots or mobile manipulators. In [13], a homography-based 2D approach for visual control of a mobile manipulator that does not need any measure of the 3D structure of the observed target has been proposed. A task-space sensing and control system designed to control the end effector motion of a mobile manipulator in the presence of dynamic and unknown base motion is proposed in [14]. In [15], an image-based approach with redundancy resolution is proposed for the guidance of a mobile manipulator. An indirect position-based visual control approach is described in [16] for the guidance of a mobile manipulator. In [17], a control scheme that uses image-based visual servoing was developed along with a functional mobile manipulator platform, where a hybrid camera configuration composed of monocular and stereo vision cameras was integrated into the system.

In contrast with previous approaches, the one proposed in this paper employs a direct visual servoing system for guiding mobile manipulators. In this case, the controller directly generates the torques, forces, and moments to be applied to the manipulator and base platform taking into account the robot dynamics. In the previous indirect approaches, the controller assumes that the guided device is a perfect positioning system, and therefore its dynamics is not considered. However, the use of a direct visual servo approach allows for the manipulator and base dynamics to be taken into account. The result is a faster and more accurate control that reacts more quickly to abrupt changes in the image trajectories. Additionally, an optimal control approach is used for the definition of the proposed controllers. This control approach considers the optimization of the motor signals sent to the mobile manipulator during the tracking of image trajectories. Using this optimal approach, different visual controllers with different dynamic properties can be derived.

The paper is divided into the following sections. First, Section 2 describes the robotic system and main coordinate frames. The system kinematics is described in Section 3 and the dynamics in Section 4. Section 5 describes the optimal controller and the visual servoing of the mobile manipulator. Finally, in Section 6, the main results obtained in the application of the proposed optimal controller are described.

#### **2. System Architecture and Nomenclature**

Figure 1 represents the main components of the mobile manipulator (TIAGo robot from PAL Robotics). The robot is composed of a mobile differential platform with a 7-DOF (degrees-of-freedom) arm, a RGB-D camera which is not employed in this paper, and an additional eye-in-hand camera. Frame F*<sup>p</sup>* is attached to the mobile robot platform and F*<sup>m</sup>* is the manipulator arm coordinate system attached to the base of the robot arm. The origin of F*<sup>m</sup>* is translated with respect to the frame F*<sup>p</sup>* a vector (*xpm*, *ypm*, *zpm*). The world coordinate frame is called F*o*. The end effector frame F*<sup>e</sup>* is attached

to the manipulator end effector, and frame F*<sup>c</sup>* is the eye-in-hand camera frame. In this paper we consider an eye-in-hand camera system where F*<sup>c</sup>* = F*e*. The orientation of the mobile platform is considered as α, i.e., the angle between de X-axis of the frame F*<sup>p</sup>* and the X-axis of the frame F*<sup>o</sup>* and the origin of F*<sup>p</sup>* is at the position (*xop*, *yop*, *zop*) with respect to F*o*. From the manipulator kinematics we can obtain the position of F*<sup>e</sup>* with respect to F*<sup>m</sup>* as (*xme*, *yme*, *zme*) and (*φxme*, *φyme*, *φzme*), respectively.

**Figure 1.** System components.

Regarding the visual system, an eye-in-hand camera observes a set of four visual features located on the surface of the tracked object. As Figure 1 shows, a black pattern with four white points is located in the tracked object. Therefore, the computer vision extracts four visual feature points, *s* = *<sup>f</sup>*1,*f*2,*f*3,*f*4,<sup>T</sup> that correspond to the center of these white points. Figure 2 shows the different modules that compose the computer vision system. The first step preprocesses the image in order to obtain a single format of the image no matter what camera is employed (e.g., image conversion from RGB to gray scale, image resizing, noise filtering, etc.). In the second stage, the image is binarized through a thresholding process. The result is a binary image with all the essential data of the pattern. The third step is an erosion operation. This is a morphological operation on the image that removes the pixels located on the edge of any object in the image. This step is mainly used to reduce the noise of the image. The fourth phase is the most important stage of the vision module: blob detection. The purpose of this module is to detect the connected pixels which form an object in the image and then assign a label to them. Finally, the vision module has to provide the pixel coordinates of the center of gravity of the visual features. The fifth submodule (area and centroid computation) accomplishes this task. In [12], a detailed description of the computer vision components required to determine the set of visual features is presented.

**Figure 2.** Computer vision components.

#### **3. System Kinematics**

The main components and coordinate frames of the robotic system are presented in Section 2. This section describes the kinematic formulation required for the definition of the proposed controllers. *qT <sup>m</sup>* ∈ <sup>m</sup> represent the generalized joint coordinates of the manipulator (in our case, m = 7) and *qT <sup>p</sup>* ∈ np are the generalized coordinates of the platform. The generalized coordinates of both manipulator and platform can be represented as *q* = *qT <sup>p</sup> <sup>q</sup><sup>T</sup> m* . In a general case, the computer vision system extracts k visual feature points from the observed object *s* = *f*1,*f*2, ..., *f<sup>k</sup>* T <sup>∈</sup> <sup>R</sup>*<sup>k</sup>* (k = 4 in the experiments, see Figure 1). Therefore, the image-based direct visual controller must perform the guidance of the mobile manipulator to track the desired trajectory in the image space, *s*d(*t*). The visual servoing control approach allows the control of the manipulator joints and base using k visual features, using an eye-in-hand configuration, where a camera is held by the end effector. Considering an image feature point, *fi*, , the relationship between velocities in the camera image space, . *fi* , and the velocity twist . *r*<sup>c</sup> = *v<sup>c</sup> ω<sup>c</sup> T* of the camera (expressed in its own frame), is given by:

$$\dot{f}\_i = \mathbf{J}\_{f\_i}(f\_i, Z\_i) \begin{bmatrix} \boldsymbol{\sigma}\_c \\ \boldsymbol{\omega}\_c \end{bmatrix} \tag{1}$$

where *Z* is the depth of the image feature and the value of *Jf<sup>i</sup>* can be found in [1]. From a kinematic point of view, the motor commands are considered as a set of platform command velocities, *<sup>u</sup><sup>p</sup>* <sup>∈</sup> <sup>R</sup>p, and manipulator command velocities, *<sup>u</sup><sup>m</sup>* <sup>∈</sup> <sup>R</sup>m, therefore

$$\boldsymbol{\mu} = \begin{bmatrix} \boldsymbol{\mu}\_p^T & \boldsymbol{\mu}\_m^T \end{bmatrix}^T \tag{2}$$

In the above equation the manipulator motor commands are the joint velocities, i.e., . *q<sup>m</sup>* = *um*, and from the kinematic model of the mobile platform the following relation can be considered:

$$
\dot{q}\_p = \psi \left( q\_p \right) u\_p \tag{3}
$$

where, usually, *u<sup>p</sup>* = *v ω T* is the platform linear and angular velocities and the function *ψ* spans the admissible velocity space at each mobile platform configuration. In this paper, an image-based approach will be employed for the guidance of the mobile manipulator, therefore, an image Jacobian, *Ji*, must be defined which relates the differential mapping between the motor commands *u* and the time derivative of the set of image features *s*:

$$
\dot{s} = f\_i u \tag{4}
$$

where *J<sup>i</sup>* will be defined throughout the paper as the product of the matrices *J<sup>s</sup>* and *Jc*, therefore, *<sup>J</sup><sup>i</sup>* <sup>=</sup> *<sup>J</sup>sJc*. The interaction matrix *<sup>J</sup><sup>s</sup>* <sup>∈</sup> <sup>R</sup>2k×<sup>6</sup> for the set of image features points, **<sup>s</sup>**, is the stack of k matrices *Jf i* for each feature:

$$\dot{\mathbf{s}} = \mathbf{J}\_s(\mathbf{s}, \mathbf{Z}) \begin{bmatrix} \mathbf{v}\_c \\ \boldsymbol{\omega}\_c \end{bmatrix} = \begin{bmatrix} J\_{f1}(\boldsymbol{f}\_{1'}, \boldsymbol{Z}\_1) \\ \vdots \\ \boldsymbol{f}\_{fk}(\boldsymbol{f}\_{k'}, \boldsymbol{Z}\_k) \end{bmatrix} \begin{bmatrix} \mathbf{v}\_c \\ \boldsymbol{\omega}\_c \end{bmatrix} \tag{5}$$

where *Z* = *Z*<sup>1</sup> ... *Zk T* <sup>∈</sup> <sup>R</sup><sup>k</sup> is a depth vector for each image feature. The matrix *<sup>J</sup><sup>c</sup>* relates the end effector velocity with respect to the camera coordinate frame *v<sup>c</sup> ω<sup>c</sup> T* and the motor command *<sup>u</sup>*:

$$
\begin{bmatrix} \boldsymbol{v}\_{\mathcal{E}} \\ \boldsymbol{\omega}\_{\mathcal{E}} \end{bmatrix} = \boldsymbol{J}\_{\mathcal{E}}(\boldsymbol{q})\boldsymbol{u} \tag{6}
$$

As previously described, the mobile manipulator is composed of a differential drive platform and a 7-DOF robot arm with an eye-in-hand camera. In order to derive the value of the matrix *J<sup>c</sup>* , first the position and orientation of the camera with respect to the world coordinate F*<sup>o</sup>* is defined as

$$\mathbf{r} = \begin{bmatrix} \ t^T(\mathbf{q}) & \boldsymbol{\Phi}^T(\mathbf{q}) \end{bmatrix}^T = \begin{bmatrix} \ t\_x & t\_y & t\_z & \boldsymbol{\Phi}\_x & \boldsymbol{\Phi}\_y & \boldsymbol{\Phi}\_z \end{bmatrix}^T \tag{7}$$

where *tT*(*q*) represents the position and *φT*(*q*) is the minimal parametrization of the camera orientation by Euler angles. From the robot kinematics and the system architecture presented in Section 2, it is possible to obtain

$$\mathbf{x}\_x = (\mathbf{x}\_{m\mathbf{e}} + \mathbf{x}\_{pm})\cos\mathfrak{a} - (y\_{m\mathbf{e}} + y\_{pm})\cos\mathfrak{a} + \mathbf{x}\_{\mathcal{O}\mathbf{p}} \tag{8}$$

$$t\_y = (x\_{m\mathfrak{a}} + x\_{pm})\sin\mathfrak{a} + (y\_{m\mathfrak{a}} + y\_{pm})\cos\mathfrak{a} + y\_{op} \tag{9}$$

$$t\_z = z\_{m\text{f}} + z\_{pm} \tag{10}$$

From the previous equations we can obtain the time derivatives of *<sup>r</sup>*, i.e., *v<sup>T</sup> ω<sup>T</sup> T* . First, the components of the translation velocity, . *t T* (*q*), can be defined as

$$\dot{\mathbf{i}}\_x = \dot{\mathbf{x}}\_{\text{m}t} \cos \mathbf{a} - \left(\mathbf{x}\_{\text{m}t} + \mathbf{x}\_{\text{pw}}\right) \sin(a) \omega - \dot{y}\_{\text{m}c} \sin \mathbf{a} - \left(y\_{\text{m}t} + y\_{\text{pw}}\right) \cos(a) \omega + \mathbf{v} \cos a \tag{11}$$

$$\dot{x}\_y = \dot{x}\_{\text{mc}} \sin \alpha + (\mathbf{x}\_{\text{mc}} + \mathbf{x}\_{\text{pm}}) \cos(\alpha)\omega + \dot{y}\_{\text{mc}} \cos \alpha - (y\_{\text{mc}} + y\_{\text{pm}}) \sin(\alpha)\omega + v \sin \alpha \tag{12}$$

$$
\dot{\mathfrak{f}}\_z = \dot{\mathfrak{z}}\_{mc} \tag{13}
$$

Therefore, the motion of the manipulator end depends not only on the joint positions and velocities of the manipulator arm but also on the motion of the mobile platform and its orientation *α*. Additionally, the components of the angular speed, . *φT* (*q*) of the robot can be computed as [15]

$$
\dot{\phi}\_x = \dot{\phi}\_{xm}\cos\alpha - \dot{\phi}\_{ym}\sin\alpha \tag{14}
$$

$$
\dot{\phi}\_y = \dot{\phi}\_{xm\epsilon} \sin \alpha - \dot{\phi}\_{ym\epsilon} \cos \alpha \tag{15}
$$

$$
\dot{\phi}\_z = \dot{\phi}\_{zm\epsilon} + \omega \tag{16}
$$

From Equations (11)–(16) the values of *v<sup>T</sup> ω<sup>T</sup> T* = . *tx* . *ty* . *tz* . *φx* . *φy* . *φz T* are obtained with respect to the motor commands. Using the rotation matrix *Rc*, which represents the orientation frame <sup>F</sup>*<sup>c</sup>* with respect to <sup>F</sup>*o*, the values of *v<sup>T</sup> ω<sup>T</sup> T* can be expressed with respect to the frame F*c*, *vT <sup>c</sup> ω<sup>T</sup> c T* . Appendix A describes the computation of the numerical values of matrix *Jc*.

Another useful kinematic relationship for the definition of the proposed controllers is the image acceleration or second time derivative of *s*. This relationship is obtained by the definition of *J<sup>i</sup>* (Equation (4)) and differentiating w.r.t.

$$
\ddot{s} = J\_i \dot{u} + \dot{J}\_i u \tag{17}
$$

#### **4. System Dynamics**

This section describes the main dynamic equations for the robotic system. The dynamic equations of motion of the mobile manipulator can be written as

$$
\begin{bmatrix} F\_b \\ \pi \end{bmatrix} = \begin{bmatrix} \mathbf{M}\_{bb} & \mathbf{M}\_{bm} \\ \mathbf{M}\_{bm}^T & \mathbf{M}\_{mm} \end{bmatrix} \begin{bmatrix} \dot{u}\_p \\ \ddot{q}\_m \end{bmatrix} + \begin{bmatrix} \mathbf{c}\_b \\ \mathbf{c}\_m \end{bmatrix} \tag{18}
$$

where .. *<sup>q</sup><sup>m</sup>* <sup>∈</sup><sup>m</sup> is the set of joint accelerations of the robot manipulator, . *u<sup>p</sup>* = . *<sup>v</sup>* . *ω T* is the platform linear and angular accelerations, *<sup>M</sup>bb* ∈ *p*×*<sup>p</sup>* is the inertia matrix of the platform, *<sup>M</sup>bm* ∈ *p*×*<sup>m</sup>* is the coupled inertia matrix of the base and the manipulator, and *<sup>M</sup>mm* ∈ *m*×*<sup>m</sup>* is the inertia matrix of the manipulator; *c<sup>b</sup>* and *c<sup>m</sup>* are a velocity/displacement-dependent, nonlinear terms for the base and manipulator, respectively, *F<sup>b</sup>* is the force and moment exerted on the base, and *τ* is the applied joint torque on the robot manipulator. From a dynamic point of view, the motion of the mobile manipulator is governed by the applied torques on the manipulator joints and by the force and moment exerted on the base. Hence, Equation (18) can be written in the following compact form.

$$
\widetilde{\mathbf{r}} = \breve{M}\dot{\mathbf{u}} + \breve{\mathbf{C}} \tag{19}
$$

$$
\text{where } \widetilde{\mathbf{r}} = \begin{bmatrix} \mathbf{F}\_{\mathbf{b}} \\ \mathbf{r} \end{bmatrix}, \breve{\mathbf{M}} = \begin{bmatrix} \mathbf{M}\_{\mathbf{b}b} & \mathbf{M}\_{\mathbf{b}m} \\ \mathbf{M}\_{\mathbf{b}m}^{\mathrm{T}} & \mathbf{M}\_{mm} \end{bmatrix}, \dot{\mathbf{u}} = \begin{bmatrix} \dot{\mathbf{u}}\_{p} \\ \ddot{\mathbf{q}}\_{m} \end{bmatrix}, \text{and } \breve{\mathbf{C}} = \begin{bmatrix} \mathbf{c}\_{\mathbf{b}} \\ \mathbf{c}\_{m} \end{bmatrix}.
$$

#### **5. Optimal Control of the Mobile Manipulator**

#### *5.1. Optimal Control Definition*

In complex high dimensional systems, several tasks can be accomplished simultaneously taking advantage of the robot redundancy. This paper considers a system with *μ* constrains that represents the task for the mobile manipulator to be executed. These constraints can be represented as

$$\mathbf{A} \cdot (t)\dot{\mathbf{u}} = \mathbf{b}(t) \tag{20}$$

where *<sup>A</sup>* (*t*) <sup>∈</sup>*μ*×(*p*+*m*) and *<sup>b</sup>* (*t*) <sup>∈</sup>*μ*×1. An advantage of this task formulation is that nonholonomic constraints can be treated in the same general way. In order to reduce the energy required for performing the visual servoing task, the proposed optimal controller is designed to minimize the control torque for the mobile manipulator, considering the following cost function.

$$\mathbf{M}(t) \; \; \;= \widetilde{\mathbf{r}}^{\mathrm{T}} \mathbf{W}(t) \widetilde{\mathbf{r}} \tag{21}$$

where **W**(*t*) is a time-dependent weight matrix.

Let us assume that the robot model presented in the previous section and a constraint formulation are given. In this case, the optimal controller has to guarantee that the task is perfectly achieved, i.e., that *A* (*t*) . *u* = *b*(*t*) holds at all times. Additionally, the minimization of the control command with respect to some given metric, i.e., **<sup>Ω</sup>**(*t*) <sup>=</sup> *<sup>τ</sup>*<sup>T</sup>**W**(*t*)*<sup>τ</sup>*, is intended at each instant of time. The solution to this pointwise optimal control problem [18] can be derived from a generalization of Gauss' principle, as originally suggested in [19]. It is also a generalization of the propositions in [20,21], where the following control action is obtained considering the general expression for the dynamic model expressed in Equation (19) (for the sake of clarity the time dependences are not indicated):

$$\widetilde{\pi} = \mathbf{W}^{-1/2} \left( A \widetilde{\mathbf{M}}^{-1} \mathbf{W}^{-1/2} \right)^{+} \cdot \left( \mathbf{b} + A \widetilde{\mathbf{M}}^{-1} \widetilde{\mathbf{C}} \right) \tag{22}$$

where the symbol + denotes the pseudo inverse of a general matrix. As it can be seen in Equation (22), the matrix **W** is an important factor in the control law that can be used to determine how the control efforts are distributed over the joints and the robot base.

#### *5.2. Optimal Direct Visual Servoing*

Once the optimal control framework is presented in Section 5.1, this section describes its application for the optimal control of the mobile platform. First, we define .. *s<sup>r</sup>* as the reference image accelerations that will be employed by the optimal controller. The value of these image accelerations can be obtained from Equation (17). This last expression can be expressed into the form of Equation (20) as

$$J\_i \dot{u} = \ddot{\mathbf{s}}\_r - \dot{\mathbf{J}}\_i u \tag{23}$$

This way, the visual servoing task is defined by the following relationships.

$$A = J\_i b = \bar{s}\_r - \dot{J}\_i u \tag{24}$$

Therefore, with the given definition of *A* and *b*, the optimal control will minimize the motor commands while performing the tracking in the image space. The final control law can be obtained replacing these variables into the function that minimizes the motor signals described by Equation (22):

$$\boldsymbol{\Upsilon} = \mathbf{W}^{-1/2} \left( \mathbf{J}\_i \boldsymbol{\tilde{M}}^{-1} \mathbf{W}^{-1/2} \right)^{+} \cdot \left( \ddot{\mathbf{s}}\_r - \dot{\mathbf{J}}\_i \boldsymbol{u} + \mathbf{J}\_i \boldsymbol{\tilde{M}}^{-1} \boldsymbol{\tilde{C}} \right) \tag{25}$$

As it can be seen, the visual controller represented by (25) implicitly depends on the weighting matrix **W,** and different values of this matrix can be used to coordinate the motion of the base and of the manipulator. The diagram of the considered control loop is depicted in Figure 3.

**Figure 3.** Diagram of the control scheme for the mobile manipulator robot.

Considering **<sup>W</sup>** <sup>=</sup> *<sup>M</sup>* <sup>−</sup><sup>2</sup> and replacing this value in the control framework expressed in Equation (25), the result yields

$$\widetilde{\mathbf{T}} = \, \breve{\mathbf{M}} \mathbf{J}\_i^+ \cdot \left( \ddot{\mathbf{s}}\_{\mathbf{r}} - \dot{\mathbf{J}}\_i \boldsymbol{u} + \mathbf{J}\_i \breve{\mathbf{M}}^{-1} \breve{\mathbf{C}} \right) \tag{26}$$

This controller represents a direct visual servo control using inversion of the dynamic model. Additionally, new controllers can be obtained using different values of **W**. For example, an important value for **<sup>W</sup>**, due to its physical interpretation, is **<sup>W</sup>** <sup>=</sup> *<sup>M</sup>* <sup>−</sup><sup>1</sup> , since it is consistent with the principle of d'Alembert:

$$\tilde{\pi} = \tilde{\boldsymbol{M}}^{1/2} \left( \boldsymbol{J}\_i \tilde{\boldsymbol{M}}^{-1} \tilde{\boldsymbol{M}}^{1/2} \right)^+ \cdot \left( \ddot{\mathbf{s}}\_r - \dot{\boldsymbol{J}}\_i \boldsymbol{u} + \boldsymbol{J}\_i \tilde{\boldsymbol{M}}^{-1} \tilde{\mathbf{C}} \right) \tag{27}$$

Now, the definition of the reference control, .. *s*r, is described considering the eye-in-hand camera system that extracts a set of k image feature points. The task description as a constraint is given by the following equation in the image space.

$$\left(\ddot{\mathbf{s}}\_d - \ddot{\mathbf{s}}\right) + \mathbf{K}\_D \left(\dot{\mathbf{s}}\_d - \dot{\mathbf{s}}\right) + \mathbf{K}\_P (\mathbf{s}\_d - \mathbf{s}) = 0\tag{28}$$

where .. *sd*, . *sd*, and *s<sup>d</sup>* are the desired image space accelerations, velocities, and positions, respectively. *K<sup>P</sup>* and *K<sup>D</sup>* are proportional and derivative gain matrices, respectively. This equation can be expressed in regards to image error in the following way.

$$
\ddot{\mathbf{s}}\_d + \mathbf{K}\_D \dot{\mathbf{e}}\_\mathbf{s} + \mathbf{K}\_P \mathbf{e}\_\mathbf{s} = \ddot{\mathbf{s}}\_r \tag{29}
$$

where *<sup>e</sup><sup>s</sup>* and . *es* are the image error and the time derivative of the image error, respectively. As stated, the variable .. *sr* denotes the reference image accelerations of the proposed image space based controller. Substituting this variable into the dynamic visual servo controller in Equation (25), the control law is set by the following relationship.

$$\widetilde{\mathbf{r}} = \mathbf{W}^{-1/2} \left( I\_i \widetilde{\mathbf{M}}^{-1} \mathbf{W}^{-1/2} \right)^+ \cdot \left( \ddot{\mathbf{s}}\_d + \mathbf{K}\_D \dot{\mathbf{e}}\_s + \mathbf{K}\_P \mathbf{e}\_s - \dot{I}\_i u + I\_i \widetilde{\mathbf{M}}^{-1} \widetilde{\mathbf{C}} \right) \tag{30}$$

In order to demonstrate the asymptotic tracking of the control law, some operations must be done. First, the closed-loop behavior is computed from Equation (19) as

$$\delta \mathbf{M} \dot{\mathbf{u}} + \tilde{\mathbf{C}} = \mathbf{W}^{-1/2} \left( I\_i \tilde{\mathbf{M}}^{-1} \mathbf{W}^{-1/2} \right)^{+} \cdot \left( \bar{\mathbf{s}}\_d + \mathbf{K}\_D \dot{\mathbf{e}}\_s + \mathbf{K}\_P \mathbf{e}\_s - \dot{\mathbf{J}}\_i \mu + I\_i \tilde{\mathbf{M}}^{-1} \tilde{\mathbf{C}} \right) \tag{31}$$

Equation (31) can be simplified by premultiplying its left and right sides by *<sup>J</sup>i<sup>M</sup>* <sup>−</sup><sup>1</sup> **W**−1/2 **W**1/2:

$$J\_i \dot{u} = \ddot{\mathbf{s}}\_d + \mathbb{K}\_D \dot{\mathbf{e}}\_\mathbf{s} + \mathbb{K}\_P \mathbf{e}\_\mathbf{s} - \dot{f}\_i u \tag{32}$$

Finally, using the relationship expressed in (23), it can be concluded that

$$
\ddot{\mathbf{e}}\_{\mathfrak{s}} = -\mathbf{K}\_{D}\dot{\mathbf{e}}\_{\mathfrak{s}} - \mathbf{K}\_{P}\mathbf{e}\_{\mathfrak{s}} \tag{33}
$$

Therefore, when *J<sup>i</sup>* is full-rank, an asymptotic tracking is achieved by the visual servo controller expressed in Equation (30) for the tracking of an image trajectory.

#### **6. Results**

This section presents the system behavior during the tracking of different trajectories. The robot is guided by an eye-in-hand camera system. The parameters of a Gigabit Ethernet TM6740GEV camera is considered, which acquires 200 images every second with a resolution of 1280 × 1024 pixels. The eye-in-hand camera extracts four visual features from the four corners of the target.

Three different experiments are presented in this section. In these experiments the value of **W** = *M*−<sup>1</sup> is considered. Figure 4a,b represents the initial configuration employed in all of the experiments. Figure 4a represents the target and the mobile manipulator with the eye-in-hand camera system. The initial position of the visual features is represented in Figure 4b. The desired trajectory employed in the first experiment is a linear trajectory that represents a lateral translation of 200 px. in x and y directions in the image from the initial position of the visual features. As it can be seen in Figure 4d, the image-based controller correctly carries out the tracking and a straight line is obtained in the image space. Figure 4c represents the final pose of the mobile manipulator. The second experiment, represented in Figure 4e,f, evaluates the tracking when a displacement in depth must be performed. As it can be seen in Figure 4e, a displacement in depth is carried out and, once the experiment ends, the mobile manipulator is closer to the target. Figure 4f represents the obtained image trajectory. This trajectory corresponds to the desired trajectory specified as a linear increase in depth from the initial image configuration. Finally, the experiment represented in Figure 4g,h requires a change in orientation of the eye-in-hand camera with respect to the observed target. To do this, the desired image trajectories for the visual features are represented by a straight line between the initial and the final ones defined as *<sup>s</sup>*<sup>∗</sup> <sup>=</sup> [*f*1*<sup>d</sup>* <sup>=</sup> (503, 584), *<sup>f</sup>*2*<sup>d</sup>* <sup>=</sup> (612, 600), *<sup>f</sup>*3*<sup>d</sup>* <sup>=</sup> (523, 472), *<sup>f</sup>*4*<sup>d</sup>* <sup>=</sup> (630, 477)]*<sup>T</sup>* px. As in the previous experiments, Figure 4g represents the final robot pose once the experiment ends and Figure 4h represents the image trajectory. As the experiments show, the proposed controller allows the concurrent guidance of both the manipulator and robotic platform during the tracking of the desired image trajectory.

**Figure 4.** Robot pose and image trajectories in the experiments. (**a**) 3D initial pose of the mobile manipulator and target. (**b**) Initial visual features extracted by the eye-in-hand camera. First experiment: (**c**) 3D final pose. (**d**) Image trajectory. Second experiment: (**e**) 3D final pose. (**f**) Image trajectory. Third experiment: (**g**) 3D Final pose. (**h**) Image trajectory.

Figure 5 represents the joint torques applied to the manipulator (m = 1 ... 7) and force and moments applied to the robot platform during the first and second experiments. As it can be seen in Figure 5a (first experiment), the torques, force, and moment remain low, and a smooth behavior is obtained during the tracking. For the second experiment, Figure 5b represent the joint torques applied to the manipulator and force and moment applied to the base platform during the robot motion in depth. It is worth noting that the reduction of the distance between the two bodies (target and robot) is obtained by modifying both the base platform pose and the robot manipulator joint configuration. Finally, as in the previous experiments, Figure 6 represents the joint torques applied to the manipulator and force and moment applied to the base platform during the third tracking experiment. The desired relative orientation is achieved by controlling both the orientation of the base platform and the robot manipulator joint configuration.

**Figure 5.** (**a**) Experiment 1: Joint torque and moment and force exerted on the base platform. (**b**) Experiment 2: Joint torque, moment, and force exerted on the base platform.

**Figure 6.** Experiment 3: Joint torque, moment, and force exerted on the base platform.

One of the main advantages of the proposed controller is the possibility to define new direct image-based controllers by modifying the value of the matrix **W** of the optimal controller (see Equation (25)). A different controller with different dynamic behavior during the tracking of image trajectories can be obtained by modifying this matrix. In order to evaluate the proposed framework, in Table 1 we represent the obtained image error during the tracking of a repetitive image trajectory defined by the following equation.

$$\mathbf{s}\_d = \begin{bmatrix} f\_{xd} \\ f\_{yd} \end{bmatrix} = \begin{bmatrix} 320 + 166 \cos(\omega t + \pi/4) \\ 265 + 160 \sin(\omega t + \pi/4) \end{bmatrix} \tag{34}$$

**Table 1.** Mean image error in pixels (mm) during the tracking of the image trajectory.


Table 1 represents the mean error in pixels and mm during the tracking of the image trajectory considering different tracking velocities. As it can be seen in this table, different tracking errors are obtained and the best behavior at high velocities is obtained considering **<sup>W</sup>** <sup>=</sup> *<sup>M</sup>* <sup>−</sup><sup>1</sup> . Additionally, the performance of the controllers is compared with respect to previous controllers only based on inverse dynamics [22] (indicated as "previous" in Table 1).

#### **7. Conclusions**

This paper presents a direct image-based visual servoing system for the guidance of mobile manipulators. This approach is based on an optimal framework that considers the optimization of the motor signals sent to the mobile manipulator during the tracking of image trajectories. The stability of the controller has been proved analytically, and different dynamical behavior can be obtained by using different values of the matrix **W**.

As a result, an adaptive and flexible tool has been obtained, allowing for concurrent control of both the base platform and manipulator during the tracking of trajectories. With the given definition of *A* and *b*, the optimal control can minimize the motor commands while performing the tracking of trajectories in the image space. The viability of the controller has been tested in a variety of test case experiments, including approaches to an object, as well as different positioning tasks that require modifying the position and/or the orientation of the mobile manipulator. In all such cases, the experiments and trajectories have been accomplished.

**Author Contributions:** Funding acquisition, C.A.J.; Methodology, J.L.R.; Software, Á.B.; Supervision, J.P.; Validation, G.J.G.

**Funding:** This research was supported by the Valencia Regional Government through project GV/2018/050. **Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A**

As described throughout the paper, the motor commands are *u*= *uT <sup>p</sup> u<sup>T</sup> m T* = *<sup>v</sup> <sup>ω</sup>* . *q T m T* . First, the robot Jacobian is used to convert the joint velocities to Cartesian velocities:

$$
\begin{bmatrix}
v \\ \omega \\ \dot{x}\_{mc} \\ \dot{y}\_{mc} \\ \dot{z}\_{mc} \\ \dot{\Phi}\_{xmc} \\ \dot{\Phi}\_{ymc} \\ \dot{\Phi}\_{zmc}
\end{bmatrix} = \begin{bmatrix}
I\_{2\times2} & \mathbf{0}\_{2\times6} \\
\mathbf{0}\_{6\times2} & I\_{6\times6}
\end{bmatrix} \mu. \tag{A1}
$$

The relation between the previous Cartesian velocities and *v<sup>T</sup> ω<sup>T</sup> T* can be obtained by using Equations (11)–(16):

⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

$$
\begin{bmatrix}
\dot{t}\_x\\ \dot{t}\_y\\ \dot{t}\_z\\ \dot{\Phi}\_x\\ \dot{\Phi}\_y\\ \dot{\Phi}\_z
\end{bmatrix} = \begin{bmatrix}
v\\ \omega\\ \dot{x}\_{mc}\\ \dot{y}\_{mc}\\ \dot{z}\_{mc}\\ \dot{\Phi}\_{xmc}\\ \dot{\Phi}\_{ymc}\\ \dot{\Phi}\_{zmc}
\end{bmatrix} \tag{A2}
$$

Where,

$$\begin{aligned} \mathbf{J}\_{c1} &= \begin{bmatrix} \cos a & -\mathbf{x}\_{m}\sin a - \mathbf{x}\_{pun}\sin a & \cos a & -\sin a & 0 & 0 & 0 & 0\\ \sin a & \mathbf{x}\_{m}\cos a + \mathbf{x}\_{pun}\cos a & \sin a & \cos a & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 \end{bmatrix} \\ \mathbf{J}\_{c2} &= \begin{bmatrix} 0 & 0 & 0 & 0 & 0 & \cos a & -\sin a & 0\\ 0 & 0 & 0 & 0 & 0 & \sin a & -\cos a & 0\\ 0 & 1 & 0 & 0 & 0 & 0 & 0 & 1 \end{bmatrix} \end{aligned} \tag{A3}$$

Finally, by using the rotation matrix *R<sup>c</sup>* which represents the orientation frame F*<sup>c</sup>* with respect to the fame <sup>F</sup>*o*, the values of *v<sup>T</sup> ω<sup>T</sup> T* can be expressed with respect to the frame F*c*, *vT <sup>c</sup> ω<sup>T</sup> c T* .

$$
\begin{bmatrix} \boldsymbol{\sigma}\_{\boldsymbol{\varepsilon}} \\ \boldsymbol{\omega}\_{\boldsymbol{\varepsilon}} \end{bmatrix} = \begin{bmatrix} \boldsymbol{R}\_{\boldsymbol{\varepsilon}}^{T} & \mathbf{0} \\ \mathbf{0} & \boldsymbol{R}\_{\boldsymbol{\varepsilon}}^{T} \end{bmatrix} \begin{bmatrix} \boldsymbol{\sigma} \\ \boldsymbol{\omega} \end{bmatrix} \tag{A4}
$$

Therefore, the value of *J<sup>c</sup>* is equal to

$$J\_c = \begin{bmatrix} R\_c^T & \mathbf{0} \\ \mathbf{0} & R\_c^T \end{bmatrix} \begin{bmatrix} J\_{c13\times 8} \\ J\_{c23\times 8} \end{bmatrix} \begin{bmatrix} I\_{2\times 2} & \mathbf{0}\_{2\times 2} \\ \mathbf{0}\_{6\times 2} & J\_{6\times 6} \end{bmatrix} \tag{A5}$$

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Picking Robot Visual Servo Control Based on Modified Fuzzy Neural Network Sliding Mode Algorithms**

#### **Wei Chen 1,2,\*, Tongqing Xu 1, Junjie Liu 1, Mo Wang <sup>2</sup> and Dean Zhao <sup>3</sup>**


Received: 23 April 2019; Accepted: 24 May 2019; Published: 29 May 2019

**Abstract:** Through an analysis of the kinematics and dynamics relations between the target positioning of manipulator joint angles of an apple-picking robot, the sliding-mode control (SMC) method is introduced into robot servo control according to the characteristics of servo control. However, the biggest problem of the sliding-mode variable structure control is chattering, and the speed, inertia, acceleration, switching surface, and other factors are also considered when approaching the sliding die surface. Meanwhile, neural network has the characteristics of approaching non-linear function and not depending on the mechanism model of the system. Therefore, the fuzzy neural network control algorithm can effectively solve the chattering problem caused by the variable structure of the sliding mode and improve the dynamic and static performances of the control system. The comparison experiment is carried out through the application of the PID algorithm, the sliding mode control algorithm, and the improved fuzzy neural network sliding mode control algorithm on the picking robot system in the laboratory environment. The result verified that the intelligent algorithm can reduce the complexity of parameter adjustments and improve the control accuracy to a certain extent.

**Keywords:** visual servoing; fuzzy neural network; sliding mode control; picking robot

#### **1. Introduction**

With the development of science and technology, robot technology has acquired unprecedented achievements. In agriculture, the development of fruit-picking robots has become a hotpot project due to the rising labor costs and the huge output of apples and citrus. The apple-picking robot is an integrated system that integrates environmental perception, dynamic decision-making, and behavior control; motion control is therein the most basic and important link. At present, there are two main kinds of visual servo control for picking robots: one is image-based visual servo control, and the other is a position-based visual servo control system [1]. The picking robot manipulator is a typical non-linear system, because it is composed of multiple joints and has time-varying uncertainty during the picking process. This indicates that traditional control methods would not meet the control requirements. Image feedback-based visual servo control is a real-time feedback control for the manipulator according to the visual acquisition image, and it also shows strong self-adaptability to the change of external environment and system parameters so that it reduces the complexity of manipulator in the control process.

At present, most visual servo control systems based on image feedback adopt traditional control schemes such as adaptive control [2], PID control [3], and sliding model control [4], etc., or intelligent control algorithms [5] such as neural network and fuzzy theory [6], etc. There are some defects in adopting a single control method. Adaptive control is a strategy to deal with structural uncertainties, which provides good performance to a certain extent [7]. PID control has better control performance for the position and joint angle parameters of the manipulator end-effector [8]. Sliding-mode control (SMC) has excellent robustness and good anti-jamming ability to the uncertainties of the system, especially in the control of non-linear systems [9]. Neural network (NN) has the ability to learn and approximate non-linear functions [10]. It is often used to compensate unstructured uncertainties in robot controller modeling. Fuzzy control [11] imitates human reasoning and decision-making processes on the basis of not depending on the precise mathematical model of the controlled object. Fuzzy logic is adopted in robot controllers to achieve good control performance under uncertain conditions. In the reference [12], a robust adaptive control method based on the dynamic structure of the fuzzy wavelet neural network is proposed, which is used to track the trajectory of the robot with uncertainties and disturbances by the adaptive sliding-mode control. Mehta and Ton et al. introduced a continuous terminal sliding mode visual servo controller to adjust the camera to the desired position, but this has many requirements for the camera, and requires the correction of the camera position [13]. In the field of robotic citrus picking, there is a non-linear robust image visual servo controller that can adjust the end effector to the fruit position under the condition of unknown fruit movement, but the speed of the picking process is slow, and is not conducive to agricultural application [14]. The fuzzy neural network (FNN) learning algorithm [15,16] is a superior system identification method that utilizes non-linear control. There is no need to get the prior knowledge about the uncertainty and a sufficient amount of observed data. Efe and Devesh et al. used the novel particle swarm optimization (PSO)–SMC hybrid learning algorithm to train the FNN effectively, but instead of using the visual servo control of picking robots, they realized the flattening and rotation control of the four-rotor configuration for agricultural purposes [17].

A great deal of research has been devoted to these intelligent algorithm scholars, but there are few applications in the field of agricultural picking robots. In this paper, a joint visual servo control algorithm based on an improved sliding mode and kinematics and dynamics equation is presented that appears to have good robustness in a non-linear system. However, sliding-mode control always requires detailed system information and corresponding uncertainty bounds to ensure stability [18]. Even though the stability requirements are met, there will still be problems such as jitter [19]. Therefore, fuzzy and adaptive neural network control algorithms are introduced and combined with sliding-mode control to suppress the sliding-mode jitter problem. However, the traditional neural network model is complex and computationally intensive, with a high probability of over-fitting. To prevent over-fitting, we increased the dropout rate for the dropout layer [20].

According to the characteristics of apple-picking robot motion control, the image-based robot visual servo control method is combined with the improved fuzzy neural network sliding mode control algorithm to make use of the non-linear and non-system-dependent mechanism model of the fuzzy neural network. Combined with sliding mode variable structure control, the sliding mode chattering is eliminated, and the stability of the control system is improved. In this paper, the algorithm is applied for the first time in the field of agricultural fruit-picking robots, which improves the extraction speed of apple-picking robots.

The research aim of this paper is mainly to verify the visual servo control of the sliding-mode control algorithm based on the improved fuzzy neural network through the apple-picking robot. The article is divided into three parts, Firstly, we establish the kinematics and dynamics equations of the picking robot. Secondly, visual positioning is introduced to calculate the position of the target point, and an image-based visual control algorithm is used. Finally, the sliding-mode control algorithm and the improved fuzzy neural network control algorithm are combined to carry out simulation analysis, and the algorithm is verified on the picking robot. It is concluded that the improved fuzzy neural network sliding mode control algorithm can improve the efficiency of the robot arm servo control and has higher stability.

#### **2. Picking Robot Structure and Model Establishment**

#### *2.1. Manipulator Structure Design*

Since the apple trees are relatively high and the fruits are disperse, the higher degree of freedom of the picking robot manipulator brings not only the kinematics algorithm complexity, but also the waste of joint freedom degree, which is disadvantageous to the practicality of agriculture. This reduces the requirements of the manipulator and minimizes the usage of freedom degrees. So, this paper introduces the self-designed five degrees of freedom (DOF) manipulator, which is shown in Figure 1. Figure 2 is a schematic view of the picking robot manipulator. The manipulator gripping device adopts a curved design in order to reduce the damage of the apple during the process of picking up the apple. Figure 3 shows a wheeled picking robot.

**Figure 1.** Manipulator structural drawing of picking robot (1: picking robot platform; 2: platform; 3: robot base; 4: lift platform; 5: servo motor; 6: robot lumbar; 7: lumbar rotating shaft; 8: main arm joint; 9: main arm; 10: small arm joint; 11: small arm; 12: rotating shaft; 13: grasp hand installation seat).

**Figure 2.** Schematic diagram of picking robot manipulator.

**Figure 3.** Picking robot physical map.

Fruit-picking robots are mainly composed of two parts. The first part is a chassis platform, and the second part is a manipulator. Therein, the five DOF manipulator is a PRRRP joint structure. The first degree of freedom is the lifting platform. Its main function is to lift the whole manipulator. The second degree of freedom is the waist rotation joint of the manipulator. The third degree of freedom is the swing axis of the arm, the fourth degree of freedom is the swing axis of the arm, and the fifth degree of freedom is the rotating axis of the manipulator with the function of adjusting the grasping posture according to the picking requirement.

#### *2.2. Kinematics Equation of Manipulator*

The manipulator is a complex system, since it includes various disciplines such as kinematics and dynamics. Kinematics analysis is used to model according to the relationship between the spatial posture and the angle of each joint of the manipulator. The designed structural kinematics parameters of the manipulator are shown in Table 1. In the table, α*<sup>i</sup>* is the torsion angle, *ai* is the length of the connecting rod, θ*<sup>i</sup>* is the joint angle, *di* is the offset of the connecting rod, σ = 0 is the rotating joint, σ = 1 is the direct acting joint, and *di* is the other constant. The use of the D–H rule to determine the coordinate position of the connecting rod is shown in Figure 4.


**Table 1.** Kinematic parameters of manipulators.

**Figure 4.** Manipulator D–H model.

The homogeneous coordinate transformation matrix is used in the above-mentioned establishment of connecting the rod D–H coordinate system and kinematics mode.

$$\begin{aligned} \;\_{1}^{0}T &= \begin{bmatrix} 0 & -1 & 0 & 0 \\ 1 & 0 & 0 & 1 \\ 0 & 0 & 1 & d\_{1} \\ 0 & 0 & 0 & 1 \end{bmatrix} \\ \;\_{2}^{1}T &= \begin{bmatrix} \cos\theta\_{2} & 0 & -\sin\theta\_{2} & 0.133\cos\theta\_{2} \\ \sin\theta\_{2} & 0 & \cos\theta\_{2} & 0.133\sin\theta\_{2} \\ 0 & -1 & 0 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \\ \;\_{3}^{2}T &= \begin{bmatrix} \cos\theta\_{3} & -\sin\theta\_{3} & 0 & \cos\theta\_{3} \\ 1 & \cos\theta\_{3} & 0 & \sin\theta\_{3} \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \\ \;\_{4}^{3}T &= \begin{bmatrix} \cos\theta\_{4} & 0 & -\sin\theta\_{4} & 0 \\ \sin\theta\_{4} & 0 & \cos\theta\_{4} & 0 \\ 0 & -1 & 0 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \\ \;\_{5}^{4}T &= \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & d\_{5} \\ 0 & 0 & 0 & 1 \end{bmatrix} \end{aligned} \tag{1}$$

Therein, *<sup>i</sup>*−<sup>1</sup> *<sup>i</sup> T* presents the homogeneous matrix of translation and rotation from the *i* − 1 joint coordinate system to the *i* joint coordinate system. The total transformation matrix of the end effector relative to the robot coordinate system is:

$$\begin{aligned} \,^0\_5 T = \,^0\_1 T^1\_2 T^2\_3 T^3\_4 T^4\_5 T = \begin{bmatrix} n\_x & o\_x & a\_x & p\_x \\ n\_y & o\_y & a\_y & p\_y \\ n\_z & o\_z & a\_z & p\_z \\ 0 & 0 & 0 & 1 \end{bmatrix} \tag{2} \end{aligned} \tag{2}$$

The end effector position equation is:

$$\begin{aligned} p\_x &= d\mathfrak{s}\sin\theta\_2 \sin(\theta\_3 + \theta\_4) - \sin\theta\_2 \cos\theta\_3 - 0.133\sin\theta\_2\\ p\_y &= -d\mathfrak{s}\cos\theta\_2 \sin(\theta\_3 + \theta\_4) + \cos\theta\_2 \cos\theta\_3 + 0.133\cos\theta\_2\\ p\_z &= -d\mathfrak{s}\cos(\theta\_3 + \theta\_4) - \sin\theta\_3 + d\_1 \end{aligned} \tag{3}$$

The end effector posture equation is:

$$\begin{aligned} p\_x &= d\_5 \sin \theta\_2 \sin(\theta\_3 + \theta\_4) - \sin \theta\_2 \cos \theta\_3 - 0.133 \sin \theta\_2 \\ p\_y &= -d\_5 \cos \theta\_2 \sin(\theta\_3 + \theta\_4) + \cos \theta\_2 \cos \theta\_3 + 0.133 \cos \theta\_2 \\ p\_z &= -d\_5 \cos(\theta\_3 + \theta\_4) - \sin \theta\_3 + d\_1 \end{aligned} \tag{4}$$

#### *2.3. Dynamic Model of Manipulator*

According to the analysis of manipulator joints [21], the five degrees of freedom of the manipulator are respectively the lifting base, waist rotation axis, arm joint, and wrist rotation. The lifting base only works when the length of the manipulator is insufficient, while the waist rotation and wrist rotation axis are tentatively defined as not playing a key role in the visual recognition and picking process; therefore, the manipulator is simplified as a two-joint manipulator with only a big arm and a small arm. Meanwhile, a dynamic model is established, as shown in Figure 5.

$$\mathcal{C}M(q)\ddot{q} + \mathcal{C}\{q,\dot{q}\}\dot{q} + \mathcal{G}(q) = \pi \tag{5}$$

**Figure 5.** Simplified schematic diagram of two-joint manipulator.

Herein, *<sup>q</sup>* <sup>=</sup> *q***<sup>1</sup>** *q***<sup>2</sup>** *T* , τ = [τ**<sup>1</sup>** τ**2**] *<sup>T</sup>*. In the formula, *<sup>q</sup>*, . *q*, .. *q* are respectively the joint angle position, velocity, and acceleration vectors of the manipulator. *<sup>M</sup>*(*q*) <sup>∈</sup> *Rn*×*<sup>n</sup>* is a positive definite mass inertia matrix. *C q*, . *q* <sup>∈</sup> *<sup>R</sup>n*×*<sup>n</sup>* represents the centrifugal force and Coriolis force. *<sup>G</sup>*(*q*) <sup>∈</sup> *<sup>R</sup><sup>n</sup>* is gravity vector. <sup>τ</sup> <sup>∈</sup> *Rn* is the control moment input from each part of the joint. <sup>τ</sup>d∈*Rn* is the unknown external disturbances. Therein, Formula (1) satisfies the following characteristics:

(1) Positive definite symmetry: For any *q* matrix, *M*(*q*) is a positive definite symmetric. . .

(2) Boundedness: Matrix functions *M*(*q*) and *C q*, *q* are uniformly bounded for all *q*, *q*.

(3) Skew symmetry: The matrix function *M*(*q*) − 2*C q*, . *q* is a skew symmetric matrix, i.e., for any vector α, there is α*<sup>T</sup> M*(*q*) − 2*C q*, . *q* <sup>α</sup> <sup>=</sup> 0.

#### **3. Visual Servo Control**

The visual servo control structure can be divided into position-based control and image-based control according to different feedback signals. The feedback signal of position-based control is defined in coordinate form in three-dimensional task space. Its basic principle is to estimate the target in a three-dimensional Cartesian coordinate system by extracting the image features and combining the geometric model of the known target, and then planning the trajectory and calculating the control quantity according to the position comparison between the end effector and the target of the manipulator. For the image-based control, the control quantity is calculated directly from the error signal in the image, and the task is completed by driving the manipulator [22]. In this paper, the image-based control method is used to output control signals after image calculation, and the sliding-mode control parameters are adaptively adjusted by the fuzzy neural network algorithm so as to improve the robustness and stability of the manipulator grasping.

#### *3.1. Visual Target Location*

According to the visual servo control system working process of fruit picking robots, the relationship between the position information of the robot and its joints should be determined first. The mapping relationship between the image plane coordinates and the three-dimensional coordinates in the real space can be established by the perspective model of the small hole so that the coordinate position of the target can be calculated [23]. Due to the vision camera adopted in this paper being the target localization of the monocular robot, its camera is installed at the end of the manipulator. According to the principle of the optical imaging of the camera, the model of monocular vision localization is established as shown in Figure 6, in which the mapping relation of two-dimensional image coordinates and coordinates in three-dimensional space is established by the transformed geometric coordinates.

**Figure 6.** Visual model of camera imaging.

Here, we assume that the target fruit coordinate in the coordinate system of the manipulator is (*Xa*, *Ya*, *Za*) and its coordinate in the camera coordinate system is (*Xc*, *Yc*, *Zc*). The relationship between the 3D coordinates (*Xc*, *Yc*, *Zc*) of the target in the camera coordinate system and its two-dimensional (2D) coordinates (*X*1, *Y*1) on the camera imaging plane is obtained.

$$\begin{cases} X\_1 = \frac{X\_\xi f}{Z\_\xi} \\ \ Y\_1 = \frac{Y\_\xi f}{Z\_\xi} \end{cases} \tag{6}$$

In the formula, *f* is the camera focal length. The coordinate transformation formula between the camera coordinate system and manipulator coordinate system can be obtained by the geometric analytic method shown in Figure 2.

$$\begin{cases} X = \sqrt{\frac{l\_2^2 + l\_3^2 - 2L\_2 l\_3 \cos(\pi - \theta\_3) - \left(L\_2 \sin \theta\_2 + L\_3 \sin \left(\theta\_2 + \theta\_3\right)\right)^2}{1 + \tan^2 \theta\_1}} \\\ Y = X \tan \theta\_1 \\\ Z = L\_1 + L\_2 \sin \theta\_2 + L\_3 \sin \left(\theta\_2 + \theta\_3\right) \end{cases} \tag{7}$$

In the formula, *L*1—Waist length

*L*2—Big arm length *L*3—Small arm length θ1—one DOF joint angle of manipulator θ2—two DOF joint angle of manipulator θ3—three DOF joint angle of manipulator

From Formula (3), the relationship between the target fruit coordinate in the manipulator coordinate system (*Xa*, *Ya*, *Za*), and the fruit coordinate in the camera coordinate system (*Xc*, *Yc*, *Zc*) can be obtained:

$$\begin{cases} X\_c = X\_d - X \\ Y\_c = Y\_d - Y \\ Z\_c = Z\_d - Z \end{cases} \tag{8}$$

The relationship between the robot angle change of each joint and the plane coordinate change of the target fruit imaging is obtained.

#### *3.2. Visual Servo Control*

Since the fruit picking robot works in unknown and uncertain environments and the visual servo control system has the characteristics of time-varying, strong coupling, and non-linearity, therefore, the visual servo control system based on images is used due to its advantages regarding insensitivity to errors. The structure of the image-based visual servo system is shown in Figure 7.

**Figure 7.** Image-based visual servo system structure.

In this system, the picking robot captures the target image with a camera installed by eye-in-hand mode, and the position information of the target fruit is obtained by an image processing system. The angle of each joint of the robot can be obtained by coordinate transformation. In order to improve the motion performance of the manipulator, a visual servo algorithm based on fuzzy neural network adaptive sliding mode control is adopted to overcome the uncertainty of the system so as to improve the robustness of the servo control of the manipulator.

#### **4. Complex Fuzzy Neural Network Sliding Mode Control**

#### *4.1. Model Design of Sliding Mode Algorithms*

Sliding-mode [24] variable structure control mainly uses a high-speed control rate under non-linear conditions. It can reach the predetermined sliding mode control track in a short time and move along the track according to the variable dynamic control system. The sliding mode control scheme is shown in Figure 8.

**Figure 8.** Structural diagram of the sliding-mode control (SMC) controller.

First, we set the ideal angle instruction to *qd*(*t*), and track the error of the sliding-mode controller:

$$
\dot{e}(t) = q(t) - q\_d(t) \tag{9}
$$

Then, we design the sliding surface as:

$$s(t) = \ddot{\mathbf{e}} + \Lambda\_1 \dot{\mathbf{e}} + \Lambda\_2 \mathbf{e} \tag{10}$$

Therein, **Λ** = *diag*(λ*i***1**, λ*i***2**, ... , λ*in*), λ*ij* > **0**, *i* = **1**, **2**, *j* = **1**, **2**, ... , *n*. Then, we set the Lyapunov function to:

$$V = \frac{1}{2} \mathbf{s}^T \mathbf{M} \mathbf{s} \tag{11}$$

Therefore: .

$$\dot{V} = \frac{1}{2}\dot{\mathbf{s}}^T \mathbf{M} \mathbf{s} + \frac{1}{2}\mathbf{s}^T \dot{\mathbf{M}} \mathbf{s} + \frac{1}{2}\mathbf{s}^T \mathbf{M} \dot{\mathbf{s}} \tag{12}$$

Bring the characteristics of the manipulator dynamics equation into the formula:

$$\dot{V} = \mathbf{s}^T M \dot{\mathbf{s}} + \mathbf{s}^T \mathbf{C} \dot{\mathbf{s}} = \mathbf{s}^T \left( M \left( \dddot{\overline{\boldsymbol{\eta}}} - \dddot{\boldsymbol{\eta}}\_d + \Lambda\_1 \ddot{\boldsymbol{\varepsilon}} + \Lambda\_2 \dot{\boldsymbol{\varepsilon}} \right) + \mathbf{C} \left( \ddot{\overline{\boldsymbol{\varepsilon}}} + \Lambda\_1 \dot{\boldsymbol{\varepsilon}} + \Lambda\_2 \boldsymbol{\varepsilon} \right) \right) \tag{13}$$

According to:

$$
\pi + \Lambda \pi = \Lambda \mathcal{U} \mathcal{U}\_{\text{SMC}} \tag{14}
$$

Find: .

$$\dot{V} = \mathbf{s}^T (H + \Lambda U I\_{\rm SMC}) \tag{15}$$

Therein:

$$H = M\left(\Lambda\_1 \ddot{e} + \Lambda\_2 \dot{e} - \dddot{\overline{q}}\_d\right) + \mathbb{C}\left(\Lambda\_1 \dot{e} + \Lambda\_2 e - \dddot{\overline{q}}\_d\right) - \left(\dot{M} + \Lambda \Lambda \right) \ddot{q} \tag{16}$$

$$- \left(\dot{\mathbb{C}} + \Lambda \mathbb{C}\right) \dot{q} - \left(\dot{\mathbb{C}} + \Lambda \mathbb{G}\right)$$

Then, we design the SMC system as:

$$
\hbar L\_{\rm SMC} = -\Lambda^{-1} \left( H + \eta \text{sgn}(s) \right) \tag{17}
$$

Find: .

$$\dot{V} = -\eta \mathbf{s}^T \text{sgn}(\mathbf{s}) = -\eta \|\mathbf{s}\| \le 0 \tag{18}$$

According to the LaSalle invariant set theorem, when *<sup>t</sup>* → ∞, *<sup>s</sup>* <sup>→</sup> 0, then *<sup>q</sup>* <sup>→</sup> 0, . *<sup>q</sup>* <sup>→</sup> 0.

#### *4.2. Sliding-Mode Control Based on Complex Fuzzy Neural Network*

The fuzzy neural network is a kind of special neural network that is a hybrid intelligent system formed by the combination of neural network and fuzzy logic. It combines the two kinds of techniques by combining the human-like reasoning of fuzzy systems with the learning and connection structure of neural networks [25,26]. In a nutshell, the fuzzy neural network (FNN) assigns a conventional neural network to fuzzy input signals and fuzzy weights. Its function is to use the fuzzy neural network structure to implement fuzzy logic reasoning.

In order to improve the control accuracy of the two-joint manipulator, a fuzzy neural network control algorithm, the fuzzy neural network (FNN) is constructed to optimize the conventional sliding mode control because of the chattering phenomenon in the conventional sliding-mode control [27–30]. The system consists of an input layer, an adaptive fuzzy rule layer, a rule layer, and an output layer. The structure of the system is shown in Figure 9. The input is a sliding surface, and the output is a control quantity. The following is the main flow chart of the FNN algorithm:

**Figure 9.** Structure of fuzzy neural network system.

Step 1: Input the layer data initialization of the fuzzy neural network, and transfer the sliding-mode vector variable *si*(*i* = 1, 2, ... , *n*) to the next layer.

Step 2: The nodes in each layer represent the membership functions of the members, which use Gauss function as an input:

$$\mu\_{i}^{j}(s\_{i}) = \exp\left[-\frac{\left(s\_{i} - \mathbf{x}\_{i}^{j}\right)^{2}}{\left(\sigma\_{i}^{j}\right)^{2}}\right], (i = 1, 2, \dots, n; \ j = 1, 2, \dots, m) \tag{19}$$

In the formula, *n* represents the number of input variables, and *m* is the number of member functions. *x<sup>j</sup> <sup>i</sup>* and <sup>σ</sup>*<sup>j</sup> <sup>i</sup>* represent the j level node of the i input from *si* to the Gauss function.

Step 3: The rule layer represents the fuzzy reasoning mechanism of the neural network, and the data results are obtained by multiplying the input signals of the layer. The formula is as follows:

$$\mathcal{g}\_k = \prod\_{i=1}^n \omega\_{ji}^k \mu\_i^j(s\_i) \tag{20}$$

Herein, *g<sup>k</sup>* represents the Kth output of the rule layer.

Step 4: The fourth layer is the output layer, in which each node represents an output variable. It integrates the output into the sum of all the input signals. The execution output of this layer deblurring is as follows:

$$y\_0 = \sum\_{k=1}^{N\_y} \omega\_k^0 g\_k \tag{21}$$

By adding FNN online cyclic training, inheriting the traditional sliding mode control system, and relaxing the requirement of system information, the SMC jitter problem can be eliminated. However, the traditional neural network training method mentioned above makes the learning speed of the algorithm slow and may fall into local minimum problem, etc. Therefore, a dropout layer is added on this basis to prevent over-fitting by randomly hiding layer nodes. The improved fuzzy neural network is expressed in vector form:

$$\mathbf{y} = \mathcal{U}\_{\text{NFNN}}(\mathbf{e}\_{\prime}\mathbf{x}\_{\prime}\sigma\_{\prime}\mathcal{W}\_{\prime}l\_{\prime}\mathbf{p}) = \mathcal{W}\mathbf{g} \tag{22}$$

In the formula, *W* is the weight matrix of the hidden layer and the output layer. *e* is the tracking error vector and the input vector of the system. *l* represents the fuzzy rules. *p* represents the dropout layer parameters of Bernoulli distribution.

Therefore, the equivalent control law of the optimal output estimation sliding surface is obtained.

$$\mathcal{U}L\_{\rm SMC} = \mathcal{U}^\*\_{\rm NN}(\varepsilon, x^\*, \sigma^\*, \mathcal{W}^\*, l^\*, p^\*) + \varepsilon = \mathcal{W}^\* \mathcal{g}^\* + \varepsilon \tag{23}$$

Herein, ε represents the error vectors of the network. *W*∗ , *x*∗ , σ∗ , *l* ∗ , *p*∗ are the optimal parameters of network. Then, the sliding mode control law of the improved fuzzy neural network can be expressed as:

$$\mathcal{U}\mathcal{U}(\mathbf{t}) = \mathcal{U}\_{\text{NN}}(\mathbf{e}, \mathbf{\hat{x}}, \mathbf{\hat{e}}, \mathbf{\hat{W}}, \mathbf{\hat{l}}, \mathbf{\hat{p}}) + \mathcal{U}\_{\text{r}} \tag{24}$$

Herein, *U*ˆ *NN* is the improved fuzzy neural network controller, which is used to estimate the sliding mode control law. *Ur* is an additional value of robustness, which ensures that the system would be on the sliding surface when t <sup>≥</sup> 0. ˆ*x*, <sup>σ</sup>ˆ,*W*<sup>ˆ</sup> , <sup>ˆ</sup> *l*, ˆ*p* represent the neural network parameter estimation.

$$
\overline{\mathcal{U}} = \mathcal{U}\_{\text{SMC}} - \mathcal{U} = \mathcal{W}^\* \mathcal{g}^\* + \varepsilon - \mathcal{W} \mathcal{\mathfrak{H}} - \mathcal{U}\_r = \overline{\mathcal{W}} \mathcal{g}^\* + \mathcal{W} \overline{\mathcal{g}} + \varepsilon - \mathcal{U}\_r \tag{25}
$$

In the formula, *<sup>W</sup>* <sup>=</sup> *<sup>W</sup>*<sup>∗</sup> <sup>−</sup> *<sup>W</sup>*<sup>ˆ</sup> , *<sup>g</sup>* <sup>=</sup> *<sup>g</sup>*<sup>∗</sup> <sup>−</sup> *<sup>g</sup>*ˆ, *<sup>g</sup>* can be expressed by Taylor expansion due to *<sup>g</sup>* being a non-linear activation function.

$$
\overline{\mathcal{g}} = \mathcal{g}\_{\overline{x}} \overline{\mathcal{e}} \overline{\mathcal{e}} + \mathcal{g}\_{\mathcal{o}} \overline{\mathcal{e}} \overline{\mathcal{e}} + o\_{\text{m}} \tag{26}
$$

Herein, *<sup>x</sup>* <sup>=</sup> *<sup>x</sup>*<sup>∗</sup> <sup>−</sup> *<sup>x</sup>*ˆ, <sup>σ</sup> <sup>=</sup> <sup>σ</sup><sup>∗</sup> <sup>−</sup> <sup>σ</sup><sup>ˆ</sup> can be expressed as:

$$\mathbf{g}^\* = \mathbf{\hat{g}} + \mathbf{g}\_x \mathbf{\overline{x}} \mathbf{e} + \mathbf{g}\_{\sigma} \mathbf{\overline{\sigma}} \mathbf{e} + o\_m \tag{27}$$

Bringing Formula (22) into Formula (20), then we find:

$$\begin{split} \overline{\mathcal{U}} &= \mathcal{W}^\* \mathcal{g}^\* + \varepsilon - \hat{\mathcal{W}} \hat{\mathcal{g}} - \mathcal{U}\_I \\ &= \mathcal{W}^\* (\mathcal{g} + \mathcal{g}\_x \overline{\mathcal{x}} e + \mathcal{g}\_\sigma \overline{\sigma} e + o\_m) + \varepsilon - \hat{\mathcal{W}} \mathcal{g} - \mathcal{U}\_I \\ &= \left( \mathcal{W}^\* - \hat{\mathcal{W}} \right) \mathcal{g} + \left( \overline{\mathcal{W}} + \hat{\mathcal{W}} \right) \mathbb{g}\_x \overline{\mathcal{x}} e + \left( \overline{\mathcal{W}} + \hat{\mathcal{W}} \right) \mathbb{g}\_\sigma \overline{\sigma} e + \varepsilon - \mathcal{U}\_I + \mathcal{W}^\* o\_m \\ &= \overline{\mathcal{W}} \mathcal{G} + \mathcal{W} \mathbb{g}\_x \overline{\mathcal{x}} e + \mathcal{W} \mathbb{g}\_\sigma \overline{\sigma} e - \mathcal{U}\_I + P \end{split} \tag{28}$$

Herein: *P* = *Wg <sup>x</sup> xe* <sup>+</sup> *Wg* <sup>σ</sup>σ*<sup>e</sup>* <sup>+</sup> *<sup>W</sup>*<sup>∗</sup> *om* + ε, ||*P*|| < *r*, *r* is a minimal positive number.

#### *4.3. Stability Analysis*

Next, we improve the fuzzy neural network sliding mode control algorithm according to the dynamics and kinematics equation of the manipulator to make the manipulator reach the prescribed sliding surface in the closed-loop control of the visual servo system, and reduce and eliminate the chattering problem of sliding-mode control, so as to ensure the long-term stability of the system.

Set:

$$
\hat{\mathcal{W}} = \eta\_1 \left( \mathbb{S}^{\mathcal{S}} \right)^T \tag{29}
$$

$$\pounds = \eta\_2 \Big(eS^T \mathcal{W} g\_x\big)^T \tag{30}$$

$$
\boldsymbol{\sigma} = \eta\_3 \Big( \boldsymbol{e} \boldsymbol{S}^T \boldsymbol{W} \mathbf{g}\_{\mathcal{S}} \Big)^T \tag{31}
$$

$$\mathcal{U}\_{\mathcal{I}} = \frac{\tau\_{0} \text{sgn}(S)}{\sum\_{i=1}^{n} S\_{i}^{2}} \tag{32}$$

Define the Lyapunov function as:

$$V = \frac{1}{2}S^T S + \frac{1}{2\eta\_1} tr\Big(\widetilde{W}\widetilde{W}^T\Big) + \frac{1}{2\eta\_2} tr\Big(\widetilde{\mathbf{x}\mathbf{x}^T}\Big) + \frac{1}{2\eta\_3} tr\Big(\widetilde{\sigma}\overline{\sigma}^T\Big) \tag{33}$$

Find the derivation:

$$=\frac{1}{2}S^{T}\dot{S}-\frac{1}{\eta\_{1}}\text{tr}\Big(\widetilde{\boldsymbol{W}}\widetilde{\boldsymbol{W}}^{T}\Big)-\frac{1}{\eta\_{2}}\text{tr}\Big(\widetilde{\boldsymbol{\text{xx}}}^{T}\Big)-\frac{1}{\eta\_{3}}\text{tr}\Big(\widetilde{\boldsymbol{\sigma}\boldsymbol{\sigma}}^{T}\Big)\tag{34}$$

Since . *S* = *U*, we introduce Formulas (6)–(15) and (6)–(17) and find:

$$\begin{array}{c} \dot{\mathsf{V}} = \mathsf{S}^{T} \Big[ \widetilde{\mathsf{W}} \mathsf{g} + \mathsf{W}\_{\mathcal{G}\mathbf{x}} \overline{\mathsf{x}} \mathsf{e} + \mathsf{W}\_{\mathcal{G}\mathbf{o}} \overline{\mathsf{o}} \mathsf{e} - \mathsf{U}\_{r} + P \Big] \\ - \frac{1}{\eta\_{1}} \mathsf{tr} \Big( \widetilde{\mathsf{W}} \mathsf{W}^{T} \Big) - \frac{1}{\eta\_{2}} \mathsf{tr} \Big( \overline{\mathsf{x}} \mathsf{x}^{T} \Big) - \frac{1}{\eta\_{3}} \mathsf{tr} \Big( \overline{\sigma} \sigma^{T} \Big) \\ = \mathsf{tr} \Big( \widetilde{\mathsf{W}} \Big( \mathcal{G} \mathcal{S}^{T} - \frac{1}{\eta\_{1}} \widetilde{\mathsf{W}}^{T} \Big) \Big) + \mathsf{tr} \Big( \mathsf{e} \mathcal{S}^{T} \dot{\mathcal{W}} \mathsf{g}\_{\mathcal{X}} - \frac{1}{\eta\_{2}} \overline{\mathsf{x}}^{T} \Big) + \mathsf{tr} \Big( \mathsf{e} \mathcal{S}^{T} \dot{\mathcal{W}} \mathsf{g}\_{\mathcal{o}} - \frac{1}{\eta\_{3}} \overline{\sigma}^{T} \Big) + \mathcal{S}^{T} \left( -\mathcal{U}\_{r} + P \right) \end{array} \tag{35}$$

According to Formulas (24), (27) and (30), we transform **. V** to:

$$\dot{\mathbf{V}} = \mathbf{S}^T(-\mathcal{U}\_I + P) \le \|\mathbf{S}^T\| \|\times\| - \mathcal{U}\_I + P\| = -\alpha \|\mathbf{S}^T\| \le 0 \tag{36}$$

Herein, α is a positive, sliding-mode control system based on the improved fuzzy neural network, and is asymptotically stable according to the Lyapunov stability principle. When *t* → ∞, s → 0.

#### **5. Simulation Analysis and Experiment**

The adaptive sliding-mode controller is simulated by matlab. The AC servo motor is used in the robot joint. The corresponding parameters are set according to the physical parameters of the picking robot arm as shown in Table 2.

$$H = \begin{bmatrix} \alpha + 2\varepsilon \cos(q\_2) + 2\eta \sin(q\_2) & \beta + \varepsilon \cos(q\_2) + \eta \sin(q\_2) \\ \beta + \varepsilon \cos(q\_2) + \eta \sin(q\_2) & \beta \end{bmatrix}$$

$$\mathcal{C} = \begin{bmatrix} (-2\varepsilon \sin(q\_2) + 2\eta \cos(q\_2))\dot{q}\_2 & (-\varepsilon \sin(q\_2) + \eta \cos(q\_2))\dot{q}\_2 \\ (\varepsilon \sin(q\_2) - \eta \cos(q\_2))\dot{q}\_1 & 0 \end{bmatrix}$$

$$\mathcal{G} = \begin{bmatrix} \varepsilon e\_2 \cos(q\_1 + q\_2) + \eta e\_2 \sin(q\_1 + q\_2) + (\alpha - \beta + e\_1)e\_1 \cos q\_1 \\ \varepsilon e\_2 \cos(q\_1 + q\_2) + \eta e\_2 \sin(q\_1 + q\_2) \end{bmatrix}$$


**Table 2.** Structure parameters of picking robot.

Herein, α, β, η, ε is constant. α = *I*<sup>1</sup> + *m*1*l* 2 *<sup>c</sup>*<sup>1</sup> + *Ie* + *mel* 2 *ce* + *mel* 2 <sup>1</sup>, β = *I*<sup>1</sup> + *mel* 2 *ce*, ε = *mel*1*lcecos*(δ*e*), <sup>η</sup> <sup>=</sup> *mel*1*lcesin*(δ*e*). Take *<sup>a</sup>* <sup>=</sup> αβεη <sup>T</sup> , *<sup>a</sup>*<sup>ˆ</sup> is an estimation of *<sup>a</sup>*, set *<sup>a</sup>* <sup>=</sup> *<sup>a</sup>*<sup>ˆ</sup> <sup>−</sup> *<sup>a</sup>*, and *<sup>a</sup>* is constant, so . *<sup>a</sup>* <sup>≈</sup> . *a*ˆ.

Set the angle instruction of joint 2 as *qd*<sup>1</sup> = *sin*(2π*t*), *qd*<sup>2</sup> = *sin*(2π*t*), . *qd*<sup>1</sup> and . *qd*<sup>2</sup> are respectively the angular velocities of joints 1 and 2. Take **Λ** = 15 0 0 15 , **Λ**<sup>1</sup> = 5 0 0 5 , **Λ**<sup>2</sup> = 50 0 0 50 , *K*<sup>d</sup> = 100 0 0 100 . Set the gain of the adaptive fuzzy neural network to η<sup>1</sup> = 20; η<sup>2</sup> = 5; η<sup>3</sup> = 2.

Firstly, the tracking curves of angular position and velocity of joints 1 and 2 are obtained by using sliding mode control without adding the fuzzy neural network algorithm, as shown in Figures 10–12.

From Figure 10, it can be seen that the angular position switching motion of common SMC joints without the fuzzy neural network is continuous and smooth. From Figure 12, it can be seen that joint 2 is more likely to be disturbed due to the need to deal with the conductive force from joint 1, so the output buffeting is relatively large.

The simulation results of a manipulator system with an improved fuzzy neural network sliding-mode control algorithm are shown in Figures 13–15. From Figure 15, it can be seen that compared with the traditional SMC scheme, the proposed fuzzy neural network sliding-mode control algorithm effectively eliminates the control chattering phenomenon. Meanwhile, the angular velocity tracking of joint 1 and joint 2 is more robust. It realizes high-precision control without detailed system information, and embodies the superiority of the improved fuzzy neural sliding-mode control algorithm.

**Figure 10.** Angle tracking of joints 1 and 2.

**Figure 11.** Angular velocity tracking of joints 1 and 2.

**Figure 12.** Joint 1 and 2 control inputs without neural network sliding mode.

In order to further verify the effectiveness of the sliding-mode algorithm of the fuzzy neural network system, the self-developed picking robot system is shown in Figure 16. The control system of the manipulator that is used in the experiment is an industrial computer with an Intel processor. The motion control system of the manipulator is connected with the host computer through serial communication, and the C control code generated by Matlab is imported into the control system.

The control instruction is generated according to the results of image recognition, so we process the control instructions by the sliding-mode algorithm of the fuzzy neural network and dynamic joint angle, output the dynamic joint angle, and bring into the kinematics equations related to the manipulator and output manipulator motor control instruction so as to control the manipulator gripper to reach the target position. Intensive experiments are performed on the apple-picking robot using the conventional PID algorithm, the sliding-mode control algorithm, and the improved fuzzy neural network sliding mode control on the robot arm servo control in the same experimental environment. Comparing the above three different algorithms on the picking-robot system on the grab speed and the grab accuracy, Table 3 is the data comparison using different algorithms.

**Figure 14.** Angular velocity tracking of joints 1 and 2.


**Figure 15.** Control input of joints 1 and 2 after adopting the intelligent control scheme.

**Figure 16.** (**a**) Representing the movement process of the robot arm of apple-picking robot; (**b**) Representing that the picking machine recognizes the target and controls the robot arm to grasp the target through the algorithm presented in this paper.

Through the comparison of experimental data, it can be known that the conventional PID algorithm captures a long time, while the sliding mode control algorithm can effectively reduce the grab time, but there is a lower catch success rate, and the improved fuzzy neural network algorithm is adopted. The sliding mode control algorithm not only improves the grabbing efficiency, but also greatly improves the success rate of the capture. It is shown in the result of the visual servo motion of the manipulator that the fruit-picking robot based on the improved fuzzy neural network sliding-mode algorithm has good stability and robustness. Figure 14 shows that the fruit-picking robot completes the visual servo control process of the manipulator using an improved algorithm.

#### **6. Conclusions**

In this paper, based on the self-developed apple-picking robot, the kinematics and dynamics equations of the robot are established. The image-based visual servo control method is adopted to control the manipulator in order to improve the grasping accuracy in the picking process. Then, the joint control performance of the control system has been improved by the adaptive fuzzy neural network sliding-mode control algorithm. Finally, the superiority of the algorithm is verified by simulation and experiment. The results show that the method improves the shortcomings of the common sliding mode control for the training control jitter in the actual robotic arm visual servo system to a certain extent, and improves the stability of the servo control.

Due to the low stability of the mechanical design of the current picking robot end effector, the existing mechanical structure of the picking robot will be improved in future work, and continue to verify the superiority of the current algorithm on the basis of the new terminal actuator, and complete the picking experiment in the real environment of the apple-picking robot.

**Author Contributions:** W.C. conceived and designed the experiments; W.C. performed the experiments; T.X. contributed reagents/materials/analyzed the data; T.X. and J.L. contributed reagents/materials/analysis tools; T.X. and M.W. wrote the paper; D.Z. analyzed the data.

**Funding:** This research is supported by the "Six talent peaks" high-level talents projection Jiangsu Province (2016-GDZB-021), Postgraduate Research & Practice Innovation Program of Jiangsu Province (KYCX18\_2332), Project funded by China Postdoctoral Science Foundation (2016M601691), Industrial Foresight Project of Zhenjiang City (GY2018018).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Optimized Combination of Spray Painting Trajectory on 3D Entities**

## **Wei Chen 1,2,\*, Xinxin Wang 1, Hao Liu 1, Yang Tang <sup>3</sup> and Junjie Liu <sup>1</sup>**


Received: 27 November 2018; Accepted: 4 January 2019; Published: 9 January 2019

**Abstract:** In this research, a novel method of space spraying trajectory optimization is proposed for 3D entity spraying. According to the particularity of the three-dimensional entity, the finite range model is set up, and the 3D entity is patched by the surface modeling method based on FPAG (flat patch adjacency graph). After planning the spray path on each patch, the variance of the paint thickness of the discrete point and the ideal paint thickness is taken as the objective function and the trajectory on each patch is optimized. The improved GA (genetic algorithm), ACO (ant colony optimization), and PSO (particle swarm optimization) are used to solve the TTOI (tool trajectory optimal integration) problem. The practicability of the three algorithms is verified by simulation experiments. Finally, the trajectory optimization algorithm of the 3D entity spraying robot can improve the spraying efficiency.

**Keywords:** spray painting robot; FPAG; GA; ACO; PSO; TTOI problem

#### **1. Introduction**

With the development of the social economy and the improvement of life, people have higher requirements for product quality and product appearance. Moreover, some products with higher surface spraying quality, such as furniture, automobiles, and artworks, determine the quality of the product appearance and the competitiveness of products in the market. Therefore, people pay more and more attention to surface spraying technology [1]. The traditional spraying technology is that the spraying workers with spraying guns directly spray the workpiece to be sprayed. During spraying, the paint mist diffuses into the surrounding environment, which not only pollutes the environment, but also seriously harms the physical and mental health of spraying workers. To solve the problems caused by traditional spraying, automatic spraying systems have been developed. As a typical table of spraying automation equipment, a spraying robot has many advantages, such as good uniformity of the coating thickness, high repetitive positioning accuracy, wide applicability, and high efficiency. At the same time, spraying robots can free workers from a toxic, flammable, and explosive working environment [2].

In many fields of industrial production, the surface spraying of the workpiece is an essential step, and the types of workpiece modeling are becoming more and more abundant. During the spraying operation, the workpiece may have multiple spraying surfaces, and the curvature of each spraying surface is different. Usually, a three-dimensional (3D) entity modeling method is needed to form these parts, as shown in the 3D entity diagram in Figure 1. In recent years, due to the wide application of 3D solid modeling in various fields, scholars have made remarkable achievements in the research of 3D

solid modeling [3–7]. The spraying of 3D solid modeling has a certain novelty because the spraying is more interested in the surface shape of the workpiece. For the research of this paper, we use the finite spraying modeling technology and the surface modeling method based on the plane patch adjacency graph (FPAG) for modeling. It should be pointed out that the 3D entities mentioned in this paper are not the same as polyhedra. A polygon is a spatial geometry surrounded by several planar polygons, each of which is a plane. However, each face of a 3D entity can be a free-form curved surface. In this sense, the polyhedron is only a subset of the 3D entity.

**Figure 1.** (**a**) 3D entities with concave surfaces. (**b**) 3D entities with convex surfaces.

In addition, research on trajectory optimization for spray painting robots oriented to 3D entities is immature. Existing trajectory optimization methods can only be applied to 3D entities with convex surfaces, as shown in Figure 1b. For a 3D entity with concave surfaces (Figure 1a), the research on the trajectory optimization for the robot in this field is still a blank as the shape of the entity is complex and the robot is required to be extremely flexible in automatic spray painting operation. The trajectory optimization method for the spray painting robot introduced in this paper can only be applied to 3D entities with convex surfaces.

The idea of trajectory optimization for a spray painting robot oriented to a 3D entity is: First, a simple mathematical model of the paint deposition rate is established by the experimental method and the 3D entity is sliced by using the FPAG surface modeling method [8–10]. Secondly, after planning the painting path on each patch, the spray painting trajectory is optimized on each patch with the objective function of the variance of the paint thickness at discrete points and the ideal paint thickness. Finally, the spray painting trajectories on the individual patches are optimized, and the optimized trajectories of the spray painting robot on the 3D entities are formed eventually.

#### **2. The Establishment of the Mathematical Model**

The spraying mathematical model mainly includes the position and direction of the end-effector and the paint deposition rate model.

The establishment of the paint deposition rate model is an important problem in trajectory optimization for spray painting robots. Considering that the expressions of these models are rather complex, it is not applicable to build the mathematical model of the paint deposition rate on 3D entities. Therefore, in this paper, the specific mathematical expression of the finite range model takes the following factors into account: The distance and direction from the gun to the working surface, the curvature of the workpiece surface, and the angle of the paint gun (i.e., the cone angle corresponding to the paint flow).

Before building a finite-scope model, we made the following assumptions: The paint particles sprayed by the paint gun form a cone in space. Suppose the opening angle, ϕ, is half of the angle, *ϕ*, and the opening angle, *ϕ* < 90◦. The definition of the opening angle, *ϕ*, can be seen in Figure 2.

**Figure 2.** The finite range model.

Considering that the amount of paint to be obtained is smaller when the distance, *L*(*L* > 0), from the point, *s*(*x*, *y*, *z*) ∈ *S*, on the surface to the gun increases. Additionally, the amount of paint obtained at the points on the surface is also decreased when the angle, *θ*(*θ* < *ϕ*), is gradually increasing. The amount of the paint obtained at the point, *s*, on the workpiece surface can be expressed by the following equation:

$$\frac{c(\theta,\phi)}{L^2} \tag{1}$$

Among which:

$$\mathfrak{c}(\theta,\mathfrak{q}) \begin{cases} > 0, & \theta < \mathfrak{q} \\ = 0, & \theta \ge \mathfrak{q} \end{cases} \tag{2}$$

$$L = \sqrt{\left(x - \mathbf{p}\_x(t)\right)^2 + \left(y - \mathbf{p}\_y(t)\right)^2 + \left(z - \mathbf{p}\_z(t)\right)^2} \tag{3}$$

**p***x*, **p***y*, and **p***<sup>z</sup>* express, respectively, the coordinates of the gun on the X-axis, Y-axis, and Z-axis. Assuming that the workpiece surface is a curved surface, the paint deposition rate of a point, *s*(*x*, *y*, *z*) ∈ *S*, on the surface is proportional to the inner product of the following two vectors (shown in Figure 2): First, the unit normal vector, *n*(*s*), of the point. Second, the unit direction vector, *d*(*p*(*t*),*s*), of the gun and the point, *s*. The function, *d*(*p*(*t*),*s*), is defined as follows:

$$d(p(t),s) = \frac{(x - p\_x(t))i + \left(y - p\_y(t)\right)j + (z - p\_z(t))k}{L} \tag{4}$$

where, *i*, *j*, and *k* denote unit vectors in the positive direction of the *X*, *Y*, and *Z* axis, respectively.

Based on the above assumptions, we can derive the paint deposition rate function of a point, *s*, on the surface of the workpiece:

$$\stackrel{\bullet}{f}(s, p(t), t) = \left(\frac{c(\theta, q)}{\left(x - p\_x(t)\right)^2 + \left(y - p\_y(t)\right)^2 + \left(z - p\_z(t)\right)^2}\right) \cdot d(p(t), s) \cdot n(s) \tag{5}$$

The variable, *θ*, in Equation (5) is not significant. In the equation, *θ* is the angle of the gun and a point, *s*, on the workpiece surface and the axis of the gun. It is related to the coordinates of the point, *s*, on the workpiece surface, the position, *p*(*t*), and direction, *o*(*t*), of the paint gun, which can be defined as:

$$\theta = \cos^{-1}(d(\mathfrak{p}(t), \mathfrak{s}) \cdot \mathfrak{o}(t))\tag{6}$$

The selection of function, *c*(*θ*, *ϕ*), depends on the basic spray characteristics of the gun and some parameter settings, such as the air pressure of the gun and the velocity of the paint flow. In the usual case, the function, *c*(*θ*, *ϕ*), reaches its maximum when *θ* = 0 (that is, it just below the paint gun). When *θ* → *ϕ*, *c*(*θ*, *ϕ*) → 0. The following gives a concrete model of function, *c*(*θ*, *ϕ*):

$$\mathcal{L}(\theta, \mathfrak{p}) = \begin{cases} a \frac{\cos(\theta) - \cos(\mathfrak{p})}{\left(1 - \cos(\mathfrak{p})\right)^2} & \theta \le \mathfrak{p} \\ 0 & \text{otherwise} \end{cases} \tag{7}$$

The painting velocity of the paint gun and air pressure can be adjusted by changing the values of the parameters, *α* and *ϕ*, in the formula above, respectively. It can also be seen from the formula above that the values of the parameters, *α* and *ϕ*, are related to the maximum of the function, *c*(*θ*, *ϕ*).

The finite-range model here is circular on the plane corresponding to the paint distribution model. When the workpiece surface has a certain curvature, the paint distribution should become oval. For surfaces with small curvatures, the paint distribution, which is actually elliptical, can be approximated as a circular. In addition, the paint deposition rate model also illustrates that the total amount of paint sprayed to the workpiece surface (the total amount of paint sprayed from the gun) has nothing to do with the surface shape of the workpiece and the distance from the gun to the workpiece surface, which is in line with the actual situation.

#### **3. Segmentation for 3D Entity**

To obtain the optimal trajectory on the 3D solid surface, the first step is to model the workpiece surface. Second, due to the particularity of the spraying surface, trajectory optimization of the spraying robot is relatively difficult in a practical application. To obtain a good spraying effect, the finite range model modeling method is adopted. The third step is to use the FPAG surface modeling method to simplify the plane patch adjacency graph. As shown in Figure 3, the specific steps are as follows: (1) Divide the triangular grid of the 3D solid surface. (2) Set the maximum normal vector threshold and connect the triangle surface into a smaller flat area according to the triangle connection algorithm. (3) Each patch is approximately planar, and at least one side of each patch is part of the 3D solid ridge.

**Figure 3.** Step diagram of the surface modeling method based on the flat patch adjacency graph (FPAG).

#### **4. Trajectory Optimization on Each Patch**

The geometric properties of 3D entity surfaces are more complex compared with two-dimensional planar surfaces and regular curved surfaces. Therefore, to simplify the problem, we will describe a relatively simple and practical trajectory optimization method in the following. This method is fast and the process is simple. It can meet the actual needs completely.

The trajectory of a spray painting robot mainly consists of two factors: Path and velocity. In the spray painting process, the painting path can be obtained by determining the width of the overlapping areas formed by two paint strokes. Therefore, to determine the trajectory of a spray painting robot, we only need to determine the velocity of the paint gun and the width of the overlapping area formed by two paint strokes. *x* represents the distance from a point, *s*, to the first path in the painting radius, *d* represents the width of the overlapping area formed by two paint strokes, *R* represents the distance from the surface point to the spray direction, *o*(*t*), O is the TCP (tool center point). Then, the paint thickness at point, *s*, is:

$$q\_3(\mathbf{x}) = \begin{cases} \, \, lq\_1(\mathbf{x}) & 0 \le \mathbf{x} \le R - d \\\, q\_1(\mathbf{x}) + q\_2(\mathbf{x}) & R - d < \mathbf{x} \le R \\\, q\_2(\mathbf{x}) & R < \mathbf{x} \le 2R - d \end{cases} \tag{8}$$

*q*1(*x*) and *q*2(*x*) are the paint thickness at point *s* when spray painting on two adjacent paths, respectively. The formulas of *q*1(*x*) and *q*2(*x*) are:

$$q\_1(\mathbf{x}) = 2 \int\_0^{t\_1} f(r\_1) dt,\ 0 \le x \le R \tag{9}$$

$$q\_2(\mathbf{x}) = 2 \int\_0^{t\_2} f(r\_2) dt,\ \mathcal{R} - d \le \mathbf{x} \le 2\mathcal{R} - d \tag{10}$$

among which, *<sup>t</sup>*<sup>1</sup> <sup>=</sup> <sup>√</sup>*R*<sup>2</sup> <sup>−</sup> *<sup>x</sup>*2/*v*, *<sup>t</sup>*<sup>2</sup> <sup>=</sup> *R*<sup>2</sup> − (2*R* − *d* − *x*) 2 /*v*, *r*<sup>1</sup> = (*vt*) <sup>2</sup> + *x*2, *r*<sup>2</sup> = (*vt*) <sup>2</sup> + (2*<sup>R</sup>* <sup>−</sup> *<sup>d</sup>* <sup>−</sup> *<sup>x</sup>*) 2 . *t*<sup>1</sup> and *t*<sup>2</sup> represent half of the spray painting time that the gun paints in two adjacent painting paths at point *s*, respectively. *r*<sup>1</sup> and *r*<sup>2</sup> represent the distance from point *s* to the central projection point of TCP in two adjacent painting paths, respectively, *t* is the time that the gun moves from point *O* to point *s'*. *s'* is the projection of the point, *s*, on the painting path. The following expression can be obtained from (9) and (10):

$$q\_s(\mathbf{x}, d, v) = \frac{1}{v} f(\mathbf{x}, d) \tag{11}$$

where, *J* is a function of *x* and *d*. To make the paint thickness on the surface as uniform as possible, the difference between the actual paint thickness and the variance of ideal paint thickness at point *s* are taken as the optimization objective function:

$$\min\_{d \in [0,R], \upsilon} E\_1(d, \upsilon) = \int\_0^{2R - d} (q\_d - q\_\sharp(\mathbf{x}, d, \upsilon))^2 d\mathbf{x} \tag{12}$$

In the equation above, *qd* is the ideal paint thickness. Since the maximum paint thickness, *q*max, and the minimum paint thickness, *q*min, determine the uniformity of the paint thickness on the workpiece surface, and *q*max and *q*min also need to be optimized:

$$\min\_{d \in [0, R], v} E\_2(d, v) = (q\_{\max} - q\_d)^2 + (q\_d - q\_{\min})^2 \tag{13}$$

It can be obtained from Equations (11)–(13) that:

$$\min\_{d \in [0, R], \upsilon} E(d, \upsilon) = \frac{1}{2R - d} E\_1(d, \upsilon) + E\_2(d, \upsilon) \tag{14}$$

Additionally, from Equations (9) and (10), the expressions of the maximum paint thickness and minimum paint thickness can be described as:

$$q\_{\text{max}} = \frac{1}{\upsilon} f\_{\text{max}}(d) \tag{15}$$

$$q\_{\rm min} = \frac{1}{v} f\_{\rm min}(d) \tag{16}$$

Let *<sup>∂</sup>E*(*d*,*v*) *<sup>∂</sup><sup>v</sup>* = 0, it can be obtained from Equations (11), (14)–(16) that:

$$v = \frac{\frac{1}{2R - d} \int\_0^{2R - d} f^2(x, d) dx - f\_{\text{max}}^2(d) - f\_{\text{min}}^2(d)}{q\_d \left[ \frac{1}{2R - d} \int\_0^{2R - d} f(x, d) dx + f\_{\text{max}}(d) + f\_{\text{min}}(d) \right]} \tag{17}$$

It can be seen that the spray painting rate, *v*, can be expressed as a function of the width, *d*, of the paint overlapping area formed by two painting strokes. Therefore, the minimum value of *E*(*d*, *v*) is only related to *d*. The golden section method [11] can be used to obtain the optimization value, *d*, so that the optimized trajectory on each patch can be obtained too.

#### **5. Tool Trajectory Optimal Integration on 3D Entity**

After the track optimization of each patch, the optimal combination of trajectories connecting each patch should also be considered to speed up the spraying speed of the spraying robot. The first step is to transform and model the TTOI problem. TTOI problem is represented by the Hamiltonian diagram. In the second step, the corresponding optimization algorithm is used to solve the TTOI problem. The third step, through simulation and spray painting experiments, verification, and comparison, identifies the advantages and effectiveness of the algorithm.

#### *5.1. The Transformation and Modeling of Tool Trajectory Optimal Integration*

As is shown in the Figure 4, the TTOI (tool trajectory optimal integration) on each patch after the 3D entity segmentation is expressed [12]. To make the problem less complicated, the trajectories are considered as an edge. The ultimate purpose of the TTOI problem is to spray patches on the workpiece surface to make the spraying path of the robot the shortest. According to graph theory, a non-directional connection graph, G, is assumed (*V*, *<sup>E</sup>*, *<sup>R</sup>*, *<sup>ω</sup>*: *<sup>E</sup>*→*Z+*), among which *<sup>V</sup>* denotes the vertex set, *E* denotes the edge set, *R* denotes any subset of *E*, and *ω* denotes the weight of the edge (the length of the actual spray path). The problem of TTOI is to find a path passing all edges only once with the shortest distance in graph, G. Similar to the traveling salesman problem (TSP), which is a common problem in the optimization problem, the TTOI problem is also a typical NP (non-polynomial) problem.

**Figure 4.** Tool trajectory optimal integration on each patch.

Suppose that D = {*dij*} (*i, j* = 1, 2, ... , *n*), the shortest distance between vertex *i* and vertex *j*, which are not on the same edge in graph G, the distance between the vertices can be calculated according to the Floyd algorithm. To make the problem less complicated, the TTOI problem can be expressed by the Hamiltonian method. As shown in Figure 5, a vertex is used to represent an edge of the original graph G to form a complete Hamiltonian [12]: *g* (*VH*, *EH*, *ωH*), among which *VH* denotes the vertex set, *EH* denotes the edge set, and *ωH* denotes the weight of the edge and *ωH*∈D. In the graph, *g*, the weight of each edge is not fixed. Its value is determined by the order of vertices on the same edge in the original graph, G. Suppose that the order of the vertices set, *VH* = {*v*1, *v*<sup>2</sup> ... ... *vn*}, in graph g is *T* = (*t*1, *t*<sup>2</sup> ... ... *tn*) *ti*∈*VH* (*i* = 1, 2, . . . , *n*), and the TTOI problem can be defined as Equation (18):

$$\text{minim}\overline{L} = \sum\_{i=1}^{n} \omega\_{i} + \sum\_{j=1}^{n-1} \omega\_{j}^{\text{H}} \tag{18}$$

where *ω<sup>i</sup>* is the weight of the edges in the primitive graph, G, corresponding to vertices, *t*1, *t*<sup>2</sup> ... ... *tn*, in graph *g* and *ω<sup>H</sup> <sup>j</sup>* denotes the weight of the edge in graph *g*. Since the weight, *ωi*, of each edge in the original graph, G, is considered to be a fixed value in this problem, the above optimization problem can be reduced to Equation (12):

$$\text{minim}L = \sum\_{j=1}^{n-1} \omega\_j^{\mathbb{H}} \tag{19}$$

**Figure 5.** Transformation of the original graph, *G*, into the Hamiltonian graph, *g*.

The spraying robot is the most complex one in the control of the industrial robot because of its many parameters. Especially, the trajectory optimization of the spraying robot on the complex surface makes the actual operation difficult. Therefore, finding the arrangement of all vertices in the Hamilton diagram makes the path, L, of the painting robot the shortest, which becomes a TTOI problem. Because of the large number of parameters, spray robots are the most complex control of industrial robots. Especially, the trajectory optimization of the spraying robot on the complex surface makes the actual operation difficult. Therefore, finding the arrangement of all vertices in the Hamilton diagram makes the shortest path, L, of the painting robot a TTOI problem. To solve the TTOI problem, the improved genetic algorithm, ant colony algorithm, and particle swarm optimization (PSO) proposed in this paper can be used to optimize the trajectory of a spraying robot on complex surfaces. In the process of fragmentation, these intelligent algorithms can solve the trajectory optimization problem between patches. For the first time, these intelligent algorithms have been used for spraying complex surfaces. The previous links between patches are random combinations. Finally, the advantages and disadvantages of each algorithm are illustrated by experiments.

#### *5.2. Solving the TTOI Problem with the Genetic Algorithm*

Genetic algorithm (GA) is a method to search for the optimal solution by simulating the natural evolution process. Therefore, this algorithm has good effects on the NP (non-polynomial) problem in combinatorial optimization and can be used to solve the TTOI problem. According to the particularity of TTOI [13–15], GA needs a special individual code and crossover, mutation, and other genetic manipulation methods.

**(1) Individual code:** The length of the individual code is |*VH*|. Since each vertex in the Hamilton graph represents one edge of the original graph, G, to distinguish the start and end points of each edge. The individual code contains the binary code, *Psi,* representing the direction of each edge in the original graph, G.

**(2) Fitness function:** The values of fitness function are used to determine which individuals can enter the next round of evolution and which individuals need to be removed from the population. To facilitate the selection operation in the genetic algorithm, the optimization of the minimum value is usually converted to optimization of the maximum value, and the fitness function can be taken as: *F* = *U* − *L*, where *U* should be selected as an appropriate number, to make the fitness of all individuals positive. In the process of population evolution, to select the individuals with high fitness, the population size is maintained as the value, *Psize*. According to the fitness function rule, the *Psize* individuals with the highest fitness are passed to the next generation.

**(3) Crossover:** Crossover is the process of exchanging the partial codes between two individuals with a certain probability to generate new individuals. Here, order crossover (OX) is used on *Pi* while two-point crossover is used on *Psi*. OX ensures that the original order of each vertex is almost the same when the effective sequence of the individual itinerary is modified [16]. The main idea of OX is: A conventional two-point crossover is performed, followed by an effective sequence modification of the individual itinerary. When modifying, the original relative access order of each point should be maintained as much as possible. Basic steps of OX are as follows:

(a) In the individual code strings, *Px* and *Py*, representing the spray painting order, the positions after the two loci, *i* and *j*, are randomly selected as the intersection. That is, each locus between the (*i* + 1)-th locus and the *j*-th locus is defined as an intersection area, and the contents of the intersection region are respectively memorized to *Wx* and *Wy*.

(b) According to the mapping relation in the intersection area, find all *Pxq* − *Pxq* (*p* = *i* + 1, *i* + 2, ... , *j*) loci *q* in the individual *Px* and set them as vacancies. Find all *Pxq* − *Pxq* (*p* = *i* + 1, *i* + 2, ... , *j*) loci, *r*, in the individual, *Py*, and set them as vacancies.

(c) The individuals, *Px*, *Py*, are left-shifted circularly until the first vacancy in the code string is moved to the left end of the intersection area. Then, all the vacancies are concentrated in the intersection area, and the original gene values in the intersection area are sequentially moved backward.

(d) Exchange the content in *Wx* and *Wy* and put them into the intersection area of individual *Px*, *Py*. The result is a new spray painting order.

**(4) Mutation operation:** *Pi* is subjected to inversion mutation to generate a new individual. A basic variation is applied to *Psi*, where one or more loci are randomly selected for individual code and the gene values of these loci are inverted.

Thus, the genetic algorithm of the TTOI problem is shown in Figure 6.

**Figure 6.** Flow chart of the GA (Genetic algorithm) of the TTOI (tool trajectory optimal integration) problem.

Taking 3D entities as spray objects, simulation experiments were carried out using genetic algorithm programming to verify the effectiveness of the TTOI problem. According to the segmentation method of the 3D entity, a 3D entity is divided into seven patches; that is, the individual code, *Pi* and *Psi*, are seven bits in the genetic algorithm. The parameters of the algorithm are as follows: Population size, *Psize* = 100; crossover probability, *xrate* = 0.20; mutation probability, *mrate* = 0.05; and the maximum number of evolutionary generation, *T* = 100. The corresponding evolutionary processes of different solutions are shown in Figure 7. As can be seen from the figure, the value of the objective function of the optimal individual decreases monotonously with the evolution process and eventually tends to be constant. After about 70 generations of evolution, the average fitness remains stable and the algorithm converges.

**Figure 7.** The simulation result of solving the TTOI problem with GA.

#### *5.3. Solving the TTOI Problem with the Ant Colony Algorithm*

Ant colony optimization (ACO) is a probabilistic intelligent algorithm for finding the optimal path. It originated from the behavior of ants to find the path in the process of searching for food. It has strong anti-interference ability, strong compatibility, and other characteristics. The algorithm initializes the following individual information: Not visited vertices (NVV), not visited edges (NVE), visited edges (VE), and tour length (TL). Through the memory function of the population, individual information is constantly updated and adjusted. Taking the connection graph, G, shown in Figure 5 as an example, if the algorithm starts when the ant is at the vertex, 1, the initialization information is:

$$\begin{array}{c} \text{NVV [1] = \{1, 2, 3, 4, 5, 6, 7, 8, 9, 10\}.}\\ \text{VV [1] = \{\}.}\\ \text{NVE [1] = \{ (1, 2), (3, 4), (5, 6), (7, 8), (9, 10) \}.}\\ \text{VE [1] = \{\}.}\\ \text{TL [1] = 0.0} \end{array}$$

After time, Δ*t*, the pheromone on trajectory (*i*, *j*) *i* adjusted as follows:

$$
\tau\_{ij}(t + \Delta t) = \rho \tau\_{ij}(t) + \Delta \tau\_{ij} \tag{20}
$$

where *ρ* represents the volatilization rate of pheromone, *τij*(*t*) represents the accumulation amount of pheromone on the track (*i*, *j*) at time *t*, Δ*τij* represents the increment of the pheromone on the trajectory (*i*, *j*) after the time, Δ*t*, which can be calculated as follows:

$$
\Delta \tau\_{ij} = \sum\_{k=1}^{m} \Delta \tau\_{ij}^{k} \tag{21}
$$

Δ*τ<sup>k</sup> ij* denotes the pheromone on the trajectory (*i*, *j*) during the searching process of the *k*-th ant, the expression of which is:

$$
\Delta \mathbf{r}\_{ij}^k = \begin{cases}
\frac{Q}{TL\langle k\rangle} & \text{ant } k \text{ pass path } (i, j) \\
0, & \text{else}
\end{cases}
\tag{22}
$$

Among which, *Q* is a constant. The pheromones on each trajectory during initialization are: Δ*τij* = 0. At the time, *t*, the transition probability of an ant, *k*, from vertex *x* to other feasible vertices is:

$$p\_{ij}^k = \begin{cases} \frac{(\tau\_{ij}(t))^a (\eta\_{ij}(t))^\beta}{\sum\_{s \in already\_k} (\tau\_{is}(t))^a (\eta\_{is}(t))^\beta}, & j \in allowed\_k\\ 0, & otherwise \end{cases} \tag{23}$$

where *ηij* denotes the visibility on track (*i*, *j*), which reflects the degree of heuristic from vertex *i* to vertex *j*. Here, let *ηij* = 1/*dij*, *dij* is the distance from vertex *i* to vertex *j*. The parameters, *α* and *β*, denote the influence weights of *τij*(*t*) and *ηij*(*t*) on the whole transition probability. *alowedk* denotes the feasible neighborhood of the ant k at vertex i (end of the edge in the list NVE). Thus, the ant colony algorithm of the TTOI problem is shown in Figure 8.

**Figure 8.** Flow chart of the ACO of the TTOI problem.

In the following, the validity of using ACO to solve the TTOI problem is verified by simulation experiments. Assuming that a 3D entity workpiece is divided into five patches, the number of edges in the connected graph, G, is five and the number of vertices is *m* = 10. The parameters of the algorithm are chosen as follows: *α* = 1, *β* = 5, *ρ* = 0.5, *Q* = 100 and the maximum number of cycles is *Nmax* = 100. Figure 9 shows the evolution of the optimal solution obtained from the algorithm. It can be seen from the figure that the length of the spray painting trajectory decreases monotonically with the evolutionary process. After about 70 generations of evolution, the length of the trajectory no longer changes and the algorithm converges.

**Figure 9.** Simulation result of the ACO algorithm.

#### *5.4. Solving the TTOI Problem with Particle Swarm Optimization*

Particle swarm optimization (PSO) is easy to implement and there is no need to adjust a lot of parameters compared with other optimization algorithms. Also, it does not need gradient information, which is an effective tool to solve the optimal combination [17–19]. In the algorithm, each individual is a particle, and each particle represents a potential solution. Assuming that *zi* = (*zi*1, *zi*2, ..., *ziD*) is the D-dimensional position vector of the i-th particle, the current fitness value of *zi* can be calculated according to the fitness function so that the position of the particle can be measured. The process of calculating the minimum length of the spray path length can be selected as the fitness function in the TTOI problem. *vi* = (*vi*1, *vi*2, ..., *viD*) is the flying speed of particle i; that is, the distance of particle movement. *pi* = (*pi*1, *pi*2, ..., *piD*) is the optimal position of the particle to date. *pg* = *pg*1, *pg*2, ..., *pgD* is the optimal position of the particle swarm to date. In each iteration, the velocity and position of the particles can be updated according to:

$$
\sigma\_{\rm id}^{k+1} = \sigma\_{\rm id}^k + c\_1 r\_1 (p\_{\rm id} - z\_{\rm id}^k) + c\_2 r\_2 (p\_{\rm gd} - z\_{\rm id}^k) \tag{24}
$$

$$z\_{id}^{k+1} = z\_{id}^k + v\_{id}^{k+1} \tag{25}$$

where *i* = 1, 2, ... , *m*, *d* = 1, 2 ... *D*, *r*<sup>1</sup> and *r*<sup>2</sup> are random numbers between [0, 1], and *c*<sup>1</sup> and *c*<sup>2</sup> are learning factors. Thus, the particle swarm algorithm of the TTOI problem is shown in the Figure 10.

Assuming that the 3D entity artifact is divided into five patches, the number of edges in the connection graph, G, shown in Figure 5 is five and the number of vertices is *m* = 10. To guarantee the precision of the algorithm, the maximum number of cycles is *Nmax* = 100. To ensure that the particle does not skip the optimal solution and can search the search space sufficiently, let *ε* = 1000. To ensure accuracy and reduce the amount of calculation, take the number of particles as 20. The learning factors, *c*<sup>1</sup> and *c*2, can make the particles have self-summary and the ability to learn from the outstanding individuals in the group, to be close to the best in their own history and the history within the group. These two parameters have little effect on the convergence of the algorithm, but if we adjust these two parameters properly, we can reach the convergence faster. After adjusting the values of *c*<sup>1</sup> and

*c*<sup>2</sup> several times and analyzing the effect of *c*<sup>1</sup> and *c*<sup>2</sup> on the optimal fitness, we can conclude that *c*<sup>1</sup> = *c*<sup>2</sup> = 2 is a better choice for the TTOI problem. Figure 11 shows the evolution of the optimal solution obtained from the algorithm. It can be seen from the figure that the length of the spray painting trajectory decreases monotonously with the evolutionary process, and finally tends to a definite value. As the spray painting robot does not consider the obstacle avoidance during the working process, the environment information is known and is relatively simple, so the particle swarm algorithm converges faster. It can be seen from Figure 11 that after about 80 generations of evolution, the length of the spray painting trajectory does not change and the algorithm converges.

**Figure 10.** Flow chart of the PSO of the TTOI problem.

**Figure 11.** Simulation result of the PSO algorithm.

#### *5.5. Comparison of Algorithms*

In this paper, according to the built coating accumulation model: The finite range model, then FPAG (flat patch adjacency graph) is used to treat spraying workpiece, and the workpiece surface is divided into numerous patches. In each patch, the trajectory optimization algorithm in part 3 is used to obtain the trajectory. Finally, the genetic algorithm, ant colony algorithm, and particle swarm optimization algorithm are respectively used to obtain the optimal path to connect all patches to complete the spraying work. Assuming the ideal paint thickness of *qd* = 50 μm, the maximum allowable deviation of paint thickness of *qw* = 10 μm, the bottom radius of the cone paint sprayed by the gun is *R* = 60 mm. The cumulative rate of the paint is obtained from the spray test data on the plate [12]:

$$f(r) = \frac{1}{15}(R^2 - r^2)\mu\text{m/s} \tag{26}$$

After generating and optimizing the spray painting trajectories on the plate, the spray painting rate (at uniform speed) and the width of the overlapping area of the paint for each of the two spray strokes are obtained, which are *v* = 256.3 mm/s, *d* = 50.2 mm, respectively.

As shown in Figure 12, taking the experimental workpiece as an example, the workpiece can be regarded as a 3D entity. In the experiment, suppose the vertical distance from the gun to the workpiece surface is H, the bias algorithm can be used to obtain the spray path. Each spray parameter setting in the optimal algorithm of the spray painting trajectory are as follows: The ideal paint thickness is *qd* = 50 μm, the maximum allowable deviation of thickness is *qw* = ±10 μm, the spray radius is *R* = 60 mm, the spray distance is *H* = 100 mm, and the velocity of the spray is *v* = 256.3 mm/s. The workpiece surface is divided into five patches after the modeling work and the path at the junction of the two patches is the PA-PA mode. GA, ACO, and PSO are used to carry out experiments when solving the TTOI. The parameters of the GA algorithm are set as: Population size is *Psize* = 100, crossover probability is *xrate* = 0.20, mutation probability is *mrate* = 0.05, and the maximum number of evolutionary generation is *T* = 100. The parameters of the ACO algorithm are set as: The number of ants is *m* = 10, parameter *α* = 1, parameter *β* = 5, the volatilization rate of the pheromone is *ρ* = 0.5, constant *Q* = 100, and the maximum number of iterations is *Nmax* = 100. The parameters of the PSO algorithm are set as: The maximum speed threshold is *ε* = 1000, the number of particles is 20, learning factors are *c*<sup>1</sup> = *c*<sup>2</sup> = 2, and the maximum number of iterations is *Nmax* = 100.

**Figure 12.** The experimental workpiece.

The self-developed offline programming system of the spray painting robot is used for the spray test [20,21]. The part of the optimized trajectories of different patches on the workpiece surface are shown in Figure 13. The paint thicknesses of 400 discrete points are measured by the paint thickness gauge after the spray painting operation. The paint thickness at the sample point after spray painting in the optimized trajectory is shown in Figure 14, among which the maximum paint thickness is *qmax* = 55.6 μm, and the minimum paint thickness is *qmin* = 45.5 μm. It can be seen that all the paint thicknesses at the sampling point are within the maximum allowable deviation of the paint thickness *qw*, which is in line with the requirements of the spray quality.

**Figure 13.** (**a**) Optimized trajectory at the side of the patch; (**b**) optimized trajectory at the top of the patch.

**Figure 14.** The paint thickness at the sampling points.

From the point of view of spray painting efficiency, the compared results of GA, ACO, PSO, and random combination are shown in Table 1. It can be seen from the results that the total length of the spray trajectory using the PSO algorithm is the shortest, the spray painting time is the least, and the execution time of the system operation is the longest, which is within the allowable range in the practical application. For the workpiece, the spray painting time was reduced by 23% compared to a random combination. The total trajectories of GA and ACO are shorter than those of the random combination, and the spray painting time are reduced by 16% and 20%, respectively. It should be noted that the workpiece used here is only divided into five patches. For the workpiece, which is more complex and has more patches, the advantages of using the PSO algorithm in saving painting time will be more obvious, but the execution time of the offline programming system will be longer. Therefore, if the real-time performance of the system operations can meet the practical application requirements, the PSO algorithm will be the best choice. Otherwise, the GA algorithm or ACO algorithm can be taken into consideration.

**Table 1.** The compared results of different algorithms.


#### *5.6. Spray Painting Experiment*

As shown in Figures 15–18, taking the automotive body of a brand is as the paint objective. The thickness of the coating on each sampling point is shown in Figure 19. Each spray parameter settings in the optimal algorithm of the spray painting trajectory are as follows: The ideal paint thickness is *qd* = 50 μm, the maximum allowable deviation of the thickness is *qw* = ±10 μm, the spray radius is *R* = 60 mm, the spray distance is *H* = 80 mm, and the velocity of the spray is *v* = 389 mm/s. The left side of the car surface is divided into 15 patches and the trajectory of each patch is in a different

color. GA, ACO, and PSO are used to carry out experiments when solving the TTOI. The parameters of the GA algorithm are set as: Population size is *Psize* = 100, crossover probability is *xrate* = 0.20, mutation probability is *mrate* = 0.05, and the maximum number of evolutionary generation is *T* = 200. The parameters of the ACO algorithm are set as: The number of ants is *m* = 10, parameter *α* = 1, parameter *β* = 5, the volatilization rate of pheromone is *ρ* = 0.5, constant *Q* = 100, and the maximum number of iterations is *Nmax* = 200. The parameters of the PSO algorithm are set as: The maximum speed threshold is *ε* = 1000, the number of particles is 20, learning factors *c*<sup>1</sup> = *c*<sup>2</sup> = 2, and the maximum number of iterations is *Nmax* = 200.

**Figure 15.** An automobile paint line composed by several spray-painting robots.

**Figure 16.** Spray painting trajectory on the roof part of the automobile.

**Figure 17.** Spray painting trajectory on the left part and the left rear part of the automobile.

**Figure 18.** Spray painting trajectory on the right part and the right rear part of the automobile.

**Figure 19.** The paint thickness at the sampling points on the automobile body.

From the point of view of spray painting efficiency, the compared results of the GA, ACO, PSO, and random combination are shown in Table 2. It can be seen from the results that the total length of the spray trajectory using the PSO algorithm is the shortest, the spray painting time is the least, and the execution time of the system operation is the longest, which is within the allowable range in the practical application.

**Table 2.** The compared results of different algorithms.


#### **6. Conclusions**

In this paper, the spatial trajectory optimization method of a spray painting robot for 3D entity objects was proposed. Firstly, the finite range model of the paint deposition rate was established, and the 3D entity was sliced by the surface modeling method according to FPAG. Then, after planning the spray path on each patch, the variance of the paint thickness of the discrete point and the ideal paint thickness was taken as the objective function and the trajectory on each patch was optimized. The path at the junction of two patches was the PA-PA (parallel-parallel) mode. The improved GA algorithm, ACO algorithm, and PSO algorithm were used to solve the TTOI problem. The practicability of the algorithms was verified by simulation experiments. Finally, spraying experiments were conducted on the off-line programming experimental platform of the spraying robot, and the results of the three algorithms were studied. The results of the automobile body spraying experiments showed that the proposed trajectory optimization of the 3D entity spraying robot can completely satisfy the requirements of uniformity of the spraying thickness. More experimental data please see the Supplementary Materials.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2079-9292/8/1/74/s1.

**Author Contributions:** W.C. and Y.T. conceived and designed the experiments; W.C. performed the experiments; H.L. and J.L. analyzed the data; X.W. contributed reagents/materials/ analyzed the data; X.W. contributed reagents/materials/analysis tools; X.W. wrote the paper.

**Funding:** This research was supported by the National Natural Science Foundation of China under grant 61503162, "Six talent peaks" high-level talents projection Jiangsu Province under grant 2016GDZB021, Project funded by China Postdoctoral Science Foundation under grant 2016M601691, Industrial Foresight Project of Zhenjiang City under grant GY2018018.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Robust Visual Compass Using Hybrid Features for Indoor Environments**

**Ruibin Guo 1,\*, Keju Peng 1, Dongxiang Zhou <sup>1</sup> and Yunhui Liu <sup>2</sup>**


Received: 15 December 2018; Accepted: 7 February 2019; Published: 16 February 2019

**Abstract:** Orientation estimation is a crucial part of robotics tasks such as motion control, autonomous navigation, and 3D mapping. In this paper, we propose a robust visual-based method to estimate robots' drift-free orientation with RGB-D cameras. First, we detect and track hybrid features (i.e., plane, line, and point) from color and depth images, which provides reliable constraints even in uncharacteristic environments with low texture or no consistent lines. Then, we construct a cost function based on these features and, by minimizing this function, we obtain the accurate rotation matrix of each captured frame with respect to its reference keyframe. Furthermore, we present a vanishing direction-estimation method to extract the Manhattan World (MW) axes; by aligning the current MW axes with the global MW axes, we refine the aforementioned rotation matrix of each keyframe and achieve drift-free orientation. Experiments on public RGB-D datasets demonstrate the robustness and accuracy of the proposed algorithm for orientation estimation. In addition, we have applied our proposed visual compass to pose estimation, and the evaluation on public sequences shows improved accuracy.

**Keywords:** visual compass; orientation estimation; hybrid features; plane tracking; vanishing direction; Manhattan World; RGB-D camera

#### **1. Introduction**

Robust orientation estimation is of great significance in robotics tasks such as motion control, autonomous navigation, and 3D mapping. Orientation can be obtained by utilizing carried sensors like the wheel encoder, inertial measurement unit (IMU) [1–4], or cameras [5–7]. Among these solutions, the visual-based method [8–11] is effective, as cameras can conveniently capture informative images to estimate orientation and position. In the past decades, many simultaneous localization and mapping (SLAM) systems [12,13] and visual odometry (VO) methods [14,15] have been proposed. Payá et al. [5] proposed a global description method based on Radon Transform to estimate robots' position and orientation with the equipped catadioptric vision sensor. These methods show good performance in estimating orientation from captured images. However, local and global maps' construction or loop detection is needed in these approaches to reduce drift error.

For most indoor environments, there exist many parallel and orthogonal lines and planes (called the Manhattan World (MW) [16]). These structural regularities are exploited in studies to estimate drift-free rotation without previous complex techniques (map reconstruction and loop closure) [17–19]. Since 3D geometric structures can easily be calculated by using the camera that provides both depth information and color image with 3 channels (red, green, and blue), called the RGB-D camera. The RGB-D camera has become a popular alternative to monocular and stereo cameras for the purpose

of rotation estimation, and estimation accuracy has been prominently improved by using the MW assumption with a RGB-D camera [20–23]. However, a major disadvantage of these MW-based methods is that the number of lines and planes used for tracking the MW axes must be no less than 2, which is the minimal sampling for 3 degrees of freedom (DoF). In practice, robots often encounter harsh environments without lines, and only one plane can be visible, resulting in failure in tracking MW axes to estimate the camera orientation. To address these issues, we select some frames as keyframes and exploit hybrid features (i.e., plane, line, and point) to compute the rotation matrix of each captured frame with respect to its reference keyframe instead of directly aligning with the global MW axes.

In this paper, we propose a robust and accurate approach for orientation estimation using RGB-D cameras. We detected and tracked the normal vectors of multiple planes from depth images, and we detected and matched the line and point features from color images. Then, by utilizing these hybrid features (i.e., plane, line, and point), we constructed a cost function to solve the rotation matrix of each captured frame. Meanwhile, we selected keyframes to reduce drift error and avoid directly aligning each frame with the global MW. Furthermore, we extracted the MW axes based on the normal vectors of orthogonal planes and the vanishing directions of parallel lines, and by aligning the current MW axes with the global MW axes, we refined the aforementioned rotation matrix of each keyframe and achieved drift-free orientation. Experiments showed that our proposed method produces lower drift error in a variety of indoor sequences compared to other state-of-the-art methods.

Our algorithm exploits hybrid features and adds a refinement step for keyframes, which can provide robust and accurate rotation estimation, even in harsh environments, as well as general indoor environments. The contributions of this work are as follows:


#### **2. Related Work**

Pose estimation obtained by VO or V-SLAM systems has been extensively studied for the purpose of meaningful applications, such as autonomous robots and augmented reality. Rotation estimation is usually considered as a subproblem of pose estimation that consists of rotational and translation components. It has been gradually recognized by researchers that the main source of VO drift is inaccurate rotation estimation [19,20]. In the following discussion, we focus on rotation-estimation methods that exploit structural regularities with RGB-D cameras. These methods utilize surface normals, vanishing points (vanishing directions), or mixed constraints to compute camera orientation.

Surface-normal vectors were exploited to estimate camera orientation because their distribution on the unit sphere (or called Gaussian sphere) is regular and more likely around plane-normal vectors in the current environment, as shown in Figure 1. The work of Straub et al. [21] introduced the Manhattan-Frame model in the surface-normal space and proposed a real-time maximum a posteriori (MAP) inference algorithm to estimate drift-free orientation. Zhou et al. [23] developed a mean-shift paradigm to extract and track planar modes in surface-normal vector distribution on the unit sphere, and achieved drift-free behavior by registering the bundle of planar modes. In the work of Kim et al. [24], orthogonal planar structures were exploited and tracked with an efficient SO(3)-constrained mean-shift algorithm to estimate drift-free rotation. These surface-normal-based methods can provide stable and accurate rotation estimation if the number of observed orthogonal planes is not less than two.

(**a**) RGB-D image pairs (**b**) 3D point cloud (**c**) Distribution of surface normals

**Figure 1.** Single RGB-D frame and distribution of surface normals corresponding to its 3D point cloud. (**a**) Frame 85 in 'LivingRoom0' sequence of ICL-NUIM dataset. (**b**) 3D point cloud obtained by back-projecting the depth information and coloring with aligned RGB pixels. (**c**) Distribution of surface normals on the unit sphere.

A vanishing point (VP) of a line is obtained by intersecting the image plane with a ray parallel to the world line and passing through the camera center, and it depends only on the direction of a line [25]. Two parallel lines determine a vanishing direction (VD), and the Euclidean 3D transformation of a VD is influenced only by rotation; the geometric relationships are shown in Figure 2, so VPs and VDs have been widely used for estimating rotation. Bazin et al. [17] proposed a three-line random-sample consensus (RANSAC) algorithm with the VP orthogonality constraint to estimate rotation. The work of Elloumi et al. [26] proposed a real-time pipeline for estimating camera orientation based on vanishing points for indoor navigation assistance on a smartphone. VP-based methods need a sufficient number of lines for estimating rotation, and accuracy performance is greatly affected by line-segment noise.

**Figure 2.** Three-dimensional geometric relationship between parallel lines and their vanishing direction. Gaussian sphere is a unit sphere on the center of a camera projection. Two parallel lines are projected onto the Gaussian sphere as two great circles, and vanishing direction is obtained by the cross project of these two great circles' normal vectors. Two parallel lines and their corresponding vanishing direction are drawn with red.

Hybrid approaches use both surface normals obtained in depth image and vanishing directions extracted in the RGB image to estimate rotation, which shows more robust performance. The method proposed by Kim et al. [22] exploited both line and plane primitives to deal with degenerate cases in surface-normal-based methods for stable and accurate zero-drift rotation estimation. In the work of Kim et al. [27], only a single line and a single plane in RANSAC were used to estimate camera orientation, and refinement is performed by minimizing the average orthogonal distance from the endpoints of the lines parallel to the MW axes once the initial rotation estimation is found. Bazin et al. [17] introduced a related one-line RANSAC for situations where the horizon plane is known.

#### **3. Proposed Method**

We propose a robust visual compass that exploits hybrid features (i.e., plane, line, and point) to estimate camera orientation with the RGB and depth-image pairs. Our proposed method has two main steps: (1) rotation matrix is estimated by tracking the hybrid features for each frame with respect to the reference keyframe (tracking step); and (2) refine the keyframe's initial rotation matrix by aligning with the global MW axes to achieve drift-free orientation (refinement step). The overview of our proposed method is shown in Figure 3.

**Figure 3.** Overview of our visual compass. We estimate camera rotation by tracking the plane, line, and point features. We refine the keyframe orientation by aligning the current Manhattan World (MW) axes with the global MW axes. Global MW axes is extracted from the first captured RGB-D frame.

#### *3.1. Rotation Estimation with Hybrid Features*

We simultaneously tracked multiple planes, lines, and points in the current environment, and we utilized tracked hybrid features to construct a cost function for estimating the current rotation matrix relative to its reference keyframe. This can provide camera rotation even in uncharacteristic scenes where there are no rich texture or no consistent visible lines.

#### 3.1.1. Multiple-Plane Detection and Tracking

We detected multiple planes from the depth image with a fast plane extraction algorithm [28]. The algorithm first constructs an initial graph that uniformly divides the depth image's point cloud into a set of nodes with size *H* × *W* in the image space. It then performs agglomerative hierarchical clustering on this graph to merge nodes belonging to the same plane. It final refines the extracted planes using pixel-wise region growing. With this approach, we obtain planes *P<sup>i</sup>* : (**n***i*, *di*), *i* = 1, ..., *m*, where **n***<sup>i</sup>* is the unit normal vector of the *i*-th plane, and *di* is the distance to the origin of the camera co-ordinate system for the current frame.

We tracked the normal vector of each detected plane with a mean shift algorithm [23] that operates based on the density distribution of initial nodes' normals on the Gaussian sphere, as shown in Figure 4. We calculated the normal vector of each initial node by the least-square fitting method, and the depth image was preprocessed by a box filter to obtain the stable vectors. We used the previous frame's tracked (detected) normal vectors as an initial value, and then performed the mean shift algorithm in the tangent plane of the Gaussian sphere to obtain the tracked results. It should be noted that the parallel planes have the same plane normal vector; we class them as the same plane cluster for rotation estimation.

If we only have plane primitives to estimate rotation, rotation matrix *R* with three degrees of freedom can be computed from no less than two such tracked norm vectors that are not parallel because each normal vector represents two independent constraints on *R*.

$$\mathcal{R} = \arg\min\_{i=1,\ldots,m} \sum\_{i=1,\ldots,m} \left\| \mathbf{R} \cdot \mathbf{n}\_i^{ref} - \mathbf{n}\_i^k \right\|\_2^2 \tag{1}$$

where **n***ref <sup>i</sup>* represents the *<sup>i</sup>*-th detected plane normal vector in the reference keyframe, and **<sup>n</sup>***<sup>k</sup> <sup>i</sup>* represents the tracked result of the *i*-th plane in frame *k*.

**Figure 4.** Result of tracking planes: (**a**) Depth image of Frame 741 in the 'LivingRoom2' sequence of the ICL-NUIM dataset. (**b**) Tracked normal vectors from the Gaussian sphere; black dots represent the normals of initial nodes. (**c**) Plane primitives in current frame. Tacked normal vectors and their corresponding planes in image domain are represented with the same color.

#### 3.1.2. Line and Point Detection and Matching

We used a linear-time Line Segment Detector (LSD) [29] to extract 2D line segments on the color image and obtain 2D line segments set **l** = {*li*, *i* = 1, 2, ..., *n*}, where *li* is the *i*-th line segment: *y* = *kix* + *bi*, with starting point **u***si* = (*usi*, *vsi*) and ending point **u***ei* = (*uei*, *vei*). Pixels belonging to the line segment *li* are: ˜ <sup>=</sup> {**u**|**<sup>u</sup>** <sup>∈</sup> *li* <sup>∧</sup> **<sup>u</sup>** <sup>∈</sup> <sup>Ω</sup>}, <sup>Ω</sup> is the image domain. Then, we reconstructed their corresponding 3D points set: **<sup>P</sup>***<sup>l</sup>* <sup>=</sup> {**p**|**<sup>p</sup>** <sup>=</sup> *<sup>π</sup>*−1(**u**, *<sup>d</sup>*(**u**)), **<sup>p</sup>** <sup>∈</sup> *<sup>R</sup>*<sup>3</sup> <sup>∧</sup> **<sup>u</sup>** <sup>∈</sup> }˜ , where *<sup>d</sup>*(**u**) represents the corresponding depth value of pixel **u** in the color domain, and *π*−1(**u**, *d*(**u**)) = *d*(**u**)( *<sup>u</sup>*−*cx fx* , *v*−*cy fy* , 1)*<sup>T</sup>* is the inverse projection function for a camera model, with *fx*, *fy* being the focal lengths on the *x* axis and *y* axis, and (*cx*, *cy*)*<sup>T</sup>* is the camera's centre co-ordinates. Finally, the RANSAC method is used to fit 3D lines *Li* : (**d***i*, **p***i*), where **d***<sup>i</sup>* represents the *i*-th line's 3D direction, and **p***<sup>i</sup>* represents a point in this 3D line.

We match the lines that were respectively extracted from the current frame and the reference keyframe based on the Line Band Descriptor (LBD) [30], and two pairs of matching lines that are not parallel are needed to estimate the rotation matrix in case that there are only line primitives obtained in the current environment.

$$\mathcal{R} = \arg\min\_{i=1\ldots n} \sum\_{i=1\ldots n} \left\| \mathbf{R} \cdot \mathbf{d}\_i^{ref} - \mathbf{d}\_i^k \right\|\_2^2 \tag{2}$$

where **d***ref <sup>i</sup>* represents the *<sup>i</sup>*-th detected 3D line direction in the reference keyframe, and **<sup>d</sup>***<sup>k</sup> <sup>i</sup>* represents the matching line direction in frame *k*.

In addition to the plane and line features, we extracted and matched the oriented fast and rotated brief (ORB) features for point tracking, as these features are extremely fast to compute and match, and they present good invariance to camera autogain, autoexposure, and illumination changes. We used the epipolar constraint method to optimize the initial matching pairs by ORB descriptors; the optimized results can provide reliable constraints to estimate the camera pose.

$$\mathcal{R} = \arg\min\_{i.\dots} \sum\_{i.\dots} \left\| \left( \mathbf{R} \cdot \boldsymbol{\pi}^{-1} \left( \mathbf{u}\_i^{\text{ref}}, d(\mathbf{u}\_i^{\text{ref}}) \right) + \mathbf{t} \right) - \boldsymbol{\pi}^{-1} (\mathbf{u}\_i^k, d(\mathbf{u}\_i^k)) \right\|\_{2}^2 \tag{3}$$

where **u***ref <sup>i</sup>* represents the 2D position of *i*-th detected ORB point feature in the reference keyframe and **u***<sup>k</sup> <sup>i</sup>* represents the 2D position of the matching point in frame *k*. It should be noted that translation component **t** could be obtained by solving Equation (3), but we did not consider it, as our visual compass mainly focused on camera orientation.

#### 3.1.3. Robust Rotation Estimation

We jointly utilized the tracked planes, lines, and points in the current environment to estimate the rotation matrix with respect to the reference keyframe. Rotation matrix *R* can be computed by solving:

$$\mathbf{R} = \arg\min\_{i=1\ldots m} \left\| \sum\_{i=1\ldots m} \lambda\_i^{\mathbb{P}} \left\| \mathbf{R} \cdot \mathbf{n}\_i^{\mathbb{P}f} - \mathbf{n}\_i^k \right\|\_2^2 + \sum\_{i=1\ldots n} \lambda\_i^L \left\| \mathbf{R} \cdot \mathbf{d}\_i^{\mathbb{P}f} - \mathbf{d}\_i^k \right\|\_2^2 + \sum\_{i=1\ldots N} \left\| \left( \mathbf{R} \cdot \mathbf{P}\_i^{\mathbb{P}f} + \mathbf{t} \right) - P\_l^k \right\|\_2^2 \tag{4}$$

where *λ<sup>P</sup> <sup>i</sup>* represents the number of pixels contained in the *<sup>i</sup>*-th tracked plane, and *<sup>λ</sup><sup>L</sup> <sup>i</sup>* represents the number of pixels contained in the *i*-th tracked line, 3D points *Pref <sup>i</sup>* <sup>=</sup> *<sup>π</sup>*−1(**u***ref <sup>i</sup>* , *<sup>d</sup>*(**u***ref <sup>i</sup>* )), and *Pk <sup>i</sup>* = *<sup>π</sup>*−1(**u***<sup>k</sup> <sup>i</sup>* , *<sup>d</sup>*(**u***<sup>k</sup> <sup>i</sup>* )).

Cost function Equation (4) contains three parts, corresponding to plane, line, and point constraints. A stable and accurate rotation matrix can be solved by minimizing Equation (4) with line and plane constraints jointly in the texture-less environment that few points tracked, and the point constraints ensure that the rotation estimation is reliable in the scenes that no consistent lines or only one plane to be visible.

**Keyframe Selection:** By the tracking step, we constantly know the number of the tracked planes, lines and points for each frame. If there is only one tracked plane with the condition that the number of normal vectors on the Gaussian sphere around this tracked normal vector is too low, and the number of tracked points is less than a threshold, we reuse the fast plane extraction method and Line Segment Detector (LSD) method to detect planes and lines in the current frame. If the number of redetected orthogonal planes and lines is larger than 2, this frame is selected as a keyframe and performs the following refinement step.

#### *3.2. Drift-Free Orientation Estimation*

The previous tracking step estimates the rotation matrix between the current frame and its reference keyframe, and it is obvious that the accuracy of the reference keyframe's orientation directly affects the accuracy of the current frame's rotation matrix. To reduce drift error, we sought global MW axes in the first frame and refined each keyframe's orientation by aligning the current extracted MW axes with the global MW axes to achieve drift-free rotation in MW scenes. We sought current and global MW axes based on the plane normal vectors, and the vanishing directions of the parallel lines. Plane normal vectors can be directly obtained by the previous fast plane extraction method, and we propose a novel vanishing direction extraction method as follows.

#### 3.2.1. Vanishing Direction Extraction

To extract accurate VDs, we need to cluster lines that are parallel in the real world. We used the simplified Expectation–Maximization (EM) clustering method to group image lines and compute their corresponding 3D direction vectors. The original EM algorithm iterates the expectation and the maximization steps. In our simplified algorithm, we skip the expectation phase and roughly cluster the lines based on the K-means method [31], with the Euclidean distance of all extracted lines' 3D directions that are represented as 3D points. In the maximization phase, direction vectors are estimated by maximizing the objective function:

$$\mathbf{d}\_k^v = \arg\max\_i \prod\_i p(\mathbf{d}\_k^v | \mathbf{l}\_i^{(k)}) \tag{5}$$

where **l** (*k*) *<sup>i</sup>* represents the *<sup>i</sup>*-th 2D line segment in the *<sup>k</sup>*-th initial classification, **<sup>d</sup>***<sup>v</sup> <sup>k</sup>* represents the VD of the *k*-th initial classification to be optimized, and the *p*(**d***<sup>v</sup> k* |**l** (*k*) *<sup>i</sup>* ) represents the posterior likelihood of the VD.

Using the Bayes formula, the posterior likelihood of the VD is expressed as:

$$p(\mathbf{d}\_k^v | \mathbf{l}\_i^{(k)}) = \frac{p(\mathbf{l}\_i^{(k)} | \mathbf{d}\_k^v) p(\mathbf{d}\_k^v)}{p(\mathbf{l}\_i^{(k)})} \tag{6}$$

where *p*(**l** (*k*) *<sup>i</sup>* |**d***<sup>v</sup> <sup>k</sup>* ) represents the prior probability of the VD, and *<sup>p</sup>*(**d***<sup>v</sup> <sup>k</sup>* ) models the potential knowledge of the VD before we obtain the measurement. If we know nothing, *p*(**d***<sup>v</sup> <sup>k</sup>* ) is defined as uniform distribution with a constant value. Therefore, the VD can be obtained by maximizing prior probability:

$$\mathbf{d}\_{k}^{\upsilon} = \arg\max\_{i} \prod\_{i} p(\mathbf{d}\_{k}^{\upsilon} | \mathbf{l}\_{i}^{(k)}) = \arg\max\_{i} \sum\_{i} \log p(\mathbf{l}\_{i}^{(k)} | \mathbf{d}\_{k}^{\upsilon}) \tag{7}$$

Prior probability *p*(**l** (*k*) *<sup>i</sup>* |**d***<sup>v</sup> <sup>k</sup>* ) is defined as:

$$p(\mathbf{l}\_i^{(k)}|\mathbf{d}\_k^v) = \frac{1}{\sqrt{2\pi\sigma\_k^2}} (\frac{-(\mathbf{l}\_i^{(k)T}\mathbf{K}\mathbf{d}\_k^v)^2}{2\sigma\_k^2})\tag{8}$$

where **K** represents the internal camera parameters. Equation (8) reflects the fact that the vanishing direction is perpendicular to the plane normal of a great circle that is determined by an image line **l** (*k*) *i* and the center of the projection of the camera, as shown in Figure 2.

Maximizing objective function Equation (7) is equivalent to solving a weighted least-squares problem for each **d***<sup>v</sup> k* :

$$\mathbf{d}\_{k}^{v} = \arg\min\_{i} \sum\_{i} \frac{length(\mathbf{l}\_{i}^{(k)})}{\max(length(\mathbf{l}^{(k)}))} \cdot (\mathbf{l}\_{i}^{(k)\mathbf{T}} \mathbf{K} \mathbf{d}\_{k}^{v})^{2} \tag{9}$$

where *length*(**l** (*k*) *<sup>i</sup>* ) represents the length of the *i*-th line, and *max*(*length*(**l** (*k*))) represents the maximum line length in rough cluster *k*. Length coefficient is considered in the term because the longer the lines are, the more reliable they are. By solving Equation (9), we can obtain the initial vanishing direction and the residual for each line. To obtain a more accurate vanishing direction, we discarded lines with a larger residual than a threshold and added additional optimization. Optimized parallel lines are used to estimate the final vanishing directions. Figure 5 shows two results of parallel-line clustering obtained by our proposed method.

**Figure 5.** Results of parallel-line clustering that are used to compute vanishing directions: (**a**) the 120-th image in 'Living Room 2' sequence. (**b**) the 64-th image in 'Office Room 1' sequence. Lines with the same color are parallel in the real world. The vanishing directions obtained by these parallel lines can provide accurate and reliable constraints for MW extraction.

#### 3.2.2. Global Manhattan World Seeking

The Manhattan World assumption is suitable in human-made indoor environments, and we sought global MW axes based on plane normal vectors and the vanishing directions from the first frame. MW axes can be expressed as columns of a 3D rotation matrix *R* = [**r** *g* 1 **r** *g* 2 **r** *g* <sup>3</sup> ] ∈ *SO*(3).

We first set the the detected plane normal vectors and the vanishing directions from the parallel lines as the candidate MW axes, which return a redundant set. In fact, most of the pixels in the frame typically belong to the planes and lines that determine the dominant MW axes, and we sought the plane that contained the largest pixels, and set its plane normal vector **r**<sup>1</sup> as the first MW axis. The remaining two axes **r**<sup>2</sup> and **r**<sup>3</sup> are determined based on the orthogonality constraint with the first axis and the number of the pixels belonging to the detected planes or the parallel lines.

If we detect three mutually orthogonal planes in the first frame, we directly set their plane-normal vectors as the MW axes. In the case that there are only two orthogonal planes, the third axis is determined by the vanishing direction from the parallel lines that is orthogonal with the two previous plane-normal vectors. Similarly, if only one plane is detected in the first frame, we sought the remaining two MW axes by the vanishing directions from the orthogonal parallel lines.

Initial MW axes [**r**<sup>1</sup> **r**<sup>2</sup> **r**3], obtained by the previous step, are not strictly orthogonal. We maintained orthogonality by reprojecting the MW axes onto the closest matrix on *SO*(3). Each axis is reweighted by a factor *λ<sup>i</sup>* that is determined by the number of pixels belong to this axis' corresponding planes or parallel lines. The final global MW axes are obtained by using singular=value decomposition (SVD):

$$\mathbb{E}\left[\mathbf{r}\_1^{\mathcal{S}} \ \mathbf{r}\_2^{\mathcal{S}} \ \mathbf{r}\_3^{\mathcal{S}}\right] = \mathbf{U} \mathbf{V}^T \tag{10}$$

where [**U**, **D**, **V**] = *SVD*([*λ*1**r**<sup>1</sup> *λ*2**r**<sup>2</sup> *λ*3**r**3]) and factor *λ<sup>i</sup>* describes how certain the observation of a direction is.

#### 3.2.3. Keyframe Orientation Refinement

For each keyframe, we refined its rotation matrix by aligning the current extracted MW axes with the global MW axes. We first used the fast plane-extraction algorithm and our proposed spatial-line direction estimation method to extract plane-normal vectors and vanishing directions in the current keyframe. We then extracted the current MW axes (**r***<sup>c</sup> <sup>i</sup>* , *i* = 1, 2, 3) by using the same method that we used to extract global MW axes. We finally determine the corresponding pairs based on the Euclidean distance between vector **r***<sup>c</sup> <sup>i</sup>* and **Rr***<sup>g</sup> <sup>j</sup>* , where **R** represents the rotation matrix obtained by tracking steps for the current keyframe. Refined rotation matrix **R***<sup>r</sup>* is computed by solving:

$$\mathbf{R}^r = \arg\min\_{i=1,2,3} \sum\_{i=1,2,3} \left\| \mathbf{R}^r \mathbf{r}\_i^S - \mathbf{r}\_i^c \right\|\_2^2 \tag{11}$$

where **r** *g <sup>i</sup>* and **<sup>r</sup>***<sup>c</sup> <sup>i</sup>* represent the *i*-th global MW axis and the current extracted corresponding MW axis.

#### **4. Results**

We evaluate our proposed approach on the synthetic dataset (ICL-NUIM [32]), real-world dataset (TUM RGB-D [33]), and pose-estimation application, respectively. All experiments were run on a desktop computer with an Intel Core i7, 16 GB memory, and Ubuntu 16.04 platform.


We compared our proposed approach with two state-of-the-art MW-based methods proposed by Zhou et al. [23] and Kim et al. [27], namely, orthogonal planes based rotation estimation (OPRE) and 1P1L. OPRE estimates absolute and drift-free rotation by exploiting orthogonal planes from depth images. 1P1L estimates 3DoF drift-free rotational motion with only a single line and plane in the Manhattan world. We used the average value of the absolute rotation error (ARE) in degrees as the performance metric for the entire sequences:

$$\text{AREAaverage} = \frac{1}{N} \sum\_{i=1}^{N} \arccos(\frac{tr(\mathbf{R}\_i^T \cdot \mathbf{R}\_i^{\mathcal{S}}) - 1}{2}) \times 57.3 \tag{12}$$

where *tr*() denotes the trace of a matrix, **<sup>R</sup>***<sup>i</sup>* and **<sup>R</sup>***<sup>g</sup> <sup>i</sup>* represent the estimated and ground rotation matrix for the *i*-th frame, respectively, and *N* represents the number of frames in the tested sequence.

#### *4.1. Evaluation on Synthetic Dataset*

We first tested the performance of our proposed algorithm on the ICL-NUIM dataset, and measured the average ARE in degrees for each sequence; evaluation results are shown in Table 1. The smallest average ARE values are bolded, which reveals that our proposed method is more accurate than the two other methods. For example, in 'Office Room 0', the average ARE of our proposed method is 0.16 degrees, while that of 1P1L and OPRE is 0.37 and 0.18 degrees, respectively. Our method outperformed the others in all cases in the ICL-NUIM benchmark. The main reason is that we jointly exploited the plane, line, and point features to estimate camera orientation even when the camera moves in scenes with no consistent lines, or where only one plane is visible; this is illustrated in Figure 6. In 'Living Room 0', the OPRE method failed to estimate the rotation for the entire sequence because only one plane can be visible in some frames; we marked the result as '×' in Table 1. The last column in Table 1 shows the number of frames in the current sequence.

**Figure 6.** Results of camera orientation estimated by our proposed method on the ICL-NUIM dataset: (**a**) 'Living Room 0' sequence. (**b**) 'Office Room 0' sequence. The estimated orientation of each frame is shown in the bottom of the RGB image. Colored thick and thin lines respectively denote current orientation and global MW axes (determined in the first frame); black lines represent ground orientation.


**Table 1.** Comparison of average absolute rotation error (ARE, degrees) on the ICL-NUIM dataset.

The MW assumption is sufficiently suitable for the ICL-NUIM benchmark and we used the refinement step to achieve a drift-free rotation matrix. To clearly show the effect of the refinement step, we measured the ARE values in degrees for all sequences by our method without a refinement step, which corresponds to the 'No Refinement' column in Table 1. We recorded the values of absolute rotation error (ARE) for each frame in the 'Living Room 0' sequence, and the final rotation drift with and without refinement was 0.34 and 1.43 degrees, respectively, as shown in Figure 7. This demonstrates that the refinement step can effectively reduce rotation drift.

**Figure 7.** Performance evaluation for refinement on the 'Living Room 0' sequence: (**a**–**c**) Roll, pitch, and yaw angles estimated by the proposed method with and without a refinement step for each frame. (**d**) Absolute rotation errors for our proposed methods with and without a refinement step. This shows that the refinement step can effectively reduce rotation drift.

#### *4.2. Evaluation on Real-World Data*

We compared the performance of our proposed algorithm with other two methods on seven real-world TUM RGB-D sequences that contained structural regularities. Comparison results are shown in Table 2. We provide the average ARE for rotation estimation, and the smallest values are indicated in bold. Our proposed method showed better performance in low-texture environments such as 'fr3\_struc\_notex' and 'fr3\_cabinet' because we used hybrid features to estimate orientation, as shown in Figure 8. Our method can also provide a more accurate rotation matrix in an environment with imperfect MW structure like 'fr3\_nostruc\_tex' and 'fr3\_nostruc\_notex', whereas OPRE fails because it requires at least two orthogonal planes to estimate camera orientation.


**Table 2.** Comparison of average ARE (degrees) on TUM RGBD Dataset.

The result of refinement performance on the 'fr3\_cabinet' sequence is shown in Figure 9. Final rotation drift with and without refinement was 1.30 and 1.62 degrees, respectively. It is clear that our refinement step can effectively reduce drift error. The average ARE values computed by our proposed algorithm with and without refinement step were the same in sequences 'fr3\_nostruc\_tex' and 'fr3\_nostruc\_notex'. The reason is that there were no perfect global MW axes extracted in the first frame, and the refinement step was not implemented.

**Figure 8.** Results of camera orientation estimated by our proposed method on the TUM RGB-D dataset: (**a**) 'fr3\_cabinet' sequence. (**b**) 'fr3\_struc\_notex' sequence. Estimated orientation of each frame is shown in the bottom of the RGB image. Colored thick and thin lines respectively denote current orientation and global MW axes (determined in the first frame); black lines represent ground orientation.

**Figure 9.** *Cont.*

**Figure 9.** Performance evaluation for refinement on the 'fr3\_cabinet' sequence: (**a**–**c**) Roll, pitch, and yaw angles estimated by our proposed method with and without a refinement step for each frame. (**d**) Absolute rotation errors for the proposed methods with and with refinement step. This shows that the refinement step can effectively reduce rotation drift.

#### *4.3. Application to Pose Estimation*

To further verify the practicability of our proposed visual compass, we used it for pose-estimation application and recorded the estimated trajectories. The three-dimensional pose has six degrees of freedom (DoF) and it consists of 3-DoF rotation and 3-DoF translation. As our proposed visual compass method can provide accurate rotation estimation, the key to performing pose localization is to estimate the translation component. We first detected and tracked ORB feature points to obtain point correspondences. Then, we recovered the 3-DoF translational motion of the images by minimizing:

$$\mathbf{t} = \arg\min\_{i=1\ldots N} \left\| \left( \mathbf{R}^{fixed} \cdot \mathbf{P}\_i^{ref} + \mathbf{t} \right) - \mathbf{P}\_i^k \right\|\_2^2 \tag{13}$$

where *Rfixed* represents the rotation matrix between the reference image and the current image, and it is obtained by our proposed visual compass, Three-dimensional points *Pref <sup>i</sup>* and *<sup>P</sup><sup>k</sup> <sup>i</sup>* are described in Equation (4).

We tested pose estimation on four datasets, "Living Room 2", "Office Room 3", "fr3\_struc\_tex", and "fr3\_nostruc\_tex". These datasets provide the ground-truth pose for each image; we measured the root mean squared error (RMSE) of the absolute translational error (ATE) and compared it with state-of-the-art approaches, namely, ORB\_SLAM [12], dense visual odometry (DVO) [15], and line-plane based visual odometry (LPVO) [22]. The comparison of ATE.RMSE is shown in Table 3; the smallest error for each sequence is indicated in bold. Estimated trajectories are shown in Figure 10.

**Figure 10.** Estimated trajectories with the proposed (red) and ground truth (black) on four sequences: (**a**) Living Room2, (**b**) Office Room 3, (**c**) fr3\_struc\_tex, (**d**) fr3\_nostruc\_tex.


**Table 3.** Comparison of ATE.RMSE (unit: m).

#### **5. Conclusions**

We proposed a visual-based method to estimate robot orientation with RGB-D cameras. We exploited hybrid features providing reliable constraints to construct cost function for solving the initial rotation matrix. We presented a vanishing direction extraction method based on parallel lines and combined it with detected plane normals to seek global and current Manhattan World axes. We refined the orientation matrix of the selected keyframe with respect to the global MW axes, and achieved drift-free orientation. The proposed algorithm was tested on both synthetic as well as real-world publicly available RGB-D datasets, and we compared it with two state-of-the-art methods for orientation estimation. The results demonstrated the accuracy and robustness of our proposed method. Furthermore, we applied the proposed algorithm to pose estimation and recovered the translational motion by giving absolute camera orientation; evaluation on the public sequences showed improved accuracy. In summary, the proposed algorithm showed good performance in Manhattan World scenes, and it has significant applications on mobile robotics. In the future, we will exploit hybrid features to perform pose location and 3D mapping that can provide maps with a geometric structure and more robust pose estimation in less-textured and -structured environments, and we will try to optimize the refinement step for not only pure Manhattan Worlds but also more general environments like Mixtures of Manhattan Worlds [34].

**Author Contributions:** Methodology, R.G.; resources, R.G.; software, R.G.; supervision, K.P. and D.Z.; validation, K.P. and Y.L.; writing—original draft, R.G.; writing—review and editing, D.Z. and Y.L.

**Funding:** This research received no external funding.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

*Article*

## **Estimated Reaction Force-Based Bilateral Control between 3DOF Master and Hydraulic Slave Manipulators for Dismantlement**

#### **Karam Dad Kallu, Jie Wang, Saad Jamshed Abbasi and Min Cheol Lee \***

School of Mechanical Engineering, Pusan National University, Busandaehak-ro 63beon-gil, Geumjeong-gu, Busan 46241, Korea; karamdadkallu@gmail.com (K.D.K.); woaiwoffff@gmail.com (J.W.); saadjamshed93@gmail.com (S.J.A.)

**\*** Correspondence: mclee@pusan.ac.kr; Tel.: +82-10-3861-2688

Received: 4 August 2018; Accepted: 15 October 2018; Published: 16 October 2018

**Abstract:** This paper proposes a novel bilateral control design based on an estimated reaction force without a force sensor for a three-degree of freedom hydraulic servo system with master–slave manipulators. The proposed method is based upon sliding mode control with sliding perturbation observer (SMCSPO) using a bilateral control environment. The sliding perturbation observer (SPO) estimates the reaction force at the end effector and second link without using any sensors. The sliding mode control (SMC) is used as a bilateral controller for the robust position tracking and control of the slave device. A bilateral control strategy in a hydraulic servo system provides robust position and force tracking between master and slave. The difference between the reaction force of the slave produced by the effect of the remote environment and the operating force applied to the master by the operator is expressed in the target impedance model. The impedance model is applied to the master and allows the operator to feel the reaction force from the environment. This research experimentally verifies that the slave device can follow the trajectory of the master device using the proposed bilateral control strategy based on the estimated reaction force. This technique will be convenient for three or more degree of freedom (DOF) hydraulic servo systems used in dismantling nuclear power plants. It is worthy to mention that a camera is used for visual feedback on the safety of the environment and workspace.

**Keywords:** Hydraulic Servo System; SMCSPO; Bilateral Control; Estimated Reaction Force; Master–Slave Configuration and Nuclear Power Plant

### **1. Introduction**

This is an era for computer science and technology; thus, automation and remote operation requires time. The automated/remote system must consider, in detail, the manual operation to come up with an alternative solution that provides better control for the operator with instrumental feedback. Several conditions need to be considered when designing and implementing remote activities. These conditions include a clear workspace for designed equipment, the environmental conditions of the area of operation, handling of different types of materials, tool setup and change out among others. The crux of the story is to replace manual tools with a suitable remote system. A simple example is to modify grip operation by utilizing the end effector and implementing remote alignment and pinning methods and using self-standing bails. A successful remote design always inherits a defined workspace for equipment with minimum mechanical limitations to accomplish a task. Remote technology is a broader area that could be subdivided into different fields such as dismantling equipment, cutting tools, segmenting, sampling and workspace to handle the designed equipment. This field has some

state-of-the-art applications, e.g., dismantling of nuclear facilities which require more safety. Several articles in past research [1–8] have been published to address this problem. The Maestro robot system was developed in France to dismantle inactive nuclear reactors [9,10].

An alternative for such activities is tele-operation systems which have human control to ensure safety. In tele-operated systems, human personnel set visual/instrumental feedback from the manipulator to decide future tasks. The controller, in this case, acts according to the master–slave configuration. Bilateral control architecture is at favorable positions for such targets, and most research has presented two-channel architecture such as position–position (P-P), position–force (P-F), force–force (F-F) and force–position (F-P). Several other authors in the past [11–14] have presented bilateral controllers with more than a two-channel architecture.

Several researchers in the past have discussed the improvement in control architecture and the performance of controllers [15,16]. Yana et al. [17] have addressed the finite-time control problem for the bilateral tele-operated environment through resulting feedback. They proposed an observer to estimate a velocity profile by ensuring semi-global stability depending upon the resulting velocity error. A promising feature of this method is that it only utilizes position information which forces the master–slave synchronization error to approach null value in a defined time. Dinh et al. [18] have presented the idea of utilizing a joystick controller to be used for construction machinery control. The authors proposed a controller that is comprised of a force-reflecting gain tuner and a couple of adaptive controllers, namely, master and slave. They implemented a fuzzy logic technique to design local adaptive controllers. Additionally, they utilized an efficient optimization algorithm that provides a real feeling of interaction for an operator at a remote site. Ollin et al. [19] introduced the idea of communications delayed under-actuated mechanical systems under master–slave tele-operated control. They proposed a solution resulting in a particular compiling matrix that ensures the error dynamics refrained in the linear and non-linear parts inherit the matching condition to be satisfied. Later, these linear and non-linear parts are utilized to design discontinuous casual controllers to achieve bilateral co-ordination in position and time. Rabah et al. [20] proposed the idea of using a two adaptive fuzzy controller. The proposed algorithm not only adjusts membership function by the compensatory fuzzy controller but also implements a compensatory learning algorithm for optimal solutions. The first controller is designed under a compensatory neural–fuzzy interface whereas the other is designed under a compensatory adaptive neural–fuzzy interface system. The force–position scheme is utilized by incorporating a two-channel bilateral tele-operated architecture. Finally, the stability and transparency analysis is carried out under passivity framework [20]. Another design of a bilateral controller is presented in Umar et al. [21]. The design is based upon the state of convergence and was implemented for tele-operated systems. The authors applied Takagi–Sugeno's (TS) fuzzy model approach for approximation. The authors mentioned that SC-based bilateral controllers have several advantages in modeling and control design. The most advantageous feature is the ease to implement. The master–slave system in the modeling step can be performed by an *n-th* order differential equation. Similarly, its control design step can easily identify the gains required for the desired closed loop dynamics for a particular system.

Another novel control strategy is presented in [22], where the controller is not placed with the system under experiment. The authors analyzed the stability and transparency of the proposed tele-operators. They implemented PD-like controllers for fixed-time delays and P-like controllers for time-varying delays. Another interesting study by Kamran et al. [23], which utilizes a robust controller for master control and an adaptive back-stepping controller is designed for the slave controller. The authors analyzed the scenarios of input-delay uncertainties in the parameters and multi-objective optimization in implementing the robust master control. KD et al. [24] presented the estimating methodology of reaction force for assembly work with a robot manipulator consisting of a three-link dual-arm. They utilized sliding mode control with a sliding perturbation observer. In another study by Tayfun et al. [25], the authors presented bilateral control of a tele-operated system consisting of one master and two-slave systems. The master system was a 6DOF haptic robot while one of the slave

systems was a virtual 6DOF robot and another real industrial robot. A visual user interface was created to show the position and velocity profile for tele-operated control. They implemented Lyapunov stability methodology to analyze stability and position localization. Shafiqul et al. [26] investigated a bilateral control tele-operated scheme for a robotic system with unsymmetrical time-varying delays. The authors implemented an adaptive algorithm to determine relating factors between human and master manipulator and between slave and remote setup. Later, delayed estimated factors are fed back to the master and slave systems. The authors estimated and analyzed the impedance properties of the interaction between the human-based system and the remote environment. Further application of bilateral control are electro-pneumatic actuators and mobile robot [27,28].

Xiao et al. [29] presented the model mediated tele-operated approach. This proposed scheme has been designed to achieve stability and transparency while considering the random communication delays. Da Sun et al. [30], proposed a novel approach for tele-operated systems utilizing an extended prescribed performance control and a wave-based time domain passivity scheme. This scheme ensures synchronization of velocity, force, and localization. The stability and performance of the system are analyzed by the standard Lyapunov scheme. This methodology also ensures high tracking results of localization, velocity, and force. Similarly Azimifar et al. [31] presented a strategy to estimate the external force acting on master and slave systems. This scheme is advantageous with low-cost features and ease of implementation as the force sensors requirement is eliminated. They proposed a novel control for a nonlinear bilateral tele-operated system with time delays that estimates the force accordingly. The stability of the system is analyzed through the famous Lyapunov stability methodology. The tele-operation of remote system falls under the umbrella of bilateral control. It includes a master robot system that is operated by human operators and controlled through electric actuators whereas a slave robot system that is placed or installed at a remote location and is controlled through hydraulic actuators. Thus, to achieve an efficient bilateral control, three key features are necessary. The first is the coordinated control that is responsible for decoupling of position and force controllers. Second, is the linearization of hydraulic actuators. Third, is the agreement of system order of electric and hydraulic actuators by using a pseudo differentiator. Sho et al. [32] proposed a novel method with an efficient and accurate tracking performance and is the most stable in a contact control scenario among all three controllers.

Minou et al. [33] proposed a hybrid control algorithm for trajectory tracking with constant force. A non-linear model is implemented for the position controller to achieve predictive tracking, satisfying the input constraints. The authors analyzed the performance of the proposed controller and concluded that position tracking has a proof-mean-square-error (RMSE) of 0.89 mm whereas the catheter regulated the force with RMSE of 4.9 mN. Yaoyao et al. [34] presented a novel idea of trajectory control of an underwater vehicle-manipulator system by implementing a discrete time delay estimating methodology. Xia et al. [35] proposed a non-linear adaptive control algorithm for a non-linear manipulator system with uncertainties in its dynamics. Their algorithm guarantees accurate and efficient tracking. The proposed algorithm not only estimates the manipulator's dynamics but also determines the best fit dead-zone parameter in adaptation law. The estimated values are later used in proposed control law. The authors claimed and showed results suggesting that tracking is accurate and efficient even when both dynamics and dead-zone uncertainties appear at the same time. There are several studies in the literature which have presented the idea of utilizing force sensors to detect the outer forces in conventional bilateral control schemes, but such addition of force sensors has certain problems. A simple explanation to it is that a manipulator system with a force sensor acts as a two-mass resonant system. Thus, it is a bottleneck in realizing high-frequency force sensing [36,37]. The master and slave configuration of a manipulator system is highly applicable and suitable for nuclear power plants, because of restricted human access. In addition to this, it also requires exact control and dismantling/handling of a material/object with excessive load. Hydraulic systems are best suitable for such an application as they offer high power actuators.

In keeping a view of the above studies, in this study, we implemented the sliding mode control with sliding perturbation observer (SMCSPO). It is an efficient and robust control algorithm that not only estimates the reaction force of master and slave but also determines the bilateral control of the hydraulic servo system of a 3DOF master–slave robot. The reason for using a hydraulic servo system is that its power-to-weight ratio is better than any other type of actuated robot at the expense of positional accuracy. In this study, the sliding perturbation observer (SPO) is implemented to estimate the reaction force of the slave without using any sensor. The bilateral control scheme is implemented for efficient and accurate position and force tracking b/w master–slave configuration with visual feedback. In the bilateral controller, the difference of reaction force of the slave manipulator and operating force applied to the master manipulator is designed to target the impedance model. The reaction force of the slave is resultant of effects in the remote environment while operator force is applied by the operator (Human) at the master manipulator. The experimental results of studies endorse that the slave efficiently follows the position trajectory of the master system.

The manuscript is organized as follows: Section 2 of this paper presents the mechanical structure and dynamics of a 3DOF hydraulic servo system that could be utilized in the dismantling of nuclear power plants. Section 3 describes the sliding mode control with sliding perturbation observer (SMCSPO), and the reaction force estimation method is also presented in the same section. Section 4 is reserved for the description of mathematical details regarding bilateral control of master–slave 3DOF hydraulic servo system, and Section 5 presents its experimental setup. Section 6 presents the results of the performance of bilateral control through different experiments, and finally, Section 7 entails the concluding remarks to this study.

#### **2. Mechanical Design and Dynamics of Hydraulic Servo System**

It is a well-known fact that hydraulic servo systems have a favorable position in applications involving the machine tool industry. Some popular examples include handling hazardous material, remote equipment, steel factories, mining of materials, the exploration of oil, and the testing of automotive equipment, etc. A servo system is responsible, according to design, to control dynamical properties, such as; force, pressure, acceleration, electrical properties, etc. The attractive features of hydraulic servo systems include efficient response time, high torque and very few stroke characteristics. The hydraulic system has several advantages, but most prominent are accurate tracking of localization and acceleration, better stiffness features, null backlash, efficient response to sudden change, and less wear rate, among others. The hydraulic system has a better position among robotic systems to dismantle nuclear plants or hazardous plants. The hydraulic system, while used in nuclear plants, could be divided into two major parts, the first consists of a couple of hydraulic cylinders which formulate the vertical movement and the second consists of an AC servo motor responsible for horizontal changes. Two cylinders with a hydraulic mechanism is provided in the first part as much higher torque is required in the vertical direction. The mechanical design of shape and different sizes of a hydraulic servo system utilized in nuclear power plants is shown in Figures 1 and 2. The modelling of hydraulic actuator has been presented in Appendix A.

**Figure 1.** Structure of the 3DOF hydraulic servo system.

**Figure 2.** The 3DOF hydraulic servo system.

The schematic design of an end effector is displayed in Figure 3 and the schematic diagram of a hydraulic servo system's base and secondary link is shown in Figure 4.

**Figure 3.** Schematic diagram of end effector.

**Figure 4.** Schematic diagram of base and second link.

It is a well-known fact that a robotic system dynamic is a cross-relation between different forces, acceleration properties, and localization. Thus, the dynamical equation of a robotic manipulator in free space could be represented mathematically as,

$$T = A(\theta)\ddot{\theta} + B\left(\theta, \dot{\theta}\right) + \mathcal{g}(\theta) \tag{1}$$

where *θ*, *A*(*θ*), *B θ*, . *θ* , *g*(*θ*) and *T* are the vectors representing joints of angles, the matrix representing mass or inertia, the centrifugal/Coriolis torque, the gravity torque in joint space, and the vector of joint torques, respectively. Similarly, the equation repressing dynamical properties of different links in a hydraulic servo system could be represented as

$$(I\_{s1} + \Delta I\_{s1})\ddot{\theta}\_1 + (D\_{s1} + \Delta D\_{s1} + \beta\_1)\dot{\theta}\_1 + 0.5M\_{s1}L\_1g\sin\theta\_1 + \tau\_{t1} = T\_1 \tag{2}$$

$$\frac{1}{2}(I\_{\&2} + \Delta I\_{\&2})\ddot{\theta}\_2 + (D\_{\&2} + \Delta D\_{\&2})\dot{\theta}\_2 + M\_{\&2}L\_2g\cos\theta\_2 + \beta\_2\dot{\mathbf{x}} + \pi\_{\ell \to 2} + \lambda = T\_2\tag{3}$$

$$\frac{1}{2}(I\_{\xi3} + \Delta I\_{\xi3})\ddot{\theta}\_3 + (D\_{\xi3} + \Delta D\_{\xi3})\dot{\theta}\_3 + 0.5M\_{\xi3}L\_3g\sin\theta\_3 + \tau\_{\xi3} = T\_3\tag{4}$$

where *Js*1, *Js*<sup>2</sup> and *Js*<sup>3</sup> represent the inertia of the base, second link, and end effector respectively, similarly *Ds*1, *Ds*<sup>2</sup> and *Ds*<sup>3</sup> represents the damping characteristics of the base, second link, and end effector respectively. The uncertainty is represented by Δ, *β*1, and *β*<sup>2</sup> shows the viscosity of each cylinder in the first part. *Ms*1, *Ms*<sup>2</sup> and *Ms*<sup>3</sup> represent the masses of the base, second link and end effector, *L*<sup>1</sup> and *L*<sup>3</sup> are lengths of base and end effector, *L*<sup>2</sup> represents the length from joint to the centre of mass (COM) of the second link. *τe*1, *τe*<sup>2</sup> and *τe*<sup>3</sup> represent the reaction torque generated by contact with the environment and joints 1, 2 and 3 respectively, *λ* represents the dynamical properties regarding the base, . *<sup>θ</sup>*<sup>1</sup> is the velocity of the first cylinder and . *x* is the velocity of the second cylinder, and *T*1, *T*<sup>2</sup> and *T*<sup>3</sup> represent joint torques of the base, second link and end effector respectively.

#### **3. Sliding Mode Control with Sliding Perturbation Observer (SMCSPO)**

#### *3.1. Sliding Mode Control*

Several previous studies have shown that the implementation of a sliding perturbation observer (SPO) and a sliding mode control (SMC) at the same time provides certain attractive features which include efficient performance against perturbation by utilizing partial state feedback. The combination of these two with such attractive properties is known as sliding mode control with sliding perturbation observer [37]. In our previous work [38], we only estimated the reaction force of the slave at the end effector. In this study, three-link robotic manipulator actuators are controlled by implementing SMC, and the reaction force is determined by applying SPO. A generic mathematical representation of n-degrees of freedom system inheriting second order dynamics is as follow,

$$\ddot{\mathbf{x}}\_{j} = f\_{j}(\mathbf{x}) + \Delta f\_{j}(\mathbf{x}) + \sum\_{i=1}^{n} \left[ (b\_{ji}(\mathbf{x}) + \Delta b\_{ji}(\mathbf{x}))u\_{i} \right] + d\_{j}(t), \; j = 1, \; \dots, n \tag{5}$$

where **<sup>x</sup>** <sup>Δ</sup> = [**X**<sup>1</sup> ... **Xn**] *<sup>T</sup>* represents the state vector with **Xj** Δ = [*xj*, . *xj*] *T* , the non-linear driving force is represented by *fj*(**x**) with uncertainties Δ*fj*(**x**), the elements of the control gain matrix is represented by *bji* with corresponding uncertainties Δ*bji*, the external disturbance and control input is represented by *dj* and *uj*; respectively, and *fj*, *bji* are well known continuous functions of state in literature [39]. All of the uncertainties could be summed up, to represent perturbation as follow,

$$\psi\_{\dot{\jmath}}(\mathbf{x},t) = \Delta f\_{\dot{\jmath}}(\mathbf{x}) + \sum\_{i=1}^{n} [\Delta b\_{\dot{\jmath}i}(\mathbf{x})u\_i] + d\_{\dot{\jmath}}(t) \tag{6}$$

The objective of control is to force state x to a desired state **x***<sup>d</sup>* Δ = [**X**1*<sup>d</sup>* ... **X***nd*] *<sup>T</sup>* in the presence of perturbation. The upper bound for perturbation is defined by a known continuous function of state as follows

$$\Gamma\_j(\mathbf{x}, t) = F\_j(\mathbf{x}) + \Sigma \big| \Phi\_{ji}^{\prime}(\mathbf{x}) u\_i \big| + D\_j(t) > \left| \Psi\_j(t) \right| \tag{7}$$

where *Fj* > Δ*fj* , <sup>Φ</sup>*ji* <sup>&</sup>gt; Δ*bji* and *Dj* <sup>&</sup>gt; *dj* represent the expected upper bounds of the uncertainties. Let us suppose *fj*(**x**), defined in Equation (5), except perturbation of Equation (6) is represented as

$$f\_{\vec{f}}(\vec{\mathfrak{X}}) + \sum\_{i=1}^{n} b\_{\vec{j}\vec{i}}(\vec{\mathfrak{X}})u\_{i} = a\_{3\vec{j}}\overline{u}\_{\vec{j}} \tag{8}$$

where *α*3*<sup>j</sup>* is an arbitrary positive number, and *uj* is the new control variable. The equations of SPO are derived as [39]. .

$$\pounds\_{1\circ} = \pounds\_{2\circ} - k\_{1\circ}sat(\widecheck{x}\_{1\circ}) \tag{9}$$

$$
\dot{\mathfrak{X}}\_{2\dot{j}} = \mathfrak{a}\_3 \overline{\mathfrak{u}}\_{\dot{j}} - k\_{2\dot{j}} \text{sat}(\tilde{\mathfrak{x}}\_{1\dot{j}}) + \Psi\_{\dot{j}} \tag{10}
$$

$$
\dot{\mathfrak{x}}\_{3j} = \mathfrak{a}\_{3j}^2 \left( \mathfrak{u}\_j + \mathfrak{a}\_{3j}\mathfrak{x}\_{2j} - \mathfrak{x}\_{3j} \right) \tag{11}
$$

$$
\hat{\psi}\_{\rangle} = \mathfrak{a}\_{\mathfrak{J}\rangle} (\mathfrak{a}\_3 \mathfrak{f}\_{2\mathfrak{j}} - \mathfrak{f}\_{\mathfrak{J}\mathfrak{j}}) \tag{12}
$$

where *<sup>k</sup>*1*j*, *<sup>k</sup>*2*j*, *<sup>α</sup>*3*<sup>j</sup>* are positive numbers and *<sup>x</sup>* <sup>=</sup> *<sup>x</sup>*<sup>ˆ</sup> <sup>−</sup> *<sup>x</sup>* is the estimation error of the measurable state. *<sup>ψ</sup>*ˆ*<sup>j</sup>* is defined as the estimated perturbation of the robot manipulator. The "" and "ˆ" represent the error in estimation and quantity estimated in result of the estimation respectively.

$$sat(\check{\mathbf{x}}\_{1j}) = \begin{cases} |\check{\mathbf{x}}\_{1j} / |\check{\mathbf{x}}\_{1j}| \, & \, if \big| \check{\mathbf{x}}\_{1j} \big| \geq \varepsilon\_{0j} \\ |\check{\mathbf{x}}\_{1j} / \varepsilon\_{0j'} \quad \, if \big| \check{\mathbf{x}}\_{1j} \big| \leq \varepsilon\_{0j} \end{cases} \tag{13}$$

*Electronics* **2018**, *7*, 256

The anti-chatter properties are formulated by a saturation function defined by Slotine et al. [40] where the boundary layer of SMC control is represented by *ε*0*j*. Finally, the error dynamics of SPO are mathematically defined as [39], .

$$
\widetilde{\mathbf{x}}\_{1j} = \widetilde{\mathbf{x}}\_{2j} - k\_{1j} \text{sat}(\widetilde{\mathbf{x}}\_{1j}) \tag{14}
$$

$$
\dot{\tilde{x}}\_{2j} = -k\_{2j} \text{sat}(\tilde{x}\_{1j}) + \tilde{\Psi}\_j \tag{15}
$$

$$\dot{\tilde{\mathbf{x}}}\_{3j} = -a\_{3j}^2 (a\_{3j}\tilde{\mathbf{x}}\_{2j} - \tilde{\mathbf{x}}\_{3j}) + \dot{\Psi}/a\_{3j} \tag{16}$$

After the observer sliding mode begins, *<sup>x</sup>*<sup>2</sup>*<sup>j</sup>* dynamics become

$$
\dot{\tilde{\mathbf{x}}}\_{2j} + \left(k\_{2j}/k\_{1j}\right)\tilde{\mathbf{x}}\_{2j} = \tilde{\mathbf{\dot{Y}}}\_{j} \tag{17}
$$

The frequency domain relation between Ψ *<sup>j</sup>* and Ψ*<sup>j</sup>* is derived as

$$\Psi\_{\dot{\gamma}}(p) = \frac{p\left[p^2 + \left(k\_{1\dot{\gamma}}/\varepsilon\_{0\dot{\gamma}}\right)p + k\_{2\dot{\gamma}}/\varepsilon\_{0\dot{\gamma}}\right]}{p^3 + \left(k\_{1\dot{\gamma}}/\varepsilon\_{0\dot{\gamma}}\right)p^2 + \left(k\_{1\dot{\gamma}}/\varepsilon\_{0\dot{\gamma}}\right)p + a\_{3\dot{\gamma}}^2(k\_{2\dot{\gamma}}/\varepsilon\_{0\dot{\gamma}})} \left(-\Psi\_{\dot{\gamma}}(p)\right) \tag{18}$$

and this is equivalent to a high-pass filter. The sliding function is defined as

$$s\_{\dot{j}} = \dot{\mathbf{e}}\_{\dot{j}} + \mathbf{c}\_{1\dot{j}}\mathbf{e}\_{\dot{j}} \tag{19}$$

where *ej* = *x*1*<sup>j</sup>* − *x*1*dj* is the actual position tracking error, *c*1*<sup>j</sup>* > 0. As the sliding surface is reached, we define . *sj* = 0 and sliding control is defined as

$$
\dot{s}\_{\dot{\jmath}} = -K\_{\dot{\jmath}}sat(s\_{\dot{\jmath}}) \tag{20}
$$

where robust gain is represented by *Kj* and is supposed to be positive. The control input *uj* is mathematically defined as follows,

$$u\_{\dot{j}} = B^{-1} \mathbb{C}ol(\ddot{\mathbf{x}}\_{1d\dot{j}} + f\_{\dot{j}}(\mathbf{x}) + c\dot{e}\_{\dot{j}})\_{\dot{j}} - K\_{\dot{j}} \text{sat}(\mathbf{s}\_{\dot{j}}) \tag{21}$$

Similarly, the estimated sliding function is represented mathematically as

$$
\mathfrak{F}\_{\dot{l}} = \dot{\mathfrak{e}}\_{\dot{l}} + \mathfrak{c}\_{\dot{l}1} \mathfrak{e}\_{\dot{l}} \tag{22}
$$

where, the estimation error of localizing tracking is *e*ˆ*<sup>j</sup>* = *x*ˆ1*<sup>j</sup>* − *x*1*dj* and *j*th degree of freedom motion [*x*1*dj* . *x*1*dj*] *<sup>T</sup>* and *cj*<sup>1</sup> <sup>&</sup>gt; 0. Similarly, the estimation error in sliding function is mathematically represented by *sj* <sup>=</sup> *<sup>s</sup>*ˆ*<sup>j</sup>* <sup>−</sup> *sj*. The estimation error in sliding function can be calculated by using Equations (19) and (22) as

$$
\widetilde{\mathbf{x}}\_{\dot{\jmath}} = \dot{\widetilde{\mathbf{x}}}\_{1\dot{\jmath}} + c\_{\dot{\jmath}1} \widetilde{\mathbf{x}}\_{1\dot{\jmath}} \tag{23}
$$

The new control input *uj* is designed such that it forces . *s*ˆ*s*ˆ < 0 outside of the prescribed manifold. The desired *<sup>s</sup>*ˆ*j*-dynamics is defined as .

$$\mathfrak{s}\_{\rangle} = -\mathbb{K}\_{\mathfrak{i}} \text{sat}(\mathfrak{s}\_{\rangle}) \tag{24}$$

Thus . *s*ˆ*<sup>j</sup>* can be calculated as follows

$$\begin{aligned} \dot{\varepsilon}\_{j} &= \mathfrak{a}\_{3j}\overline{\mathfrak{u}}\_{j} - \left[ k\_{2j}/\varepsilon\_{0j} + c\_{j1} \left( k\_{1j}/\varepsilon\_{0j} \right) - \left( k\_{1j}/\varepsilon\_{0j} \right)^{2} \right] \widetilde{\mathfrak{x}}\_{1j} \\ &- \left( k\_{1j}/\varepsilon\_{0j} \right) \widetilde{\mathfrak{x}}\_{2j} - \ddot{\mathfrak{x}}\_{1jd} + c\_{j1} \left( \mathfrak{x}\_{2j} - \dot{\mathfrak{x}}\_{1jd} \right) + \widehat{\mathfrak{y}}\_{j} \end{aligned} \tag{25}$$

The control law to apply Equation (23) with *<sup>x</sup>*<sup>2</sup>*<sup>j</sup>* <sup>=</sup> 0 is defined as

$$\begin{cases} \overline{u}\_{j} = \frac{1}{\alpha\_{\overline{j}}} \{-K\_{\overline{j}} \text{sat}(\mathfrak{k}\_{\overline{j}}) + \left[\frac{k\_{2\overline{j}}}{\varepsilon\_{0\overline{j}}} + c\_{j1} \frac{k\_{1\overline{j}}}{\varepsilon\_{0\overline{j}}} - \left(\frac{k\_{1\overline{j}}}{\varepsilon\_{0\overline{j}}}\right)^{2}\right] \widetilde{\mathbf{x}}\_{1\overline{j}} \\ \mathbf{x} + \ddot{\mathbf{x}}\_{1\overline{j}d} - c\_{j1} (\dot{\mathbf{x}}\_{2\overline{j}} - \dot{\mathbf{x}}\_{1\overline{j}d}) - \boldsymbol{\Psi}\_{\overline{j}} \} \end{cases} \tag{26}$$

where *uj* represents the control input of SMCSPO. Thus, the resulting *s*ˆ*j*-dynamics including the effects of *<sup>x</sup>*<sup>2</sup>*<sup>j</sup>* could be mathematically represented as

$$
\dot{\hat{s}}\_{\hat{\jmath}} = -K\_{\hat{\jmath}} \text{sat}(\hat{s}\_{\hat{\jmath}}) - \left(k\_{1\hat{\jmath}}/\varepsilon\_{0\hat{\jmath}}\right) \tilde{\mathbf{x}}\_{2\hat{\jmath}} \tag{27}
$$

To ensure the outer part of the manifold *s*ˆ*j* <sup>≤</sup> *<sup>ε</sup>*0*<sup>j</sup>* satisfies the inequality . *s*ˆ*s*ˆ < 0, the robust control gain must be constrained to the following inequality

$$\mathbb{K}\_{\circ} \ge k\_{1j}^2 / \varepsilon\_{0j} \tag{28}$$

Finally, the actual *sj*-dynamics within the boundary layer of *s*ˆ*j* ≤ *ε*0*<sup>j</sup>* can be mathematically expressed as

$$\begin{cases} \dot{s}\_{j} + \frac{K\_{j}}{\varepsilon\_{0j}} s\_{j} = \left[ \frac{k\_{2j}}{\varepsilon\_{0j}} - \left( \frac{k\_{1j}}{\varepsilon\_{0j}} - \frac{K\_{j}}{\varepsilon\_{0j}} \right) \left( \varepsilon\_{j1} - \frac{k\_{1j}}{\varepsilon\_{0j}} \right) \right] \widetilde{\mathbf{x}}\_{1j} \\\ - \left( \varepsilon\_{j1} + \frac{K\_{j}}{\varepsilon\_{0j}} \right) \widetilde{\mathbf{x}}\_{2j} - \widetilde{\Psi}\_{j} \end{cases} \tag{29}$$

The estimation errors in state estimation and perturbation are the driving force for *sj*-dynamics. The target of the designed SMCSPO is to minimize mismatch b/w the actual and desired trajectory. The sliding perturbation observer is responsible for reducing the estimation error *<sup>x</sup><sup>j</sup>* in the observer part. Therefore, the sliding mode control is designed for error *e*ˆ*<sup>j</sup>* reduction b/w desired and actual values of the trajectory. Thus, implementation of SMC and SPO simultaneously guarantee the accurate results of trajectory tracking with a reduction of error.

The mechanical hardware limitations restrict the design procedure as described in Jairo et al. [39]. The instant when *s*ˆ*j* ≤ *ε*0*j*, the observer design and *sj*-dynamics, can be represented in mathematical form as below,

$$
\begin{bmatrix}
\dot{\tilde{\mathbf{x}}}\_{1j} \\
\dot{\tilde{\mathbf{x}}}\_{2j} \\
\dot{\tilde{\mathbf{x}}}\_{3j} \\
\dot{s}\_{j}
\end{bmatrix} = \begin{bmatrix}
0 & a^3\mathbf{j}\_{3j} & -a^2\mathbf{j}\_{3j} & 0 \\
k\_{2j}/\varepsilon\_{0j} - \left(c - k\_{1j}/\varepsilon\_{0j}\right)^2 & -\left(2c + a^2\mathbf{j}\_{3j}\right) & a\_{3j} & -c
\end{bmatrix} \cdot \begin{bmatrix}
\ddot{\mathbf{x}}\_{1j} \\
\ddot{\mathbf{x}}\_{2j} \\
\ddot{\mathbf{x}}\_{3j} \\
s\_{j}
\end{bmatrix} + \begin{bmatrix}
0 \\
0 \\
1 \\
0
\end{bmatrix} \dot{\boldsymbol{\psi}}\_{j}/a\_{3j} \tag{30}
$$

where a square matrix of order 4 in Equation (30) represents the state matrix. Further suppose, *λ* represents the eigen values of state matrix A, then its characteristic equation det |*λI* − *A*| = 0, can be expressed as,

$$\left[\lambda + \varepsilon\_{j1}\right] \left[\lambda^3 + \left(k\_{1j}/\varepsilon\_{0j}\right)\lambda^2 + \left(k\_{2j}/\varepsilon\_{0j}\right)\lambda + a^2\varepsilon\_{3j}\left(k\_{2j}/\varepsilon\_{0j}\right)\right] = 0\tag{31}$$

By implementing the pole-placement method, let us introduce a desired characteristic polynomial *p*(*λd*) = (*λ* + *λd*) <sup>4</sup> which leads to a design solution

$$k\_{1\circ}/\varepsilon\_{0\circ} = 3\lambda\_d k\_{2\circ}/k\_{1\circ} = \lambda\_d \mathfrak{a}\_{3\circ} = \sqrt{\lambda\_d/3}\mathfrak{c} = K\_\circ/\varepsilon\_{0\circ} = \lambda\_d \tag{32}$$

It is obvious that a reduction in error and an increase in the accuracy of observations could be achieved with a large value of gain *λd*. But Jairo et al. [41] has categorically shown the limitation of breakpoints of sliding function dynamics encircled in a manifold. They showed that it could not go beyond 1/5*τhw*, where *h* is a positive number, and frequency is represented by *w*. Another study by Slotine et al. [42] also showed the same results. Thus, the optimal gain value could be selected as

$$
\lambda\_d = \frac{1}{15\pi^{hw}}\tag{33}
$$

#### *3.2. Reaction Force Estimation Based upon Sliding Perturbation Observer (SPO)*

The algorithm using SPO could be utilized to set an estimate of perturbation that leads to the determination of reaction force. The perturbation's estimation includes two factors, the first disturbance that is regarded as an external force and another is the dynamic error that inherits non-linear terms and viscous friction. The 3DOF robotic-manipulator defined in Equations (2) and (3) can be used to define a perturbation estimate that would be used to determine the reaction force. The perturbation estimate of the end-effector and second link are mathematically represented by an expression defined below,

$$\hat{\psi}\_{s1} = -\frac{1}{f\_{s1}}(\mathbf{f}\_{s1}) - \frac{1}{f\_{s1}}(0.5M\_{s1}L\_{1}\mathbf{g}\sin\theta\_{1}) - \left(\frac{\Delta f\_{s1}}{f\_{s1}}\right)\bar{\theta}\_{1} - \frac{1}{f\_{s1}}(\Delta B\_{s1} + \beta\_{1})\dot{\theta}\_{1} \tag{34}$$

$$\hat{\psi}\_{s2} = -\frac{1}{\int\_{s2}} (\mathbf{f}\_{s2}) - \frac{1}{\int\_{s2}} (M\_{s2} L\_{2} \mathbf{g} \cos \theta\_{2}) - (\frac{\Delta f\_{s2}}{\int\_{s2}}) \ddot{\theta}\_{1} - \frac{1}{\int\_{s2}} (\Delta B\_{s2} \dot{\theta}\_{2}) - \frac{1}{\int\_{s2}} (\beta\_{2} \dot{\mathbf{x}}) - \frac{1}{\int\_{s2}} (\lambda) \tag{35}$$

The Equations (34) and (35) further used to calculate the reaction force as follows

$$\hat{\mathbf{r}}\_{t1} = J\_{s1}\hat{\boldsymbol{\Psi}}\_{s1} + 0.5M\_{s1}L\_{1}\mathbf{g}\sin\theta\_{1} + \Delta J\_{s1}\ddot{\boldsymbol{\theta}}\_{1} + (\Delta B\_{s1} + \beta\_{1})\dot{\boldsymbol{\theta}}\_{1} \tag{36}$$

$$\hat{\mathfrak{r}}\_{t2} = J\_{s2}\hat{\mathfrak{y}}\_{s2} + M\_{s2}L\_2\mathbf{g}\cos\theta\_2 + \Delta J\_{s2}\ddot{\theta}\_1 + \Delta B\_{s2}\dot{\theta}\_2 + \beta\_2\dot{\mathbf{x}} + \lambda \tag{37}$$

where *τ*ˆ*e*1, *τ*ˆ*e*<sup>2</sup> are the estimates of reaction torques generated as a result of contact with the environment of the end-effector and second link respectively, Δ represents the uncertainty parameter. It is worthy to mention that if the parameter is well estimated than the parameter of uncertainty could be considered as well.

#### **4. Bilateral Control Design between Master and Slave for 3DOF Hydraulic Servo System**

#### *4.1. Bilateral Control*

The design purpose of bilateral control refers to a relation b/w the master and slave system helping the system operator in a way that is as close as the actual environment. There are two important factors that need to be considered while designing bilateral control. The first is that the slave accurately follows the trajectory of the master system and the second that the system's operator gets a realistic feel of a reaction force. In this study, the master and slave system are 3DOF hydraulic servo systems. The dynamical equations for these two systems are shown in Equations (38) and (39).

$$J\_m \ddot{\theta}\_m + B\_m \dot{\theta}\_m = \mu\_m + \tau\_h\tag{38}$$

$$J\_s \ddot{\theta}\_s + B\_s \dot{\theta}\_s = \mathbf{u}\_s - \mathbf{r}\_c \tag{39}$$

where *Jm*, *Js*, *um* and *us* are inertia, position, and control input of master and slave of hydraulic servo system respectively. *τ<sup>h</sup>* represents the real/action force generated by the operator at the master device, *τ<sup>e</sup>* defines the reaction force of the slave system in a remote environment. A detailed workflow of the control algorithm for bilateral control is presented in Figure 5.

**Figure 5.** The work flow of control algorithm.

A simple fact is that as the master device is operated by a human operator, the slave system follows the trajectory of the master system. The slave system of hydraulic servos follows the master's trajectory using a sliding mode control. The impedance control is implemented to transfer reaction force during trajectory tracking. Therefore, no reaction force is observed by the operator while there are no contact between the slave and environment.

#### *4.2. Master Controller and Device*

The design of the master system has been considered under two constraints. The first one is that the reaction force is felt by the operator during the slave system and environment contact. The second constraint is that the operator uses the master system with the minimal force possible. The design of the impedance control is defined as,

$$\mathcal{J}\_d \ddot{\theta}\_m + B\_d \dot{\theta}\_m + \mathcal{K}\_d \theta\_m = \mathfrak{k}\_h - k\_f \mathfrak{k}\_c \tag{40}$$

where *Jd*, *Bd* and *Kd* are the inertia, damping, and stiffness of the impedance control model respectively, *k <sup>f</sup>* represents the scale factor of the reaction force. The scale factor *k <sup>f</sup>* scales the force ratio that is converted from the master system to the slave system. For simplicity, in this study *k <sup>f</sup>* is considered to be unity. The control input signal to the master system can be determined by replacing the estimated/observed state variable to the original ones. Thus, the control input for a dynamical system of Equation (38) and impedance control of Equation (40) can be mathematically presented as follows,

$$\mathbf{u}\_{m} = (B\_{m} - \frac{f\_{m}}{f\_{d}}B\_{d})\dot{\boldsymbol{\theta}}\_{m} + (\frac{f\_{m}}{f\_{d}} - 1)\mathbf{f}\_{h} - \frac{f\_{m}}{f\_{d}}(k\_{f}\,\mathbf{\hat{r}}\_{\mathbf{\hat{c}}} + K\boldsymbol{\hat{\theta}}\_{m}) \tag{41}$$

where *um*, . ˆ *θm*, . ˆ *θ<sup>m</sup>* are the control input, estimated speed and estimated position of the master system by utilizing SPO, *τ*ˆ*<sup>h</sup>* represents the torque estimate that is produced by the operator while operating the master device. The master device used in this research is presented in Figure 6.

**Figure 6.** Master device.

#### *4.3. Slave Controller and Device*

The slave controller is designed to track the trajectory of the master system. The slave controller is designed by utilizing SMCSPO for this study. The logic behind the SMCSPO design is to compensate sliding mode control for the perturbation estimate. It is an additive quantity consisting of uncertainty, disturbance, and non-linearity [39]. Thus, SMCSPO design efficiently removes the external disturbance and uncertainty of parameters. The sliding function that is estimated can be expressed as

$$
\delta = \dot{\theta} + \alpha \theta \tag{42}
$$

where *e*ˆ = ˆ *<sup>θ</sup>s*<sup>1</sup> <sup>−</sup> <sup>ˆ</sup> *θm*<sup>1</sup> is the tracking error between the master and slave device, *c* is a positive constant. By substituting the observed/estimated state in the sliding function defined above in Equation (42) and the sliding perturbation observer of Equation (12), the differential of the estimated sliding surface *s*ˆ can be mathematically represented as

$$\begin{aligned} \dot{\hat{s}} &= \alpha\_{s3} \overline{u}\_{s} - \frac{k\_{s2}}{\varepsilon\_{s0}} \widetilde{\theta}\_{s1} - \alpha\_{s2} \widetilde{\theta}\_{s1} + \hat{\psi}\_{\sf s} - \frac{k\_{s1}}{\varepsilon\_{s0}} \dot{\hat{\theta}}\_{s1} \\ &- \alpha\_{s1} \dot{\tilde{\theta}}\_{s1} - \ddot{\tilde{\theta}}\_{m1} + c(\dot{\tilde{\theta}}\_{s1} - \dot{\tilde{\theta}}\_{m1}) \end{aligned} \tag{43}$$

It is found that *αs*<sup>1</sup> . *θ <sup>s</sup>*1, *αs*2*θ <sup>s</sup>*<sup>1</sup> ≈ 0 when the phase is converged to the sliding surface and error estimate *θ <sup>s</sup>*<sup>1</sup> remains inside boundary layer. The impedance model Equations (40) and (43) are utilized to obtain . *s*ˆ. .

$$\begin{cases} \dot{s} = \alpha\_3 \overline{u}\_s - \left[ \frac{k\_{s2}}{\varepsilon\_{s0}} + c \left( \frac{k\_{s1}}{\varepsilon\_{s0}} \right) - \left( \frac{k\_{s1}}{\varepsilon\_{s0}} \right)^2 \right] \widetilde{\theta}\_{s1} - \int\_d^{-1} (\dot{\tau}\_{\text{lt}} \\\ -k\_f \cdot \mathbf{t}\_c - B\_d \dot{\theta}\_m - K\_d \dot{\theta}\_m) + \hat{\psi}\_s + c (\dot{\theta}\_{s1} - \dot{\theta}\_{m1}) \end{cases} \tag{44}$$

The new control variable *us* is chosen under the constraint i.e., . *<sup>s</sup>*ˆ*s*<sup>ˆ</sup> <sup>&</sup>lt; 0. Similarly, . *s*ˆ dynamics is chosen to satisfy the sliding mode condition

$$
\dot{\mathfrak{s}} = -Ksat(\mathfrak{s})\tag{45}
$$

The new control input presented in Equation (8) can be found from Equations (44) and (45)

$$\begin{aligned} \overline{u}\_{s} &= \mathfrak{a}\_{s3}^{-1} \{-\mathrm{K}sat(\mathfrak{k}) + [\frac{\dot{k}\_{s2}}{\mathfrak{c}\_{s0}} + \mathfrak{c}(\frac{\dot{\mathfrak{k}}\_{s1}}{\mathfrak{c}\_{s0}}) - (\frac{\dot{\mathfrak{k}}\_{s1}}{\mathfrak{c}\_{s0}})^2] \widetilde{\theta}\_{s1} \\ &+ f\_{d}^{-1} (B\_{d} \dot{\theta}\_{m} + \mathsf{K}\_{d} \dot{\theta}\_{m} - \mathfrak{k}\_{h} + k\_{f} \mathfrak{k}\_{c}) - \mathfrak{c}(\dot{\theta}\_{s1} - \dot{\theta}\_{m1}) - \widehat{\mathfrak{y}}\_{s} \} \end{aligned} \tag{46}$$

Final control input for the slave system is mathematically defined as

$$
\mu\_s = J\_s \overline{u}\_s + B\_s \dot{\theta}\_s \tag{47}
$$

#### **5. Experimental Environment**

The slave device for the experiment is shown in Figure 7. The hydraulic servo system consists of two hydraulic cylinders and one AC servo motor. The first base axis is actuated by the AC servo motor. The 2nd link and end effector are actuated by the hydraulic cylinder.

**Figure 7.** Slave device.

The experiments were performed on a 3DOF hydraulic servo system. Table 1, shows the specifications of a hydraulic servo system.


 **Table 1.** Hydraulic servo system.

In [43,44], the identification and robust control of a hydraulic servo system have been discussed. The unknown parameters such as the inertia and damping coefficients of the system are obtained by the signal compression method which can estimate the dynamics by obtaining an equivalent impulse response [45]. The model dynamic is given by

$$
\ddot{\mathbf{x}} = \frac{1}{I\_i} \boldsymbol{\mu} - \frac{1}{I\_i} D\_i \dot{\mathbf{x}}\_i \quad i = \ 1, \ 2, \ 3 \tag{48}
$$

where *Ji*, *Di* are the equivalent moments of inertia and damper respectively. Table 2, shows the values of the dynamics.


**Table 2.** Hydraulic servo system dynamics parameters.

The experimental setup includes a master device, a slave device and a control system as shown in Figure 8.

**Figure 8.** Experimental setup.

The master and slave device consist of three links each, in which the third link is connected with the base. We can find the reaction force at end effector and the 2nd link. SMCSPO is used to estimate the reaction force. Since the operator moves the master device to make the slave come in contact with a hard object, a visual sensor is installed with the slave device. It is not relevant to the control algorithm but provides visual feedback to the operator to avoid any inconvenience or uncertain situations. Figure 9 provides a pictorial view of visual feedback which helps enhance the safety during work.

**Figure 9.** Pictorial view of visual feedback.

#### **6. Experimental Results**

In our study, our system is non-linear, and therefore, a robust control scheme (i.e., SMCSPO) was pursued. Since PID/PD controllers are not robust, comparatively to SMCSPO, to nonlinearities. Therefore, their performance is not as good as that of the proposed SMCSPO. Figure 10 shows the comparison between PID and SMCSPO.

**Figure 10.** Comparison between PID and SMCSPO.

Table 3 shows the values of the parameters used in the experiment by using SMCSPO. The parameters used in the experiment are determined by using the Equations (32) and (33). Extensive simulations were performed to find the best value of boundary layer width (*εo*) resulting in convergence of the sliding surface to zero.


**Table 3.** Design parameters.

Six different experiments are done by using SMCSPO: (i) Bilateral (Master–Slave manipulation) control of the end effector (ii), estimated perturbation measured at the end effector of master and slave (iii), bilateral (Master–Slave manipulation) control of the 2nd link (iv), estimated perturbation measured at the 2nd link of master and slave (v), bilateral (Master–Slave manipulation) control of the base (vi), and bilateral (Master–Slave manipulation) control of the end effector, 2nd link and base at the same time.

The scenario of the experiment is set up such that the operator (human) moves the master device and the slave device follows the trajectory of the master device using SMCSPO. Figure 11 shows the experimental result of the master–slave trajectories for the end effector. The blue line shows the experimental result of the master device whereas, the red line shows the experimental result of the slave device by using SMCSPO. It can be observed that the slave device can follow the master device. The maximum value of the trajectory is 81 degree at 14 s. The end effector of the slave device can move between 0 and 90 degrees.

**Figure 11.** Master–slave trajectories for end effector.

The error between the master–slave trajectories for the end effector is shown in Figure 12. The maximum error between the master and slave trajectories is 0.62 degrees at 15.5 s. It can be seen that the errors between the master and slave trajectories are very small.

**Figure 12.** Error between master–slave trajectories for end effector.

Figures 13 and 14 show the estimated perturbation of master and slave for the end effector. The maximum estimated perturbation of the master is 67.56162 N\*m at 14 s. While the maximum estimated perturbation of the slave is 1516.7 N\*m at 14 s.

**Figure 13.** Estimated perturbation of master for end effector.

The value of the estimated perturbation for the slave is very high as compared to the master. This is because of the hydraulic system for the slave. The dynamic value of the slave (i.e., 303.26) is much bigger than the master (i.e., 1.35135). The pattern of the estimated perturbation of master and slave is the same but in an opposite direction. To compare these two results, we normalized the estimated perturbation of master and slave. The normalized estimated perturbation of master is between 0 and 1 while the slave is between −1 and 0 using Equations (49) and (50).

$$P\_{norm}(Master) = \frac{a\_i - \min(a)}{\max(a) - \min(a)}, \quad i = 1 \ldots N \tag{49}$$

$$P\_{norm}(Slave) = \frac{a\_i - \max(a)}{\max(a) - \min(a)}, \ i = 1 \dots N \tag{50}$$

In the above equations, *ai* is the current value. Figure 15 shows the normalized estimated perturbation for the master and slave of the end effector. The red line shows the normalized estimated perturbation of the master device, whereas the blue line shows the normalized estimated perturbation of the slave device.

**Figure 14.** Estimated perturbation of slave for end effector.

**Figure 15.** Normalized estimated perturbation of master and slave for end effector.

Figure 16 shows the experimental result of the master–slave trajectories for the 2nd link. The blue line shows the experimental result of the master device, whereas the red line shows the experimental result of the slave device by using SMCSPO. The slave device can also follow the master device. The maximum value of the trajectory is 60 degrees at 13.5 s. The 2nd link of the slave device can move between 0 and 90 degrees.

**Figure 16.** Master–slave trajectories for 2nd link.

The error between master–slave trajectories for the 2nd Link is shown in Figure 17. The maximum error between master and slave trajectories are 0.41 degree at 49 s.

**Figure 17.** Error between master–slave trajectories for 2nd link.

Figures 18 and 19 show the estimated perturbation of master and slave for the 2nd link. The maximum estimated perturbation of the master is 326.6367 N\*m at 38.5 s. While the maximum estimated perturbation of the slave is 1296.9 N\*m at 38.5 s. The pattern of master and slave estimated perturbation is the same but in opposite direction.

**Figure 18.** Estimated perturbation of master for 2nd link.

**Figure 19.** Estimated perturbation of slave for 2nd link.

Figure 20 shows the normalized estimated perturbation for master and slave of the 2nd link. The red line shows the normalized estimated perturbation of the master device, whereas the blue line shows the normalized estimated perturbation of the slave device.

**Figure 20.** Normalized estimated perturbation of master and slave for 2nd link.

Figure 21 shows the experimental result of master–slave trajectories for the base axis. The blue line shows the experimental result of the master device whereas, the red line shows the experimental result of the slave device by using SMCSPO. It can be seen that the slave device can follow the master device. The maximum value of the trajectory is 100 degrees at 13 s. The base of the slave device can move between 0 and 360 degrees.

**Figure 21.** Master–slave trajectories for base.

The error between master–slave trajectories for the base is shown in Figure 22. The maximum error between master and slave trajectories is 1.3 degrees at 26.5 s.

**Figure 22.** Error between master–slave trajectories for base.

Figure 23 shows the experimental result of the master–slave trajectories for the end effector, 2nd link and base by using SMCSPO. The blue line shows the experimental result of the master device whereas the red line shows the experimental result of the slave device of the end effector. The grey line shows the experimental result of the master device whereas the pink dotted line indicates the experimental result of the slave device of the 2nd link. The blue-dotted line shows the experimental result of the master device whereas the red dotted line shows the experimental result of the slave device of the base. At the beginning only, the end effector can move until 49 s. Then, the end effector and 2nd link moved simultaneously by 12 s (i.e., from 50 to 62 s). After that, the 2nd link moved separately by 18 s (i.e., from 62 to 80 s). Then, the base can move by 32 s (i.e., from 81 to 113 s). In the end, the end effector and 2nd link moved again simultaneously by 16 s (i.e., from 115 to 131 s). The slave device followed the trajectory of the master device accurately. The maximum trajectories of the end effector, 2nd link and base are 88, 27 and 52 degrees, respectively.

**Figure 23.** Master–slave trajectories for end effector, 2nd link and base.

#### **7. Conclusions**

In this paper, we have estimated the reaction force of the end effector and 2nd link for a three-degree of freedom hydraulic servo system with master–slave manipulators using SMCSPO without using any sensors. By using an SMC-based bilateral control strategy and visual feedback, the slave device followed the trajectory of the master device (human operator) with minimum error. Also, bilateral control is used to estimate the reaction force of the master device which is fed back to the operator to handle the master device.

From Figures 11, 16, 21 and 23, it is confirmed that the slave can follow the master trajectories and an operator can easily handle the slave device by feeling the estimated reaction force using the applied SPO when the slave touches an object even if a force sensor is not used.

The maximum error between master and slave for the end effector, 2nd link and base are summarized in Table 4.


**Table 4.** Error between master–slave trajectories.

This research is applied for dismantling nuclear power plants, and there are many situations where a human cannot access due to the high degree of radiation and the very long half-lives of the radioactive materials involved. Therefore, the slave device is used for such hazardous locations. It is useful for activities such as transportation of active uranium in nuclear power plants, disposal of an explosive, remote cutting for nuclear power plant dismantling, grinding, etc.

**Author Contributions:** Conceptualization, K.D.K. and M.C.L.; Formal analysis, K.D.K., J.W. and S.J.A.; Funding acquisition, M.C.L.; Methodology, K.D.K.; Software, K.D.K. and S.J.A.; Supervision, M.C.L.; Writing—original draft, K.D.K.; Writing—review & editing, K.D.K., J.W. and M.C.L.

**Funding:** This research was funded by the MOTIE (Ministry of Trade, Industry & Energy), Korea, under the Industry Convergence Liaison Robotics Creative Graduates Education Program supervised by the KIAT (N0001126). This research was funded by the Technology Innovation Program (10073147, Development of Robot Manipulation Technology by Using Artificial Intelligence) funded By the Ministry of Trade, Industry & Energy (MOTIE, Korea).

**Acknowledgments:** This research was funded by the MOTIE (Ministry of Trade, Industry & Energy), Korea, under the Industry Convergence Liaison Robotics Creative Graduates Education Program supervised by the KIAT (N0001126). This research was funded by the Technology Innovation Program(10073147, Development of Robot Manipulation Technology by Using Artificial Intelligence) funded By the Ministry of Trade, Industry & Energy(MOTIE, Korea).

**Conflicts of Interest:** The author declares no conflict of interest.

#### **Appendix**

Figure A1 is showing the schematic of hydraulic servo system. In this figure, *Ve* is the volume of the chamber of the cylinder, *Ksv* is coefficient between the input voltage and displacement of spool,*xv* is the displacement of servo valve spool and *Mp* is the mass factor of the end of cylinder.

**Figure A1.** Schematic of Hydraulic Actuator.

The fluid supplied flow rate of the forward chamber and return flow rate of the return chamber are derived by Bernoulli equation respectively and expressed as follow

$$Q\_1 = \mathbb{C}\_d \omega \ge\_v \sqrt{2(P\_s - P\_1)/\rho} \tag{A1}$$

$$Q\_2 = \mathbb{C}\_d \omega \propto\_{\upsilon} \sqrt{2(P\_2)/\rho} \tag{A2}$$

where, *xv* is displacement of servo valve spool, *Cd* is flow coefficient, *ω* is area gradient of spool valve, *ρ* is density of fluid,*Ps*, *P*1, *P*<sup>2</sup> is pressure of supplied fluid and pressure inside the two chambers of the cylinder respectively.

The load pressure *PL* and load flow rate *QL* are expressed as follow.

$$P\_L = P\_1 - P\_2 \tag{A3}$$

$$Q\_L = (Q\_1 + Q\_2) / 2 \tag{A4}$$

By Using Equations (A3) and (A4), load flow rate of servo value is defined as follow

$$\mathcal{Q}\_{L} = \kappa \mathcal{C}\_{d} \omega \ge\_{\upsilon} \sqrt{2(P\_{s} - P\_{L})/\rho} \tag{A5}$$

where

$$\kappa = \frac{1+\eta}{\sqrt{2(1+\eta^2)}} \le 1 \tag{A6}$$

Furthermore, the flux inside of hydraulic cylinder is defined as Equation (A7) by continuity equation.

$$Q\_L = A\_\varepsilon \dot{x}\_p + \frac{V\_\varepsilon}{4\beta} \dot{P}\_L \tag{A7}$$

where, *Ae* is average area between piston and piston load, *β* is the effective flow coefficient modulus and *Ve* is the volume of the chamber of the cylinder.

The dynamics equation of load system is obtained as

$$A\_c P\_L = M\_p \ddot{\mathbf{x}}\_p + B\_p \dot{\mathbf{x}}\_p + F\_p \tag{A8}$$

where *Mp* is the total mass elements, *Bp* is the total damper elements between rod and cylinder, *Fp* is the total friction elements and *xp* is the displacement of rod. To calculate the linearized load flow dynamic equation, Equation (A5) can be linearized as follow

$$Q\_L = k\_q \mathbf{x}\_v - k\_p P\_L \tag{A9}$$

where *kq* is load flow coefficient, *kp* is load flow pressure coefficient and defined as follow

$$k\_{\!} = \text{aC}\_{d}\omega \text{x}\_{\text{U}} \sqrt{(P\_{\text{s}} - P\_{L}^{\ast})/\rho} = \frac{\partial Q\_{L}}{\partial x\_{\text{U}}} \vert\_{P\_{\!} = P\_{L}} \tag{A10}$$

$$k\_p = \frac{\mathcal{C}\_d \omega \propto\_v \text{\*}}{2} \sqrt{1/\rho (P\_A - P\_L \text{\*})} = \frac{\partial Q\_L}{\partial P\_L}|\_{x\_v = x\_v \text{\*}} \tag{A11}$$

*xv* ∗ and *PL* <sup>∗</sup> in Equations (A10), (A11) are *xv* and *PL* near the operating point. Using Equation (A7), linearized load flow dynamic equation can be rewritten as Equation (A12). The effective bulk modulus of the fluid is much larger value than other constant parameters. There the pressure variation terms of Equation (A7) is negligible.

$$A\_c \dot{\mathbf{x}}\_p = k\_q \mathbf{x}\_v - k\_p P\_L \tag{A12}$$

Combing Equations (A7) and (A12), 2nd order dynamic equation can be obtained as follow

$$(M\_p \ddot{\mathbf{x}}\_p + (B\_p + \frac{A\_c^2}{k\_p})\dot{\mathbf{x}}\_p + F\_c = \frac{A\_c k\_q}{k\_p} \mathbf{x}\_v \tag{A13}$$

After separating the non-linear term and parameter error from Equation (A13), the equation of motion is derived as

$$M\_{HT}\ddot{\mathbf{x}}\_p + B\_{TH}\dot{\mathbf{x}}\_p + \boldsymbol{\psi} = K\mathcal{K}\_{\text{sv}}V\_{\text{c}} \tag{A14}$$

where *MHT* and *BTH* is the equivalent value of mass element and damper element respectively, *ψ* is the summation of non-linear term, parameter error, friction and disturbance and *K* is the linear coefficient of *Aekq kp* . The perturbation *ψ*, is defined as follow

$$\Psi = M\_{HT}^{-1} [\Delta K \Delta K\_{\text{sw}} u\_H - \left\{ \Delta M\_{HT} \ddot{\mathbf{x}}\_p + \Delta B\_{TH} \dot{\mathbf{x}}\_p + F\_\varepsilon \right\} ] \tag{A15}$$

#### **References**


© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Time Sequential Motion-to-Photon Latency Measurement System for Virtual Reality Head-Mounted Displays**

#### **Song-Woo Choi, Siyeong Lee, Min-Woo Seo and Suk-Ju Kang \***

Department of Electronic Engineering, Sogang University, Seoul 04107, Korea; songwoo602@gmail.com (S.-W.C.); siyeong@sogang.ac.kr (S.L.); yoynok08@gmail.com (M.-W.S.) **\*** Correspondence: sjkang@sogang.ac.kr; Tel.: +82-02-705-8466

Received: 12 August 2018; Accepted: 29 August 2018; Published: 1 September 2018

**Abstract:** Because the interest in virtual reality (VR) has increased recently, studies on head-mounted displays (HMDs) have been actively conducted. However, HMD causes motion sickness and dizziness to the user, who is most affected by motion-to-photon latency. Therefore, equipment for measuring and quantifying this occurrence is very necessary. This paper proposes a novel system to measure and visualize the time sequential motion-to-photon latency in real time for HMDs. Conventional motion-to-photon latency measurement methods can measure the latency only at the beginning of the physical motion. On the other hand, the proposed method can measure the latency in real time at every input time. Specifically, it generates the rotation data with intensity levels of pixels on the measurement area, and it can obtain the motion-to-photon latency data in all temporal ranges. Concurrently, encoders measure the actual motion from a motion generator designed to control the actual posture of the HMD device. The proposed system conducts a comparison between two motions from encoders and the output image on a display. Finally, it calculates the motion-to-photon latency for all time points. The experiment shows that the latency increases from a minimum of 46.55 ms to a maximum of 154.63 ms according to the workload levels.

**Keywords:** head-mounted display; virtual reality; motion-to-photon latency

#### **1. Introduction**

Because the personal computer's performance has been greatly improved, real-time rendering of high-quality images has become possible, and virtual reality (VR) technology has become a reality [1]. According to this trend, a variety of VR devices utilizing 3D rendering and sensor-based technology have been released. The head-mounted display (HMD) device, which is a system designed to improve immersion by mounting a wide field-of-view display within the user's sight, has been gaining popularity [2]. However, because of the mismatch between visual and vestibular systems, users wearing HMD devices experience motion sickness and dizziness, which can be an obstacle to the VR market. The motion-to-photon latency, which is one of the causes for this mismatch, is the time delay for a user movement to be fully reflected on a display screen [3]. Generally, three steps are required to render an image, as shown in Figure 1. When head motion occurs, the motion detection unit samples the orientation data for the view generation. After the motion detection, the visual processing unit renders a 3D image. Finally, the rendered image is outputted to a display corresponding to the head orientation of the user measured by the sensor. As these steps take time, the delay results in motion-to-photon latency. In this case, the image does not exactly correspond to the actual head orientation of the user, thereby causing the user to experience motion sickness [4]. Therefore, many HMD makers such as Oculus VR have conducted studies to minimize the mismatch caused by the motion-to-photon latency. Numerous studies such as prediction techniques based on the data acquired by inertial measurement unit (IMU) sensors [5] and asynchronous time warp (ATW) [6] have been conducted to overcome this limitation. The quantitative evaluation of the above methods required the authors to measure the motion-to-photon latency. They could then improve and evaluate these methods with reference data.

**Figure 1.** Image generation process in the HMD (head-mounted display) system.

To measure this motion-to-photon latency, the method of [7] presents a low-cost system with high accuracy which measures the latency. However, it can be used for the mobile device. The method of [8] proposes the measurement system and a technique to reduce the latency for the optical see-through display. The authors had previously proposed a new measurement system by using multiple sensors such as an optical sensor and encoders and by comparing the physical signal to the luminance signal from a VR scene reflected on the display [9]. Figure 2 shows the overall architecture of this measurement system. A rotary platform, which controls the physical motion, was proposed in that method to simulate the rotation of the neck. It measured the change of the image brightness when the physical motion commenced. The method could directly measure the motion-to-photon latency because it calculated the time difference between the point at which the brightness startedchanging and that at which the physical motion started However, although this method provided accurate measurement results, its limitation was that the latency could only be measured when the physical motion began.

**Figure 2.** Overall architecture of the previously proposed motion-to-photon latency measurement system.

In this paper, the authors propose a novel time sequential measurement system for the motion-to-photon latency by comparing the physical motion and the motion reflected on the image by 3D rendering. Therefore, it is possible to measure and record the time sequential latency more precisely in real-time, while reflecting the change in workload over time, unlike the conventional method mentioned in [9]. Specifically, the proposed method renders the rotation data with the intensity levels of pixels on the measurement area, which is specially invented for the HMD system, and it can obtain the motion-to-photon latency data in all temporal ranges. Therefore, it has a great advantage over the existing method.

This paper is organized as follows. Section 2 describes the proposed latency measurement system with five subsections. Section 3 presents the experimental environment and the results using the proposed system. Section 4 presents the paper's conclusion.

#### **2. Proposed Method**

Figure 3 shows the overall block diagram of the proposed system. The proposed method measured the actual angular change of the encoder when physical movement occurred. Concurrently, when the physical movement was measured by the IMU sensor, the rendered image was outputted to the display reflecting the measured angle, and the corresponding angle value was converted into the intensity image. The proposed method compared these two values. Specifically, a measurement area, which was constructed with multiple 2D objects mapped with an intensity level converting the rotation angle measured from the IMU, was implemented. Then, the photodetector converted the luminance reflected in the measurement area into a voltage. Finally, the oscilloscope measured this voltage value to determine the intensity level. At the same time, the pulse obtained from the encoder was converted into a position value and compared with the luminance-based position value obtained from the photodetector in real time. Finally, the motion-to-photon latency was calculated by measuring the time difference at which the angle measured from the display reached the same rotation angle of the encoder, as shown in Figure 3. The detailed processes are explained in following sections.

**Figure 3.** Overall block diagram of the proposed system.

#### *2.1. Motion Generator-Based on Human Neck Movement*

The authors used the rotary platform proposed in [9] to model the physical rotation of the user's neck. Specifically, in order to simulate a human neck, a two-degree-of-freedom rotary platform was proposed, built with joints and links based on head kinematics. Therefore, the motion at the end-effector could be estimated through the motion in each joint based on forward kinematics [10,11], as shown in Figure 4.

**Figure 4.** Kinematics of the head-model-based measurement platform.

For this paper, the authors implemented the kinematic analysis of the model using the Denavit–Hartenberg parameters [12], and the angle of the end-effector was estimated by calculating the angle of each joint [13]. In the proposed method, the rotation of the end-effector was estimated by performing rotation transformation on the displacement and rotation in the previous link, and coordinate transformation with the translation matrix. The transformation was carried out on *n* number of links with the following matrix multiplication:

$$[T] = [Z\_1][X\_1][Z\_2][X\_2] \cdots \cdots [X\_n][Z\_n] \tag{1}$$

where [*T*] denotes the transformation matrix of the end-effector, [*Zn*] denotes a transformation matrix at the *n*-th joint, and [*Xn*] denotes a transformation matrix of at the *n*-th link. The transformation matrix of the *i*-th joint is as follows:

$$\begin{aligned} \begin{bmatrix} Z\_i \end{bmatrix} = \begin{bmatrix} \cos \theta\_i & -\sin \theta\_i & 0 & 0\\ \sin \theta\_i & \cos \theta\_i & 0 & 0\\ 0 & 0 & 1 & d\_i\\ 0 & 0 & 0 & 1 \end{bmatrix} \end{aligned} \tag{2}$$

where *θ<sup>i</sup>* denotes the rotation angle between the previous joint and the next *i*-th joint along the z axis, and *di* denotes the displacement between each joint. The transformation matrix of the *i*-th link is as follows:

$$\begin{aligned} \, [X\_i] = \begin{bmatrix} 1 & 0 & 0 & r\_{i,i+1} \\ 0 & \cos \alpha\_{i,i+1} & -\sin \alpha\_{i,i+1} & 0 \\ 0 & \sin \alpha\_{i,i+1} & \cos \alpha\_{i,i+1} & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \end{aligned} \tag{3}$$

where *αi*,*i*+<sup>1</sup> denotes an angle between each link along the x-axis, and *ri*,*i*+<sup>1</sup> denotes the displacement between each link. Finally, the transformation matrix for the forward kinematics from the (*n* − 1)-th link to the *n*-th link is as follows:

$$T\_{\rm nl} = \begin{bmatrix} \cos \theta\_{\rm nl} & -\sin \theta\_{\rm n} \cos a\_{\rm n} & \sin \theta\_{\rm n} \sin a\_{\rm n} & r\_{\rm n} \cos \theta\_{\rm n} \\ \sin \theta\_{\rm n} & \cos \theta\_{\rm n} \cos a\_{\rm n} & -\cos \theta\_{\rm n} \sin a\_{\rm n} & r\_{\rm n} \sin \theta\_{\rm n} \\ 0 & \sin a\_{\rm n} & \cos a\_{\rm n} & d\_{\rm n} \\ 0 & 0 & 0 & 1 \end{bmatrix}. \tag{4}$$

Therefore, the Euler angles [14] of the end-effector were calculated from the inner rotation matrix. The Euler angles were used in an interface of the proposed system.

#### *2.2. Measurement Area Design*

In the proposed method, photodetectors measured the rotation angles from the rendered image. The advantage of photodetectors is that they can be measured more accurately while modeling the human visual function owing to their fast response time. However, the photodetectors only measure the intensity of light within the specific range. Hence, the authors proposed specially invented objects that displayed the current rotation angle on a VR scene. Specifically, the objects did not have any colors but had grayscale intensities, and the intensity levels represented the rotation angle with the pre-defined conversion rule that converted the rotation angle with a float type into the intensity value with an unsigned integer type. Therefore, by measuring the luminance for the objects in the measurement area, the current rotation angle was obtained.

When designing the objects in the measurement area, the shapes of the object could influence not only the rendering performance but also the measurement performance. In 3D rendering, there is back-face culling in that the game engine does not calculate the region occluded by an object [15]. Therefore, due to occlusion in the measurement area, the rendering performance is changed. Since the back-face culling affects the latency measurement, the size of the measurement area to be projected into the display is optimally adjusted to minimize the back-face culling and to maximize the voltage that the active layer of the photodetector can measure.

The brightness of the measurement area was determined by the rotation value calculated from the IMU sensor of the HMD device when rendering the current frame. To enhance the measurement accuracy, two objects represented the current rotation angle at a single rotation axis. Arrangement of the four objects enabled both the yaw and pitch, which were the target rotation directions, to be calculated.

The scanline of the display for the area to be measured should also be considered. Since the entire image is not drawn at once but sequentially drawn along the scanline, there is a time difference depending on the arrangement of the measurement area [16]. In order to estimate the correct angle, two measurement areas must be synchronized considering the physical location.

#### *2.3. Voltage Mapping Table*

When a change in light is detected in the photodetector installed in the measurement area, the current (the output voltage as a result) emitted by the internal photodiode changes. If the experimental environment and parameters are controlled constantly, the output voltage of the photodetector to the image luminance is constant. This means that the output voltage can be converted back to the original luminance information. Therefore, if the conversion between the luminance and the voltage is known in advance, the rotation angle applied in the rendering process can be estimated. In the proposed method, the relationship between the voltage and luminance was acquired in advance, and the mapping table for the relationship was obtained by using these data. Figure 5 shows how to estimate intensity levels from the luminance of the measurement area on the display using the mapping table.

**Figure 5.** Luminance estimation from an object rendered by using the photodetector system and the voltage mapping table.

The voltage of the photodetector is proportional to the luminance, and therefore, the low luminance has a low output voltage. In this case, the amount of the voltage according to the brightness was not distinguishable from the noise when the voltage outputted from the measurement region was low at the stage of constructing the mapping table. Figure 6 illustrates the above problem. Therefore, low voltage levels are excluded from the measurement, and the mapping table is estimated from the upper level. Then, the current rotation value was inferred. There is a problem that if only the 128 levels of intensity are used for 256 gray levels with an 8-bit depth, the measurement resolution is halved. To compensate for this problem, an additional photodetector was used. An additional sensor restored the resolution to the original resolution by using double photodetectors to measure one rotation of the axis.

**Figure 6.** Measurement error and noise for an output voltage in the photodetector system.

For double photodetectors, a new formula was needed which converted a rotation value into two intensity levels. The rotation angle was split into two intensity levels, and these intensity levels were reflected on the measurement area so that the photodetectors measured each intensity level to convert them back to the rotation angle. The specific conversion formula is as follows:

$$\text{Channel} = \left\{ \begin{array}{l} MSBs\_{7bit} \left[ \frac{(\theta + M\_{\text{eular}})}{2M\_{\text{eular}}} \times 2^{14} \right] + 127\\LSBs\_{7bit} \left[ \frac{(\theta + M\_{\text{eular}})}{2M\_{\text{eular}}} \times 2^{14} \right] + 127 \end{array} \right\}, \text{ for } |\theta| \le M\_{\text{eular}} \tag{5}$$

where *Meuler* denotes the maximum Euler angle in the measurement system, *θ* denotes the current angle measured from HMD, and *MSBs*7*bit* and *LSBs*7*bit* denote the upper and lower 7 bits of the converted value, respectively. In the measurement area, 14-bit data were mapped onto the objects.

For example, as shown in Figure 7, the current rotation of the yaw axis and maximum Euler angle are given. Then, these values were calculated with a conversion formula. The current rotation angle was converted into binary data and split into two parts. Each part was the intensity level of the measurement area. After estimating these intensity levels by the photodetectors, the reverse calculation was performed with the inverse of the conversion formula. Finally, the current rotation was obtained, which is used for the generation of a VR scene.

**Figure 7.** Example of the intensity level conversion and its inverse for measuring the current rotation angle.

However, in some cases, accurate mapping tables could not be formed because of changes in the external environment during the measurement process and changes in the output voltage due to the electrical noise. To solve this problem, a polynomial regression to store the data was performed from the voltage data obtained by repeated experiments. In the proposed method, an optimal curve was obtained by repeatedly performing a polynomial regression of less than third-order to avoid overfitting. Then, the obtained curve was used for the voltage estimation for a luminance level.

The third polynomial regression used in this paper is described below. The regression model is as follows:

$$v\_i = a\_0 + a\_1 I\_1 + a\_2 I\_2^2 + a\_3 I\_3^3 + \varepsilon\_i \ (i = 1, \ 2, \ 3, \ \cdots, \ n), \tag{6}$$

where *n* denotes the number of points, *vi* denotes an output voltage from the photodetector, *Ii* denotes an 8-bit unsigned integer value of the intensity level projected on the measurement area, *ai* denotes a curvature parameter, and *ε<sup>i</sup>* denotes a random error. The polynomial curve is composed as follows.

$$
\begin{bmatrix} v\_1 \\ v\_2 \\ v\_3 \end{bmatrix} = \begin{bmatrix} 1 & I\_1 & I\_1^2 & I\_1^3 \\ 1 & I\_2 & I\_2^2 & I\_2^3 \\ 1 & I\_3 & I\_3^2 & I\_3^3 \end{bmatrix} \begin{bmatrix} a\_1 \\ a\_2 \\ a\_3 \end{bmatrix} + \begin{bmatrix} \varepsilon\_1 \\ \varepsilon\_2 \\ \varepsilon\_3 \end{bmatrix}. \tag{7}
$$

In (7), if the random error term is excluded, the curvature parameters can be estimated after the least-squares estimation [17]. The least-squares estimation of these parameters is as follows:

$$\mathbf{a} = \left(\mathbf{I}^T \mathbf{I}\right)^{-1} \mathbf{I}^T \mathbf{v},\tag{8}$$

where *v* denotes a vector composed of the output voltages from the photodetector, *a* denotes the curvature parameter vector of the voltage–luminance curve, and *I* denotes the matrix of intensity levels from the repeated measurements. Using this regression model, the motion-to-photon latency was measured and is described in the following section.

#### *2.4. Measurement of Motion-to-Photon Latency*

The output image on the display was rendered by rotating the HMD using a motion generator designed to control the rotation of the HMD device. It controlled the HMD device with the desired rotation angle using a highly accurate DC servo motor. An internal encoder in the motor, which was used to control the motor, could also be used for measuring the current physical movement. However, in order to more precisely measure the physical movement over time, the incremental type encoders, additionally mounted on the axes, detected the rotation angles and outputted them as pulses. Concurrently, by measuring the luminance of the objects, the current angle data could be obtained by the conversion rule with the mapping table described above. The motion data about the current frame outputted from the display and the motion data outputted from the motion generator were compared. Due to the motion-to-photon latency, the rotation angle from the display was temporally lagged with respect to the angle outputted from the motion generator. To quantify the motion-to-photon latency, the proposed system calculated the time at which the angle, outputted from the display, matched the angle outputted from encoders of the motion generator. Finally, it calculated the difference, which was the motion-to-photon latency between the two time points that had the same rotation angle. When calculating the time points, the authors found the indices of two rotation buffers with the same rotation angles, as shown in Figure 8. To avoid finding multiple points of the same value, the searching range was limited. There were multiple points that had the same value in all ranges. By limiting the searching range, the method allowed the buffers to be searched in the limited range so that only one same value was selected.

**Figure 8.** Example of the motion-to-photon latency calculation.

#### *2.5. Real-Time Measurement and Interface*

The proposed method used buffers to store the data from the encoders and the display. Specifically, it synchronized and stored all data in an oscilloscope, and the time delays were compared by searching points that had the same values. Once each time difference was found, it was converted to a time value to obtain the final motion-to-photon latency. In addition, a user interface was proposed to control the system and to integrate and run several modules. Figure 9 shows the configuration of the interface. The sequence-based motion generation part provided the information for controlling the

motion generator. In the data plotting part, the graphs for the rotation angles are plotted. The latency measurement part provides the current, mean, and max values for the motion-to-photon latency.

**Figure 9.** Proposed user interface of the motion-to-photon latency measurement system.

#### **3. Experimental Results**

The experimental environment is as follows. First, the authors used an Oculus Rift DK 2 HMD device (Oculus VR, Menlo park, CA, USA) for the VR implementation [18]. To drive a motion generator, two RE 40 (Maxon Motor, Sachseln, Switzerland) [19] were used as a DC motor along two axes, and the same number of EPOS2 50/5 (Maxon Motor, Sachseln, Switzerland) were used as controllers to control them [20]. However, since the position estimated from the motor itself had mechanical latency, additional encoders (EIL 580 Baumer, Frauenfeld, Switzerland), which are of the incremental type with 500 steps/turn and 300 kHz frequency, were used to measure more accurate current rotation angles [21]. SM05PDs (THORLABS, Newton, NJ, USA) were used as the photodetectors, which have a spectral range from 200 nm to 1000 nm [22]. A PicoScope 4824 (Pico Technology, Saint Neots, UK) was used to measure the voltages from encoders and photodetectors [23]. It measured the voltages of the encoder and photodetector with a sampling rate of 100,000 times per second and a bandwidth of 5.0 Gbps, thereby allowing nearly continuous data description and processing in real time. To render the VR scenes and drive the entire proposed system, a PC, which had an Intel i7-6700k @ 4.4 GHz CPU [24] and an NVIDIA Geforce GTX 1080 graphic card [25], were used. The experiment was conducted by the following two methods. First, the authors measured the changes in the latency according to the change in the position sequence while the initial workload was fixed. In this case, the workload was assigned the minimum level. The second method used only one of the motion sequences for different initial workload levels. Figure 10 shows the overall experimental process described above.

**Figure 10.** Overall experimental processes for calculating the motion-to-photon latency in (**a**) different sequences and (**b**) different workload levels.

#### *3.1. Position Sequence*

Table 1 lists the position sequences defined in the experiment for the yaw and pitch axes. Six peak points were inputted to the motion generator controlled by the input angle, and the system was controlled according to the sequence shown in Figure 11. In this case, as shown in Table 2, the average latency was up to 46.55 ms and the maximum latency was up to 63.72 ms. In this experiment, the positions and shapes of the 3D objects in a VR scene were changed according to the orientation of the HMD. By reflecting these changes of the objects, the rendering workload was changed with time, thereby changing the motion-to-photon latency. As shown in Table 2, the motion-to-photon latency was varied according to the motion sequence. These results show that the proposed system could measure the motion-to-photon latency in all temporal ranges when the position sequence changed.


**Table 1.** Peak points of simulated head orientations in yaw and pitch directions.

**Table 2.** Minimum, maximum, and average motion-to-photon latencies by different position sequence.


**Figure 11.** Sequences and peak points for simulated head orientations: the change of angles in (**a**) sequence #1 and (**b**) sequence #2.

#### *3.2. Different Workloads*

The authors also measured the latency change while varying the initial rendering workload with the fixed position sequence. The steps of each workload were determined by the number of vertices, which is the minimum unit that constitutes the mesh that has the greatest influence on the computation when rendering the image. In the workload of level 0, the minimum number of objects was two and the number of vertices was 800,000. In the workload of level 3, the number of objects was 17, and the number of vertices was 9.5 million. Table 3 summarizes the number of vertices per level. An increase in the number of vertices also increased the computation time, which directly affected the motion-to-photon latency. In the experiment, the minimum, maximum, and average motion-to-photon latency were measured according to the workload level as shown in Table 4. In this case, the average latency measured in the workload with level 0 was up to 46.55 ms, and the maximum latency was up to 63.75 ms. On the other hand, the average latency measured in the workload with level 3 was up to 154.63 ms, and the maximum latency was up to 198.24 ms. Figure 12 shows the results as a function of the workload level. In the figure, the blue points are the rotation angles from a VR scene, and the orange points are the angles from the motion generator. In the cropped image, the blue points were lagged behind orange points, which are the current angles of the motion. As shown in both of the cropped regions, the time delay of the two points was increased when the workload level was increased. Therefore, when the workload level was changed, the time delay also changed significantly.


**Table 3.** Various rendering workloads and vertices.

**Table 4.** Minimum, maximum, and average motion-to-photon latencies by different rendering workload levels.


**Figure 12.** Profiled motion data measured from the physical motion and a virtual reality (VR) output image: (**a**) workload level 0 and (**b**) workload level 3.

#### **4. Conclusions**

This paper proposed a novel measurement system to visualize the motion-to-photon latency with time-series data in real time. The proposed system rotated the HMD device using a motion generator based on the kinematics of a human neck to control its motions. In addition, we proposed a measurement area, which includes four objects with intensity levels that are converted from an IMU sensor data of the HMD, to acquire the current motion from the display. Finally, the motion data obtained from the encoder were compared with the motion data obtained from the display to calculate the motion-to-photon latency and visualize it on the dedicated interface. In the experiment, the latency was changed from a minimum of 46.55 ms to a maximum of 154.63 ms according to the workload levels.

**Author Contributions:** S.-W.C., M.-W.S. and S.-J.K. conceived and designed the experiments; S.-W.C., S.L. and S.-J.K. performed the experiments, and analyzed the data; S.-W.C., M.-W.S., S.L. and S.-J.K. contributed the equipment development; S.-W.C. and S.-J.K. wrote the paper.

**Acknowledgments:** This research was supported by the National Research Foundation of Korea (NRF) through a grant funded by the Government of Korea (MSIT) (No. 2018R1D1A1B07048421), and supported by the MSIT (Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2018-0-01421) supervised by the IITP (Institute for Information & Communications Technology Promotion).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel. +41 61 683 77 34 Fax +41 61 302 89 18 www.mdpi.com

*Electronics* Editorial Office E-mail: electronics@mdpi.com www.mdpi.com/journal/electronics

MDPI St. Alban-Anlage 66 4052 Basel Switzerland

Tel: +41 61 683 77 34 Fax: +41 61 302 89 18