Online Control for Biped Robot with Incremental Learning Mechanism

Yang, Liang; Lai, Guanyu; Chen, Yong; Guo, Zhihui

doi:10.3390/app11188599

Open AccessArticle

Online Control for Biped Robot with Incremental Learning Mechanism

¹

Zhongshan Institute, School of Computer Science, University of Electronic Science and Technology of China, Zhongshan 528402, China

²

School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China

³

School of Automation, Guangdong University of Technology, Guangzhou 510006, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2021, 11(18), 8599; https://doi.org/10.3390/app11188599

Submission received: 18 June 2021 / Revised: 12 August 2021 / Accepted: 9 September 2021 / Published: 16 September 2021

(This article belongs to the Section Robotics and Automation)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

In this paper, we develop a new online walking controller for biped robots, which integrates a neural-network estimator and an incremental learning mechanism to improve the control performance in dynamic environment. With the aid of an iteration algorithm for updating, some newly incoming data can be used straightforwardly to update into the original well-trained model, in order to avoid a time-consuming retraining procedure. On the other hand, how to maintain the zero-moment-point stability and counteract the effect of yaw moment simultaneously is also a key technical problem to be addressed. To this end, an interval type-2 fuzzy weight identifier is newly developed, which assigns weight for each walking sample to deal with the imbalanced distribution problem of training data. The effectiveness of the proposed control scheme has been verified through a full-dynamics simulation and a practical robot experiment.

Keywords:

biped robot; zero-moment-point; neural networks incremental learning

1. Introduction

In recent years, biped robot has received considerable attention owing to its unique bipedal movement, excellent suitability to human society and theoretical importance. Up to date, a number of active control approaches have been proposed, e.g., stability-criterion-based method [1,2,3], model-based method [4,5,6], and optimization-based method [7,8,9]. In addition to these, many real robot platforms have been successfully developed, including Atlas, MABEL, ASIMO, and NAO [10,11]. To achieve the bipedal locomotion stability, zero-moment-point (ZMP) was proposed and have become the most popular stability criteria. In [12], Kajita et al. designed a ZMP tracking servo controller and proposed a bipedal walking pattern generation algorithm based on cart-table model. In [13], a modified walking pattern method was presented by utilizing allowable ZMP variation and both step length and walking period can be independently adjusted without any extra step. Subsequently, Shin et al. [14] further proposed a practical gait synthesis algorithm by optimizing gait parameters, and the locomotion stability was guaranteed. Moreover, Caron et al. [15] defined the pendular support area and presented a whole-body controller for locomotion across arbitrary multicontact stances. Despite these contributions, the stability established in [12,13,14,15] depends on an assumption that the effect on stability caused by yaw moment can be ignored, which is in fact a restrictive condition. As pointed out in [16,17,18], yaw moment is inevitably generated by the motion of swing leg, which may lead to slippage or falling down.

To remove such limitation, much effort has been paid in this field and some interesting results were reported in [16,17,18,19,20,21]. In particular, Hirabayashi et al. [16] proposed a waist-rotation-based yaw moment compensation algorithm, while a biped robot was modeled as a 3D inverted pendulum. In [22], with the fusion of waist joint control technique and optimized swing leg reference generation method, a fast walking pattern generation approach was presented to counteract the effect of yaw moment. Inspired by human walking experience, Xing et al. [18] designed an arm-swing-based control scheme to cancel the factors which produce the yaw moment. To further improve the control performance, in [19], the angular momentum rate changes were smoothly integrated into yaw moment equation and the locomotion stability was ensured by utilizing a Eulerian ZMP resolution approach. Moreover, Yang et al. [21] constructed a practical control scheme to compensate yaw moment by controlling lower limb. Although much progress has been made in dynamic balance control field, some challenging difficulties still remain open. In most of existing control schemes on yaw moment compensation for biped robot, such as those mentioned above, only a few joints are involved which bring much burden to driving motors and may result in unnatural gaits. In practice, it is difficult to generate natural and efficient gaits in real time according to external disturbance from circumstance.

To address this problem, a series of optimization-based methods were proposed. In [23], a spline-based estimation of distribution algorithm was proposed by formulating the gait pattern generation into a multiobjective optimization problem. In [24], besides ensuring the ZMP stability, the performance of energy efficiency was also well guaranteed by the fusion of moving ZMP criterion with the fourier series approximation technique. With the recourse of Newton–Raphson iteration, the locomotion stability was achieved and the walking speed of robot was successfully regulated online in [25]. Furthermore, Wang et al. [26] presented a SVM-based learning control system for biped robots, in which a novel SVM objective function with energy-related slack variables was proposed. This objective function followed the principle that the slack variables were determined by energy cost, which means the sample with lower energy consumption contributes more to SVM regression. This provided an interesting clue to learn biped walking locomotion. However, this method generally depends on a well-trained model, which may not always be achieved in practical applications.

Motivated by such an observation, in this paper, we make an attempt to further address the online control problem for biped robot. To remove the restrictions just mentioned, the main challenging difficulty that obstructs the design of our control scheme lies in the development of a protocol to compensate yaw moment and at the same time maintain zero-moment-point stability. To overcome the difficulty, an online walking control approach is presented. In summary, the work of this paper has the following novelties and contributions:

As compared with the control scheme developed in [26], ours newly equips with a neural-network estimator and an incremental mechanism, with which those newly coming data can be used straightforwardly to update the original well-trained model in real time. This implies that it is possible for a robot to achieve better locomotion stability in dynamic environment, e.g., from flat ground to uneven terrain;
Traditional optimization-based methods, such as those in [23,24,25,26] are involved in many adaption laws to be updated or computed online, which may result in a computation burden during control implementation. To remove this restriction, we achieve the fusion of the random vector functional-link neural network with an incremental mechanism, so that the entire retraining from beginning can be effectively avoided. Furthermore, by designing an interval type-2 fuzzy weight identifier (IT2FWI), both horizontal and vertical locomotion stabilities are successfully taken into account in training procedure.

The rest of this paper is organized as follows. In Section 2, the kinematics and dynamics of the biped robot are given and some preliminaries are presented. In Section 3, we propose an online control scheme based on incremental learning Algorithm 1, and a neural-network mechanism is established. In Section 4, simulation and experiment are carried out to verify the effectiveness of our scheme. In Section 5, the conclusions are given.

Algorithm 1 Online Updating with increment learning algorithm

Input:

incoming new samples

Q_{i} = (q_{i}, {\dot{q}}_{i}, {\ddot{q}}_{i}, Δ X_{z}^{i}, M_{z}^{i})

;

Output:

T = (t_{1}, . . ., t_{N}) = ({Δ q}_{1}, . . ., {Δ q}_{N})

1: Randomly initiate

W_{e}

,

W_{h}

,

β_{e}

,

β_{h}

, set training error

e = 0

;

2: Calculate the matrix

A = [K^{n} H^{m}]

;

3: Calculate

{(A^{m})}^{+}

with Equation (11);

4: while

e \leq t h r e s h o l d

; do

5: Randomly initiate

W_{p}

,

β_{p}

;

6: Set

Y_{p} = φ (Q W_{p} + β_{p})

and

A^{m + 1} = [A^{m} | Y^{p}]

7: Calculate

{(A^{m + 1})}^{+}

and

W^{m + 1}

by Equations (12)–(14);

8: end while

9: Calculate

T = {(A^{m + 1})}^{+} W^{m + 1}

2. System Description and Some Preliminaries

2.1. Overview of Biped Robot BRZ-4

BRZ-4 is a half-size biped robot, which is set up as a test bed, as presented in Figure 1. Basically, BRZ-4 is 66.2 cm in height and 2.4 kg in weight, which contains 17 degrees. Specifically, three degrees for hip joint, one for knee joint, and two for ankle joint. To collect necessary feedback information, each joint is driven by a DYNAMIXEL MX-64-T motor and the mechanical structure of this robot is made from 3D printing. Moreover, the rotary encoders are integrated in the motors to obtain the motion of joint. The BRZ-4’s kinematic model is accordingly depicted in Figure 1 and the physical parameters are given in Table 1.

A control system for BRZ-4 is set up to perform stable walking objective, which consists of biped robot and ground workstation. By integrating Matlab in the ground workstation, the proposed method is implemented to collect the real-time motion of robot and send control signals to BRZ-4 through RS485 bus. Moreover, to reduce the number of cables, driving motors are connected in daisy chains.

2.2. Kinematics and Dynamics

Usually, biped robot can be simplified as connective link model as shown in Figure 1, in which each link is uniform mass distribution. The relationship between joint speed and end-effector velocity is defined as

\dot{r} = J (q) \dot{q}

(1)

where

\dot{r} \in R^{m}

is task-space velocity,

q = [q_{1}, . . ., q_{n}] \in R^{n}

represents joint angles, n is the number of degrees of freedom,

\dot{q} = [\dot{q_{1}}, . . ., \dot{q_{n}}]

denotes joint angle velocities and

J (q) \in R^{m \times n}

is the Jacobian matrix from joint space to task space.

From the Lagrangian approach, the dynamics of biped robot can be expressed as follows

M (q) \ddot{q} + C (q, \dot{q}) \dot{q} + G (q) + F_{e} = τ

(2)

where

M (q) \in R^{n \times n}

is the positive definite inertial matrix,

C (q, \dot{q}) \in R^{n \times n}

is the Coriolis and centrifugal matrix,

G (q) \in R^{n}

is the gravitational force,

F_{e} \in R^{n}

denotes the external disturbance,

τ \in R^{n}

is the joint torques.

2.3. Bipedal Locomotion Stability

Zero-moment-point (ZMP) stability criterion is one of the most popular stability criteria, which is successfully applied in real biped robot platforms including ASIMO, NAO, and HRP. According to the definition of ZMP, zero-moment-point is the point on the sole, in which the horizontal component of the net moment caused by inertial and gravity forces is zero as shown in Figure 2.

Hence, the following equation holds

M_{x} = M_{y} = 0

(3)

where

M_{x}

and

M_{y}

are the x-axis and y-axis moment of the inertial and gravity forces, respectively.

As pointed out in [18,19], ZMP stability criterion cannot guarantee the moment equilibrium in vertical plane, which neglects the influence caused by yaw moment on locomotion stability. Actually, the undesired yaw moment along the support leg would be generated by the motions of components of biped robot in different planes. Thus, we have

M_{z} \leq M_{R}

(4)

where

M_{z}

denotes yaw moment and

M_{R}

is the moment generated by the ground reaction force. Specially, yaw moment is defined as below

\begin{matrix} M_{z} = \sum_{i = 1}^{n} m_{i} (r_{i} - r_{z m p}) \times ({\ddot{r}}_{i} + g) \end{matrix}

(5)

where

m_{i}

is the mass of the ith connective link;

r_{i}

is the position vector of the center of the ith connective link;

r_{z m p}

denotes ZMP position vector;

g = [0, 0, g]

is the gravitational acceleration vector;

M_{z}

is yaw moment.

Moreover, ZMP coordinate

r_{z m p} = (x_{z m p}, y_{z m p}, 0)

has the following forms [27]:

x_{z m p} = \frac{\sum_{i = 1}^{n} x_{i} ({\ddot{z}}_{i} + g) m_{i} - \sum_{i = 1}^{n} z_{i} m_{i} {\ddot{x}}_{i}}{\sum_{i = 1}^{n} ({\ddot{z}}_{i} + g) m_{i}}

(6)

y_{z m p} = \frac{\sum_{i = 1}^{n} y_{i} ({\ddot{z}}_{i} + g) m_{i} - \sum_{i = 1}^{n} z_{i} m_{i} {\ddot{y}}_{i}}{\sum_{i = 1}^{n} ({\ddot{z}}_{i} + g) m_{i}}

(7)

where

[x_{i}, y_{i}, z_{i}]

is the position of the center of the ith link.

To evaluate the locomotion stability in horizontal plane, ZMP stability margins introduced in [28] are adopted.

J_{z m p} = \{\begin{matrix} \begin{matrix} \frac{1}{2} \frac{l_{z x}}{l_{c x}} + \frac{1}{2} \frac{l_{z y}}{d_{c y}} & i f r_{z m p} \in Ω_{z m p} \\ 0 & i f r_{z m p} \notin Ω_{z m p} \end{matrix} \end{matrix}

(8)

where

l_{z x}

and

l_{z y}

denote the x-axis and y-axis distance between zero-moment-point and boundaries, respectively;

l_{c x}

and

d_{c y}

represent the x-axis and y-axis distance between the center of foot sole and boundaries, respectively;

Ω_{z m p}

denotes ZMP boundaries.

3. Online Control System Design

In this section, the control system design procedure will be specially introduced and a walking control framework is presented as shown in Figure 3, in which the random vector function-link neural networks (RVFLNNs) [29] is adopted to approximate

f (•)

and an incremental learning mechanism is incorporated in the NNs. Moreover, an interval type-2 fuzzy weight identifier (IT2FWI) is designed to improve the control performance.

3.1. Weighted Neural-Network Estimator

From Equations (5)–(7), it is noted that both ZMP stability and yaw moment are related with the position, velocity and acceleration of each link. According to [24,25] and results in [28], the locomotion stability can be ensured by regulating robot joints. Thus, the following mapping function is considered

\begin{matrix} Δ q = f (q, \dot{q}, \ddot{q}, Δ X_{z}, M_{z}) \end{matrix}

(9)

where

f (•)

is a non-linear mapping function,

Δ q = {[Δ q_{1}, . . ., Δ q_{n}]}^{T}

denotes the corrections of all joints;

Δ X_{z}

is ZMP error,

M_{z}

denotes yaw moment. By approximating the non-linear mapping function

f (•)

,

Δ q

can be obtained.

In our scheme, the random vector functional-link neural networks(RVFLNNs) [29] is adopted to estimate

f (•)

. Different from traditional Neural Networks, RVFLNNs effectively eliminates the disadvantage of the long training process and provides a fast learning property by designing a flatted network with randomly generated weights and biases.

Given N training sample sequences

{X_{i}, t_{i}}_{i}^{N}

, let

{\hat{c}}_{i}

represent the appropriate weight of the ith sample. Thus, the approximation task is formulated as the following optimization problem

\begin{matrix} arg min_{W} : \sum_{i = 1}^{N} {\hat{c}}_{i} (A_{i} W_{i} - T_{i}) + λ {∥ W ∥}_{2}^{2} \end{matrix}

(10)

where

{\hat{c}}_{i}

denotes the weight of the ith sample,

A = [A_{1}, . . ., A_{N}] = [K^{n} H^{m}]

represents the input matrix,

K^{n} = [K_{1}, . . ., K_{n}]

is the input node set,

K_{i} = φ (X W_{e i} + β_{e i})

is the input node,

X

is the input sample data, and

H^{m} = [H_{1}, . . ., H_{m}]

is the enhancement node set,

H_{i} = φ (K^{n} W_{h i} + β_{h i})

is the enhancement node;

β_{e i}

and

β_{h i}

are bias;

W_{e i}

and

W_{h i}

are weight matrices;

φ (•)

is sigmoid function;

T = [T_{1}, . . ., T_{N}]

is the desired output matrix,

λ

is a penalty coefficient. Moreover,

W

is the connecting weight matrix, which can be computed by

W = [W_{1}, . . ., W_{N}] = A^{+} T

.

By applying the Moore–Penrose inverse, we have

\begin{matrix} A^{+} = lim_{λ \to 0} {(λ I + \hat{C} {AA}^{T})}^{- 1} A^{T} {\hat{C}}^{T} \end{matrix}

(11)

where

A^{+}

is the pseudo-inverse matrix of

A

,

\hat{C} = [{\hat{c}}_{1}, {\hat{c}}_{2}, . . ., {\hat{c}}_{n}]

is the weight matrix of sample data.

3.2. Incremental Learning Method Design

Let

A^{m + 1} = [A^{m} Y^{p}]

, then the pseudoinverse of

A^{m + 1}

can be expressed as follows [29]:

\begin{matrix} {(A^{m + 1})}^{+} = [\begin{matrix} {(A^{m})}^{+} - {DB}^{T} \\ B^{T} \end{matrix}] \end{matrix}

(12)

where

A^{m} = [A_{1}, . . ., A_{m}]

is the original input matrix,

Y^{p} = [Y_{1}, . . ., Y_{p}]

denotes the new input node set,

Y_{i} = φ (Q W_{p_{i}} + β_{p_{i}})

,

Q

is the new incoming data,

D = {(A^{m})}^{+} Y^{p}

,

B^{T} = \{\begin{matrix} {(C)}^{+}, i f C \neq 0 \\ {(1 + d^{T} d)}^{- 1} B^{T} {(A^{m})}^{+}, i f C = 0 \end{matrix}

(13)

and

C = φ (K^{n} W_{p} + β_{p}) - A^{m} D

.

Thus, the new weight matrix is achieved by the following equation:

\begin{matrix} W^{m + 1} = [\begin{matrix} W^{m} - {DB}^{T} T \\ B^{T} T \end{matrix}] \end{matrix}

(14)

3.3. Interval Type-2 Fuzzy Identifier Design

One of the main difficulties in the development of the proposed control scheme is how to assign an appropriate learning weight for each walking sample. In this paper, an interval type-2 fuzzy weight identifier (IT2FWI) is designed to deal with the uncertainty of walking sample.

The interval type-2 fuzzy logic systems (FLSs) rules are given as follows

\begin{matrix} R u l e l : If z is {\tilde{B}}_{l, 1}, y is {\tilde{B}}_{l, 2}, \\ then \hat{c} is {\tilde{O}}_{l} . \end{matrix}

where z and y represent ZMP stability margin and yaw moment,

\hat{c}

denotes the weight of sample data,

{\tilde{B}}_{l, j}

and

{\tilde{O}}_{l}

denote the linguistic variables of the fuzzy sets;

l = 1, 2, . . ., L

. L is the total number of the fuzzy rules.

Gaussian membership function is adopted to map crisp input to fuzzy sets for its clear physical signification. The membership function of ZMP stability margin is given as below

ϕ_{{\tilde{B}}_{i, j}} (z_{i}) = [{\underset{̲}{ϕ}}_{{\tilde{B}}_{i, j}} (z_{i}), {\bar{ϕ}}_{{\tilde{B}}_{i, j}} (z_{i})]

(15)

{\bar{ϕ}}_{{\tilde{B}}_{i, j}} (z_{i}) = \{\begin{matrix} ({\underset{̲}{c}}_{i, j}^{z}, σ_{i, j}, z_{i}), z_{i} < {\underset{̲}{c}}_{i, j}^{z} \\ 1, {\underset{̲}{c}}_{i, j}^{z} \leq z_{i} \leq {\bar{c}}_{i, j}^{z} \\ ({\bar{c}}_{i, j}^{z}, σ_{i, j}, z_{i}), z_{i} > {\bar{c}}_{i, j}^{z} \end{matrix}

(16)

{\underset{̲}{ϕ}}_{{\tilde{B}}_{i, j}} (z_{i}) = \{\begin{matrix} ({\underset{̲}{c}}_{i, j}^{z}, σ_{i, j}, z_{i}), z_{i} > \frac{{\underset{̲}{c}}_{i, j}^{z} + {\bar{c}}_{i, j}^{z}}{2} \\ ({\bar{c}}_{i, j}^{z}, σ_{i, j}, z_{i}), z_{i} \leq \frac{{\underset{̲}{c}}_{i, j}^{z} + {\bar{c}}_{i, j}^{z}}{2} \end{matrix}

(17)

where

(c_{i, j}^{z}, σ_{i, j}, z_{i}) = \exp {- \frac{1}{2} {(\frac{z_{i} - c_{i, j}^{z}}{σ_{i, j}})}^{2}}

, and

c_{i, j}^{z} \in [{\underset{̲}{c}}_{i, j}^{z} {\bar{c}}_{i, j}^{z}]

.

From [30], the output fuzzy set

ϕ_{\tilde{O} (\hat{c})}

can be obtained by the following equation

\begin{matrix} ϕ_{\tilde{O}} (\hat{c}) \\ = \int_{\hat{c} \in [{\underset{̲}{f}}_{1} {\underset{̲}{ϕ}}_{{\tilde{G}}_{1}} (\hat{c}) \lor \dots \lor {\underset{̲}{f}}_{L} {\underset{̲}{ϕ}}_{{\tilde{G}}_{L}} (\hat{c}), {\bar{f}}_{1} {\bar{ϕ}}_{{\tilde{G}}_{1}} (\hat{c}) \lor \dots \lor {\bar{f}}_{L} {\bar{ϕ}}_{{\tilde{G}}_{L}} (\hat{c})]} \frac{1}{\hat{c}} \end{matrix}

(18)

where

{\underset{̲}{f}}_{i} = \prod_{j = 1}^{n} {\underset{̲}{ϕ}}_{{\tilde{B}}_{i, j}}

,

{\bar{f}}_{i} = \prod_{j = 1}^{n} {\bar{ϕ}}_{{\tilde{B}}_{i, j}}

,

n = 2

; ’∨’ operation denotes the maximum operation.

Utilizing the center-of-sets-type reduction and the Karnik–Mendel method, we have

\begin{matrix} \hat{c} & = [{\hat{c}}_{l o w}, {\hat{c}}_{h i g h}] \end{matrix}

(19)

\begin{matrix} {\hat{c}}_{l o w} = \frac{Σ_{i = 1}^{l} {(Q \bar{f})}_{i} {\tilde{c}}_{i} + Σ_{j = l + 1}^{L} {(Q \underset{̲}{f})}_{j} {\tilde{c}}_{j}}{Σ_{i = 1}^{l} {(Q \bar{f})}_{i} + Σ_{j = l + 1}^{L} {(Q \underset{̲}{f})}_{j}} \end{matrix}

(20)

\begin{matrix} {\hat{c}}_{h i g h} = \frac{Σ_{i = 1}^{r} {(Q \underset{̲}{f})}_{i} {\tilde{c}}_{i} + Σ_{j = r + 1}^{L} {(Q \bar{f})}_{j} {\tilde{c}}_{j}}{Σ_{i = 1}^{r} {(Q \underset{̲}{f})}_{i} + Σ_{j = r + 1}^{L} {(Q \bar{f})}_{j}} \end{matrix}

(21)

where

\hat{c}

is an interval set,

{\hat{c}}_{l o w}

,

{\hat{c}}_{h i g h}

represent the left and right limits, respectively;

{\hat{c}}_{i}

is the centroid of the type-2 interval consequent set

{\tilde{O}}_{i}

;

C = ({\hat{c}}_{1}, . . ., {\hat{c}}_{L})

represent the original rule-ordered consequent values and

\tilde{c} = ({\tilde{c}}_{1}, \dots, {\tilde{c}}_{L}) = Q C

satisfying

{\tilde{c}}_{1} \leq {\tilde{c}}_{2} \leq . . . {\tilde{c}}_{L}

;

Q

is an

L \times L

permutation matrix.

Thus, the defuzzified output is

\begin{matrix} \hat{c} = \frac{{\hat{c}}_{l o w} + {\hat{c}}_{h i g h}}{2} \end{matrix}

(22)

4. Experiment Results and Analysis

In this section, the effectiveness of our control scheme is discussed through simulations and experiments. Robots are required to perform two typical kinds of tasks including walking on flat ground and climbing stairs. The first one is based on a physical platform, while the second one is carried out on a simulation platform.

4.1. Experiment: Walking on Flat Ground

As illustrated in Figure 1 and Table 1, the biped robot BRZ-4 is set up as test bed. Roughly, the test bed consists of two parts, which are the ground workstation and BRZ-4. We apply the proposed control algorithm to BRZ-4. Specifically, our scheme contains off-line and online learning parts. To improve the efficiency, the off-line training is carried out in matlab while the online learning is implemented by using C language. Moreover, the control commands and the states of robot can be transmitted through RS485 bus. To visually present the basic components of hardware system, a graphical result is provided in Figure 4.

One of the goals of the experiment is to control the biped robot to track the desired gait, such that stable walk is achieved. Under the new constructed control framework, desired gait is planned by using a spline-based parametric optimization technique [31], which contains start gait, period normal gait, and stop gait. In addition, the generation of planned gait is implemented in the ground workstation and specific control command will be transmit to BRZ-4 through RS485 bus in real time. To visually illustrate the planned results, the stick animation is presented in Figure 5, in which the CoM trajectory is highlighted in red.

In the construction of the proposed IT2FWI, we take the Gaussian function with fixed standard deviation

σ

and uncertain mean as the primary membership functions. By applying the trial-and-error procedure, the designed parameters of membership functions of ZMP stability margin and yaw moment are chosen as follows

\begin{matrix} σ_{z} = 0.12; σ_{y} = 0.18 \\ [{\underset{̲}{c}}_{i j}^{z}, {\bar{c}}_{i j}^{z}]_{j = 1}^{5} = & {[- 0.02, 0.02], [0.23, 0.27], [0.48, 0.52], \\ [0.73, 0.77], [0.98, 1.02]} \\ [{\underset{̲}{c}}_{i j}^{y}, {\bar{c}}_{i j}^{y}]_{j = 1}^{5} = & {[- 1.02, - 0.98], [- 0.52, - 0.48], [- 0.02, 0.02], \\ [0.48, 0.52], [0.98, 1.02]} \end{matrix}

A comparison between the proposed control scheme and the one in [26] is carried out on the platform of BRZ-4. The ZMP response trajectories are plotted in Figure 6. As indicated in this figure, all the ZMP trajectories are observed to be within the convex boundaries of the supporting foot, which implies that both methods can ensure the locomotion stability in horizontal direction. On the other hand, undesired yaw moment has an significant impact on locomotion stability of biped robot, as pointed out in [16,17,18]. Now we test the effectiveness of our control scheme in compensating for yaw moment. The evolutions of yaw moment, with our method and the one in [26] are visualized in Figure 7. As seen from the comparison, with the above two methods, the yaw moment is successfully suppressed. Apart from these, the root-mean-square (RMS) errors of x-axis/y-axis ZMP stabilities and yaw moment are recorded in Table 2. It is noted that, with the proposed scheme, the RMS errors of x-axis, y-axis ZMP trajectories and yaw moment are around 5.9%, 9.9%, and 20.7% lower than those in [26], respectively. Moreover, comparing with the method in [26], the online learning time is dramatically reduced from 1.56 s to 0.23 s.

Remark 1.

Comparing with [26] which also focuses on the optimization-based learning control design, the learning mechanism in our scheme can be divided into two parts including off-line learning and online learning. By employing a flat network structure and deriving an weight matrix updating Equation (14), the proposed method successfully avoids the entire retraining from beginning. As a result, the online learning time is dramatically reduced as shown in Table 2.

4.2. Simulation: Climbing Stairs

In this case, we consider the new robot BRZ-5, whose basic parameters Table 3. As indicated in Figure 8, BRZ5 is 121 cm in height and 14.9 kg in weight. The whole simulation contains two kinds of gaits. One is walking gait and the other is climbing stairs gait. Specially, every gait includes six step cycles. In this simulation, the robot is required to climb stairs after walking six steps on flat ground. Moreover, some comparative simulations between our proposed method with the one in [26] are conducted on Pybullet which is a real-time physics simulation platform. To facilitate the comparison and analysis, we keep the setting parameters and initial conditions as the same.

The comparative simulation results are given in Figure 9 and Figure 10. In particular, Figure 9 shows the snapshots of climbing stairs while the ZMP response trajectories are plotted in Figure 10. From Figure 9, it is noticed that, with these two methods, the robot can maintain balance in the first six step cycles on flat ground. However, in the next six step cycles, the robot fell down with the method in [26] while the robot controlled by the proposed scheme successfully finished the climbing stairs task. Similar results are also observed in Figure 10, in which ZMP response trajectories are illustrated in dot-dash black line and solid blue line. As indicated in Figure 10, with the approach in [26], ZMP response trajectory is basically within ZMP boundaries in the first six steps while the obvious deviation appears from the 8th to 12th steps. Comparing with the control scheme in [26], ours exhibits a better generalization performance.

Remark 2.

A brief analysis is given to the above comparative results. As we know, the strong environment adaptive ability is one of the keys to realize the large-scale application of biped robots. However, it is almost impossible to handle all kinds of dynamic disturbances from environment with only one well-trained model. Unlike the approach in [26], an incremental updating mechanism is newly integrated into our scheme. With the aid of an iteration algorithm for updating, the new incoming data can be used straightforwardly to update into the original well-trained model, which successfully avoids the entire retraining from beginning. Thus, with this incremental updating mechanism, the adaptive capacity of robot is further improved.

5. Conclusions

This paper presented a walking control framework for biped robot to deal with the online leaning and control problems. Under the new framework, an incremental learning algorithm is further constructed, such that the new coming data can be integrated into the well-trained model in real time without a retraining process. Finally, experiment and simulation results verified the effectiveness of the proposed scheme.

Author Contributions

Writing—original draft preparation, L.Y.; writing—review, G.L. and Y.C.; Simulation, Z.G. All authors have read and agreed to the published version of the manuscript.

Funding

The National Natural Science Foundation of China under Grant 61941301, 61803090, 11771102 and 61573108, in part by China Postdoctoral Science Foundation under Grant 2018M633353, in part by the Special Program for Key Field of Guangdong Colleges under Grant 2019KZDZX1037, in part by the Science and Technology Foundation of Guangdong Province under Grant 2021A0101180005 and 2019B090910001, in part by the Natural Science Foundation of Guangdong Province under Grant 2019A1515012109.

Conflicts of Interest

The authors declare no conflict of interest.

References

Yang, L.; Liu, Z.; Zhang, Y. Energy-efficient yaw moment control for humanoid robot utilizing arms swings. Int. J. Precis. Eng. Manuf. 2016, 17, 1121–1128. [Google Scholar] [CrossRef]
Yu, Z.; Zhou, Q.; Chen, X.; Li, Q.; Meng, L.; Zhang, W.; Huang, Q. Disturbance rejection for biped walking using zero-moment point variation based on body acceleration. IEEE Trans. Ind. Inform. 2018, 15, 2265–2276. [Google Scholar] [CrossRef]
Kim, S.; Hirota, K.; Nozaki, T.; Murakami, T. Human motion analysis and its application to walking stabilization with COG and ZMP. IEEE Trans. Ind. Inform. 2018, 14, 5178–5186. [Google Scholar] [CrossRef]
Kim, E.; Kim, T.; Kim, J.W. Three-dimensional modelling of a humanoid in three planes and a motion scheme of biped turning in standing. IET Control. Theory Appl. 2009, 3, 1155–1166. [Google Scholar] [CrossRef]
Lai, X.; Zhang, A.; She, J.; Wu, M. Motion control of underactuated three-link gymnast robot based on combination of energy and posture. IET Control. Theory Appl. 2011, 5, 1484–1493. [Google Scholar] [CrossRef]
Tamayo, A.J.M.; Bustamante, P.V.; Ramos, J.J.M.; Cobo, A.E. Inverse models and robust parametric-step neuro-control of a Humanoid Robot. Neurocomputing 2017, 233, 90–103. [Google Scholar]
Winkler, A.W.; Farshidian, F.; Pardo, D.; Neunert, M.; Buchli, J. Fast trajectory optimization for legged robots using vertex-based zmp constraints. IEEE Robot. Autom. Lett. 2017, 2, 2201–2208. [Google Scholar] [CrossRef]
Huan, T.T.; Van Kien, C.; Anh, H.P.H.; Nam, N.T. Adaptive gait generation for humanoid robot using evolutionary neural model optimized with modified differential evolution technique. Neurocomputing 2018, 320, 112–120. [Google Scholar] [CrossRef]
Yan, L.; Zhen, T.; Kong, J.L.; Wang, L.M.; Zhou, X.L. Walking Gait Phase Detection Based on Acceleration Signals Using Voting-Weighted Integrated Neural Network. Complexity 2020, 2020, 4760297. [Google Scholar] [CrossRef]
Kuindersma, S.; Deits, R.; Fallon, M.; Valenzuela, A.; Dai, H.; Permenter, F.; Koolen, T.; Marion, P.; Tedrake, R. Optimization-based locomotion planning, estimation, and control design for the atlas humanoid robot. Auton. Robot. 2016, 40, 429–455. [Google Scholar] [CrossRef]
Zhang, X.; Zhang, J.; Zhong, J. Skill Learning for Intelligent Robot by Perception-Action Integration: A View from Hierarchical Temporal Memory. Complexity 2017, 2017, 7948684. [Google Scholar] [CrossRef] [Green Version]
Kajita, S.; Kanehiro, F.; Kaneko, K.; Fujiwara, K.; Harada, K.; Yokoi, K.; Hirukawa, H. Biped walking pattern generation by using preview control of zero-moment point. In Proceedings of the IEEE International Conference on Robotics and Automation, Taipei, Taiwan, 14–19 September 2003; Volume 2, pp. 1620–1626. [Google Scholar]
Lee, B.J.; Stonier, D.; Kim, Y.D.; Yoo, J.K.; Kim, J.H. Modifiable Walking Pattern of a Humanoid Robot by Using Allowable ZMP Variation. IEEE Trans. Robot. 2008, 24, 917–925. [Google Scholar] [CrossRef]
Shin, H.K.; Kim, B.K. Energy-Efficient Gait Planning and Control for Biped Robots Utilizing the Allowable ZMP Region. IEEE Trans. Robot. 2014, 30, 986–993. [Google Scholar] [CrossRef]
Caron, S.; Pham, Q.C.; Nakamura, Y. Zmp support areas for multicontact mobility under frictional constraints. IEEE Trans. Robot. 2017, 33, 67–80. [Google Scholar] [CrossRef] [Green Version]
Hirabayashi, T.; Ugurlu, B.; Kawamura, A.; Zhu, C. Yaw moment compensation of biped fast walking using 3D inverted pendulum. In Proceedings of the 10th IEEE International Workshop on Advanced Motion Control, Trento, Italy, 26–28 March 2008; pp. 296–300. [Google Scholar]
Park, J. Synthesis of natural arm swing motion in human bipedal walking. J. Biomech. 2008, 41, 1417–1426. [Google Scholar] [CrossRef]
Xing, D.; Su, J. Arm/trunk motion generation for humanoid robot. Sci. China Inf. Sci. 2010, 53, 1603–1612. [Google Scholar] [CrossRef]
Ugurlu, B.; Saglia, J.A.; Tsagarakis, N.G.; Caldwell, D.G. Yaw moment compensation for bipedal robots via intrinsic angular momentum constraint. Int. J. Humanoid Robot. 2012, 9, 1250033. [Google Scholar] [CrossRef]
Fu, G.; Chen, J.; Yang, Y. A yaw moment counteracting method for humanoid robot based on arms swinging. Jiqiren (Robot) 2012, 34, 498–504. [Google Scholar] [CrossRef]
Yang, L.; Fu, Y.; He, H. A Yaw Moment Control Method for Humanoid Robot Based on Leg Joints Control. Control. Decis. 2016, 31, 79–83. [Google Scholar]
Yu, W.; Bao, G.; Wang, Z.; Wu, W. fast walking pattern generation for humanoid robot using waist joint moment. Jiqiren (Robot) 2010, 32, 219–225. [Google Scholar]
Hu, L.; Zhou, C.; Sun, Z. Estimating biped gait using spline-based probability distribution function with Q-learning. IEEE Trans. Ind. Electron. 2008, 55, 1444–1452. [Google Scholar]
Erbatur, K.; Kurt, O. Natural ZMP trajectories for biped robot reference generation. IEEE Trans. Ind. Electron. 2009, 56, 835–845. [Google Scholar] [CrossRef] [Green Version]
Tao, G. Online Regulation of the Walking Speed of a Planar Limit Cycle Walker via Model Predictive Control. IEEE Trans. Ind. Electron. 2013, 61, 2326–2333. [Google Scholar]
Wang, L.; Liu, Z.; Chen, C.L.P.; Zhang, Y.; Lee, S.; Chen, X. Energy-efficient SVM learning control system for biped walking robots. IEEE Trans. Neural Netw. Learn. Syst. 2013, 24, 831–837. [Google Scholar] [CrossRef]
Yang, L.; Deng, C. Yaw Moment Compensation for Humanoid Robot via Arms Swinging. Open Autom. Control. Syst. J. 2014, 6, 1371–1377. [Google Scholar] [CrossRef] [Green Version]
Yang, L.; Liu, Z.; Zhang, Y. Online walking control system for biped robot with optimized learning mechanism: An experimental study. Nonlinear Dyn. 2016, 86, 2035–2047. [Google Scholar] [CrossRef]
Chen, C.; Liu, Z. Broad Learning System: An Effective and Efficient Incremental Learning System Without the Need for Deep Architecture. IEEE Trans. Neural Netw. Learn. Syst. 2018, 29, 10–24. [Google Scholar] [CrossRef]
Liu, Z.; Zhang, Y.; Wang, Y. A Type-2 Fuzzy Switching Control System for Biped Robots. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 2007, 37, 1202–1213. [Google Scholar] [CrossRef]
Bessonnet, G.; Seguin, P.; Sardain, P. A parametric optimization approach to walking pattern synthesis. Int. J. Robot. Res. 2005, 24, 523–536. [Google Scholar] [CrossRef]

Figure 1. Kinematic model for BRZ-4.

Figure 2. Forces acting on the supporting foot.

Figure 3. Architecture of proposed control system for biped robot.

Figure 4. Basic components of hardware system.

Figure 5. Animation of planned gait.

Figure 6. ZMP trajectories for BRZ-4 with two methods.

Figure 7. Yaw moment evolutions for BRZ-4 with two methods.

Figure 8. Kinematic model for BRZ-5.

Figure 9. Snapshots of climbing stairs (Simulation platform: Pybullet).

Figure 10. ZMP trajectories for BRZ-5 with two methods.

Table 1. Basic Physical Parameters of BRZ-4.

	Link	Trunk	Thigh	Shank	Arm	Foot
BRZ-4	Length (cm)	20	17.5	15.5	24.7	5
BRZ-4	Mass (kg)	1.05	0.256	0.156	0.156	0.075

Table 2. Comparison of control performance.

	Proposed Method	Method in [26]
RMS error ( $x_{z m p}$ )	0.0322	0.0341
RMS error ( $y_{z m p}$ )	0.0251	0.0276
RMS error ( $M_{z}$ )	0.0617	0.0745
Learning time of each cycle (s)	0.23	1.56

Table 3. Basic physical parameters of BRZ-5.

	Link	Trunk	Thigh	Shank	Foot
BRZ-5	Length (cm)	38.3	37.1	35.0	11.0
BRZ-5	Mass (kg)	1.38	3.58	2.18	1.048

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yang, L.; Lai, G.; Chen, Y.; Guo, Z. Online Control for Biped Robot with Incremental Learning Mechanism. Appl. Sci. 2021, 11, 8599. https://doi.org/10.3390/app11188599

AMA Style

Yang L, Lai G, Chen Y, Guo Z. Online Control for Biped Robot with Incremental Learning Mechanism. Applied Sciences. 2021; 11(18):8599. https://doi.org/10.3390/app11188599

Chicago/Turabian Style

Yang, Liang, Guanyu Lai, Yong Chen, and Zhihui Guo. 2021. "Online Control for Biped Robot with Incremental Learning Mechanism" Applied Sciences 11, no. 18: 8599. https://doi.org/10.3390/app11188599

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Online Control for Biped Robot with Incremental Learning Mechanism

Abstract

1. Introduction

2. System Description and Some Preliminaries

2.1. Overview of Biped Robot BRZ-4

2.2. Kinematics and Dynamics

2.3. Bipedal Locomotion Stability

3. Online Control System Design

3.1. Weighted Neural-Network Estimator

3.2. Incremental Learning Method Design

3.3. Interval Type-2 Fuzzy Identifier Design

4. Experiment Results and Analysis

4.1. Experiment: Walking on Flat Ground

4.2. Simulation: Climbing Stairs

5. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI