Next Article in Journal
Vibration Analysis of a Tetra-Layered FGM Cylindrical Shell Using Ring Support
Next Article in Special Issue
A Review on the Recent Development of Planar Snake Robot Control and Guidance
Previous Article in Journal
Reconstruction and Prediction of Chaotic Time Series with Missing Data: Leveraging Dynamical Correlations Between Variables
Previous Article in Special Issue
Fixed-Time Event-Triggered Control of Nonholonomic Mobile Robots with Uncertain Dynamics and Preassigned Transient Performance
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

An Improved Hierarchical Optimization Framework for Walking Control of Underactuated Humanoid Robots Using Model Predictive Control and Whole Body Planner and Controller

1
School of Optoelectronic Information and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
2
Institute of Machine Intelligence, University of Shanghai for Science and Technology, Shanghai 200093, China
3
Shanghai Droid Robot Co., Ltd., Shanghai 200093, China
4
Department of Informatics, University of Hamburg, 20146 Hamburg, Germany
*
Author to whom correspondence should be addressed.
Mathematics 2025, 13(1), 154; https://doi.org/10.3390/math13010154
Submission received: 28 November 2024 / Revised: 29 December 2024 / Accepted: 2 January 2025 / Published: 3 January 2025

Abstract

:
This paper addresses the fundamental challenge of achieving stable and efficient walking in a lightweight, underactuated humanoid robot that lacks an ankle roll degree of freedom. To tackle this relevant critical problem, we present a hierarchical optimization framework that combines model predictive control (MPC) with a tailored whole body planner and controller (WBPC). At the high level, we employ a matrix exponential (ME)-based discretization of the MPC, ensuring numerical stability across a wide range of step sizes (5 to 100 ms), thereby reducing computational complexity without sacrificing control quality. At the low level, the WBPC is specifically designed to handle the unique kinematic constraints imposed by the missing ankle roll DOF, generating feasible joint trajectories for the swing foot phase. Meanwhile, a whole body control (WBC) strategy refines ground reaction forces and joint trajectories under full-body dynamics and contact wrench cone (CWC) constraints, guaranteeing physically realizable interactions with the environment. Finally, a position–velocity–torque (PVT) controller integrates feedforward torque commands with the desired trajectories for robust execution. Validated through walking experiments on the MuJoCo simulation platform using our custom-designed lightweight robot X02, this approach not only improves the numerical stability of MPC solutions, but also provides a scientifically sound and effective method for underactuated humanoid locomotion control.

1. Introduction

Humanoid robots, as highly articulated floating-base robotic systems, have become a focal point of whole body control (WBC) research over the past decade. Underactuated leg structures lacking ankle roll joints can simplify robot design and reduce costs, but also impose significant control challenges for WBC. Whole body plan (WBP) methods based on operational space (OS) control [1], and null space (NS) techniques [2] have been widely applied to biped humanoid robots for tasks such as dynamic walking [3], jumping [4], and manipulation [5]. These methods present two main advantages for humanoid robots: First, they enable the design of NS-based multi-priority tasks, which are crucial for scenarios like walking that involve stance leg contact, swing leg planning, and center of mass (CoM) maintenance [6], thus ensuring orderly trajectory planning. Second, the mapping operations between OS and joint space Jacobians can reduce the influence of a specific task space degree of freedom (DOF) on joint space, enhancing control reliability over underactuated leg structures [7,8]. Furthermore, walking task control needs to meet the reliability requirements of high-frequency control commands, actual physical constraints, and accurate ground reaction forces [9,10]. WBC based on contact force constraints and dynamic constraints has also been widely applied in humanoid robots. A WBC approach involving ground reaction forces and whole-body dynamic constraints was proposed to calculate desired joint torques [11]. A hybrid approach combining NS kinematic planning and friction cone-constrained whole-body dynamic optimization has been shown to effectively control humanoid gait [12].
Although WBC can generate high-frequency, real-time joint control commands to drive humanoid or quadruped robots, the solutions produced by WBC are typically local optima [13]. Model predictive control (MPC), as an optimization-based approach with a finite prediction horizon, can fully exploit the robot’s kinematic range and torque limits within this horizon [14,15]. Furthermore, MPC’s optimization result adheres to the Bellman optimality principle, ensuring that each update yields a globally optimal solution with continuity [16]. However, for high-dimensional humanoid or quadruped robots, the combination of a long prediction horizon and system complexity significantly increases the dimensionality of the optimization variables, which is especially burdensome for nonlinear MPC [17], leading to computational challenges that undermine real-time performance. Common solutions include simplifying the humanoid robot model to an inverted pendulum [18,19,20] or a single rigid-body dynamic (SRBD) model [21,22,23], using linearized convex MPC models [24,25], and employing discretization techniques such as forward Euler (FE) or matrix exponential (ME) methods [26,27,28,29]. However, simplifying the robot model or MPC model, or applying discretization inevitably reduces the accuracy of MPC solutions, making it infeasible to directly use the output for robot control in complex systems.
Despite these advancements, three key challenges remain: underactuated Robot Control, optimization numerical stability, and integrated optimally planning and control. This study aims to develop an improved hierarchical optimization control strategy for underactuated humanoid robots by integrating the global optimal control capabilities of MPC with the high-frequency, precise control capabilities of WBC. First, for the high-level optimization, a convex MPC based on the SRBD model (as in MIT Cheetah 3 [21]) is employed to generate optimal CoM trajectories and estimated ground reaction forces. To ensure numerical stability in high-stiffness systems and reduce computational load, the ME method is utilized instead of the FEr method, allowing for accurate predictions even with larger discretization steps. Second, for the low-level optimization, a multi-priority WBP is designed to generate desired trajectories from operational space to joint space for an underactuated humanoid robot without ankle roll joints. Since the simplified MPC model cannot directly provide accurate ground reaction forces for WBC, further optimization of these forces is necessary. As described in [30], the foot contact surface can be modeled as a polygon, and the resultant wrench from multiple contact points must satisfy the contact wrench cone (CWC) constraints to ensure feasible contact forces and a stable contact mode. Thus, a WBC strategy is constructed with constraints on whole-body dynamics, joint torques, and CWC. This ensures that the output joint torques comply with physical constraints while achieving precise trajectory tracking. The WBC’s optimization objective focuses on incremental ground reaction forces and joint accelerations. Using the upstream planner’s output as the initial solution to the QP problem enables warm-starting, which accelerates convergence, improves continuity, and enhances smoothness of the output.
To address the challenging problem of whole-body complex planning and control for underactuated humanoid robots, the proposed method has the following three innovations:
  • The proposed convex MPC strategy employs ME discretization method, ensuring numerical stability even with large discretization steps. This approach enables globally optimal trajectory prediction while reducing computational costs.
  • The proposed complementary hierarchical optimization framework integrates the high-level MPC planner and the low-level WBPC, effectively balancing the need for globally optimal trajectories and high-frequency control commands that comply with physical constraints. This enhances system real-time performance, stability and robustness.
The proposed framework is validated on X02: a lightweight underactuated humanoid robot lacking ankle roll DOF. Walking experiments verify the effectiveness of the hierarchical optimization strategy. The subsequent sections are organized as follows: Section 2 introduces the preliminaries, including MPC, SRBD, and WBP; Section 3 presents the proposed hierarchical optimization framework; Section 4 conducts walking simulation experiments on the X02 robot based on MuJoCo; Section 5 discusses the experimental results; and Section 6 concludes with a summary and future prospects.

2. Preliminaries

2.1. Single Rigid-Body Dynamics Model

For a single rigid robot, we can use its simplified approximate linear dynamics equations as equality constraints. The dynamics equation of a single rigid robot in a world coordinate frame is as follows:
p ¨ = i = 1 n f i m g d d t ( I ω ) = i = 1 n r i × f i R ˙ = [ ω ] × R
where f i R 3 is ground reaction force, and n represents the number of contact points. The distance from the CoM to each contact point is r i R 3 . ω R 3 is the robot’s angular velocity. p R 3 is the robot’s position, m is the mass, g R 3 is the gravitational acceleration, I R 3 is the robot’s inertia tensor, and R R 3 × 3 is the orthogonal coordinate transformation matrix from the body coordinate system to the world coordinate system. [ x ] × R 3 × 3 is the skew-symmetric matrix.
According to the linearization process in the literature [21], by neglecting the effects of roll and pitch in the Euler angles and only considering the case where the robot’s yaw angle changes slowly, the angular momentum equation in Equation (1) can be approximated as
d d t ( I ω ) = I ω ˙ + ω × ( I ω ) I ω ˙
where I can be approximated by I ^ = R b I R T ,   b I is the inertia tensor of the robot in the body coordinate system, which is a constant physical property.
Thus, the approximated simplified SRBD is
d d t Θ ^ p ^ ω ^ p ˙ ^ = 0 3 0 3 R ( ψ ) 0 3 0 3 0 3 0 3 1 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 Θ ^ p ^ ω ^ p ˙ ^ + 0 3 0 3 0 3 0 3 I ^ 1 [ r 1 ] × I ^ 1 [ r n ] × 1 3 m 1 3 m f 1 f n + 0 0 0 g
The compact form is
x ˙ ( t ) = A c ( ψ ) x ( t ) + B c ( r 1 , , r n , ψ ) u ( t )
where Θ = [ ϕ , θ , ψ ] T represents the Euler angles in the Z-Y-X sequence, A c R 13 × 13 and B c R 13 × 3 n represent the state transition matrix and control matrix, respectively.

2.2. Convex MPC Problem Formulation

Convex MPC is a special case of the MPC method. This advanced control strategy is capable of solving for the control inputs over a finite time horizon based on the system’s predictive model, ensuring that the system follows the desired trajectory. The convex MPC problem for a SRBD [21], constrained by the aforementioned dynamics model, with the input being the desired trajectory of the CoM x r e f , and the output being the predicted trajectory x ( t ) , t [ 0 , T ] and the sequence of contact forces u ( t ) , t [ 0 , T ] , can be formulated as
min x ( · ) , u ( · ) 0 T | | x ( t ) x ref ( t ) | | Q + | | u ( t ) | | S d t s . t . x ˙ ( t ) = A c x ( t ) + B c u ( t ) C ( t ) u ( t ) = 0 d ̲ ( t ) D ( t ) u ( t ) d ¯ ( t )
The cost function is a convex quadratic form, which aligns with the definition of convex MPC. The term | | a | | s represents a weighted two-norm. The equality constraints consist of the linear dynamics model and the condition that the aerial (non-contact) foot does not experience any forces. The inequality constraints impose physical upper and lower bounds on the contact forces. C ( t ) is a matrix which selects forces corresponding with feet not in contact with the ground.

2.3. Whole Body Planner

Based on NS and OS theory, this section introduce a multi-task priority WBP. The input for the planner includes the desired trajectory and real-time state of task i in the corresponding OS, along with the Jacobian matrix. By computing the NS projection matrix and the mapped Jacobian matrix, incremental kinematic control commands in the joint space are obtained based on the current joint configuration.
First, it is necessary to calculate the pose error, velocity error, and acceleration error between the desired and current values for each task in the OS.
e i = x i d x i J i q i 1 d e ˙ i = x ˙ i d J i q ˙ i 1 d e ¨ i = x ¨ i d ( J ˙ i q ˙ + J i q ¨ i 1 d )
where x i d , x ˙ i d , x ¨ i d and J i are the desired trajectory and Jacobian matrix of task i in OS, respectively. q R ( n + 6 ) × ( n + 6 ) , q ˙ R ( n + 6 ) × ( n + 6 ) , q ¨ R ( n + 6 ) × ( n + 6 ) represents the position, velocity, and acceleration of the robot in the generalized coordinate system. q i d , q ˙ i d , q ¨ i d represent the joint space incremental trajectory output from WBP. J i q i 1 d , J i q ˙ i 1 d and J ˙ i q ˙ + J i q ¨ i 1 d are the influence of task i 1 with respect to task i.
Next, calculate the NS projection matrix and the Jacobian projected into the NS,
J i | p r e = J i N i 1 N i 1 = I J i 1 | p r e J i 1 | p r e
where J i 1 | p r e = W 1 J i 1 | p r e J i 1 | p r e W 1 J i 1 | p r e + λ I is the weighted right pseudo-inverse of J i 1 | p r e . W R n × n is a weighted diagonal matrix.
Finally, compute the joint control command increments from task space to joint space,
q i d = q i 1 d + J i | p r e e i q ˙ i d = q ˙ i 1 d + J i | p r e e ˙ i q ¨ i d = q ¨ i 1 d + J i | p r e e ¨ i
The above calculation process provides the method for computing the joint space control commands for task i. In practice, it is necessary to iteratively calculate for tasks i = 1 , 2 , K to obtain the total joint control command increments.
Remark 1.
When designing multi-priority tasks, it is important to ensure that tasks do not conflict with each other. According to null space theory, higher-priority tasks will take precedence in utilizing the degrees of freedom. It is crucial to carefully allocate weights among different tasks. By adjusting the elements of the weight matrix W R n × n , the influence of a particular task on the joint space degrees of freedom can be strengthened or weakened, thereby avoiding conflicts between tasks.

2.4. Joint Space PVT Controller

The joint space controller uses a hybrid position–velocity–torque (PVT) controller, taking the desired velocity q ˙ d , the desired position q d , and the feedforward torque of the joint τ f f as inputs. The specific form of the controller from reference [31] is
τ = τ ff + k p q d q + k d q ˙ d q ˙

3. Improved Hierarchical Optimization Framework

An improved hierarchical optimization framework is proposed for high-frequency real-time whole-body motion control of humanoid robots. The hierarchical optimization algorithm consists of two parts: (1) a convex MPC high-dimensional planner based on ME discretization, used for planning the centroid trajectory and desired ground contact forces within the optimization finite-time horizon; (2) a low-level WBC based on the output of MPC and WBP. The desired joint torques computed by WBC are combined with the desired joint trajectories obtained by WBP to feed to PVT controller that driving the motor moving.

3.1. High Level Planner Based on Convex MPC

This section introduces a convex MPC control strategy based on ME discretization. The method treats the entire robot as a SRBD subject to forces at the contact patches. For the high-level planning problem of multi-body robots, the influence of leg dynamics can be neglected while maintaining robust system performance. Taking the robot used in this study as an example, due to its excellent structural design, the mass of the arms and legs accounts for only 30% of the total mass. Through model simplification and ME discretization, the algorithm ensures real-time performance while maintaining the accuracy of the model approximation.
The ME method is employed for the discretization of the system, offering a robust approach to handle the inherent stiffness in dynamic models. This method ensures greater numerical stability by accurately capturing both the fast and slow dynamics of the system, even with larger time steps, compared to traditional explicit methods. In particular, humanoid robot dynamics, despite simplifications, exhibit stiffness due to rapid variations in ground reaction forces and frequent gait changes. Explicit discretization methods, such as FE, often require extremely small time steps to maintain stability in stiff systems, which can significantly reduce computational efficiency. By contrast, the ME method mitigates this challenge, enabling stable and efficient numerical solutions without the need for excessively small time steps.
The convex MPC (4) problem after discretization using the ME method is
min { x , u } k = 0 N 1 | | x k + 1 x ref , k + 1 | | Q + | | u k | | S s . t . x k + 1 = e A c Δ t x k + 0 Δ t e A c τ d τ B c u k , C k u k = 0 , d ̲ k D k u k d ¯ k ,
In a linear time-varying system, although the state matrix changes over time, it can still be regarded as locally linear at each discrete time step, and thus these matrices remain linear. The solution to the problem is repeated at every iteration, but only the control input for the first step is used for control. Therefore, the linear time-varying system can still be considered to meet the premise of convex MPC.
The above problem needs to be transformed into the standard QP form for solving. First, by concatenating the state equations at each time step, the system’s state can be expressed as a function of the initial state x 0 and the sequence of control inputs,
x k = Φ k x 0 + i = 0 k 1 Φ k i 1 Γ u i
where Φ = e A c Δ t and Γ = 0 Δ t e A c τ d τ B c . By combining all of the states { x 1 , x 2 , , x N } , we obtain the following form:
x 1 x 2 x N = Φ Φ 2 Φ N Φ q p x 0 + Γ 0 0 Φ Γ Γ 0 Φ N 1 Γ Φ N 2 Γ Γ Γ q p u 0 u 1 u N 1
where Φ q p is the condensed form of the state transition matrix, and Γ q p is the condensed form of the control input matrix.
Express all the states and control inputs within the prediction horizon in vector form as follows:
J = 1 2 X X ref Q X X ref + 1 2 U S U
where X = x 1 , x 2 , , x N , X ref = x ref , 1 , x ref , 2 , , x ref , N , and U = u 0 , u 1 , , u N 1 .
According to the method for constructing QP outlined in the reference [21], the following standard QP form can be obtained:
min U 1 2 U P U + U q s . t . d ̲ D U d ¯
where P = 2 Γ q p Q Γ q p + S , and q = 2 Γ q p Q e = 2 Γ q p Q Φ q p x 0 X ref .
Remark 2.
Compared to the FE method for discretization, the ME method yields a more accurate numerical solution and effectively addresses numerical stiffness issues. To mitigate the increased computational complexity and the additional computational burden from the integral term in the ME method, the following improvements can be made in numerical calculations: By leveraging the sparse characteristics of the state transition matrix  A c , a block matrix approach can be used to reduce the dimensionality of matrix computations and improve the efficiency of calculating the ME. Additionally, based on the form of the ME, the integral for Γ can be converted into a matrix calculation 0 Δ t e A c τ d τ = A c 1 e A c Δ t I . With these optimizations, the ME method can maintain excellent accuracy while reducing the computational load.

3.2. Low Level Controller Based on WBPC

Based on the WBP theory, reference trajectories q K d , q ˙ K d , q ¨ K d in joint space can be obtained by iteratively solving K OS tasks. The estimated contact forces F r , m p c from the MPC and the joint acceleration q ¨ K d from the WBP are used as input commands for the WBC, where an incremental form of the WBC QP problem is constructed.
Using the whole-body dynamics model as equality constraints and CWC as inequality constraints, the joint control torques optimized by WBC satisfy the actual physical constraint. The QP based WBC problem is formulated as follows:
min Δ F r ( · ) , Δ q ¨ K | | Δ F r ( t ) | | W r + | | Δ q ¨ K ( t ) | | W k
S c 1 M ( q ¨ K ( t ) + Δ q ¨ K ( t ) ) + S c 1 b + S c 1 g = S c 1 J c T ( F r , m p c + Δ F r )
S c 2 M ( q ¨ K ( t ) + Δ q ¨ K ( t ) ) + S c 2 b + S c 2 g = τ + S c 2 J c T ( F r , m p c + Δ F r )
lb ( τ ) τ ub ( τ )
μ F c , z F c , f x 2 + F c , f y 2
F c , z F c , τ x 2 + F c , τ y 2 X 2 + Y 2
ub ( F c , τ z ) F c , τ z ub ( F c , τ z )
where Δ F r ( · ) , Δ q ¨ K represent the optimized increments of contact force and generalized acceleration, respectively. W r and W k represent the weights of the two optimization variables, respectively. M , b , and g represent the generalized mass/inertia matrix, the sum of Coriolis/centrifugal forces, and gravity forces, respectively. Equation (14) indicates that the stability of the CoM can be maintained even after adjusting the CoM acceleration and contact forces. S c 1 R 6 × ( n + 6 ) is the centroidal coordinate selection matrix. Equations (15) and (16) indicate that after fine-tuning the joint acceleration and contact forces; the resulting torques must satisfy the physical upper and lower limit constraints. S c 2 R n × ( n + 6 ) is the joint space selection matrix. J c is the contact Jacobian matrix. Equations (17)–(19) jointly form the CWC. X and Y represent the half-length and half-width of the foot contact surface, respectively.
After obtaining Δ F r ( · ) , Δ q ¨ K , the torque control command for the joint actuators, τ f f R n × 1 , can be solved using the dynamic model
M ( q ¨ K ( t ) + Δ q ¨ K ( t ) ) + b + g = 0 6 × 1 τ ff + J c T ( F r , m p c + Δ F r )
The joint space controller uses the PVT controller, taking the desired velocity q ˙ K d and desired position q K d from WBP, as well as the joint feedforward torque τ f f from WBC as inputs. The specific form of the controller is
τ = τ ff + k p q K d q + k d q ˙ K d q ˙
The pseudocode for the low-level control algorithm based on WBPC and PVT is shown in Algorithm 1.
Algorithm 1 Low-Level Controller Based on WBPC and PVT
Require: 
x i d , x ˙ i d , x ¨ i d , J i , x i , q ˙ , q ¨ , x ¨ c , F r , m p c , M , b , g , J c  ▹ The inputs required for the algorithm
Ensure: 
τ                    ▹ The output of the algorithm
   1:
Initialization:  N 0 = I , q 0 d = 0 , q ˙ 0 d = 0 , q ¨ 0 d = 0
   2:
for  i = 1 , 2 , , K  do
   3:
     ( e i , e ˙ i , e ¨ i ) computeErr ( x i d , x ˙ i d , x ¨ i d , J i , x i , q ˙ , q ¨ )    ▹ Compute the error for task i
   4:
    Calculate the null space projection matrix N i 1
   5:
    Compute the mapped Jacobian matrix J i | p r e
   6:
     ( q i d , q ˙ i d , q ¨ i d ) planJointTraj ( J i | p r e , e i , e ˙ i , e ¨ i ) ▹ Get the trajectory increments for task i
   7:
end for
   8:
Save the final trajectory: q K d , q ˙ K d , q ¨ K d
   9:
Compute the dynamic equality constraints and CWC constraints
 10:
Formulate the QP problem for the objective function
 11:
Solve the QP problem to obtain increments Δ F r , Δ q ¨ K
 12:
Calculate the feedforward torque τ ff from the whole-body dynamic model
 13:
Calculate the PVT command: τ = τ ff + k p q K d q + k d q ˙ K d q ˙
 14:
Return  τ

4. Simulation Experiment

In this section, we validate the effectiveness of the proposed hierarchical optimization control algorithm on the X02 platform, which lacks the ankle roll DOF. First, we design a WBP multi-priority task adapted to the absence of the ankle roll DOF. Next, we evaluate the numerical stability and prediction accuracy of the convex MPC based on the ME method under different step sizes. Finally, we validate the effectiveness of the incremental WBC based on CWC constraints and whole-body dynamics constraints, along with an interpretation of joint trajectory tracking.

4.1. Experimental Setup

The X02 is a unique, lightweight humanoid robot designed by us, characterized by low energy consumption and long endurance. The leg degrees of freedom and sequence of X02 are: {hip_yaw, hip_roll, hip_pitch, knee_pitch, ankle_pitch}. In this experiment, the X02 robot model, shown in Figure 1, is loaded onto the MuJoCo simulation platform. The robot model parameters and MuJoCo physics engine settings are shown in Table 1. The control algorithm architecture is modified based on the OpenLoong [32] framework. To facilitate the deployment of the control algorithm on the physical robot, we have stripped the simulation portion from the original framework. The upper-level control and lower-level drives communicate using Google gRPC. The control program is deployed on a Lenovo Y9000R laptop with an Ubuntu 22.04 operating system, Intel i9-14900HX × 32 CPU, and 32 GB of memory. The upper-level optimization algorithm runs at 1k Hz, while the lower-level drive operates at 2k Hz.

4.2. Implementation Details

The multi-priority tasks based on WBP are prioritized as follows: maintaining_contact, holding_CoM_pose, tracking_swing_leg, in order from high to low priority. In the tracking_swing_leg task, due to the absence of the ankle roll DOF, we need to set the roll row in both the Jacobian J c and its derivative dJ c to 0, as well as set the roll component of the error vector e r o l l to 0. This eliminates the contribution of the roll direction to the control of each joint when using WBP for joint-space trajectory planning.

4.3. Results of Convex MPC

In this experiment, The whole parameters are designed as follows in Table 2.
In Figure 2, the first 5 s represent the simulation’s stable standing phase, and the MPC is introduced at the fifth second. The sampling time step increases sequentially from top to bottom in the figure. From the data, it can be observed that both methods exhibit large errors in the y-direction before 10 s. This is due to the first step of the robot’s gait planning based on a finite state machine, which drives the swing leg to move laterally. As the sampling time step increases, the FE method shows significant prediction errors caused by numerical instability, while the ME method does not. From the MSE metrics in the table and the trend in the figure, it can be seen that both methods achieve the smallest prediction error at d t = 10 ms, rather than at d t = 5 ms. This is because too small a time step increases the computational load and the setup of the QP solver. Table 3 also shows that the average solution time for ME is longer than for FE. Overall, under all four sampling time steps, the ME discretization method, compared to the FE discretization method, achieves smaller prediction errors and greater numerical stability, albeit at the cost of longer solving times.

4.4. Results of WBPC

The relevant parameters of the WBC optimization problem are presented in Table 4. The joint torque limiting sequence is as follows: hip yaw, hip roll, hip pitch, knee pitch, and ankle pitch.
From Figure 3, it can be observed that the WBC further optimizes the contact forces predicted by the MPC based on the CWC constraints and dynamic constraints in joint space. The figure illustrates the optimized contact forces for several complete gait cycles of both the left and right feet. The foot–ground contact coordinate system aligns with the global coordinate system, following the right-hand convention. For the force in the y-direction, the output by the WBC for the left foot is negative, while it is positive for the right foot, aligning better with the physical scenario. For the z-direction force, the WBC outputs smoother results that satisfy the CWC constraints.
Figure 4 and Figure 5 show the joint torques performance and joint trajectories response, respectively. In the figure, the abbreviations in the legend are defined as follows: LHY represents “left hip yaw”, LHP represents “left hip pitch”, LHR represents “left hip roll”, LKP represents “left knee pitch”, and LAP represents “left ankle pitch”. Similarly, R indicates the right, and the other abbreviations follow the same naming convention. From Figure 4, it can be seen that the optimized torques output by the WBC satisfy the physical constraints of the joint torque limits. The hip yaw joint contributes minimally during forward walking, requiring only a small torque for stabilization. Notably, the hip yaw motors of X02 rotate in opposite directions; to maintain stability in the yaw direction, the torque control commands exhibit a synchronized trend. During the swing phase, the knee pitch experiences nearly zero torque, while during the support phase, it generates significant torque. The pitch joints of the hip, knee, and ankle exhibit periodic contributions during the transition between the stance and swing legs. From the right subplot of Figure 5, it can be observed that since the motor angles are zero when X02 is standing upright, the knee pitch angle during walking approximately equals the sum of the hip pitch and ankle pitch angles to maintain the stability of the center of mass.

5. Discussion

In the experimental section of this study, we compare the proposed ME-based MPC algorithm with the FE-based MPC algorithm used in previous research. The experimental results show that under large discrete step sizes, the ME approach can still ensure numerical stability. The MSE metric for ME is generally an order of magnitude lower than that of FE, reaching the level of 10 2 . However, due to the integral nature of the ME approach, its computational load increases, leading to longer solution times compared to the FE approach. To reduce the computational cost of the ME method, we leverage the derivative property of the exponential function to convert the matrix exponential integration into 0 Δ t e A c τ d τ = A c 1 e A c Δ t I , thus shortening the solution time. From the average solution-time comparison data, the ME approach is only about 0.2 ms slower than the FE approach, which is fully acceptable for large discrete step sizes. Therefore, when solving MPC problems with large step sizes, the ME approach provides both numerical accuracy and computational efficiency.
In the WBC algorithm, ground reaction force is key to driving joint motion and maintaining the stability of the floating base. In the traditional WBC algorithm [9], the lack of prior estimation for foot contact forces means that the initial value is typically set to zero, often requiring multiple iterations to find the optimal solution—particularly costly when gait switches occur frequently during walking. In this study, we use the ground reaction forces computed by the MPC as the initial guess for the WBC optimization in an incremental optimization manner, allowing a rapid and accurate determination of the foot contact forces needed to drive joint motion. From the experimental data, although the ground reaction forces output by the MPC are not fully accurate or continuous, they converge quickly and become smooth and continuous after WBC optimization in each gait cycle for both feet.

6. Conclusions and Future Work

This study proposes an improved optimization control algorithm based on MPC and WBPC to address the numerical stability, computational efficiency, and real-time requirements of humanoid robot walking tasks. By comparing two discretization-based MPC algorithms under different step sizes with respect to convergence time and error accuracy, the results indicate that the ME method is more suitable for solving the MPC problem of a simplified humanoid robot model using large discrete step sizes, ensuring numerical stability and reducing computational demands. Moreover, to meet the high-frequency real-time requirements and physical constraints of humanoid walking, the ground reaction forces estimated by the MPC are utilized as the initial guess for the WBC optimization problem, combined with CWC constraints in an incremental optimization scheme for joint trajectories and feedforward torques. This reduces the number of optimization iterations while enhancing the accuracy and real-time performance of the numerical solutions. Simulation-based walking experiments verify the feasibility of the proposed algorithm. By integrating the global optimal solution from MPC with the high-frequency real-time characteristics of WBC, the proposed hierarchical optimization framework improves overall system robustness. Leveraging a discretization approach with stronger numerical stability and a hierarchical optimization strategy, this framework effectively balances computational load and solution accuracy, making it well-suited for deployment on edge computing platforms and providing theoretical support for physical implementation. In the future, we plan to further investigate gait planning algorithms with strong disturbance rejection to enhance the stability of this hierarchical optimization framework and achieve faster walking speeds. Additionally, capitalizing on the existing X02 communication framework allows rapid deployment of the algorithm from simulation to physical hardware, and we will validate its performance on a real robot in a forthcoming work.

Author Contributions

Conceptualization, Y.L. and Q.L.; Methodology, Y.L. and Q.L.; Formal analysis and investigation, Y.L.; Writing—original draft preparation, Y.L.; Writing—review and editing, Q.L.; Software, Y.L., H.M. and H.J. Resources, H.M. Supervision, Q.L. and J.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China under grant 92048205 and the Pujiang Talents Plan of Shanghai under grant 2019PJD035.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Acknowledgments

Thank you to Shanghai Droid Robot Co., Ltd. for their assistance in robot debugging.

Conflicts of Interest

Author Haiming Mou was employed by the company Shanghai Droid Robot Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Sentis, L.; Khatib, O. Control of free-floating humanoid robots through task prioritization. In Proceedings of the 2005 IEEE International Conference on Robotics and Automation, Barcelona, Spain, 18–22 April 2005; pp. 1718–1723. [Google Scholar]
  2. Dietrich, A.; Ott, C.; Albu-Schäffer, A. An overview of null space projections for redundant, torque-controlled robots. Int. J. Robot. Res. 2015, 34, 1385–1400. [Google Scholar] [CrossRef]
  3. Sadeghian, H.; Villani, L.; Keshmiri, M.; Siciliano, B. Task-space control of robot manipulators with null-space compliance. IEEE Trans. Robot. 2013, 30, 493–506. [Google Scholar] [CrossRef]
  4. Ahn, D.; Cho, B.K. Online jumping motion generation via model predictive control. IEEE Trans. Ind. Electron. 2021, 69, 4957–4965. [Google Scholar] [CrossRef]
  5. Sentis, L.; Khatib, O. A whole-body control framework for humanoids operating in human environments. In Proceedings of the 2006 IEEE International Conference on Robotics and Automation, ICRA 2006, Orlando, FL, USA, 15–19 May 2006; pp. 2641–2648. [Google Scholar]
  6. Lee, Y.; Kim, S.; Park, J.; Tsagarakis, N.; Lee, J. A whole-body control framework based on the operational space formulation under inequality constraints via task-oriented optimization. IEEE Access 2021, 9, 39813–39826. [Google Scholar] [CrossRef]
  7. Chu, X.; Lo, C.H.; Proietti, T.; Walsh, C.J.; Au, K.W.S. Opposite Treatment on Null Space: A Unified Control Framework for a Class of Underactuated Robotic Systems With Null Space Avoidance. IEEE Trans. Control Syst. Technol. 2022, 31, 193–207. [Google Scholar] [CrossRef]
  8. Tiseo, C.; Merkt, W.; Wolfslag, W.; Vijayakumar, S.; Mistry, M. Safe and compliant control of redundant robots using superimposition of passive task-space controllers. Nonlinear Dyn. 2024, 112, 1023–1038. [Google Scholar] [CrossRef]
  9. Kim, D.; Lee, J.; Ahn, J.; Campbell, O.; Hwang, H.; Sentis, L. Computationally-robust and efficient prioritized whole-body controller with contact constraints. In Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, 1–5 October 2018; pp. 1–8. [Google Scholar]
  10. Bi, Z.M.; Luo, C.; Miao, Z.; Zhang, B.; Zhang, W.J.; Wang, L. Safety assurance mechanisms of collaborative robotic systems in manufacturing. Robot. Comput.-Integr. Manuf. 2021, 67, 102022. [Google Scholar] [CrossRef]
  11. Xie, Z.; Li, L.; Luo, X. Optimization of the ground reaction force for the humanoid robot balance control. Acta Mech. 2021, 232, 4151–4167. [Google Scholar] [CrossRef]
  12. Kim, D.; Jorgensen, S.J.; Lee, J.; Ahn, J.; Luo, J.; Sentis, L. Dynamic locomotion for passive-ankle biped robots and humanoids using whole-body locomotion control. Int. J. Robot. Res. 2020, 39, 936–956. [Google Scholar] [CrossRef]
  13. Lee, J.; Bang, S.H.; Bakolas, E.; Sentis, L. Mpc-based hierarchical task space control of underactuated and constrained robots for execution of multiple tasks. In Proceedings of the 2020 59th IEEE Conference on Decision and Control (CDC), Jeju Island, Republic of Korea, 14–18 December 2020; pp. 5942–5949. [Google Scholar]
  14. Katayama, S.; Murooka, M.; Tazaki, Y. Model predictive control of legged and humanoid robots: Models and algorithms. Adv. Robot. 2023, 37, 298–315. [Google Scholar] [CrossRef]
  15. Chignoli, M.; Kim, D.; Stanger-Jones, E.; Kim, S. The MIT humanoid robot: Design, motion planning, and control for acrobatic behaviors. In Proceedings of the 2020 IEEE-RAS 20th International Conference on Humanoid Robots (Humanoids), Munich, Germany, 19–21 July 2021; pp. 1–8. [Google Scholar]
  16. Scianca, N.; De Simone, D.; Lanari, L.; Oriolo, G. MPC for humanoid gait generation: Stability and feasibility. IEEE Trans. Robot. 2020, 36, 1171–1188. [Google Scholar] [CrossRef]
  17. Kouzoupis, D.; Frison, G.; Zanelli, A.; Diehl, M. Recent advances in quadratic programming algorithms for nonlinear model predictive control. Vietnam J. Math. 2018, 46, 863–882. [Google Scholar] [CrossRef]
  18. Choe, J.; Kim, J.H.; Hong, S.; Lee, J.; Park, H.W. Seamless reaction strategy for bipedal locomotion exploiting real-time nonlinear model predictive control. IEEE Robot. Autom. Lett. 2023, 8, 5031–5038. [Google Scholar] [CrossRef]
  19. Zhu, Z. ZMP Preview Control of Lower Limb Exoskeleton Robot Based on 3D Linear Inverted Pendulum. In Proceedings of the 2022 12th International Conference on CYBER Technology in Automation, Control, and Intelligent Systems (CYBER), Baishan, China, 27–31 July 2022; pp. 554–557. [Google Scholar]
  20. Zhang, Z.; Zhang, L.; Xin, S.; Xiao, N.; Wen, X. Robust walking for humanoid robot based on divergent component of motion. Micromachines 2022, 13, 1095. [Google Scholar] [CrossRef]
  21. Di Carlo, J.; Wensing, P.M.; Katz, B.; Bledt, G.; Kim, S. Dynamic locomotion in the mit cheetah 3 through convex model-predictive control. In Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, 1–5 October 2018; pp. 1–9. [Google Scholar]
  22. Dai, H.; Tedrake, R. Planning robust walking motion on uneven terrain via convex optimization. In Proceedings of the 2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids), Cancun, Mexico, 15–17 November 2016; pp. 579–586. [Google Scholar]
  23. García, G.; Griffin, R.; Pratt, J. MPC-based locomotion control of bipedal robots with line-feet contact using centroidal dynamics. In Proceedings of the 2020 IEEE-RAS 20th International Conference on Humanoid Robots (Humanoids), Munich, Germany, 19–21 July 2021; pp. 276–282. [Google Scholar]
  24. Pandala, A.G.; Ding, Y.; Park, H.W. qpSWIFT: A real-time sparse quadratic program solver for robotic applications. IEEE Robot. Autom. Lett. 2019, 4, 3355–3362. [Google Scholar] [CrossRef]
  25. Stellato, B.; Banjac, G.; Goulart, P.; Bemporad, A.; Boyd, S. OSQP: An operator splitting solver for quadratic programs. Math. Program. Comput. 2020, 12, 637–672. [Google Scholar] [CrossRef]
  26. Bándy, K.; Stumpf, P. Model Predictive Control of LC Filter Equipped Surface-Mounted PMSM Drives Using Exact Discretization. IEEE Trans. Ind. Electron. 2024. [Google Scholar] [CrossRef]
  27. Al-Ali, M.; Mancilla-David, F. Model predictive control of grid connected inverters using different discretization strategies. In Proceedings of the 2021 North American Power Symposium (NAPS), College Station, TX, USA, 14–16 November 2021; pp. 1–6. [Google Scholar]
  28. Rösmann, C.; Makarow, A.; Bertram, T. Stabilising quasi-time-optimal nonlinear model predictive control with variable discretisation. Int. J. Control 2022, 95, 2682–2694. [Google Scholar] [CrossRef]
  29. Ge, L.; Zhao, Y.; Zhong, S.; Shan, Z.; Ma, F.; Han, Z.; Guo, K. Efficient and integration stable nonlinear model predictive controller for autonomous vehicles based on the stabilized explicit integration method. Nonlinear Dyn. 2023, 111, 4325–4342. [Google Scholar] [CrossRef]
  30. Bouyarmane, K.; Caron, S.; Escande, A.; Kheddar, A. Multi-contact motion planning and control. Humanoid Robot. Ref. 2018, 1–42. [Google Scholar] [CrossRef]
  31. Sleiman, J.P.; Farshidian, F.; Minniti, M.V.; Hutter, M. A unified mpc framework for whole-body dynamic locomotion and manipulation. IEEE Robot. Autom. Lett. 2021, 6, 4688–4695. [Google Scholar] [CrossRef]
  32. Humanoid Robot (Shanghai) Co., Ltd. OpenLoong-DynamicsControl: Motion Control Framework of Humanoid Robot Based on MPC and WBC. 2024. Available online: https://atomgit.com/openloong/openloong-dyn-control.git (accessed on 1 January 2025).
Figure 1. X02 humanoid robot in MuJoCo.
Figure 1. X02 humanoid robot in MuJoCo.
Mathematics 13 00154 g001
Figure 2. Performance of convex MPC using FE and ME discretization method. (a,c,e,g): Prediction errors in the x-direction with sampling intervals of 5 ms, 10 ms, 50 ms, and 100 ms; (b,d,f,h): Prediction errors in the y-direction with sampling intervals of 5 ms, 10 ms, 50 ms, and 100 ms.
Figure 2. Performance of convex MPC using FE and ME discretization method. (a,c,e,g): Prediction errors in the x-direction with sampling intervals of 5 ms, 10 ms, 50 ms, and 100 ms; (b,d,f,h): Prediction errors in the y-direction with sampling intervals of 5 ms, 10 ms, 50 ms, and 100 ms.
Mathematics 13 00154 g002
Figure 3. Ground reaction forces optimized by MPC and WBC. (ac): Ground reaction forces predicted by MPC; (df): Ground reaction forces optimized by WBC.
Figure 3. Ground reaction forces optimized by MPC and WBC. (ac): Ground reaction forces predicted by MPC; (df): Ground reaction forces optimized by WBC.
Mathematics 13 00154 g003
Figure 4. Joint torques of both legs optimized by WBC. (a): Performance of hip yaw joint torque; (b): Performance of hip roll joint torque; (c): Performance of hip pitch joint torque; (d): Performance of knee pitch joint torque; (e): Performance of ankle pitch joint torque.
Figure 4. Joint torques of both legs optimized by WBC. (a): Performance of hip yaw joint torque; (b): Performance of hip roll joint torque; (c): Performance of hip pitch joint torque; (d): Performance of knee pitch joint torque; (e): Performance of ankle pitch joint torque.
Mathematics 13 00154 g004
Figure 5. Joint tracking performance.
Figure 5. Joint tracking performance.
Mathematics 13 00154 g005
Table 1. Parameters of simulation environment and robot.
Table 1. Parameters of simulation environment and robot.
Parameters of RobotValuesParameters of MuJoCoValues
m24.19 kgdamping0.1
g z −9.8 m/s2frictionloss0.02
μ 0.6armature0.01
Table 2. Parameters of convex MPC.
Table 2. Parameters of convex MPC.
Parameters of MPCValues
N (Finite prediction horizon)10
d t {5, 10, 50, 100} ms
State weight 1.0 1.0 1.0 1.0 1.0 10 9 10 7 10 7 10 7 100.0 10.0 10.0
Force weight 10.0 1.0 1.0 1.0 1.0 1.0 10.0 1.0 1.0 1.0 1.0 1.0
f z , m a x 500 N
f z , m i n 10 N
Table 3. Performance of convex MPC using FE and ME discretization method.
Table 3. Performance of convex MPC using FE and ME discretization method.
Sampling StepMSEAverage Solving Time (μs)
FE ME FE ME
50.1340.045688.5844.8
100.1150.029716.8982.1
500.1280.037767.81004.5
1000.1780.088777.9968.8
Table 4. Parameters of WBPC.
Table 4. Parameters of WBPC.
Parameters of WBCValues
f z , m a x 500 N
f z , m i n 10 N
X1.5 cm
Y12.5 cm
τ m a x {50, 110, 72, 150, 45} Nm
F z , τ z = 80 Nm
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liu, Y.; Mou, H.; Jiang, H.; Li, Q.; Zhang, J. An Improved Hierarchical Optimization Framework for Walking Control of Underactuated Humanoid Robots Using Model Predictive Control and Whole Body Planner and Controller. Mathematics 2025, 13, 154. https://doi.org/10.3390/math13010154

AMA Style

Liu Y, Mou H, Jiang H, Li Q, Zhang J. An Improved Hierarchical Optimization Framework for Walking Control of Underactuated Humanoid Robots Using Model Predictive Control and Whole Body Planner and Controller. Mathematics. 2025; 13(1):154. https://doi.org/10.3390/math13010154

Chicago/Turabian Style

Liu, Yuanji, Haiming Mou, Hao Jiang, Qingdu Li, and Jianwei Zhang. 2025. "An Improved Hierarchical Optimization Framework for Walking Control of Underactuated Humanoid Robots Using Model Predictive Control and Whole Body Planner and Controller" Mathematics 13, no. 1: 154. https://doi.org/10.3390/math13010154

APA Style

Liu, Y., Mou, H., Jiang, H., Li, Q., & Zhang, J. (2025). An Improved Hierarchical Optimization Framework for Walking Control of Underactuated Humanoid Robots Using Model Predictive Control and Whole Body Planner and Controller. Mathematics, 13(1), 154. https://doi.org/10.3390/math13010154

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop