Combined Reinforcement Learning and CPG Algorithm to Generate Terrain-Adaptive Gait of Hexapod Robots
Abstract
:1. Introduction
2. Rhythmic Gait Generation by the CPG Network
2.1. Hexapod Robot
2.2. CPG Network
3. Terrain-Adaptive Gait Generation
3.1. Problem Description
3.2. Reward Function
- (1)
- In our work, the robot does not have sensors to sense the terrain ahead, so it cannot make a plan in advance. However, the robot is able to level the tilted body by compensating for joint positions, and this behavior deserves a reward. Thus, the expression for is given by
- (2)
- In order to avoid the robot taking advantage of the first rule to obtain a large number of rewards and behaving in an undesired way, we need to design a negative reward as follows:
- (3)
- The reward for robot motion is necessary, and we expect the robot to keep moving forward in rough terrain. The expression is
- (4)
- In this paper, we expect to perturb the gait trajectory as little as possible. The energy consumption penalty is the penalty for gait motion efficiency and energy consumption. The expression we designed is
3.3. Termination Condition
- (1)
- The robot is involved in a self-collision.
- (2)
- The pitch, roll, or yaw of the base exceeds the allowed range.
- (3)
- The base height is less than the set threshold.
- (4)
- The distance the robot moves exceeds the set threshold.
3.4. Policy Training
4. Experiments and Results
4.1. Motion Verification
4.2. Motion Performance Comparison
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Belter, D.; Skrzypczyński, P. Rough terrain mapping and classification for foothold selection in a walking robot. J. Field Robot. 2011, 28, 497–528. [Google Scholar] [CrossRef]
- Isvara, Y.; Rachmatullah, S.; Mutijarsa, K.; Prabakti, D.E.; Pragitatama, W. Terrain adaptation gait algorithm in a hexapod walking robot. In Proceedings of the 2014 13th International Conference on Control Automation Robotics & Vision (ICARCV), Singapore, 10–12 December 2014; pp. 1735–1739. [Google Scholar] [CrossRef]
- Bjelonic, M.; Kottege, N.; Beckerle, P. Proprioceptive control of an over-actuated hexapod robot in unstructured terrain. In Proceedings of the 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejeon, Republic of Korea, 9–14 October 2016; pp. 2042–2049. [Google Scholar] [CrossRef]
- Hu, N.; Li, S.; Zhu, Y.; Gao, F. Constrained model predictive control for a hexapod robot walking on irregular terrain. J. Intell. Robot. Syst. 2019, 94, 179–201. [Google Scholar] [CrossRef]
- Qazani, M.R.C.; Asadi, H.; Nahavandi, S. A model predictive control-based motion cueing algorithm with consideration of joints’ limitations for hexapod motion platform. In Proceedings of the 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC), Bari, Italy, 6–9 October 2019; pp. 708–713. [Google Scholar] [CrossRef]
- Gao, Y.; Wei, W.; Wang, X.; Li, Y.; Wang, D.; Yu, Q. Feasibility, planning and control of ground-wall transition for a suctorial hexapod robot. Appl. Intell. 2021, 51, 5506–5524. [Google Scholar] [CrossRef]
- Bjelonic, M.; Kottege, N.; Homberger, T.; Borges, P.; Beckerle, P.; Chli, M. Weaver: Hexapod robot for autonomous navigation on unstructured terrain. J. Field Robot. 2018, 35, 1063–1079. [Google Scholar] [CrossRef]
- Murata, Y.; Inagaki, S.; Suzuki, T. Development of an adaptive hexapod robot based on Follow-the-contact-point gait control and Timekeeper control. In Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macau, China, 3–8 November 2019; pp. 3321–3327. [Google Scholar] [CrossRef]
- Faigl, J.; Čížek, P. Adaptive locomotion control of hexapod walking robot for traversing rough terrains with position feedback only. Robot. Auton. Syst. 2019, 116, 136–147. [Google Scholar] [CrossRef]
- Zhao, Y.; Gao, F.; Sun, Q.; Yin, Y. Terrain classification and adaptive locomotion for a hexapod robot Qingzhui. Front. Mech. Eng. 2021, 16, 271–284. [Google Scholar] [CrossRef]
- Fukuhara, A.; Suda, W.; Kano, T.; Kobayashi, R.; Ishiguro, A. Adaptive Interlimb Coordination Mechanism for Hexapod Locomotion Based on Active Load Sensing. Front. Neurorobotics 2022, 16, 645683. [Google Scholar] [CrossRef]
- Hua-yong, W. Obstacle avoidance path optimization method of multi-legged robot based on virtual reality technology. In Proceedings of the International Conference of Social Computing and Digital Economy (ICSCDE), Chongqing, China, 28–29 August 2021; pp. 215–218. [Google Scholar] [CrossRef]
- Fućek, L.; Kovačić, Z.; Bogdan, S. Analytically founded yaw control algorithm for walking on uneven terrain applied to a hexapod robot. Int. J. Adv. Robot. Syst. 2019, 16, 1–17. [Google Scholar] [CrossRef]
- Tieck, J.C.V.; Rutschke, J.; Kaiser, J.; Schulze, M.; Buettner, T.; Reichard, D.; Roennau, A.; Dillmann, R. Combining spiking motor primitives with a behaviour-based architecture to model locomotion for six-legged robots. In Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macau, China, 3–8 November 2019; pp. 4161–4168. [Google Scholar] [CrossRef]
- Wang, Y.; Xue, X.; Chen, B. Matsuoka’s CPG with desired rhythmic signals for adaptive walking of humanoid robots. IEEE Trans. Cybern. 2018, 50, 613–626. [Google Scholar] [CrossRef]
- Yu, H.; Gao, H.; Deng, Z. Enhancing adaptability with local reactive behaviors for hexapod walking robot via sensory feedback integrated central pattern generator. Robot. Auton. Syst. 2020, 124, 103401. [Google Scholar] [CrossRef]
- Mokhtari, M.; Taghizadeh, M.; Ghaf-Ghanbari, P. Adaptive second-order sliding model-based fault-tolerant control of a lower-limb exoskeleton subject to tracking the desired trajectories augmented by CPG algorithm. J. Braz. Soc. Mech. Sci. Eng. 2022, 44, 423. [Google Scholar] [CrossRef]
- Homchanthanakul, J.; Manoonpong, P. Continuous online adaptation of bioinspired adaptive neuroendocrine control for autonomous walking robots. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 1833–1845. [Google Scholar] [CrossRef] [PubMed]
- Cully, A.; Clune, J.; Tarapore, D.; Mouret, J.-B. Robots that can adapt like animals. Nature 2015, 521, 503–507. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Hwangbo, J.; Lee, J.; Dosovitskiy, A.; Bellicoso, D.; Tsounis, V.; Koltun, V.; Hutter, M. Learning agile and dynamic motor skills for legged robots. Sci. Robot. 2019, 4, 5872. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Azayev, T.; Zimmerman, K. Blind hexapod locomotion in complex terrain with gait adaptation using deep reinforcement learning and classification. J. Intell. Robot. Syst. 2020, 99, 659–671. [Google Scholar] [CrossRef]
- Ouyang, W.; Chi, H.; Pang, J.; Liang, W.; Ren, Q. Adaptive locomotion control of a hexapod robot via bio-inspired learning. Front. Neurorobot. 2021, 15, 627157. [Google Scholar] [CrossRef]
- Panerati, J.; Zheng, H.; Zhou, S.Q.; Xu, J.; Prorok, A.; Schoellig, A.P. Learning to fly-a gym environment with pybullet physics for reinforcement learning of multi-agent quadcopter control. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Prague, Czech Republic, 27 September–1 October 2021; pp. 7512–7519. [Google Scholar] [CrossRef]
- Matsuoka, K. Sustained oscillations generated by mutually inhibiting neurons with adaptation. Biol. Cybern. 1985, 52, 367–376. [Google Scholar] [CrossRef]
- Matsuoka, K. Mechanisms of frequency and pattern control in the neural rhythm generators. Biol. Cybern. 1987, 56, 345–353. [Google Scholar] [CrossRef]
- Pol, B.V.D.; Mark, J.V.D. The heartbeat considered as a relaxation-oscillation, and an electrical model of the heart. Philos. Mag. 1929, 6, 763–775. [Google Scholar] [CrossRef]
- Bay, J.S.; Hemami, H. Modeling of a neural pattern generator with coupled nonlinear oscillators. IEEE Trans. Biomed. Eng. 1987, BME-34, 297–306. [Google Scholar] [CrossRef]
- Acebrón, J.A.; Bonilla, L.L.; Vicente, C.J.P.; Ritort, F.; Spigler, R. The Kuramoto model: A simple paradigm for synchronization phenomena. Rev. Mod. Phys. 2005, 77, 137. [Google Scholar] [CrossRef] [Green Version]
- Righetti, L.; Ijspeert, A.J. Pattern generators with sensory feedback for the control of quadruped locomotion. In Proceedings of the 2008 IEEE International Conference on Robotics and Automation, Pasadena, CA, USA, 19–23 May 2008; pp. 819–824. [Google Scholar] [CrossRef] [Green Version]
- Righetti, L.; Buchli, J.; Ijspeert, A.J. From Dynamic Hebbian Learning for Oscillators to Adaptive Central Pattern Generators; Verlag ISLE: Ilmenau, Germany, 2005; pp. 1–6. [Google Scholar]
- Mania, H.; Guy, A.; Recht, B. Simple random search provides a competitive approach to reinforcement learning. arXiv 2018. [Google Scholar] [CrossRef]
Gait | Tripod | Tetrapod | Wave |
---|---|---|---|
1/2 | 2/3 | 5/6 |
Died | |||||
---|---|---|---|---|---|
RL-CPG | 2.63 | 0.41 | 0.020 | 0.019 | 15 |
CPG | 2.15 | 0.46 | 0.021 | 0.020 | 32 |
RL | 0.96 | 0.73 | 0.034 | 0.035 | 18 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, D.; Wei, W.; Qiu, Z. Combined Reinforcement Learning and CPG Algorithm to Generate Terrain-Adaptive Gait of Hexapod Robots. Actuators 2023, 12, 157. https://doi.org/10.3390/act12040157
Li D, Wei W, Qiu Z. Combined Reinforcement Learning and CPG Algorithm to Generate Terrain-Adaptive Gait of Hexapod Robots. Actuators. 2023; 12(4):157. https://doi.org/10.3390/act12040157
Chicago/Turabian StyleLi, Daxian, Wu Wei, and Zhiying Qiu. 2023. "Combined Reinforcement Learning and CPG Algorithm to Generate Terrain-Adaptive Gait of Hexapod Robots" Actuators 12, no. 4: 157. https://doi.org/10.3390/act12040157
APA StyleLi, D., Wei, W., & Qiu, Z. (2023). Combined Reinforcement Learning and CPG Algorithm to Generate Terrain-Adaptive Gait of Hexapod Robots. Actuators, 12(4), 157. https://doi.org/10.3390/act12040157