Reinforcement Learning with Dynamic Movement Primitives for Obstacle Avoidance
Abstract
:1. Introduction
1.1. Related Work
1.2. Contribution
- The PI2 method is employed to optimize the planned trajectories and obstacle avoidance potential in a DMP;
- A well designed reward function which combines instantaneous rewards and terminal rewards is proposed to make the algorithm achieve better performance;
- Simulations and experiments on a real 7-DOF redundant manipulator are designed to validate the performance of our approach. In addition, a simulation with specified via-point shows the flexibility in trajectory learning.
2. Obstacle Avoidance for Dynamic Movement Primitives
3. Reinforcement Learning for Obstacle Avoidance
3.1. Reinforcement Learning: The PI2 Algorithm
3.2. Reinforcement Learning of Potential and Shape
Algorithm 1 PI2 algorithm for learning potential strength and DMP shape parameters . |
Input: Initial potential strength ; initial DMP shape parameters ; constant ; terminal cost ; immediate cost term ; variance of noise and ; Gaussian basis function from the system dynamics ; number of roll-outs per update K Output: Final potential strength ; final DMP shape parameters
|
4. Simulations and Experiments
4.1. Simulations
4.2. Experiments
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Li, H.; Savkin, A.V. An algorithm for safe navigation of mobile robots by a sensor network in dynamic cluttered industrial environments. Robot. Comput.-Integr. Manuf. 2018, 54, 65–82. [Google Scholar] [CrossRef]
- Lu, Z.; Liu, Z.; Correa, G.J.; Karydis, K. Motion Planning for Collision-resilient Mobile Robots in Obstacle-cluttered Unknown Environments with Risk Reward Trade-offs. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, ND, USA, 25 October–24 December 2020; pp. 7064–7070. [Google Scholar]
- Wang, W.; Zhu, M.; Wang, X.; He, S.; He, J.; Xu, Z. An improved artificial potential field method of trajectory planning and obstacle avoidance for redundant manipulators. Int. J. Adv. Robot. Syst. 2018, 15, 1–13. [Google Scholar] [CrossRef] [Green Version]
- Ijspeert, A.J.; Nakanishi, J.; Schaal, S. Learning Attractor Landscapes for Learning Motor Primitives. Adv. Neural Inf. Process. Syst. 2002, 15, 1523–1530. [Google Scholar]
- Hoffmann, H.; Pastor, P.; Park, D.H.; Schaal, S. Biologically-inspired dynamical systems for movement generation: Automatic real-time goal adaptation and obstacle avoidance. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Kobe, Japan, 12–17 May 2009; pp. 2587–2592. [Google Scholar]
- Ginesi, M.; Meli, D.; Roberti, A.; Sansonetto, N.; Fiorini, P. Dynamic movement primitives: Volumetric obstacle avoidance using dynamic potential functions. J. Intell. Robot. Syst. 2021, 101, 1–20. [Google Scholar] [CrossRef]
- Ginesi, M.; Meli, D.; Calanca, A.; Dall’Alba, D.; Sansonetto, N.; Fiorini, P. Dynamic Movement Primitives: Volumetric Obstacle Avoidance. In Proceedings of the 19th International Conference on Advanced Robotics (ICAR), Belo Horizonte, Brazil, 2–6 December 2019; pp. 234–239. [Google Scholar]
- Park, D.H.; Hoffmann, H.; Pastor, P.; Schaal, S. Movement reproduction and obstacle avoidance with dynamic movement primitives and potential fields. In Proceedings of the 8th IEEE-RAS International Conference on Humanoid Robots, Daejeon, Korea, 1–3 December 2008. [Google Scholar]
- Volpe, R.; Khosla, P. Manipulator control with superquadric artificial potential functions: Theory and experiments. IEEE Trans. Syst. Man. Cybern. 1990, 20, 1423–1436. [Google Scholar] [CrossRef] [Green Version]
- Huber, L.; Billard, A.; Slotine, J.J.E. Avoidance of convex and concave obstacles with convergence ensured through contraction. IEEE Robot. Autom. Lett. 2019, 4, 1462–1469. [Google Scholar] [CrossRef] [Green Version]
- Saveriano, M.; Lee, D. Distance based dynamical system modulation for reactive avoidance of moving obstacles. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Hong Kong, China, 31 May–7 June 2014; pp. 5618–5623. [Google Scholar]
- Rai, A.; Meier, F.; Ijspeert, A.; Schaal, S. Learning coupling terms for obstacle avoidance. In Proceedings of the IEEE-RAS International Conference on Humanoid Robots, Madrid, Spain, 18–20 November 2014; pp. 512–518. [Google Scholar]
- Rai, A.; Sutanto, G.; Schaal, S.; Meier, F. Learning Feedback Terms for Reactive Planning and Control. In Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore, 29 May–3 June 2017. [Google Scholar]
- Pairet, È.; Ardón, P.; Mistry, M.; Petillot, Y. Learning generalizable coupling terms for obstacle avoidance via low-dimensional geometric descriptors. IEEE Robot. Autom. Lett. 2019, 4, 3979–3986. [Google Scholar] [CrossRef] [Green Version]
- Ossenkopf, M.; Ennen, P.; Vossen, R.; Jeschke, S. Reinforcement learning for manipulators without direct obstacle perception in physically constrained environments. Procedia Manuf. 2017, 11, 329–337. [Google Scholar] [CrossRef]
- Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction, 2nd ed.; MIT Press: Cambridge, MA, USA; London, UK, 2018. [Google Scholar]
- Buchli, J.; Stulp, F.; Theodorou, E.; Schaa, S. Learning variable impedance control. Int. J. Robot. Res. 2011, 30, 820–833. [Google Scholar] [CrossRef]
- Theodorou, E.; Buchli, J.; Schaal, S. A generalized path integral control approach to reinforcement learning. J. Mach. Learn. Res. 2010, 11, 3137–3181. [Google Scholar]
- Stulp, F.; Theodorou, E.A.; Schaal, S. Reinforcement learning with sequences of motion primitives for robust manipulation. IEEE Trans. Robot. 2012, 28, 1360–1370. [Google Scholar] [CrossRef]
- Stulp, F.; Schaal, S. Hierarchical reinforcement learning with movement primitives. In Proceedings of the IEEE-RAS International Conference on Humanoid Robots, Bled, Slovenia, 26–28 October 2011; pp. 231–238. [Google Scholar]
- Pastor, P.; Hoffmann, H.; Asfour, T.; Schaal, S. Learning and generalization of motor skills by learning from demonstration. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Kobe, Japan, 12–17 May 2009; pp. 763–768. [Google Scholar]
- Theodorou, E.; Buchli, J.; Schaal, S. Reinforcement learning of motor skills in high dimensions: A path integral approach. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Anchorage, AK, USA, 3–8 May 2010; pp. 2397–2403. [Google Scholar]
- Oksendal, B. Stochastic Differential Equations: An Introduction with Applications, 6th ed.; Springer Science and Business Media: Berlin, Germany, 2013. [Google Scholar]
0.0051 (PI2) | 0.01 | |
---|---|---|
SD (cm) | 73.55 | 82.08 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, A.; Liu, Z.; Wang, W.; Zhu, M.; Li, Y.; Huo, Q.; Dai, M. Reinforcement Learning with Dynamic Movement Primitives for Obstacle Avoidance. Appl. Sci. 2021, 11, 11184. https://doi.org/10.3390/app112311184
Li A, Liu Z, Wang W, Zhu M, Li Y, Huo Q, Dai M. Reinforcement Learning with Dynamic Movement Primitives for Obstacle Avoidance. Applied Sciences. 2021; 11(23):11184. https://doi.org/10.3390/app112311184
Chicago/Turabian StyleLi, Ang, Zhenze Liu, Wenrui Wang, Mingchao Zhu, Yanhui Li, Qi Huo, and Ming Dai. 2021. "Reinforcement Learning with Dynamic Movement Primitives for Obstacle Avoidance" Applied Sciences 11, no. 23: 11184. https://doi.org/10.3390/app112311184
APA StyleLi, A., Liu, Z., Wang, W., Zhu, M., Li, Y., Huo, Q., & Dai, M. (2021). Reinforcement Learning with Dynamic Movement Primitives for Obstacle Avoidance. Applied Sciences, 11(23), 11184. https://doi.org/10.3390/app112311184