A Deep Reinforcement Learning Approach to Optimal Morphologies Generation in Reconfigurable Tiling Robots
Abstract
:1. Introduction
- Deep RL-based morphology generation: We investigate a deep reinforcement learning (RL) framework to generate optimal morphologies for the Smorphi robot. This framework allows the robot to modify its morphology by reorienting its blocks relative to one another by changing the angle around the hinge.
- MDP modeling: We model the generation of optimal Smorphi robot morphologies as a Markov decision process (MDP). This provides a structured approach to sequential decision-making for morphology optimization.
- RL training with PPO and A3C: We train the MDP model using proximal policy optimization (PPO) and asynchronous advantage actor–critic (A3C) in simulated environments. These reinforcement learning algorithms enable efficient policy optimization for the Smorphi robot.
- Evaluation and comparison: We evaluate the learned policy in various simulated environments and assess the effectiveness of our approach by comparing it with the non-dominated sorted genetic algorithm (NSGA-II), a metaheuristic optimization scheme. This comparative analysis helps demonstrate the advantages of our deep RL-based methodology in morphology optimization for the Smorphi robot.
2. Literature Survey
3. Deep Reinforcement Learning (DRL)
4. Problem Formulation
4.1. Mechanical Design of the Agent
- Expose/Cover—Reveal or conceal a new surface to alter functionality [54] ⇒ The Smorphi robot exposes/cover its side surface while reconfiguring to a different configuration.
- Generic connection—Employ internal or external connections (structural, power) that can be used by different modules to perform different functions or perform the same [54] ⇒ The Smorphi robot uses hinges in the corners of each block to allow identical modules to be connected, which can enable more number of configurations.
- Shared Drive Transmission—Transmit power from a common source to perform different functions in different configurations [54] ⇒ The Smorphi robot uses the same drive motor for locomotion and reconfiguration.
4.2. Footprint-Based Complete Coverage Path Planner (FBCPP) of the Environment
4.2.1. Footprint Generation
4.2.2. Complete Coverage Path Planning Based on the Footprint
Algorithm 1: footprint-based complete coverage path planner (FBCPP) |
5. Markov Decision Process
5.1. Proximal Policy Optimization (PPO)
5.2. Asynchronous Actor–Critic Agents (A3C)
6. Training and Simulation Results
- Initialization: Initializes the environment, action, and observation space.
- Step Function: Defines an action chosen by the agent, calculates the reward, and observes the agent’s behavior.
- Reset: Resets the robot morphology, reward criteria, and number of iterations at the end of each episode.
- Render: Provides visualization of the environment.
6.1. Analysis of Simulation Results
6.2. Evaluation of Optimal Morphologies
6.3. Optimal vs. Sub-Optimal Morphologies
6.4. Policy Validation
6.5. Comparison with the Optimization Approach
6.6. Limitations
- The proposed framework is only applied and validated in the reconfigurable class of tiling robot . The results may differ for different classes of robots.
- The current model is trained in a static environment. It does not consider dynamic or unpredictable environments. Therefore, the effectiveness of our proposed approach in such scenarios remains an open question.
- The proposed framework is computationally expensive, as it requires training a reinforcement learning agent. This may limit its applicability to real-world applications with tight time constraints.
- Our simulation-based study does not explicitly consider real-world physical constraints and hardware limitations
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
List of Symbols
Hinge Angle | |
Percentage of area covered | |
Path length | |
Width of morphology footprint | |
Length of morphology footprint | |
M | Simulation environment |
Resized map | |
Footprint matrix | |
Bounding box | |
The set of all possible states of the system. | |
The set of all possible actions that the agent can take. | |
Reward function | |
State transition probability | |
Discount factor | |
Policy function | |
Policy ratio | |
Advantage function estimate | |
Hyperparameter that controls the amount of clipping that is performed on the policy ratio. | |
Clipped surrogate objective function | |
Expected value operator. |
References
- Tan, N.; Hayat, A.A.; Elara, M.R.; Wood, K.L. A framework for taxonomy and evaluation of self-reconfigurable robotic systems. IEEE Access 2020, 8, 13969–13986. [Google Scholar] [CrossRef]
- Ilyas, M.; Yuyao, S.; Mohan, R.E.; Devarassu, M.; Kalimuthu, M. Design of sTetro: A modular, reconfigurable, and autonomous staircase cleaning robot. J. Sens. 2018, 2018, 8190802. [Google Scholar] [CrossRef]
- Veerajagadheswar, P.; Yuyao, S.; Kandasamy, P.; Elara, M.R.; Hayat, A.A. S-Sacrr: A staircase and slope accessing reconfigurable cleaning robot and its validation. IEEE Robot. Autom. Lett. 2022, 7, 4558–4565. [Google Scholar] [CrossRef]
- Tun, T.T.; Elara, M.R.; Kalimuthu, M.; Vengadesh, A. Glass facade cleaning robot with passive suction cups and self-locking trapezoidal lead screw drive. Autom. Constr. 2018, 96, 180–188. [Google Scholar] [CrossRef]
- Vega-Heredia, M.; Mohan, R.E.; Wen, T.Y.; Siti’Aisyah, J.; Vengadesh, A.; Ghanta, S.; Vinu, S. Design and modelling of a modular window cleaning robot. Autom. Constr. 2019, 103, 268–278. [Google Scholar] [CrossRef]
- Floor Cleaning Equipment Market Size, Share & Trends Analysis Report by Product (Scrubber, Vacuum Cleaner, Sweeper), by Application (Residential, Commercial), by Region, and Segment Forecasts, 2019–2025. 2023. Available online: https://www.grandviewresearch.com/industry-analysis/floor-cleaning-equipment-market# (accessed on 15 July 2023).
- Hayat, A.A.; Yi, L.; Kalimuthu, M.; Elara, M.; Wood, K.L. Reconfigurable robotic system design with application to cleaning and maintenance. J. Mech. Des. 2022, 144, 063305. [Google Scholar] [CrossRef]
- Kwon, Y.S.; Jung, E.J.; Lim, H.; Yi, B.J. Design of a reconfigurable indoor pipeline inspection robot. In Proceedings of the 2007 International Conference on Control, Automation and Systems, Seoul, Republic of Korea, 17–20 October 2007; pp. 712–716. [Google Scholar]
- Qiao, G.; Song, G.; Wang, Y.; Zhang, J.; Wang, W. Autonomous network repairing of a home security system using modular self-reconfigurable robots. IEEE Trans. Consum. Electron. 2013, 59, 562–570. [Google Scholar] [CrossRef]
- Zhang, H.; Wang, W.; Deng, Z.; Zong, G.; Zhang, J. A novel reconfigurable robot for urban search and rescue. Int. J. Adv. Robot. Syst. 2006, 3, 48. [Google Scholar] [CrossRef]
- Liang, G.; Luo, H.; Li, M.; Qian, H.; Lam, T.L. Freebot: A freeform modular self-reconfigurable robot with arbitrary connection point-design and implementation. In Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA, 24 October 2020–24 January 2021; pp. 6506–6513. [Google Scholar]
- Prabakaran, V.; Elara, M.R.; Pathmakumar, T.; Nansai, S. hTetro: A tetris inspired shape shifting floor cleaning robot. In Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore, 29 May–3 June 2017; pp. 6105–6112. [Google Scholar]
- Hayat, A.A.; Karthikeyan, P.; Vega-Heredia, M.; Elara, M.R. Modeling and assessing of self-reconfigurable cleaning robot htetro based on energy consumption. Energies 2019, 12, 4112. [Google Scholar] [CrossRef]
- Samarakoon, S.B.P.; Muthugala, M.V.J.; Le, A.V.; Elara, M.R. HTetro-infi: A reconfigurable floor cleaning robot with infinite morphologies. IEEE Access 2020, 8, 69816–69828. [Google Scholar] [CrossRef]
- Muthugala, M.V.J.; Samarakoon, S.B.P.; Elara, M.R. Tradeoff between area coverage and energy usage of a self-reconfigurable floor cleaning robot based on user preference. IEEE Access 2020, 8, 76267–76275. [Google Scholar] [CrossRef]
- Le, A.V.; Prabakaran, V.; Sivanantham, V.; Mohan, R.E. Modified a-star algorithm for efficient coverage path planning in tetris inspired self-reconfigurable robot with integrated laser sensor. Sensors 2018, 18, 2585. [Google Scholar] [CrossRef] [PubMed]
- Le, A.V.; Arunmozhi, M.; Veerajagadheswar, P.; Ku, P.C.; Minh, T.H.Q.; Sivanantham, V.; Mohan, R.E. Complete path planning for a tetris-inspired self-reconfigurable robot by the genetic algorithm of the traveling salesman problem. Electronics 2018, 7, 344. [Google Scholar] [CrossRef]
- Kyaw, P.T.; Le, A.V.; Veerajagadheswar, P.; Elara, M.R.; Thu, T.T.; Nhan, N.H.K.; Van Duc, P.; Vu, M.B. Energy-efficient path planning of reconfigurable robots in complex environments. IEEE Trans. Robot. 2022, 38, 2481–2494. [Google Scholar] [CrossRef]
- Cheng, K.P.; Mohan, R.E.; Nhan, N.H.K.; Le, A.V. Graph theory-based approach to accomplish complete coverage path planning tasks for reconfigurable robots. IEEE Access 2019, 7, 94642–94657. [Google Scholar] [CrossRef]
- Kalimuthu, M.; Pathmakumar, T.; Hayat, A.A.; Elara, M.R.; Wood, K.L. A metaheuristic approach to optimal morphology in reconfigurable tiling robots. Complex Intell. Syst. 2023, 1–20. [Google Scholar] [CrossRef]
- Furno, L.; Blanke, M.; Galeazzi, R.; Christensen, D.J. Self-reconfiguration of modular underwater robots using an energy heuristic. In Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada, 24–28 September 2017; pp. 6277–6284. [Google Scholar]
- Norouzi, M.; Miro, J.V.; Dissanayake, G. Planning stable and efficient paths for reconfigurable robots on uneven terrain. J. Intell. Robot. Syst. 2017, 87, 291–312. [Google Scholar] [CrossRef]
- Chaplot, D.S.; Gandhi, D.; Gupta, S.; Gupta, A.; Salakhutdinov, R. Learning to explore using active neural slam. arXiv 2020, arXiv:2004.05155. [Google Scholar]
- Zhu, D.; Li, T.; Ho, D.; Wang, C.; Meng, M.Q.H. Deep reinforcement learning supervised autonomous exploration in office environments. In Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, QLD, Australia, 21–25 May 2018; pp. 7548–7555. [Google Scholar]
- Wang, J.; Elfwing, S.; Uchibe, E. Modular deep reinforcement learning from reward and punishment for robot navigation. Neural Netw. 2021, 135, 115–126. [Google Scholar] [CrossRef]
- Joshi, S.; Kumra, S.; Sahin, F. Robotic grasping using deep reinforcement learning. In Proceedings of the 2020 IEEE 16th International Conference on Automation Science and Engineering (CASE), Hong Kong, China, 8 October 2020; pp. 1461–1466. [Google Scholar]
- Andrychowicz, O.M.; Baker, B.; Chociej, M.; Jozefowicz, R.; McGrew, B.; Pachocki, J.; Petron, A.; Plappert, M.; Powell, G.; Ray, A.; et al. Learning dexterous in-hand manipulation. Int. J. Robot. Res. 2020, 39, 3–20. [Google Scholar] [CrossRef]
- Foerster, J.; Assael, I.A.; De Freitas, N.; Whiteson, S. Learning to communicate with deep multi-agent reinforcement learning. arXiv 2016, arXiv:1605.06676. [Google Scholar]
- Hu, J.; Niu, H.; Carrasco, J.; Lennox, B.; Arvin, F. Voronoi-based multi-robot autonomous exploration in unknown environments via deep reinforcement learning. IEEE Trans. Veh. Technol. 2020, 69, 14413–14423. [Google Scholar] [CrossRef]
- Zhu, K.; Zhang, T. Deep reinforcement learning based mobile robot navigation: A review. Tsinghua Sci. Technol. 2021, 26, 674–691. [Google Scholar] [CrossRef]
- Low, E.S.; Ong, P.; Cheah, K.C. Solving the optimal path planning of a mobile robot using improved Q-learning. Robot. Auton. Syst. 2019, 115, 143–161. [Google Scholar] [CrossRef]
- Bing, Z.; Lemke, C.; Cheng, L.; Huang, K.; Knoll, A. Energy-efficient and damage-recovery slithering gait design for a snake-like robot based on reinforcement learning and inverse reinforcement learning. Neural Netw. 2020, 129, 323–333. [Google Scholar] [CrossRef]
- Gan, L.; Grizzle, J.W.; Eustice, R.M.; Ghaffari, M. Energy-based legged robots terrain traversability modeling via deep inverse reinforcement learning. IEEE Robot. Autom. Lett. 2022, 7, 8807–8814. [Google Scholar] [CrossRef]
- Ha, S.; Kim, J.; Yamane, K. Automated deep reinforcement learning environment for hardware of a modular legged robot. In Proceedings of the 2018 15th International Conference on Ubiquitous Robots (UR), Honolulu, HI, USA, 26–30 June 2018; pp. 348–354. [Google Scholar]
- Mitriakov, A.; Papadakis, P.; Garlatti, S. Staircase traversal via reinforcement learning for active reconfiguration of assistive robots. In Proceedings of the 2020 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), Glasgow, UK, 19–24 July 2020; pp. 1–8. [Google Scholar]
- Sun, H.; Yang, L.; Gu, Y.; Pan, J.; Wan, F.; Song, C. Bridging Locomotion and Manipulation Using Reconfigurable Robotic Limbs via Reinforcement Learning. Biomimetics 2023, 8, 364. [Google Scholar] [CrossRef]
- Yehezkel, L.; Berman, S.; Zarrouk, D. Overcoming obstacles with a reconfigurable robot using reinforcement learning. IEEE Access 2020, 8, 217541–217553. [Google Scholar] [CrossRef]
- Le, A.V.; Parween, R.; Kyaw, P.T.; Mohan, R.E.; Minh, T.H.Q.; Borusu, C.S.C.S. Reinforcement learning-based energy-aware area coverage for reconfigurable hRombo tiling robot. IEEE Access 2020, 8, 209750–209761. [Google Scholar] [CrossRef]
- Lakshmanan, A.K.; Mohan, R.E.; Ramalingam, B.; Le, A.V.; Veerajagadeshwar, P.; Tiwari, K.; Ilyas, M. Complete coverage path planning using reinforcement learning for tetromino based cleaning and maintenance robot. Autom. Constr. 2020, 112, 103078. [Google Scholar] [CrossRef]
- Le, A.V.; Veerajagadheswar, P.; Thiha Kyaw, P.; Elara, M.R.; Nhan, N.H.K. Coverage path planning using reinforcement learning-based TSP for hTetran—A polyabolo-inspired self-reconfigurable tiling robot. Sensors 2021, 21, 2577. [Google Scholar] [CrossRef] [PubMed]
- Kyaw, P.T.; Paing, A.; Thu, T.T.; Mohan, R.E.; Le, A.V.; Veerajagadheswar, P. Coverage path planning for decomposition reconfigurable grid-maps using deep reinforcement learning based travelling salesman problem. IEEE Access 2020, 8, 225945–225956. [Google Scholar] [CrossRef]
- Dussauge, T.P.; Sung, W.J.; Pinon Fischer, O.J.; Mavris, D.N. A reinforcement learning approach to airfoil shape optimization. Sci. Rep. 2023, 13, 9753. [Google Scholar] [CrossRef] [PubMed]
- Yan, X.; Zhu, J.; Kuang, M.; Wang, X. Aerodynamic shape optimization using a novel optimizer based on machine learning techniques. Aerosp. Sci. Technol. 2019, 86, 826–835. [Google Scholar] [CrossRef]
- Li, S.; Snaiki, R.; Wu, T. A knowledge-enhanced deep reinforcement learning-based shape optimizer for aerodynamic mitigation of wind-sensitive structures. Comput.-Aided Civ. Infrastruct. Eng. 2021, 36, 733–746. [Google Scholar] [CrossRef]
- Viquerat, J.; Rabault, J.; Kuhnle, A.; Ghraieb, H.; Larcher, A.; Hachem, E. Direct shape optimization through deep reinforcement learning. J. Comput. Phys. 2021, 428, 110080. [Google Scholar] [CrossRef]
- Bhola, S.; Pawar, S.; Balaprakash, P.; Maulik, R. Multi-fidelity reinforcement learning framework for shape optimization. J. Comput. Phys. 2023, 482, 112018. [Google Scholar] [CrossRef]
- Rabault, J.; Ren, F.; Zhang, W.; Tang, H.; Xu, H. Deep reinforcement learning in fluid mechanics: A promising method for both active flow control and shape optimization. J. Hydrodyn. 2020, 32, 234–246. [Google Scholar] [CrossRef]
- Ghraieb, H.; Viquerat, J.; Larcher, A.; Meliga, P.; Hachem, E. Single-step deep reinforcement learning for two-and three-dimensional optimal shape design. AIP Adv. 2022, 12, 085108. [Google Scholar] [CrossRef]
- Yonekura, K.; Hattori, H. Framework for design optimization using deep reinforcement learning. Struct. Multidiscip. Optim. 2019, 60, 1709–1713. [Google Scholar] [CrossRef]
- Wiering, M.A.; Van Otterlo, M. Reinforcement learning. Adapt. Learn. Optim. 2012, 12, 729. [Google Scholar]
- Li, Y. Deep reinforcement learning: An overview. arXiv 2017, arXiv:1701.07274. [Google Scholar]
- Montavon, G.; Samek, W.; Müller, K.R. Methods for interpreting and understanding deep neural networks. Digit. Signal Process. 2018, 73, 1–15. [Google Scholar] [CrossRef]
- Kalimuthu, M.; Pathmakumar, T.; Hayat, A.A.; Veerajagadheswar, P.; Elara, M.R.; Wood, K.L. Optimal Morphologies of n-Omino-Based Reconfigurable Robot for Area Coverage Task Using Metaheuristic Optimization. Mathematics 2023, 11, 948. [Google Scholar] [CrossRef]
- Singh, V.; Skiles, S.M.; Krager, J.E.; Wood, K.L.; Jensen, D.; Sierakowski, R. Innovations in design through transformation: A fundamental study of transformation principles. J. Mech. Des. Trans. ASME 2009, 131, 081010. [Google Scholar] [CrossRef]
- Weaver, J.; Wood, K.; Crawford, R.; Jensen, D. Transformation design theory: A meta-analogical framework. J. Comput. Inf. Sci. Eng. 2010, 10, 031012. [Google Scholar] [CrossRef]
- Hayat, A.A.; Parween, R.; Elara, M.R.; Parsuraman, K.; Kandasamy, P.S. Panthera: Design of a reconfigurable pavement sweeping robot. In Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019; pp. 7346–7352. [Google Scholar]
- Kalimuthu, M.; Hayat, A.; Elara, M.; Wood, K. Transformation design Principles as enablers for designing Reconfigurable Robots. In Proceedings of the International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, American Society of Mechanical Engineers, Virtual, 17–19 August 2021; Volume 85420, p. V006T06A008. [Google Scholar]
- Kalimuthu, M.; Hayat, A.A.; Elara, M.R.; Wood, K.L. Reconfigurable Robot Design Aided with Design Cards. In Proceedings of the International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, American Society of Mechanical Engineers, St. Louis, MO, USA, 14–17 August 2022; Volume 86267, p. V006T06A010. [Google Scholar]
- Ong, J.H.; Hayat, A.A.; Manimuthu, M.A.A.; Elara, M.R.; Wood, K. Transforming Spherical Robots. In Proceedings of the International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, Boston, MA, USA, 20–23 August 2023; pp. 1–12. [Google Scholar]
- Zelinsky, A.; Jarvis, R.A.; Byrne, J.; Yuta, S. Planning paths of complete coverage of an unstructured environment by a mobile robot. In Proceedings of the International Conference on Advanced Robotics, Atlanta, GA, USA, 2–6 May 1993; Volume 13, pp. 533–538. [Google Scholar]
- Van Otterlo, M.; Wiering, M. Reinforcement learning and markov decision processes. In Reinforcement Learning: State-of-the-Art; Springer: Berlin/Heidelberg, Germany, 2012; pp. 3–42. [Google Scholar]
- Schulman, J.; Wolski, F.; Dhariwal, P.; Radford, A.; Klimov, O. Proximal policy optimization algorithms. arXiv 2017, arXiv:1707.06347. [Google Scholar]
- Mnih, V.; Badia, A.P.; Mirza, M.; Graves, A.; Lillicrap, T.; Harley, T.; Silver, D.; Kavukcuoglu, K. Asynchronous methods for deep reinforcement learning. In Proceedings of the International Conference on Machine Learning, New York, NY, USA, 19–24 June 2016; pp. 1928–1937. [Google Scholar]
- Brockman, G.; Cheung, V.; Pettersson, L.; Schneider, J.; Schulman, J.; Tang, J.; Zaremba, W. Openai gym. arXiv 2016, arXiv:1606.01540. [Google Scholar]
- Liang, E.; Liaw, R.; Nishihara, R.; Moritz, P.; Fox, R.; Goldberg, K.; Gonzalez, J.; Jordan, M.; Stoica, I. RLlib: Abstractions for distributed reinforcement learning. In Proceedings of the International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; pp. 3053–3062. [Google Scholar]
- Long, P.; Fan, T.; Liao, X.; Liu, W.; Zhang, H.; Pan, J. Towards optimally decentralized multi-robot collision avoidance via deep reinforcement learning. In Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, QLD, Australia, 21–25 May 2018; pp. 6252–6259. [Google Scholar]
- Deb, K.; Pratap, A.; Agarwal, S.; Meyarivan, T. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 2002, 6, 182–197. [Google Scholar] [CrossRef]
- Van Veldhuizen, D.A.; Lamont, G.B. Evolutionary computation and convergence to a pareto front. In Late Breaking Papers at the Genetic Programming 1998 Conference; Citeseer: University Park, PA, USA, 1998; pp. 221–228. [Google Scholar]
Hyperparameter | PPO | A3C |
---|---|---|
Learning Rate | ||
Batch Size | 4096 | 32 |
Discount Factor () | 0.99 | 0.99 |
GAE Parameter () | 0.95 | 0.95 |
PPO. Epochs | 15 | - |
Max Gradient Norm | 0.5 | 0.5 |
Entropy Coefficient | 0.01 | 0.01 |
Value Loss Coefficient | 0.5 | 0.5 |
Optimization Steps | 5 | 5 |
Sl.NO | Algorithm | Map | Hinge Angles (Degree) | Area Covered (Pc) in % | Path Length (Plen) |
---|---|---|---|---|---|
1 | PPO | Map-I | (95, 59, 103) | 98.5136 | 99 |
2 | (97, 61, 107) | 98.446 | 99 | ||
3 | (96, 61, 109) | 98.1036 | 99 | ||
4 | (105, 51, 81) | 97.422 | 92 | ||
5 | A3C | Map-I | (37, 33, 137) | 97.6692 | 99 |
6 | (41, 31, 129) | 97.5516 | 99 | ||
7 | (37, 35, 125) | 97.438 | 107 | ||
8 | (33, 31, 125) | 97.1308 | 102 |
Sl.NO | Algorithm | Map | Hinge Angles (Degree) | Area Covered (Pc) in % | Path Length (Plen) |
---|---|---|---|---|---|
1 | PPO | Map-II | (15, 15, 39) | 96.9414 | 134 |
2 | (15, 21, 39) | 96.6926 | 134 | ||
3 | (41, 43, 95) | 95.2774 | 96 | ||
4 | (43, 39, 95) | 95.203 | 96 | ||
5 | A3C | Map-II | (75, 69, 83) | 90.1382 | 72 |
6 | (83, 67, 97) | 89.9714 | 79 | ||
7 | (87, 71, 111) | 88.9814 | 79 | ||
8 | (85, 71, 103) | 88.7038 | 72 |
Sl.NO | Hinge Angles (Degree) | Area Covered (Pc) in % | Path Length (Plen) |
---|---|---|---|
1 | (171, 49, 29) | 79.2364 | 92 |
2 | (70, 170, 86) | 88.350 | 121 |
3 | (120, 171, 27) | 88.6871 | 128 |
4 | (71, 113, 125) | 86.518 | 68 |
5 | (2, 55, 17) | 82.1452 | 75 |
6 | (146, 179, 58) | 88.0688 | 135 |
7 | (165, 93, 2) | 83.1008 | 116 |
8 | (0, 180, 0) | 100 | 224 |
Environment | Iterations Taken for Convergence | |
---|---|---|
Trained Model | Baseline Model | |
Environment-1 | ||
Environment-2 | ||
Environment-3 |
Sl.NO | Hinge Angles (Degree) | Area Covered (Pc) in % | Path Length (Plen) |
---|---|---|---|
1 | (97, 12, 150) | 86.9368 | 68 |
2 | (159, 20, 152) | 94.9972 | 92 |
3 | (53, 39, 167) | 92.5272 | 85 |
4 | (62, 96, 65) | 82.5484 | 59 |
5 | (73, 90, 124) | 85.1824 | 66 |
6 | (51, 114, 136) | 83.6408 | 65 |
7 | (70, 27, 136) | 90.6232 | 75 |
8 | (55, 17, 3) | 95.8784 | 107 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kalimuthu, M.; Hayat, A.A.; Pathmakumar, T.; Rajesh Elara, M.; Wood, K.L. A Deep Reinforcement Learning Approach to Optimal Morphologies Generation in Reconfigurable Tiling Robots. Mathematics 2023, 11, 3893. https://doi.org/10.3390/math11183893
Kalimuthu M, Hayat AA, Pathmakumar T, Rajesh Elara M, Wood KL. A Deep Reinforcement Learning Approach to Optimal Morphologies Generation in Reconfigurable Tiling Robots. Mathematics. 2023; 11(18):3893. https://doi.org/10.3390/math11183893
Chicago/Turabian StyleKalimuthu, Manivannan, Abdullah Aamir Hayat, Thejus Pathmakumar, Mohan Rajesh Elara, and Kristin Lee Wood. 2023. "A Deep Reinforcement Learning Approach to Optimal Morphologies Generation in Reconfigurable Tiling Robots" Mathematics 11, no. 18: 3893. https://doi.org/10.3390/math11183893
APA StyleKalimuthu, M., Hayat, A. A., Pathmakumar, T., Rajesh Elara, M., & Wood, K. L. (2023). A Deep Reinforcement Learning Approach to Optimal Morphologies Generation in Reconfigurable Tiling Robots. Mathematics, 11(18), 3893. https://doi.org/10.3390/math11183893