Reinforcement Learning and Action Space Shaping for a Humanoid Agent in a Highly Dynamic Environment

Jyun Ting Song, Guilherme Christmann, Jaesik Jeong, Jacky Baltes*

*此作品的通信作者

研究成果: 書貢獻/報告類型會議論文篇章

摘要

Reinforcement Learning (RL) is a powerful tool and has been increasingly used in continuous control tasks such as locomotion and balancing in robotics. In this paper, we tackle a balancing task in a highly dynamic environment, using a humanoid robot agent and a balancing board. This task requires complex continuous actuation in order for the agent to stay in a balanced state. In this work, we propose an RL algorithm structure based on the state-of-the-art Proximal Policy Optimization (PPO) using GPU-based implementation; the agent achieves successful balancing in under 40 min of real-time. We sought to examine the impact of action space shaping on sample efficiency and designed 6 distinct control modes. Our constrained parallel control modes outperform the naive baseline in both sample efficiency and variance to the starting seed. The best-performing control mode, using parallel configuration, including lower body and shoulder roll joints named (PLS-R), is 33% more sample efficient than all the other defined modes, indicating the impact of action space shaping on the sample efficiency of our approach.Our implementation is open-source and freely available at: https://github.com/NTNU-ERC/Robinion-Balance-Board-PPO.

原文英語
主出版物標題Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing 2022-Winter
編輯Roger Lee
發行者Springer Science and Business Media Deutschland GmbH
頁面29-42
頁數14
ISBN(列印)9783031261343
DOIs
出版狀態已發佈 - 2023
事件24th ACIS International Summer Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, SNPD 2022-Winter - Taichung, 臺灣
持續時間: 2022 12月 72022 12月 9

出版系列

名字Studies in Computational Intelligence
1086 SCI
ISSN(列印)1860-949X
ISSN(電子)1860-9503

會議

會議24th ACIS International Summer Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, SNPD 2022-Winter
國家/地區臺灣
城市Taichung
期間2022/12/072022/12/09

ASJC Scopus subject areas

  • 人工智慧

指紋

深入研究「Reinforcement Learning and Action Space Shaping for a Humanoid Agent in a Highly Dynamic Environment」主題。共同形成了獨特的指紋。

引用此