Learning locomotion skills via model-based proximal meta-reinforcement learning

Qing Xiao, Zhengcai Cao, Mengchu Zhou

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Scopus citations

Abstract

Model-based reinforcement learning methods provide a promising direction for a range of automated applications, such as autonomous vehicles and legged robots, due to their sample-efficiency. However, their asymptotic performance is usually inferior compared to the state-of-the-art model-free reinforcement learning methods in locomotion control domains. One main challenge of model-based reinforcement learning is learning a dynamics model that is accurate enough for planning. This paper mitigates this issue by meta-reinforcement learning from an ensemble of dynamics models. A policy learns from dynamics models that hold different beliefs of a real environment. This procedure improves its adaptability and inaccuracy-tolerance ability. A proximal meta-reinforcement learning algorithm is introduced to improve computational efficiency and reduces variance of higher-order gradient estimation. A heteroscedastic noise is added to the training dataset, thus leading to a robust and efficient model learning. Subsequently, proximal meta-reinforcement learning maximizes the expected returns by sampling 'imaginary' trajectories from the learned dynamics, which does not require real environment data and can be deployed on many servers in parallel to speed up the whole learning process. The aim of this work is to reduce the sample-complexity and computational cost of reinforcement learning in robot locomotion tasks. Simulation experiments show that the proposed algorithm achieves an asymptotic performance compared with the state-of-the-art model-free reinforcement learning methods with significantly fewer samples, which confirm our theoretical results.

Original languageEnglish (US)
Title of host publication2019 IEEE International Conference on Systems, Man and Cybernetics, SMC 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1545-1550
Number of pages6
ISBN (Electronic)9781728145693
DOIs
StatePublished - Oct 2019
Event2019 IEEE International Conference on Systems, Man and Cybernetics, SMC 2019 - Bari, Italy
Duration: Oct 6 2019Oct 9 2019

Publication series

NameConference Proceedings - IEEE International Conference on Systems, Man and Cybernetics
Volume2019-October
ISSN (Print)1062-922X

Conference

Conference2019 IEEE International Conference on Systems, Man and Cybernetics, SMC 2019
Country/TerritoryItaly
CityBari
Period10/6/1910/9/19

All Science Journal Classification (ASJC) codes

  • Electrical and Electronic Engineering
  • Control and Systems Engineering
  • Human-Computer Interaction

Fingerprint

Dive into the research topics of 'Learning locomotion skills via model-based proximal meta-reinforcement learning'. Together they form a unique fingerprint.

Cite this