Hierarchical relative entropy policy search

Daniel, C., Neumann, G., Kroemer, O. and Peters, J. (2016) Hierarchical relative entropy policy search. Journal of Machine Learning Research, 17 . pp. 1-50. ISSN 1532-4435

Full content URL: http://jmlr.org/papers/volume17/15-188/15-188.pdf

15-188.pdf - Whole Document

Item Type:Article
Item Status:Live Archive


Many reinforcement learning (RL) tasks, especially in robotics, consist of multiple sub-tasks that
are strongly structured. Such task structures can be exploited by incorporating hierarchical policies
that consist of gating networks and sub-policies. However, this concept has only been partially explored
for real world settings and complete methods, derived from first principles, are needed. Real
world settings are challenging due to large and continuous state-action spaces that are prohibitive
for exhaustive sampling methods. We define the problem of learning sub-policies in continuous
state action spaces as finding a hierarchical policy that is composed of a high-level gating policy to
select the low-level sub-policies for execution by the agent. In order to efficiently share experience
with all sub-policies, also called inter-policy learning, we treat these sub-policies as latent variables
which allows for distribution of the update information between the sub-policies. We present three
different variants of our algorithm, designed to be suitable for a wide variety of real world robot
learning tasks and evaluate our algorithms in two real robot learning scenarios as well as several
simulations and comparisons.

Keywords:Policy Search, Reinforcement Learning, Hierarchical Learning, JCOpen
Subjects:G Mathematical and Computer Sciences > G760 Machine Learning
Divisions:College of Science > School of Computer Science
Related URLs:
ID Code:25743
Deposited On:17 Jan 2017 16:14

Repository Staff Only: item control page