Hierarchical relative entropy policy search

Daniel, Christian, Neumann, Gerhard and Peters, Jan (2012) Hierarchical relative entropy policy search. In: Proceedings of the 15th International Conference on Artificial Intelligence and Statistics (AISTATS) 2012, 21 - 23 April 2012, La Palma, Canary Islands.

Full content URL: http://jmlr.csail.mit.edu/proceedings/papers/v22/d...

Documents
AISTATS-2012-Daniel.pdf
[img]
[Download]
[img]
Preview
PDF
AISTATS-2012-Daniel.pdf

1MB
Item Type:Conference or Workshop contribution (Paper)
Item Status:Live Archive

Abstract

Many real-world problems are inherently hierarchically
structured. The use of this structure
in an agent’s policy may well be the
key to improved scalability and higher performance.
However, such hierarchical structures
cannot be exploited by current policy
search algorithms. We will concentrate on
a basic, but highly relevant hierarchy — the
‘mixed option’ policy. Here, a gating network
first decides which of the options to execute
and, subsequently, the option-policy determines
the action.
In this paper, we reformulate learning a hierarchical
policy as a latent variable estimation
problem and subsequently extend the
Relative Entropy Policy Search (REPS) to
the latent variable case. We show that our
Hierarchical REPS can learn versatile solutions
while also showing an increased performance
in terms of learning speed and quality
of the found policy in comparison to the nonhierarchical
approach.

Keywords:Hierarchical Learning, Reinforcement Learning, Policy Search
Subjects:G Mathematical and Computer Sciences > G760 Machine Learning
Divisions:College of Science > School of Computer Science
Related URLs:
ID Code:25791
Deposited On:24 Feb 2017 09:46

Repository Staff Only: item control page