Tangkaratt, V., van Hoof, H., Parisi, S. et al, Neumann, G., Peters, J. and Sugiyama, M.
(2017)
Policy search with high-dimensional context variables.
In: AAAI Conference on Artificial Intelligence (AAAI), 4 - 9 February 2017, San Francisco, California, USA.
Full content URL: http://www.ausy.tu-darmstadt.de/uploads/Site/EditP...
![[img]](http://eprints.lincoln.ac.uk/26740/1.hassmallThumbnailVersion/tangkaratt2017policy.pdf)  Preview |
|
PDF
tangkaratt2017policy.pdf
- Whole Document
365kB |
Item Type: | Conference or Workshop contribution (Paper) |
---|
Item Status: | Live Archive |
---|
Abstract
Direct contextual policy search methods learn to improve policy
parameters and simultaneously generalize these parameters
to different context or task variables. However, learning
from high-dimensional context variables, such as camera images,
is still a prominent problem in many real-world tasks.
A naive application of unsupervised dimensionality reduction
methods to the context variables, such as principal component
analysis, is insufficient as task-relevant input may be ignored.
In this paper, we propose a contextual policy search method in
the model-based relative entropy stochastic search framework
with integrated dimensionality reduction. We learn a model of
the reward that is locally quadratic in both the policy parameters
and the context variables. Furthermore, we perform supervised
linear dimensionality reduction on the context variables
by nuclear norm regularization. The experimental results
show that the proposed method outperforms naive dimensionality
reduction via principal component analysis and
a state-of-the-art contextual policy search method.
Repository Staff Only: item control page