Learning of non-parametric control policies with high-dimensional state features

Van Hoof, Herke, Peters, Jan and Neumann, Gerhard (2015) Learning of non-parametric control policies with high-dimensional state features. Journal of Machine Learning Research: Workshop and Conference Proceedings, 38 . pp. 995-1003. ISSN 1532-4435

Full content URL: http://www.jmlr.org/proceedings/papers/v38/vanhoof...

Documents
vanhoof15.pdf
[img]
[Download]
[img]
Preview
PDF
vanhoof15.pdf - Whole Document

524kB
Item Type:Article
Item Status:Live Archive

Abstract

Learning complex control policies from highdimensional sensory input is a challenge for
reinforcement learning algorithms. Kernel methods that approximate values functions
or transition models can address this problem. Yet, many current approaches rely on
instable greedy maximization. In this paper, we develop a policy search algorithm that
integrates robust policy updates and kernel embeddings. Our method can learn nonparametric
control policies for infinite horizon continuous MDPs with high-dimensional
sensory representations. We show that our method outperforms related approaches, and
that our algorithm can learn an underpowered swing-up task task directly from highdimensional
image data.

Additional Information:Proceedings of the 18th International Conference on Artificial Intelligence and Statistics (AISTATS), 9-12 May 2015, San Diego, CA,
Keywords:Reinforcement Learning, Non-Parametric, Information Theory, JCOpen
Subjects:G Mathematical and Computer Sciences > G760 Machine Learning
Divisions:College of Science > School of Computer Science
Related URLs:
ID Code:25757
Deposited On:24 Feb 2017 09:58

Repository Staff Only: item control page