Model-free trajectory optimization for reinforcement learning

Akrour, R. and Abdolmaleki, A. and Abdulsamad, H. and Neumann, G. (2016) Model-free trajectory optimization for reinforcement learning. In: Proceedings of the International Conference on Machine Learning (ICML), 19 - 24 June 2016, New York.

Documents
full_moto_16.pdf
[img]
[Download]
[img]
Preview
PDF
full_moto_16.pdf - Whole Document

1MB
Item Type:Conference or Workshop contribution (Paper)
Item Status:Live Archive

Abstract

Many of the recent Trajectory Optimization algorithms
alternate between local approximation
of the dynamics and conservative policy update.
However, linearly approximating the dynamics
in order to derive the new policy can bias the update
and prevent convergence to the optimal policy.
In this article, we propose a new model-free
algorithm that backpropagates a local quadratic
time-dependent Q-Function, allowing the derivation
of the policy update in closed form. Our policy
update ensures exact KL-constraint satisfaction
without simplifying assumptions on the system
dynamics demonstrating improved performance
in comparison to related Trajectory Optimization
algorithms linearizing the dynamics.

Keywords:Trajectory Optimization, Reinforcement Learning
Subjects:G Mathematical and Computer Sciences > G760 Machine Learning
Divisions:College of Science > School of Computer Science
Related URLs:
ID Code:25747
Deposited On:31 Mar 2017 15:16

Repository Staff Only: item control page