Sample-based information-theoretic stochastic optimal control

Lioutikov, R. and Paraschos, A. and Peters, J. and Neumann, G. (2014) Sample-based information-theoretic stochastic optimal control. In: Proceedings of 2014 IEEE International Conference on Robotics and Automation, 31 May - 7 June 2014, Hong Kong.

Documents
Lioutikov_ICRA_2014.pdf
[img]
[Download]
[img]
Preview
PDF
Lioutikov_ICRA_2014.pdf - Whole Document

2MB
Item Type:Conference or Workshop contribution (Paper)
Item Status:Live Archive

Abstract

Many Stochastic Optimal Control (SOC) approaches
rely on samples to either obtain an estimate of the
value function or a linearisation of the underlying system model.
However, these approaches typically neglect the fact that the
accuracy of the policy update depends on the closeness of the
resulting trajectory distribution to these samples. The greedy
operator does not consider such closeness constraint to the
samples. Hence, the greedy operator can lead to oscillations
or even instabilities in the policy updates. Such undesired
behaviour is likely to result in an inferior performance of the
estimated policy. We reuse inspiration from the reinforcement
learning community and relax the greedy operator used in SOC
with an information theoretic bound that limits the ‘distance’ of
two subsequent trajectory distributions in a policy update. The
introduced bound ensures a smooth and stable policy update.
Our method is also well suited for model-based reinforcement
learning, where we estimate the system dynamics model from
data. As this model is likely to be inaccurate, it might be
dangerous to exploit the model greedily. Instead, our bound
ensures that we generate new data in the vicinity of the current
data, such that we can improve our estimate of the system
dynamics model. We show that our approach outperforms
several state of the art approaches on challenging simulated
robot control tasks.

Keywords:Stochastic Optimal Control
Subjects:G Mathematical and Computer Sciences > G760 Machine Learning
Divisions:College of Science > School of Computer Science
Related URLs:
ID Code:25771
Deposited On:05 Apr 2017 09:43

Repository Staff Only: item control page