Robust policy updates for stochastic optimal control

Rueckert, E., Mindt, M., Peters, J. and Neumann, G. (2014) Robust policy updates for stochastic optimal control. In: Humanoid Robots (Humanoids), 2014 14th IEEE-RAS International Conference on, 18 - 20 November 2014, Madrid, Spain.

AICOHumanoidsFinal.pdf - Whole Document

Item Type:Conference or Workshop contribution (Paper)
Item Status:Live Archive


For controlling high-dimensional robots, most stochastic optimal control algorithms use approximations of the system dynamics and of the cost function (e.g., using linearizations and Taylor expansions). These approximations are typically only locally correct, which might cause instabilities in the greedy policy updates, lead to oscillations or the algorithms diverge. To overcome these drawbacks, we add a regularization term to the cost function that punishes large policy update steps in the trajectory optimization procedure. We applied this concept to the Approximate Inference Control method (AICO), where the resulting algorithm guarantees convergence for uninformative initial solutions without complex hand-tuning of learning rates. We evaluated our new algorithm on two simulated robotic platforms. A robot arm with five joints was used for reaching multiple targets while keeping the roll angle constant. On the humanoid robot Nao, we show how complex skills like reaching and balancing can be inferred from desired center of gravity or end effector coordinates.

Keywords:Stochastic Optimal Control
Subjects:H Engineering > H660 Control Systems
Divisions:College of Science > School of Computer Science
Related URLs:
ID Code:25754
Deposited On:02 Feb 2017 16:38

Repository Staff Only: item control page