Deisenroth, M. P., Neumann, G. and Peters, J.
(2013)
A survey on policy search for robotics.
Foundations and Trends in Robotics, 2
(1-2).
pp. 388-403.
ISSN 1935-8253
![[img]](http://eprints.lincoln.ac.uk/28029/1.hassmallThumbnailVersion/PolicySearchReview.pdf)  Preview |
|
PDF
PolicySearchReview.pdf
- Whole Document
2MB |
Item Type: | Article |
---|
Item Status: | Live Archive |
---|
Abstract
Policy search is a subfield in reinforcement learning which focuses on
finding good parameters for a given policy parametrization. It is well
suited for robotics as it can cope with high-dimensional state and action
spaces, one of the main challenges in robot learning. We review recent
successes of both model-free and model-based policy search in robot
learning.
Model-free policy search is a general approach to learn policies
based on sampled trajectories. We classify model-free methods based on
their policy evaluation strategy, policy update strategy, and exploration
strategy and present a unified view on existing algorithms. Learning a
policy is often easier than learning an accurate forward model, and,
hence, model-free methods are more frequently used in practice. However,
for each sampled trajectory, it is necessary to interact with the
* Both authors contributed equally.
robot, which can be time consuming and challenging in practice. Modelbased
policy search addresses this problem by first learning a simulator
of the robot’s dynamics from data. Subsequently, the simulator generates
trajectories that are used for policy learning. For both modelfree
and model-based policy search methods, we review their respective
properties and their applicability to robotic systems.
Repository Staff Only: item control page