A survey of preference-based reinforcement learning methods

Wirth, Christian, Akrour, Riad, Neumann, Gerhard and Fürnkranz, Johannes (2017) A survey of preference-based reinforcement learning methods. Journal of Machine Learning Research, 18 (136). pp. 1-46. ISSN 1532-4435

Full content URL: http://jmlr.org/papers/v18/16-634.html

16-634.pdf - Whole Document
Available under License Creative Commons Attribution 4.0 International.

Item Type:Article
Item Status:Live Archive


Reinforcement learning (RL) techniques optimize the accumulated long-term reward of a suitably chosen reward function. However, designing such a reward function often requires a lot of task- specific prior knowledge. The designer needs to consider different objectives that do not only influence the learned behavior but also the learning progress. To alleviate these issues, preference-based reinforcement learning algorithms (PbRL) have been proposed that can directly learn from an expert's preferences instead of a hand-designed numeric reward. PbRL has gained traction in recent years due to its ability to resolve the reward shaping problem, its ability to learn from non numeric rewards and the possibility to reduce the dependence on expert knowledge. We provide a unified framework for PbRL that describes the task formally and points out the different design principles that affect the evaluation task for the human as well as the computational complexity. The design principles include the type of feedback that is assumed, the representation that is learned to capture the preferences, the optimization problem that has to be solved as well as how the exploration/exploitation problem is tackled. Furthermore, we point out shortcomings of current algorithms, propose open research questions and briefly survey practical tasks that have been solved using PbRL.

Keywords:Preference-based reinforcement learning
Subjects:G Mathematical and Computer Sciences > G760 Machine Learning
Divisions:College of Science > School of Computer Science
Related URLs:
ID Code:30636
Deposited On:27 Feb 2018 12:38

Repository Staff Only: item control page