Daniel, C., van Hoof, H., Peters, J. and Neumann, G. (2016) Probabilistic inference for determining options in reinforcement learning. Machine Learning, 104 (2-3). pp. 337-357. ISSN 0885-6125
Full content URL: http://doi.org/10.1007/s10994-016-5580-x
Documents |
|
|
![]() |
PDF
art%3A10.1007%2Fs10994-016-5580-x.pdf - Whole Document Restricted to Repository staff only 1MB | |
|
PDF
25739 Daniel2016ECML.pdf - Whole Document 521kB |
Item Type: | Article |
---|---|
Item Status: | Live Archive |
Abstract
Tasks that require many sequential decisions or complex solutions are hard to solve using conventional reinforcement learning algorithms. Based on the semi Markov decision process setting (SMDP) and the option framework, we propose a model which aims to alleviate these concerns. Instead of learning a single monolithic policy, the agent learns a set of simpler sub-policies as well as the initiation and termination probabilities for each of those sub-policies. While existing option learning algorithms frequently require manual specification of components such as the sub-policies, we present an algorithm which infers all relevant components of the option framework from data. Furthermore, the proposed approach is based on parametric option representations and works well in combination with current policy search methods, which are particularly well suited for continuous real-world tasks. We present results on SMDPs with discrete as well as continuous state-action spaces. The results show that the presented algorithm can combine simple sub-policies to solve complex tasks and can improve learning performance on simpler tasks.
Keywords: | Hierarchical Reinforcement Learning, Option Discovery, Semi-Markov Decision Processes, NotOAChecked |
---|---|
Subjects: | G Mathematical and Computer Sciences > G760 Machine Learning |
Divisions: | College of Science > School of Computer Science |
Related URLs: | |
ID Code: | 25739 |
Deposited On: | 17 Jan 2017 16:21 |
Repository Staff Only: item control page