Deep reinforcement learning of dialogue policies with less weight updates

Cuayahuitl, Heriberto and Yu, Seunghak (2017) Deep reinforcement learning of dialogue policies with less weight updates. In: International Conference of the Speech Communication Association (INTERSPEECH), 20-14 August 2017, Stockholm, Sweden.

Documents
multids-interspeech2017.pdf
[img]
[Download]
[img]
Preview
PDF
multids-interspeech2017.pdf - Whole Document

711kB
Item Type:Conference or Workshop contribution (Paper)
Item Status:Live Archive

Abstract

Deep reinforcement learning dialogue systems are attractive because they can jointly learn their feature representations and policies without manual feature engineering. But its application is challenging due to slow learning. We propose a two-stage method for accelerating the induction of single or multi-domain dialogue policies. While the first stage reduces the amount of weight updates over time, the second stage uses very limited minibatches (of as much as two learning experiences) sampled from experience replay memories. The former frequently updates the weights of the neural nets at early stages of training, and decreases the amount of updates as training progresses by performing updates during exploration and by skipping updates during exploitation. The learning process is thus accelerated
through less weight updates in both stages. An empirical evaluation in three domains (restaurants, hotels and tv guide) confirms that the proposed method trains policies 5 times faster than a baseline without the proposed method. Our findings are useful for training larger-scale neural-based spoken dialogue systems.

Keywords:spoken dialogue systems, deep reinforcement learning, multi-domain dialogue management
Subjects:G Mathematical and Computer Sciences > G700 Artificial Intelligence
G Mathematical and Computer Sciences > G760 Machine Learning
G Mathematical and Computer Sciences > G710 Speech and Natural Language Processing
Divisions:College of Science > School of Computer Science
Related URLs:
ID Code:27676
Deposited On:16 Jun 2017 11:00

Repository Staff Only: item control page