Lopez Zorrilla, Asier, Torres, M. Ines and Cuayahuitl, Heriberto (2022) Audio Embedding-Aware Dialogue Policy Learning. IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, 31 . pp. 525-538. ISSN 1558-7916, 2329-9290
Full content URL: https://doi.org/10.1109/TASLP.2022.3225658
Documents |
|
|
PDF
Audio_Embedding-Aware_Dialogue_Policy_Learning.pdf 14MB |
Item Type: | Article |
---|---|
Item Status: | Live Archive |
Abstract
Following the success of Natural Language Processing (NLP) transformers pretrained via self-supervised learning, similar models have been proposed recently for speech processing such as Wav2Vec2, HuBERT and UniSpeech-SAT. An interesting yet unexplored area of application of these models is Spoken Dialogue Systems, where the users’ audio signals are typically just mapped to word-level features derived from an Automatic Speech Recogniser (ASR), and then processed using NLP techniques to generate system responses. This paper reports a comprehensive comparison of dialogue policies trained using ASR-based transcriptions and extended with the aforementioned audio processing transformers in the DSTC2 task. Whilst our dialogue policies are trained with supervised and policy-based deep reinforcement learning, they are assessed using both automatic task completion metrics and a human evaluation. Our results reveal that using audio embeddings is more beneficial than detrimental in most of our trained dialogue policies, and that the benefits are stronger for supervised learning than reinforcement learning.
Keywords: | spoken dialogue systems, audio embeddings, transformer neural networks, deep reinforcement learning |
---|---|
Subjects: | G Mathematical and Computer Sciences > G700 Artificial Intelligence G Mathematical and Computer Sciences > G710 Speech and Natural Language Processing G Mathematical and Computer Sciences > G760 Machine Learning |
Divisions: | College of Science > School of Computer Science |
ID Code: | 52689 |
Deposited On: | 16 Jan 2023 15:42 |
Repository Staff Only: item control page