Classifying Speech Acts using Multi-channel Deep Attention Network for Task-oriented Conversational Search Agents

Publication Date

3-14-2021

Document Type

Conference Proceeding

Publication Title

CHIIR 2021 - Proceedings of the 2021 Conference on Human Information Interaction and Retrieval

DOI

10.1145/3406522.3446057

First Page

267

Last Page

272

Abstract

Understanding human spoken dialogues in an information-seeking scenario is a significant challenge for IR researchers. Prior literature in intelligent systems suggests that by identifying speech acts in spoken dialogues, we can identify the search intent and the information needs of the user. Therefore, in this paper, we have used speech acts to address the problem of natural language understanding in conversational search systems. First, we collected human-system interaction data through a Wizard-of-Oz study. Next, we developed a gold-standard dataset where the human-system conversations are labeled with corresponding speech acts. Finally, we built the Multi-channel Deep Attention Network (MDAN) to identify the speech acts in information-seeking dialogues. The results highlight that the best performing model predicts speech acts with 90.2% accuracy. The MDAN architecture outperforms not only all traditional machine learning models but also the state-of-the-art single-channel BERT by 3.3 absolute points. We performed ablation analysis to show the impact of the three channels of MDAN individually and in combination. The results indicate that the best performance is achieved using all three channels for speech act prediction.

Keywords

conversational information retrieval, conversational search systems, deep neural network, intelligent personal assistants, speech acts, spoken search

Department

Information

Share

COinS