Challenges in Building Highly Interactive Dialogue Systems
نویسندگان
چکیده
Over the past decade, dialogue researchers have built several systems that go beyond rigid turn-taking and robotic interactions. In research-driven systems, we are beginning to see demonstrations of humanlike sensitivity to the user’s behavior and state, swift and natural timing, and appropriately tailored behaviors. We are learning that the careful design of interactive skills in our systems can lead to improvements in naturalness, efficiency, feelings of rapport, and task-related outcomes. Research systems are providing a vision of what is possible. However much work remains before such abilities are robust, widely useful, and generally available. This article identifies 10 key challenges, relating to modeling, systems architecture, and development methods. Of pressing importance for dialogue systems, these challenges are also relevant for intelligent and interactive systems more generally. Given Siri’s broad deployment and popular salience, one might imagine that it solved the problems of interacting in dialogue: we often meet people who are unaware how cleverly Siri and her sisters avoid dialogue. While they do use speech, their preferred interaction style is to map one user input to one system output, avoiding any of that messy interaction stuff. While this helps them do well on simple tasks such as command recognition or question answering, many user needs are too complex to address in a single input-output exchange. These systems are missing half the promise of speechbased interaction. We envision the creation of much more highly interactive systems. In broad strokes, these systems will be characterized by low latency and natural timing, a deft sensitivity to the multifunctional nature of communication, and flexibility about how any given interaction unfolds. Their skill with interaction timing will be manifest in the way they are attuned to and continuously respond to their users with an array of real-time communicative signals. Their skill at understanding the multifunctional effects of utterances will mean, for example, that they decide not only what dialogue act to perform, but also with what prosody and nonverbal behavior, with what expected effects on the turntaking state, and with what expected social effects such as implied attitudes, emotions, and potential for rapport building. Their flexibility about the interaction itself will mean users feel less constrained by obvious limitations in turn-taking protocols, supported dialogue flows, and expected speech patterns. The structure of their interactions will emerge more as a creative process than as a simple instantiation of a preplanned interaction template. As we develop the technology to support such interactive skills, we believe dialogue will become the interface of choice for a much broader range of applications. The challenges in providing such interactivity are many. Our survey here is based on our experiences as researchers and developers, and on analysis of other recent advances in spoken dialogue systems, intelligent virtual agents, and human-robot interaction, including Gratch et al. (2007); DeVault, Sagae, and Traum (2009); Bohus and Horvitz (2011); Forbes-Riley and Litman (2011); Acosta and Ward (2011); Raux and Eskenazi (2012); Andrist, Mutlu, and Gleicher (2013); Meena, Skantze, and Gustafson (2014); Skantze, Hjalmarsson, and Oertel (2014); Ghigi et al. (2014); and Paetzel, Manuvinakurike, and DeVault (2015).
منابع مشابه
Challenges in Building Highly-Interactive Dialog Systems
Spoken dialog researchers have recently demonstrated highly-interactive systems in several domains. This paper considers how to build on these advances to make systems more robust, easier to develop, and more scientifically significant. We identify key challenges whose solution would lead to improvements in dialog systems and beyond.
متن کاملHow was your day? An architecture for multimodal ECA systems
Multimodal conversational dialogue systems consisting of numerous software components create challenges for the underlying software architecture and development practices. Typically, such systems are built on separate, often preexisting components developed by different organizations and integrated in a highly iterative way. The traditional dialogue system pipeline is not flexible enough to add...
متن کاملDialogue Management as Interactive Tree Building
We introduce a new dialogue model and a formalism for limited-domain dialogue systems, which works by interactively building dialogue trees. The model borrows its fundamental ideas from type theoretical grammars and Dynamic Syntax. The resulting dialogue theory is a simple and light-weight formalism, which is still capable of advanced dialogue behaviour.
متن کاملSupporting K-5 Learners with Dialogue Systems
Interactive learning environments have been built to support various audiences from preschool to university students. However, it is not yet known how to bring the great promise of tutorial dialogue systems, which engage students in rich natural language, to bear for young learners such as those in grades K-5. This doctoral consortium paper presents our goal of developing a dialogue system in t...
متن کاملFrom single word to natural dialogue
Spoken language dialogue systems represent the peak of achievement in speech technologies in the 20 th century and appear set to form the basis for the increasingly natural interactive systems to follow in the coming decades. This chapter first presents a model of the task-oriented spoken dialogue system, its multiple aspects and some of the remaining research challenges. In the context of this...
متن کامل