The Dialog State Tracking Challenge
نویسندگان
چکیده
In a spoken dialog system, dialog state tracking deduces information about the user’s goal as the dialog progresses, synthesizing evidence such as dialog acts over multiple turns with external data sources. Recent approaches have been shown to overcome ASR and SLU errors in some applications. However, there are currently no common testbeds or evaluation measures for this task, hampering progress. The dialog state tracking challenge seeks to address this by providing a heterogeneous corpus of 15K human-computer dialogs in a standard format, along with a suite of 11 evaluation metrics. The challenge received a total of 27 entries from 9 research groups. The results show that the suite of performance metrics cluster into 4 natural groups. Moreover, the dialog systems that benefit most from dialog state tracking are those with less discriminative speech recognition confidence scores. Finally, generalization is a key problem: in 2 of the 4 test sets, fewer than half of the entries out-performed simple baselines. 1 Overview and motivation Spoken dialog systems interact with users via natural language to help them achieve a goal. As the interaction progresses, the dialog manager maintains a representation of the state of the dialog in a process called dialog state tracking (DST). For example, in a bus schedule information system, the dialog state might indicate the user’s desired bus route, origin, and destination. Dialog state tracking is difficult because automatic speech ∗Most of the work for the challenge was performed when the second and third authors were with Honda Research Institute, Mountain View, CA, USA recognition (ASR) and spoken language understanding (SLU) errors are common, and can cause the system to misunderstand the user’s needs. At the same time, state tracking is crucial because the system relies on the estimated dialog state to choose actions – for example, which bus schedule information to present to the user. Most commercial systems use hand-crafted heuristics for state tracking, selecting the SLU result with the highest confidence score, and discarding alternatives. In contrast, statistical approaches compute scores for many hypotheses for the dialog state (Figure 1). By exploiting correlations between turns and information from external data sources – such as maps, bus timetables, or models of past dialogs – statistical approaches can overcome some SLU errors. Numerous techniques for dialog state tracking have been proposed, including heuristic scores (Higashinaka et al., 2003), Bayesian networks (Paek and Horvitz, 2000; Williams and Young, 2007), kernel density estimators (Ma et al., 2012), and discriminative models (Bohus and Rudnicky, 2006). Techniques have been fielded which scale to realistically sized dialog problems and operate in real time (Young et al., 2010; Thomson and Young, 2010; Williams, 2010; Mehta et al., 2010). In end-to-end dialog systems, dialog state tracking has been shown to improve overall system performance (Young et al., 2010; Thomson and Young, 2010). Despite this progress, direct comparisons between methods have not been possible because past studies use different domains and system components, for speech recognition, spoken language understanding, dialog control, etc. Moreover, there is little agreement on how to evaluate dialog state tracking. Together these issues limit progress in this research area. The Dialog State Tracking Challenge (DSTC) provides a first common testbed and evaluation
منابع مشابه
The Dialog State Tracking Challenge Series
C onversational systems are increasingly becoming a part of daily life, with examples including Apple's Siri, Google Now, Nuance Dragon Go, Xbox, and Cortana from Microsoft, and those from numerous startups. In the core of a conversation system is a key component called a dialog state tracker, which estimates the user's goal given all of the dialog history so far. For example, in a tourist info...
متن کاملThe Dialog State Tracking Challenge Series: A Review
In a spoken dialog system, dialog state tracking refers to the task of correctly inferring the state of the conversation – such as the user’s goal – given all of the dialog history up to that turn. Dialog state tracking is crucial to the success of a dialog system, yet until recently there were no common resources, hampering progress. The Dialog State Tracking Challenge series of 3 tasks introd...
متن کاملExtrinsic Evaluation of Dialog State Tracking and Predictive Metrics for Dialog Policy Optimization
During the recent Dialog State Tracking Challenge (DSTC), a fundamental question was raised: “Would better performance in dialog state tracking translate to better performance of the optimized policy by reinforcement learning?” Also, during the challenge system evaluation, another nontrivial question arose: “Which evaluation metric and schedule would best predict improvement in overall dialog p...
متن کاملThe Second Dialog State Tracking Challenge
A spoken dialog system, while communicating with a user, must keep track of what the user wants from the system at each step. This process, termed dialog state tracking, is essential for a successful dialog system as it directly informs the system’s actions. The first Dialog State Tracking Challenge allowed for evaluation of different dialog state tracking techniques, providing common testbeds ...
متن کاملIntroduction to the Special Issue on Dialogue State Tracking
In the core of most task-oriented conversation systems is a component called a dialog state tracker, which estimates the user’s goal given all of the dialog history so far. For example, in a weather information system, the dialog state might indicate the location the user is interested in (Seattle, London, Beijing), and for which date (today, tomorrow, this Saturday). Dialog state tracking is d...
متن کامل