Training a real-world POMDP-based Dialogue System

نویسندگان

  • Blaise Thomson
  • Jost Schatzmann
  • Karl Weilhammer
  • Hui Ye
  • Steve Young
چکیده

Partially Observable Markov Decision Processes provide a principled way to model uncertainty in dialogues. However, traditional algorithms for optimising policies are intractable except for cases with very few states. This paper discusses a new approach to policy optimisation based on grid-based Q-learning with a summary of belief space. We also present a technique for bootstrapping the system using a novel agenda-based user model. An implementation of a policy trained using this system was tested with human subjects in an extensive trial. The policy gave highly competitive results, with a 90.6% task completion rate.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Gaussian processes for POMDP-based dialogue manager optimisation

A partially observable Markov decision process (POMDP) has been proposed as a dialogue model that enables automatic optimisation of the dialogue policy and provides robustness to speech understanding errors. Various approximations allow such a model to be used for building realworld dialogue systems. However, they require a large number of dialogues to train the dialogue policy and hence they t...

متن کامل

Bayesian update of dialogue state: A POMDP framework for spoken dialogue systems

This paper describes a statistically motivated framework for performing real-time dialogue state updates and policy learning in a spoken dialogue system. The framework is based on the partially observable Markov decision process (POMDP), which provides a well-founded, statistical model of spoken dialogue management. However, exact belief state updates in a POMDP model are computationally intrac...

متن کامل

Training and Evaluation of the HIS POMDP Dialogue System in Noise

This paper investigates the claim that a dialogue manager modelled as a Partially Observable Markov Decision Process (POMDP) can achieve improved robustness to noise compared to conventional state-based dialogue managers. Using the Hidden Information State (HIS) POMDP dialogue manager as an exemplar, and an MDP-based dialogue manager as a baseline, evaluation results are presented for both simu...

متن کامل

Gaussian Processes for Fast Policy Optimisation of POMDP-based Dialogue Managers

Modelling dialogue as a Partially Observable Markov Decision Process (POMDP) enables a dialogue policy robust to speech understanding errors to be learnt. However, a major challenge in POMDP policy learning is to maintain tractability, so the use of approximation is inevitable. We propose applying Gaussian Processes in Reinforcement learning of optimal POMDP dialogue policies, in order (1) to m...

متن کامل

Agenda-Based User Simulation for Bootstrapping a POMDP Dialogue System

This paper investigates the problem of bootstrapping a statistical dialogue manager without access to training data and proposes a new probabilistic agenda-based method for simulating user behaviour. In experiments with a statistical POMDP dialogue system, the simulator was realistic enough to successfully test the prototype system and train a dialogue policy. An extensive study with human subj...

متن کامل

On-Line Learning of a Persian Spoken Dialogue System Using Real Training Data

The first spoken dialogue system developed for the Persian language is introduced. This is a ticket reservation system with Persian ASR and NLU modules. The focus of the paper is on learning the dialogue management module. In this work, real on-line training data are used during the learning process. For on-line learning, the effect of the variations of discount factor (g) on the learning speed...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007