KPT: Keyword-Guided Pre-training for Grounded Dialog Generation
نویسندگان
چکیده
Incorporating external knowledge into the response generation process is essential to building more helpful and reliable dialog agents. However, collecting knowledge-grounded conversations often costly, calling for a better pre-trained model grounded that generalizes well w.r.t. different types of knowledge. In this work, we propose KPT (Keyword-guided Pre-Training), novel self-supervised pre-training method without relying on extra annotation. Specifically, use language extract most uncertain tokens in as keywords. With these keywords, construct two kinds pre-train model, aiming at handling scenarios: (1) should be faithfully grounded; (2) it can selectively used. For former, grounding consists keywords extracted from response. latter, additionally augmented with other utterances same dialog. Since itself, easily performed large volume variety dialogue data. We considered three data sources (open-domain, task-oriented, conversational QA) total 2.5M dialogues. conduct extensive experiments various few-shot tasks, including acts, graphs, persona descriptions, Wikipedia passages. Our comprehensive analyses demonstrate consistently outperforms state-of-the-art methods tasks diverse
منابع مشابه
Keyword Generation for Lyrics
This paper proposes a scheme for content based keyword generation of song lyrics. Syntactic as well semantic similarity is used for sentence level clustering to separate the topic from the background of a song. A method is proposed to search for a center in the semantic graph ofWordNet for generating keywords not contained in original text.
متن کاملDialog Annotation for Stochastic Generation
Individuals who successfully make their livelihood by talking with others, for example travel agents, can be presumed to have optimized their language for the task at hand in terms of conciseness and intelligibility. It makes sense to exploit this effort for the purpose of building better generation components for a spoken dialog system. The Stochastic Generation technique, introduced by Oh and...
متن کاملA Conditional Variational Framework for Dialog Generation
Deep latent variable models have been shown to facilitate the response generation for open-domain dialog systems. However, these latent variables are highly randomized, leading to uncontrollable generated responses. In this paper, we propose a framework allowing conditional response generation based on specific attributes. These attributes can be either manually assigned or automatically detect...
متن کاملText Generation Methods for Dialog Systems
Text generation systems are typically more powerful than generation components of dialog systems. In order to exploit their advanced capabilities for dialog purposes, we discuss the extension potential of NL generation components of dialog systems on the basis of methods embedded in text generation system. We investigate architectural concerns and crucial system features in a comparison, and we...
متن کاملVisually Grounded Learning of Keyword Prediction from Untranscribed Speech
During language acquisition, infants have the benefit of visual cues to ground spoken language. Robots similarly have access to audio and visual sensors. Recent work has shown that images and spoken captions can be mapped into a meaningful common space, allowing images to be retrieved using speech and vice versa. In this setting of images paired with untranscribed spoken captions, we consider w...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2023
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v37i11.26646