The PIT Corpus of German Multi-Party Dialogues
نویسندگان
چکیده
The PIT corpus is a German multi-media corpus of multi-party dialogues recorded in a Wizard-of-Oz environment at the University of Ulm. The scenario involves two human dialogue partners interacting with a multi-modal dialogue system in the domain of restaurant selection. In this paper we present the characteristics of the data which was recorded in three sessions resulting in a total of 75 dialogues and about 14 hours of audio and video data. The corpus is available at http://www.uni-ulm.de/in/pit.
منابع مشابه
Discourse Structure and Dialogue Acts in Multiparty Dialogue: the STAC Corpus
This paper describes the STAC resource, a corpus of multi-party chats annotated for discourse structure in the style of SDRT (Asher and Lascarides, 2003; Lascarides and Asher, 2009). The main goal of the STAC project is to study the discourse structure of multi-party dialogues in order to understand the linguistic strategies adopted by interlocutors to achieve their conversational goals, especi...
متن کاملThe Teams Corpus and Entrainment in Multi-Party Spoken Dialogues
When interacting individuals entrain, they begin to speak more like each other. To support research on entrainment in cooperative multi-party dialogues, we have created a corpus where teams of three or four speakers play two rounds of a cooperative board game. We describe the experimental design and technical infrastructure used to collect our corpus, which consists of audio, video, transcripti...
متن کاملA corpus for studying addressing behavior in multi-party dialogues
This paper describes a multi-modal corpus of hand-annotated meeting dialogues that was designed for studying addressing behavior in face-to-face conversations. The corpus contains annotated dialogue acts, addressees, adjacency pairs and gaze direction. First, we describe the corpus design where we present the annotation schema, annotation tools and annotation process itself. Then, we analyze th...
متن کاملTerm-Weighting for Summarization of Multi-party Spoken Dialogues
This paper explores the issue of term-weighting in the genre of spontaneous, multi-party spoken dialogues, with the intent of using such term-weights in the creation of extractive meeting summaries. The field of text information retrieval has yielded many term-weighting techniques to import for our purposes; this paper implements and compares several of these, namely tf.idf, Residual IDF and Ga...
متن کاملEvaluation of the PIT Corpus Or What a Difference a Face Makes?
This paper presents the evaluation of the PIT Corpus of multi-party dialogues recorded in a Wizard-of-Oz environment. An evaluation has been performed with two different foci: First, a usability evaluation was used to take a look at the overall ratings of the system. A shortened version of the SASSI questionnaire, namely the SASSISV, and the well established AttrakDiff questionnaire assessing t...
متن کامل