An Annotated Corpus of Film Dialogue for Learning and Characterizing Character Style

نویسندگان

  • Marilyn A. Walker
  • Grace I. Lin
  • Jennifer Sawyer
چکیده

Interactive story systems often involve dialogue with virtual dramatic characters. However, to date most character dialogue is written by hand. One way to ease the authoring process is to (semi-)automatically generate dialogue based on film characters. We extract features from dialogue of film characters in leading roles. Then we use these character-based features to drive our language generator to produce interesting utterances. This paper describes a corpus of film dialogue that we have collected from the IMSDb archive and annotated for linguistic structures and character archetypes. We extract different sets of features using external sources such as LIWC and SentiWordNet as well as using our own written scripts. The automation of feature extraction also eases the process of acquiring additional film scripts. We briefly show how film characters can be represented by models learned from the corpus, how the models can be distinguished based on different categories such as gender and film genre, and how they can be applied to a language generator to generate utterances that can be perceived as being similar to the intended character model.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Perceived or Not Perceived: Film Character Models for Expressive NLG

This paper presents a method for learning models of character linguistic style from a corpus of film dialogues and tests the method in a perceptual experiment. We apply our method in the context of SpyFeet, a prototype role playing game. In previous work, we used the PERSONAGE engine to produce restaurant recommendations that varied according to the speaker’s personality [14, 12]. Here we show ...

متن کامل

Automatic annotation of context and speech acts for dialogue corpora

Richly annotated dialogue corpora are essential for new research directions in statistical learning approaches to dialogue management, context-sensitive interpretation, and contextsensitive speech recognition. In particular, large dialogue corpora annotated with contextual information and speech acts are urgently required. We explore how existing dialogue corpora (usually consisting of utteranc...

متن کامل

Automatic annotation of context and speech acts for dialogue corpora

Richly annotated dialogue corpora are essential for new research directions in statistical learning approaches to dialogue management, context-sensitive interpretation, and contextsensitive speech recognition. In particular, large dialogue corpora annotated with contextual information and speech acts are urgently required. We explore how existing dialogue corpora (usually consisting of utteranc...

متن کامل

Characterizing the Effectiveness of Tutorial Dialogue with Hidden Markov Models

Identifying effective tutorial dialogue strategies is a key issue for intelligent tutoring systems research. Human-human tutoring offers a valuable model for identifying effective tutorial strategies, but extracting them is a challenge because of the richness of human dialogue. This paper addresses that challenge through a machine learning approach that 1) learns tutorial strategies from a corp...

متن کامل

Dialogue-Learning Correlations in Spoken Dialogue Tutoring

We examine correlations between dialogue characteristics and learning in two corpora of spoken tutoring dialogues: a human-human corpus and a humancomputer corpus, both of which have been manually annotated with dialogue acts relative to the tutoring domain. The results from our human-computer corpus show that the presence of student utterances that display reasoning, as well as the presence of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012