REX-J: Japanese referring expression corpus of situated dialogs

نویسندگان

  • Philipp Spanger
  • Masaaki Yasuhara
  • Ryu Iida
  • Takenobu Tokunaga
  • Asuka Terai
  • Naoko Kuriyama
چکیده

Identifying objects in conversation is a fundamental human capability necessary to achieve efficient collaboration on any real world task. Hence the deepening of our understanding of human referential behaviour is indispensable for the creation of systems that collaborate with humans in a meaningful way. We present the construction of REX-J, a multi-modal Japanese corpus of referring expressions in situated dialogs, based on the collaborative task of solving the Tangram puzzle. This corpus contains 24 dialogs with over 4 hours of recordings and over 1400 referring expressions. We outline the characteristics of the collected data and point out the important differences from previous corpora. The corpus records extra-linguistic information during the interaction (e.g. the position of pieces, the actions on the pieces) in synchronization with the participants’ utterances. This in turn allows us to discuss the importance of creating a unified model of linguistic and extralinguistic information from a new perspective. Demonstrating the potential uses of this corpus, we present the analysis of a specific type of referring expression (“action-mentioning expression”) as well as the results of research into the generation of demonstrative pronouns. Furthermore, we discuss some perspectives on potential uses of this corpus as well as our planned future work, underlining how it is a valuable addition to the existing databases in the community for the study and modeling of referring expressions in situated dialog.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

First Workshop on Language Resources and Technologies for Turkic Languages Workshop Programme

In this paper we report on the preliminary findings of our ongoing study on Turkish referring expressions used in situated dialogs. Situated dialogs of pairs of Turkish speakers were collected while they were engaged with a collaborative Tangram puzzle solving task, which was designed by Spanger et al (2011) in an effort to build a corpus of referring expressions in Japanese and English. The pa...

متن کامل

Towards an Extrinsic Evaluation of Referring Expressions in Situated Dialogs

In the field of referring expression generation, while in the static domain both intrinsic and extrinsic evaluations have been considered, extrinsic evaluation in the dynamic domain, such as in a situated collaborative dialog, has not been discussed in depth. In a dynamic domain, a crucial problem is that referring expressions do not make sense without an appropriate preceding dialog context. I...

متن کامل

A Japanese Corpus of Referring Expressions Used in a Situated Collaboration Task

In order to pursue research on generating referring expressions in a situated collaboration task, we set up a data-collection experiment based on the Tangram puzzle. For a pair of participants we recorded every utterance in synchronisation with the current state of the puzzle as well as all operations by the participants. Referring expressions were annotated with their referents in order to bui...

متن کامل

SCARE: a Situated Corpus with Annotated Referring Expressions

In this paper we report on the release of a corpus of English spontaneous instruction giving situated dialogs. The corpus was collected using the Quake environment, a first-person virtual reality game, and consists of pairs of participants completing a direction giver-direction follower scenario. The corpus contains the collected audio and video, as well as word-aligned transcriptions and the p...

متن کامل

The REX corpora: A collection of multimodal corpora of referring expressions in collaborative problem solving dialogues

This paper describes a collection of multimodal corpora of referring expressions, the REX corpora. The corpora have two notable features, namely (1) they include time-aligned extra-linguistic information such as participant actions and eye-gaze on top of linguistic information, (2) dialogues were collected with various configurations in terms of the puzzle type, hinting and language. After desc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Language Resources and Evaluation

دوره 46  شماره 

صفحات  -

تاریخ انتشار 2012