Using Discourse Information for Paraphrase Extraction

نویسندگان

  • Michaela Regneri
  • Rui Wang
چکیده

Previous work on paraphrase extraction using parallel or comparable corpora has generally not considered the documents’ discourse structure as a useful information source. We propose a novel method for collecting paraphrases relying on the sequential event order in the discourse, using multiple sequence alignment with a semantic similarity measure. We show that adding discourse information boosts the performance of sentence-level paraphrase acquisition, which consequently gives a tremendous advantage for extracting phraselevel paraphrase fragments from matched sentences. Our system beats an informed baseline by a margin of 50%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Squibs: On Paraphrase and Coreference

Paraphrase extraction1 and coreference resolution have applications in Question Answering, Information Extraction, Machine Translation, and so forth. Paraphrase pairs might be coreferential, and coreference relations are sometimes paraphrases. The two overlap considerably (Hirst 1981), but their definitionsmake them significantly different in essence: Paraphrasing concerns meaning, whereas core...

متن کامل

On Paraphrase and Coreference

Paraphrase extraction and coreference resolution have applications in Question Answering, Information Extraction, Machine Translation, and so forth. Paraphrase pairs might be coreferential, and coreference relations are sometimes paraphrases. The two overlap considerably (Hirst 1981), but their definitionsmake them significantly different in essence: Paraphrasing concerns meaning, whereas coref...

متن کامل

CUSAT_NLP@DPIL-FIRE2016: Malayalam Paraphrase Detection

This paper describes an approach for paraphrase detection in Malayalam sentences developed as part of FIRE 2016 Shared Task on Paraphrase detection in Indian Languages. The task of paraphrasedetection is finding a sentence with the same meaning of another sentence expressed using same or different words. This detection is done by a semantic approach which is language dependent. Individual words...

متن کامل

On-Demand Information Extraction

At present, adapting an Information Extraction system to new topics is an expensive and slow process, requiring some knowledge engineering for each new topic. We propose a new paradigm of Information Extraction which operates 'on demand' in response to a user's query. On-demand Information Extraction (ODIE) aims to completely eliminate the customization effort. Given a user’s query, the system ...

متن کامل

Using Multiple Metrics in Automatically Building Turkish Paraphrase Corpus

Paraphrasing is expressing similar meanings with different words in different order. In this sense it is viewed as translation in the same language. It is an important issue in natural language processing for automatic machine translation, question answering, text summarization and language generation. Studies in paraphrasing can be classified as paraphrase extraction, paraphrase generation, pa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012