Mining Paraphrasal Typed Templates from a Plain Text Corpus

نویسندگان

  • Or Biran
  • Terra Blevins
  • Kathleen McKeown
چکیده

Finding paraphrases in text is an important task with implications for generation, summarization and question answering, among other applications. Of particular interest to those applications is the specific formulation of the task where the paraphrases are templated, which provides an easy way to lexicalize one message in multiple ways by simply plugging in the relevant entities. Previous work has focused on mining paraphrases from parallel and comparable corpora, or mining very short sub-sentence synonyms and paraphrases. In this paper we present an approach which combines distributional and KB-driven methods to allow robust mining of sentence-level paraphrasal templates, utilizing a rich type system for the slots, from a plain text corpus.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A language-independent method for the extraction of RDF verbalization templates

With the rise of the Semantic Web more and more data become available encoded using the Semantic Web standard RDF. RDF is faced towards machines: designed to be easily processable by machines it is difficult to be understood by casual users. Transforming RDF data into human-comprehensible text would facilitate non-experts to assess this information. In this paper we present a languageindependen...

متن کامل

Proof Mining with Dependent Types

Several approaches exist to data-mining big corpora of formal proofs. Some of these approaches are based on statistical machine learning, and some – on theory exploration. However, most are developed for either untyped or simply-typed theorem provers. In this paper, we present a method that combines statistical data mining and theory exploration in order to analyse and automate proofs in depend...

متن کامل

Text mining: intermediate forms on knowledge representation

In this paper we review the main intermediate forms proposed in text mining, and we briefly study some fuzzy counterparts. The concept of intermediate form applies to any knowledge representation employed to represent in a structured way the semantic content of a text corpus. Intermediate forms play a central role in the text mining process since it is necessary to transform plain text into a f...

متن کامل

Template Induction over Unstructured Email Corpora

Unsupervised template induction over email data is a central component in applications such as information extraction, document classification, and auto-reply. The benefits of automatically generating such templates are known for structured data, e.g. machine generated HTML emails. However much less work has been done in performing the same task over unstructured email data. We propose a techni...

متن کامل

Akshaya: A Framework for Mining General Knowledge Semantics From Unstructured Text

We report a tool called Akshaya, which implements a framework to mine four types of “general knowledge semantics” (analytical semantics) from unstructured text. The semantics being mined are semantic siblings, topical anchors, topic expansion and topical markers. The framework provides options to embed more such general knowledge semantic mining algorithms into it. We use a term co-occurrence g...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016