Adding Semantics to Data-Driven Paraphrasing
نویسندگان
چکیده
We add an interpretable semantics to the paraphrase database (PPDB). To date, the relationship between the phrase pairs in the database has been weakly defined as approximately equivalent. We show that in fact these pairs represent a variety of relations, including directed entailment (little girl/girl) and exclusion (nobody/someone). We automatically assign semantic entailment relations to entries in PPDB using features derived from past work on discovering inference rules from text and semantic taxonomy induction. We demonstrate that our model assigns these entailment relations with high accuracy. In a downstream RTE task, our labels rival relations from WordNet and improve the coverage of a proof-based RTE system by
منابع مشابه
Adding Context to Semantic Data-Driven Paraphrasing
Recognizing lexical inferences between pairs of terms is a common task in NLP applications, which should typically be performed within a given context. Such context-sensitive inferences have to consider both term meaning in context as well as the fine-grained relation holding between the terms. Hence, to develop suitable lexical inference methods, we need datasets that are annotated with fine-g...
متن کاملAdding Semantics to Data-Driven Paraphrasing: Supplementary Material
We use Amazon Mechanical Turk (MTurk) to collect labels for our phrase pairs. We show each pair to 5 independent workers, and ask each worker to use their best judgement to label the relationship that holds between the words. The workers were asked to choose one of 7 relations, or to mark that “I cannot tell.” The exact options given to the workers are shown in Figure 1. These options are based...
متن کاملAdding Semantics to Data-Driven Paraphrasing: Supplementary Material
We use Amazon Mechanical Turk (MTurk) to collect labels for our phrase pairs. We show each pair to 5 independent workers, and ask each worker to use their best judgement to label the relationship that holds between the words. The workers were asked to chose one of 7 relations, or to mark that “I cannot tell.” The exact options given to the workers are shown in Figure 1. These options are based ...
متن کاملParaMetric: An Automatic Evaluation Metric for Paraphrasing
We present ParaMetric, an automatic evaluation metric for data-driven approaches to paraphrasing. ParaMetric provides an objective measure of quality using a collection of multiple translations whose paraphrases have been manually annotated. ParaMetric calculates precision and recall scores by comparing the paraphrases discovered by automatic paraphrasing techniques against gold standard alignm...
متن کاملSemEval-2010 Task 9: The Interpretation of Noun Compounds Using Paraphrasing Verbs and Prepositions
We present a brief overview of the main challenges in understanding the semantics of noun compounds and consider some known methods. We introduce a new task to be part of SemEval-2010: the interpretation of noun compounds using paraphrasing verbs and prepositions. The task is meant to provide a standard testbed for future research on noun compound semantics. It should also promote paraphrase-ba...
متن کامل