Extracting Lexically Divergent Paraphrases from Twitter
نویسندگان
چکیده
منابع مشابه
Extracting Lexically Divergent Paraphrases from Twitter
We present MULTIP (Multi-instance Learning Paraphrase Model), a new model suited to identify paraphrases within the short messages on Twitter. We jointly model paraphrase relations between word and sentence pairs and assume only sentence-level annotations during learning. Using this principled latent variable model alone, we achieve the performance competitive with a state-of-the-art method whi...
متن کاملExtracting Paraphrases from Aligned Corpora
The Problem: The expressiveness of human language allows people to express the same idea in many different ways; they may use different words to refer to the same entity or employ different phrases to describe the same concept. Thus, an effective information retrieval (IR) and question answering (QA) system must be equipped to handle these variations, both when processing documents and when fie...
متن کاملExtracting Paraphrases from a Parallel Corpus
While paraphrasing is critical both for interpretation and generation of natural language, current systems use manual or semi-automatic methods to collect paraphrases. We present an unsupervised learning algorithm for identification of paraphrases from a corpus of multiple English translations of the same source text. Our approach yields phrasal and single word lexical paraphrases as well as sy...
متن کاملExtracting Structural Paraphrases from Aligned Monolingual Corpora
We present an approach for automatically learning paraphrases from aligned monolingual corpora. Our algorithm works by generalizing the syntactic paths between corresponding anchors in aligned sentence pairs. Compared to previous work, structural paraphrases generated by our algorithm tend to be much longer on average, and are capable of capturing long-distance dependencies. In addition to a st...
متن کاملExtracting Semantic Knowledge from Twitter
Twitter is the second largest social network after Facebook and currently 140 millions Tweets are posted on average each day. Tweets are messages with a maximum number of 140 characters and cover all imaginable stories ranging from simple activity updates over news coverage to opinions on arbitrary topics. In this work we argue that Twitter is a valuable data source for e-Participation related ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Transactions of the Association for Computational Linguistics
سال: 2014
ISSN: 2307-387X
DOI: 10.1162/tacl_a_00194