Data-driven Paraphrasing and Stylistic Harmonization

نویسنده

  • Gerold Hintz
چکیده

This thesis proposal outlines the use of unsupervised data-driven methods for paraphrasing tasks. We motivate the development of knowledge-free methods at the guiding use case of multi-document summarization, which requires a domain-adaptable system for both the detection and generation of sentential paraphrases. First, we define a number of guiding research questions that will be addressed in the scope of this thesis. We continue to present ongoing work in unsupervised lexical substitution. An existing supervised approach is first adapted to a new language and dataset. We observe that supervised lexical substitution relies heavily on lexical semantic resources, and present an approach to overcome this dependency. We describe a method for unsupervised relation extraction, which we aim to leverage in lexical substitution as a replacement for knowledge-based resources.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Paraphrasing for Style

We present initial investigation into the task of paraphrasing language while targeting a particular writing style. The plays of William Shakespeare and their modern translations are used as a testbed for evaluating paraphrase systems targeting a specific style of writing. We show that even with a relatively small amount of parallel training data, it is possible to learn paraphrase models which...

متن کامل

A Controlled Language Aproach to Text Optimisation in Technical Documentation

In this paper we propose a controlled language approach to text optimisation in the field of technical documentation. Within this approach, we use stylistic paraphrases as instrument to the optimisation process. We present various categories of paraphrasing principles and describe their implementation in the corrector component of a controlled language checker.

متن کامل

ParaMetric: An Automatic Evaluation Metric for Paraphrasing

We present ParaMetric, an automatic evaluation metric for data-driven approaches to paraphrasing. ParaMetric provides an objective measure of quality using a collection of multiple translations whose paraphrases have been manually annotated. ParaMetric calculates precision and recall scores by comparing the paraphrases discovered by automatic paraphrasing techniques against gold standard alignm...

متن کامل

Discovering User Attribute Stylistic Differences via Paraphrasing

User attribute prediction from social media text has proven successful and useful for downstream tasks. In previous studies, differences in user trait language use have been limited primarily to the presence or absence of words that indicate topical preferences. In this study, we aim to find linguistic style distinctions across three different user attributes: gender, age and occupational class...

متن کامل

Tuning of a Knowledge-Driven Harmonization Model for Tonal Music

The paper presents and discusses direct and indirect tuning of a knowledge-driven harmonization model for tonal music. Automatic harmonization is a data analysis problem: an algorithm processes a music notation document and generates specific meta-data (harmonic functions). The proposed model could be seen as an Expert System with manually selected weights, based largely on the music theory. It...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016