Using Grammatical Markov Models for Stylometric Analysis CS224N : Natural Language Processing

نویسندگان

  • Erik Goldman
  • Abel Allison
چکیده

Researchers have tackled the problem of authorship attribution in several different ways, using various metrics to identify the author of an anonymous document given a set of writing samples from potential candidates. Common complaints about modern methodologies tend to accuse studies of content bias, which occurs when quantitative models identify similar content rather than similar styles. This artificially increases accuracy by producing good results on test data while failing to identify authors in real-world applications. We examine several quantitative methods that isolate style by using grammar-based features rather than relying on models of word shape and frequency.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Attention-based Recurrent Neural Networks for Question Answering

Machine Comprehension (MC) of text is an important problem in Natural Language Processing (NLP) research, and the task of Question Answering (QA) is a major way of assessing MC outcomes. One QA dataset that has gained immense popularity recently is the Stanford Question Answering Dataset (SQuAD). Successful models for SQuAD have all involved the use of Recurrent Neural Network (RNN), and most o...

متن کامل

Pati'ern Recognition Applied to the Acquisition of a Grammatical Classification System from Unrestricted English Text

Within computational linguistics, the use of statistical pattern matching is generally restricted to speech processing. We have attempted to apply statistical techniques to discover a grammatical classification system from a Corpus of 'raw' English text. A discovery procedure is simpler for a simpler language model; we assume a first-order Markov model, which (surprisingly) is shown elsewhere t...

متن کامل

Pattern Recognition Applied To The Acquisition Of A Grammatical Classification System From Unrestricted English Text

Within computational linguistics, the use of statistical pattern matching is generally restricted to speech processing. We have attempted to apply statistical techniques to discover a grammatical classification system from a Corpus of 'raw' English text. A discovery procedure is simpler for a simpler language model; we assume a first-order Markov model, which (surprisingly) is shown elsewhere t...

متن کامل

Statistical Language Modeling Using Grammatical Information

We propose to investigate the use of grammatical information to build improved statistical language models. Until recently, language models were primarily innuenced by local lexical constraints. Today, language models often utilize longer range lexical information to aid in their predictions. All of these language models ignore grammatical considerations other than those induced by the statisti...

متن کامل

CS224N Assignment 4: Reading Comprehension

Building question answering systems with deep learning is a significant application of solving the complex natural language problem of reading comprehension. In our approach, we have analyzed literature on previous work, implemented and improved on specific models described, and compared the various models to analyze the effects of certain aspects of models on performance on the question answer...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009