Modeling Valence Effects in Unsupervised Grammar Induction

نویسنده

  • David McClosky
چکیده

We extend the dependency grammar induction model of Klein and Manning (2004) to incorporate further valence information. Our extensions achieve significant improvements in the task of unsupervised dependency grammar induction. We use an expanded grammar which tracks higher orders of valence and allows each valence slot to be filled by a separate distribution rather than using one distribution for all slots. Additionally, we show that our performance improves if our grammar restricts the maximum number of attachments in each direction, forcing our system to focus on the common case. Taken together, these techniques constitute a 23.4% error reduction in dependency grammar induction over the model by Klein and Manning (2004) on English.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Unsupervised Dependency Parsing with Richer Contexts and Smoothing

Unsupervised grammar induction models tend to employ relatively simple models of syntax when compared to their supervised counterparts. Traditionally, the unsupervised models have been kept simple due to tractability and data sparsity concerns. In this paper, we introduce basic valence frames and lexical information into an unsupervised dependency grammar inducer and show how this additional in...

متن کامل

Concavity and Initialization for Unsupervised Dependency Grammar Induction

We examine models for unsupervised learning with concave log-likelihood functions. We begin with the most well-known example, IBM Model 1 for word alignment (Brown et al., 1993), and study its properties, discussing why other models for unsupervised learning are so seldom concave. We then present concave models for dependency grammar induction and validate them experimentally. Despite their sim...

متن کامل

Memory-Bounded Left-Corner Unsupervised Grammar Induction on Child-Directed Input

This paper presents a new memory-bounded left-corner parsing model for unsupervised raw-text syntax induction, using unsupervised hierarchical hidden Markov models (UHHMM). We deploy this algorithm to shed light on the extent to which human language learners can discover hierarchical syntax through distributional statistics alone, by modeling two widely-accepted features of human language acqui...

متن کامل

Punctuation: Making a Point in Unsupervised Dependency Parsing

We show how punctuation can be used to improve unsupervised dependency parsing. Our linguistic analysis confirms the strong connection between English punctuation and phrase boundaries in the Penn Treebank. However, approaches that naively include punctuation marks in the grammar (as if they were words) do not perform well with Klein and Manning’s Dependency Model with Valence (DMV). Instead, w...

متن کامل

Unsupervised NLP and Human Language Acquisition: Making Connections to Make Progress

Natural language processing and cognitive science are two fields in which unsupervised language learning is an important area of research. Yet there is often little crosstalk between the two fields. In this talk, I will argue that considering the problem of unsupervised language learning from a cognitive perspective can lead to useful insights for the NLP researcher, while also showing how tool...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008