Modeling Valence Effects in Unsupervised Grammar Induction
نویسنده
چکیده
We extend the dependency grammar induction model of Klein and Manning (2004) to incorporate further valence information. Our extensions achieve significant improvements in the task of unsupervised dependency grammar induction. We use an expanded grammar which tracks higher orders of valence and allows each valence slot to be filled by a separate distribution rather than using one distribution for all slots. Additionally, we show that our performance improves if our grammar restricts the maximum number of attachments in each direction, forcing our system to focus on the common case. Taken together, these techniques constitute a 23.4% error reduction in dependency grammar induction over the model by Klein and Manning (2004) on English.
منابع مشابه
Improving Unsupervised Dependency Parsing with Richer Contexts and Smoothing
Unsupervised grammar induction models tend to employ relatively simple models of syntax when compared to their supervised counterparts. Traditionally, the unsupervised models have been kept simple due to tractability and data sparsity concerns. In this paper, we introduce basic valence frames and lexical information into an unsupervised dependency grammar inducer and show how this additional in...
متن کاملConcavity and Initialization for Unsupervised Dependency Grammar Induction
We examine models for unsupervised learning with concave log-likelihood functions. We begin with the most well-known example, IBM Model 1 for word alignment (Brown et al., 1993), and study its properties, discussing why other models for unsupervised learning are so seldom concave. We then present concave models for dependency grammar induction and validate them experimentally. Despite their sim...
متن کاملMemory-Bounded Left-Corner Unsupervised Grammar Induction on Child-Directed Input
This paper presents a new memory-bounded left-corner parsing model for unsupervised raw-text syntax induction, using unsupervised hierarchical hidden Markov models (UHHMM). We deploy this algorithm to shed light on the extent to which human language learners can discover hierarchical syntax through distributional statistics alone, by modeling two widely-accepted features of human language acqui...
متن کاملPunctuation: Making a Point in Unsupervised Dependency Parsing
We show how punctuation can be used to improve unsupervised dependency parsing. Our linguistic analysis confirms the strong connection between English punctuation and phrase boundaries in the Penn Treebank. However, approaches that naively include punctuation marks in the grammar (as if they were words) do not perform well with Klein and Manning’s Dependency Model with Valence (DMV). Instead, w...
متن کاملUnsupervised NLP and Human Language Acquisition: Making Connections to Make Progress
Natural language processing and cognitive science are two fields in which unsupervised language learning is an important area of research. Yet there is often little crosstalk between the two fields. In this talk, I will argue that considering the problem of unsupervised language learning from a cognitive perspective can lead to useful insights for the NLP researcher, while also showing how tool...
متن کامل