Optimizing Grammars for Minimum Dependency Length
نویسندگان
چکیده
We examine the problem of choosing word order for a set of dependency trees so as to minimize total dependency length. We present an algorithm for computing the optimal layout of a single tree as well as a numerical method for optimizing a grammar of orderings over a set of dependency types. A grammar generated by minimizing dependency length in unordered trees from the Penn Treebank is found to agree surprisingly well with English word order, suggesting that dependency length minimization has influenced the evolution of English.
منابع مشابه
Evolving Stochastic Context-Free Grammars from Examples Using a Minimum Description Length Principle
This paper describes an evolutionary approach to the problem of inferring stochastic context-free grammars from nite language samples. The approach employs a genetic algorithm, with a tness function derived from a minimum description length principle. Solutions to the inference problem are evolved by optimizing the parameters of a covering grammar for a given language sample. We provide details...
متن کاملAutomatic Learning of Parallel Dependency Treelet Pairs
Induction of synchronous grammars from empirical data has long been a problem unsolved; despite that generative synchronous grammars theoretically suit the machine translation task very well. This fact is mainly due to pervasive structural divergences between languages. This paper presents a statistical approach to learn dependency structure mappings from parallel corpora. The algorithm introdu...
متن کاملWriting Weighted Constraints for Large Dependency Grammars
Implementing dependency grammar as a set of defeasible declarative rules has fundamental advantages such as expressiveness, automatic disambiguation, and robustness. Although an implementation and a successful large-scale grammar of German are available, so far the construction of constraint dependency grammars has not been described at length. We report on techniques that were used to write th...
متن کاملLearning Stochastic Categorial Grammars
Stochastic categorial grammars (SCGs) are introduced as a more appropriate formalism for statistical language learners to est imate than stochastic context free grammars. As a vehicle for demonstrating SCG estimation, we show, in terms of crossing rates and in coverage, that when training material is limited, SCG estimation using the Minimum Description Length Principle is preferable to SCG est...
متن کاملUnderspecified Semantics for Dependency Grammars
We link generative dependency grammars meeting natural modularity requirements with underspecified semantics of Discourse Plans intended to account for exactly those meaning components that grammars of languages mark for. We complete this link with a natural compilation of the modular dependency grammars into strongly equivalent efficiently analysed categorial dependency grammars.
متن کامل