A Left Corner Parser for Tree Adjoining Grammars
نویسندگان
چکیده
Tabular parsers can be defined as deduction systems where formulas, called items, are sets of complete or incomplete constituents (Sikkel, 1997; Shieber, Schabes and Pereira, 1995). Formally, given an input string w = a1 . . . an with n ≥ 0 and a grammar G, a parser IP is a tuple (I,H,D) where I is a set of items, H is a set of hypothesis ([ai, i − 1, i] with 1 ≤ i ≤ n) that encodes the input string, and D is a set of deduction steps that determines how items are combined in order to deduce new items. The deductive approach allows us to establish relations between two parsers in a formal way. One of the most interesting relations between parsers are filters because they can be used to improve the performance of tabular parsers in practical cases. The application of a filter to a parser yields a new parser which performs less deductions or contracts sequences of deductions to single deduction steps. One well-known example of a filter is the relation between Earley and Left Corner (LC) parsers for ContextFree Grammars (CFGs). A LC parser reduces the number of items deduced by Earley’s parser using the left corner relation. Given a CFG, the left corner of a non-terminal symbol A is the terminal or non-terminal symbol X if and only if there exists a production A → Xν in the grammar, where ν is a sequence of symbols. In the case of A → ε, we consider ε as the left corner of A. The notion of the left corner relation allow us to rule out the prediction performed on X by an Earley’s parser. Most tabular parsers for Tree Adjoining Grammars (TAGs) are extensions of well-known tabular parser for CFGs. For example, we can cite a number of tabular parsers for TAGs defined on the basis of the Earley’s algorithm (Alonso Pardo et al., 1999; Lang, 1990; Joshi and Schabes, 1997; Nederhof, 1999). Although several approaches have been described to improve the performance of TAGs parsers, most of them based on restrictions in the formalism (Schabes and Waters, 1995) or compilation into finite-state automata (Evans and Weir, 1998), to the best of our knowledge, no attempt has been made to improve the practical performance of Earley-based parsers for TAGs by introducing the left-corner relation.
منابع مشابه
A Predictive Left-Corner Parser for Tree Adjoining Grammars
Tree Adjoining Grammar (TAG) is a formalism that has become very popular for the description of natural languages. However, the parsers for TAG that have been defined on the basis of the Earley’s algorithm entail important computational costs. In this article, we propose to extend the left corner relation from Context Free Grammar (CFG) to TAG in order to define an efficient left corner parser ...
متن کاملHead-Corner Parsing for TAG
This paper describes a bidirectional head-corner parser for (uniication-based versions of) Lexicalized Tree Adjoining Grammars.
متن کاملPractical experiments in parsing using Tree Adjoining Grammars
We present an implementation of a chart-based head-corner parsing algorithm for lexicalized Tree Adjoining Grammars. We report on some practical experiments where we parse 2250 sentences from the Wall Street Journal using this parser. In these experiments the parser is run without any statistical pruning; it produces all valid parses for each sentence in the form of a shared derivation forest. ...
متن کاملDeterministic Left to Right Parsing of Tree Adjoining Languages
We define a set of deterministic bottom-up left to right parsers which analyze a subset of Tree Adjoining Languages. The LR parsing strategy for Context Free Grammars is extended to Tree Adjoining Grammars (TAGs). We use a machine, called Bottom-up Embedtied Push Down Automaton (BEPDA), that recognizes in a bottom-up fashion the set of Tree Adjoining Languages (and exactly this se0. Each parser...
متن کاملFast LR parsing Using Rich (Tree Adjoining) Grammars
We describe an LR parser of parts-ofspeech (and punctuation labels) for Tree Adjoining Grammars (TAGs), that solves table conflicts in a greedy way, with limited amount of backtracking. We evaluate the parser using the Penn Treebank showing that the method yield very fast parsers with at least reasonable accuracy, confirming the intuition that LR parsing benefits from the use of rich grammars.
متن کامل