Discontinuous Constituents In Trees, Rules, And Parsing
نویسندگان
چکیده
This paper discusses the consequences of allowing discontinuous constituents in syntactic representions and phrase-structure rules, and the resulting complications for a standard parser of phrase-structure grammar. It is argued, first, that discontinuous constituents seem inevitable in a phrase-structure grammar which is acceptable from a semantic point of view. It is shown that tree-like constituent structures with discontinuities can be given a precise definition which makes them just as acceptable for syntactic representation as ordinary trees. However, the formulation of phrase-structure rules that generate such structures entails quite intricate problems. The notions .of linear precedence and adjacency are reexamined, and the concept of "n-place adjacency sequence" is introduced. Finally , the resulting form of phrase-structure grammar, called "Discontinuous Phrase-Structure Grammar" is shown to be parsable by an algorithm for context-free parsing with relatively minor adaptations. The paper describes the adaptations in the chart parser which was implemented as part of the TENDUM dialogue system. I. Phrase-structure discontinuity g r a m m a r a n d Context-free phrase-structure grammars (PSGs) have always been popular in computational linguistics and in the theory of programming languages because of their technical and conceptual simplicity and their well-established efficient parsability (Shell, 1976; Tomita, 1985). In theoretical linguistics, it was generally believed until recently that natural language competence cannot be characterized adequately by a context-free grammar, especially in view of agreement phenomena and discontinuities (see e.g. Postal, 1964). However, in the early eighties Gazdar and others revived an idea, due to Harman (1963), of formulating phrase-structure rules not in terms of monadic category symbols, but in terms of feature bundles. With this richer conception of PSG it is not at all obvious whether natural languages can be described by context-free grammars (see e .g . Pullum, 1984) . Generalized Phrase-Structure Grammar (GPSG; Gazdar et al., 1985), represents a recent attempt to provide a theoretically acceptable account of natural-language syntax in the form of a phrase-structure grammar. Apart from being important in its own right, phrase-structure grammar also plays an important part in more complex grammar formalisms that have been developed in linguistics; in classical Transformational-Generative Grammar the base component was assumed to be a PSG; in Lexical-Functional Grammar a PSG is supposed to generate c-structures, and in Functional Uni f ication Grammar context-free rules generate the input structures for the unification operation (Kay, 1979). Phrase-structure grammar has one more attractive side, apart from its technical/conceptual simplicity and its computational efficiency, namely that it seems to fit the semantic requirement of compositionality very well. The compositionality principle is the thesis that the meaning of a natural-language expression is determined by the combination of (a) the meanings of its parts; (b) its syntactic structure. This entails, for a grammar which associates meanings with the expressions of the language, the requirement that the syntactic rules should characterize the internal structure of every expression in a "meaningful" way, which allows the computation of its meaning. In this way, semantic considerations can be used to prefer one syntactic analysis to another. PSGs area useful tool for the formulation of syntactic rules that meet this requirement, as phrase-structure rules by their very nature provide a recursive description of the constituent structure
منابع مشابه
Discontinuous Data-Oriented Parsing: A mildly context-sensitive all-fragments grammar
Recent advances in parsing technology have made treebank parsing with discontinuous constituents possible, with parser output of competitive quality (Kallmeyer and Maier, 2010). We apply Data-Oriented Parsing (DOP) to a grammar formalism that allows for discontinuous trees (LCFRS). Decisions during parsing are conditioned on all possible fragments, resulting in improved performance. Despite the...
متن کاملDiscontinuous Data-Oriented Parsing through Mild Context-Sensitivity
It has long been argued that incorporating a notion of discontinuity in phrase-structure is desirable, given phenomena such as topicalization and extraposition, and particular features of languages such as cross-serial dependencies in Dutch and the German Mittelfeld. Up until recently this was mainly a theoretical topic, but advances in parsing technology have made treebank parsing with discont...
متن کاملParsing as Reduction
We reduce phrase-representation parsing to dependency parsing. Our reduction is grounded on a new intermediate representation, “head-ordered dependency trees,” shown to be isomorphic to constituent trees. By encoding order information in the dependency labels, we show that any off-the-shelf, trainable dependency parser can be used to produce constituents. When this parser is non-projective, we ...
متن کاملImproving the Efficiency of Parsing Discontinuous Constituents
A prominent tradition within the framework of Head-Driven Phrase Structure Grammar (HPSG, Pollard and Sag 1994) has argued on linguistic grounds for analyses which license so-called discontinuous constituents (Reape 1993; Kathol 1995; Richter and Sailer 2001; Müller 1999a; Penn 1999; Donohue and Sag 1999; Bonami et al. 1999), joining researchers in other linguistic frameworks, including Depende...
متن کاملComputational semantics in type theory
This paper aims to show how Montague-style grammars can be completely formalized and thereby declaratively implemented by using the Grammatical Framework GF. The implementation covers the fundamental operations of Montague’s PTQ model: the construction of analysis trees, the linearization of trees into strings, and the interpretation of trees as logical formulas. Moreover, a parsing algorithm i...
متن کامل