Guiding an HPSG Parser using Semantic and Pragmatic Expectations
نویسنده
چکیده
1 Efficient natural language generation has been successfully demonstrated using highly compiled knowledge about speech acts and their related social actions. A design and prototype implementation of a parser which utilizes this same pragmatic knowledge to efficiently guide parsing is presented. Such guidance is shown to prune the search space and thus avoid needless processing of pragmatically unlikely constituent structures. INTRODUCTION The use of purely syntactic knowledge during the parse phase of natural language understanding yields considerable local ambiguity (consideration of impossible subeonstituents) as well global ambiguity (construction of syntactically valid parses not applicable to the socio-pragmatic context). This research investigates bringing socio-pragmatic knowledge to bear during the parse, while maintaining a domain independent grammar and parser. The particular technique explored uses knowledge about the pragmatic context to order the consideration of proposed parse constituents, thus guiding the parser to consider the best (wrt the expectations) solutions first. Such a search may be classified as a bestfirst search. The theoretical models used to represent the pragmatic knowledge in this study are based on Halliday's Systemic Grammar and a model of the pragmatics of conversation. The model used to represent the syntax and domain independent semantic knowledge is HPSG Head-driven Phrase Structure Grammar. BACKGROUND Patten, Geis and Becker (1992) demonstrate the application of knowledge compilation to achieve the rapid generation of natural language. Their mechanism is based on Halliday's systemic networks, and on Geis' theory of the pragmatics of conversation. A model of conversation using principled compilation of pragmatic knowledge and other linguistic knowledge is used to permit the application of pragmatic inference without expensive computation. A pragmatic component is used to model social action, including speech acts, and utilize conventions of us.g involving such features of context such as politeness, ~e~gister, and stylistic features. These politeness features are critiqd}l to the account of indirect speech acts. This pragmatic knovCledge is compiled into course-grained knowledge in the form of a classification hierarchy. A planner component uses knowledge about conditions which need to be satisfied (discourse goals) to produce a set of pragmatic features which characterize a desired utterance. These features are mapped into the systemic l Research Funded by The Ohio State Center for Cognitive Science and The Ohio State Departments of Computer and Information Science and Linguistics grammar (using compiled knowledge) which is then used to realize the actual utterance. The syntactic/semantic component used in this study is a parser based on the HPSG (Head Driven Phrase Structure Grammar) theory of grammar (Pollard and Sag, 1992). HPSG models all linguistic constituents in terms of part/a/ information structures cal led f e a t u r e structures. Linguistic signs incorporate simultaneous representation of phonological, syntactic, and semantic attributes of grammatical constituents. HPSG is a lexiealized theory, with the lexical definitions, rather then phrase structure rules, specifying most configurational constraints. Control (such as subcategorization, for example) is asserted by the use of HPSG constraints partially filled in feature structures called feature descriptions, which constrain possible HPSG feature structures by asserting specific attributes and/or labels. A HPSG based chart parser, under development at the author's university, was used for the implementation part of
منابع مشابه
Corpus-Oriented Development of Japanese HPSG Parsers
This paper reports the corpus-oriented development of a wide-coverage Japanese HPSG parser. We first created an HPSG treebank from the EDR corpus by using heuristic conversion rules, and then extracted lexical entries from the treebank. The grammar developed using this method attained wide coverage that could hardly be obtained by conventional manual development. We also trained a statistical p...
متن کاملبرچسبزنی نقش معنایی جملات فارسی با رویکرد یادگیری مبتنی بر حافظه
Abstract Extracting semantic roles is one of the major steps in representing text meaning. It refers to finding the semantic relations between a predicate and syntactic constituents in a sentence. In this paper we present a semantic role labeling system for Persian, using memory-based learning model and standard features. Our proposed system implements a two-phase architecture to first identify...
متن کاملFeature Engineering in Persian Dependency Parser
Dependency parser is one of the most important fundamental tools in the natural language processing, which extracts structure of sentences and determines the relations between words based on the dependency grammar. The dependency parser is proper for free order languages, such as Persian. In this paper, data-driven dependency parser has been developed with the help of phrase-structure parser fo...
متن کاملTowards efficient probabilistic HPSG parsing: integrating semantic and syntactic preference to guide the parsing
We present a framework for efficient parsing with probabilistic Head-driven Phrase Structure Grammars (HPSG). The parser can integrate semantic and syntactic preference into figures-of-merit (FOMs) with the equivalence class function during parsing, and reduce the search space by using the integrated FOMs. This paper presents a CKY algorithm with this function and experimental results of beam t...
متن کاملLinking Flat Predicate Argument Structures
This report presents an approach to enriching flat and robust predicate argument structures with more fine-grained semantic information, extracted from underspecified semantic representations and encoded in Minimal Recursion Semantics (MRS). Such representations are provided by a hand-built HPSG grammar with a wide linguistic coverage. A specific semantic representation, called linked predicate...
متن کامل