Using a Meta-Grammar for LTAG Korean Grammar
نویسنده
چکیده
Generating elementary trees for wide-coverage Lexicalized Tree Adjoining Grammars (LTAG) is one of the great concerns in the TAG project. We know that the Korean LTAG developed in (Han C.-H. et al., 2000) was not sufficient to handle various syntactic structures. Therefore, a Korean Meta-Grammar (KMG) is proposed to generate and maintain a large number of elementary tree schemata. Describing Korean MG with more precise tree families and with class encoding Korean syntactic properties leads to a larger coverage capacity for Korean LTAG. 1 Motivations for this work The first development of LTAG Korean Grammar (KTAG) was proposed in (Han C.-H. et al., 2000). Few grammars for Korean exist, the one for TAG is quite small with limited coverage. Our goal is to generate a widercoverage KTAG, using a now well-established grammar development technique. We propose using the MetaGrammar for KTAG : a) The MG was successfully used to generate widecoverage grammars for French and medium size TAG for Italian (Candito 1996; Candito 1999), within the FTAG project at the Univ. of Paris 7 (Abeillé, 2002). So the use of the MG to generate real-size grammars has already been established 1 . b) In addition, the MG was also used to generate widecoverage grammars for frameworks like LFG (Clement and Kinyon, 2003). This stronly suggests that the MG is more portable to non-TAG frameworks, unlike other approaches such as Metarules For French, the MG is also used for the syntax of nouns and adjectives (see (Barrier and Barrier, 2003);(Barrier and Barrier, 2004)) c) The MG was also used to generate test-suite sentences for German (Kinyon and Rambow, 2003), as well as a medium-size grammar for German (Gerdes, 2002). This specific use of the MG for text-generation shows over-generation is not a real issue. i.e., not more than for any standard grammar development technique2. Moreover, the MG is particularly appropriate to handle relatively “free-word order” languages such as Korean and German, because of underspecification. This mechanism is used for handling phenomena such as scrambling. 2 What is a Meta Grammar The notion of Meta Grammar (MG) was originally presented to automatically generate wide-coverage TAGs for French and Italian, using a hierarchical-level and compact layer of linguistic description which imposes a general organization for the syntactic information, shared by the different elementary tree families, in a three dimensional inheritance network. The elementary structures of a MG are the classes organized in the Inheritance Graph. The classes in a graph order from more general classes to more specific classes, e.g., the class TRANSITIVE-VERB inherits information from one general class VERB. The three dimensional hierarchies in a MG represent the following information (Candito, 1999): • In Dimension 1, each terminal class encodes an initial sub-categorization, i.e., a list of arguments associated with a given head with an initial syntactic function for each, e.g., a subject and an object for a transitive verbal anchor. • In Dimension 2, each terminal class encodes a list of final function, i.e., a possible change in the initial grammatical function from dimension 1, including the possibility to increase or decrease the number of syntactic functions to be realized, e.g., adding an Even the 5000 tree FTAG was successfully used in the GTAG text generation project TAG+7: Seventh International Workshop on Tree Adjoining Grammar and Related Formalisms. May 20-22, 2004, Vancouver, BC, CA. Pages 211-218. argument for the causative, and erasing an argument for passive with no agent. • In Dimension 3, each terminal class encodes the surface realization of a final syntactic function. The category and the word order are selected. Each class in the hierarchy is associated with a partial description of a tree. These partial descriptions of trees, called quasi-trees, encode father, dominance, equality and precedence relations between tree nodes. A wellformed tree is generated by inheriting information from exactly one terminal class from dimension 1, one terminal class from dimension 2, and n terminal classes from dimension 3. For instance, in order to generate the elementary tree for By whom will Mary be accompanied ?, a MG compiler creates one crossing class which is inherited from a strict-transitive class in dimension 1, from a personal-full-passive class in dimension 2, and from a Wh-questioned-By-complement class in dimension 3. 3 Hierarchical Descriptions in Korean Meta-Grammar (KMG) for LTAG The Korean LTAG (Han C.-H. et al., 2000) consists of 15 tree families (see Fig.(1)). The 289 elementary trees Tree Families 8 for Verbs Tnx0V, Tnx0nx1V, Tnx0nxp1V, Tnx0nxp1nx2V Tnx0s1V, Tnx0nxp1s2V, Tnx0nxNOM1V, Tnx0nx1CO 3 for Adjectives Tnx0A, Tnx0nxp1A, Tnx0nxNOM1A 4 for Structures Declarative and Relative Constructions, Gerund and Adverbial Clauses Figure 1: Tree Families in (Han C.-H. et al., 2000) have been created. Han C.-H. et al, 2000 said that it was expected to increase the number of elementary trees in order to handle more syntactic phenomena : passive, causative, resultative, light verb construction, coordination construction, and scrambling. In particular, the most important concern about the coverage capacity for a Korean grammar is the ability to handle the scrambling phenomenon. Because free-word order probably leads to an enormous expansion in the number of elementary trees due to permutations of arguments. 3.1 Initial Syntactic Functions in KMG Lexicalized TAG elementary trees represent extended projections of lexical items and encapsulate all syntactic arguments of a lexical anchor. We describe the initial subcategorization frames for Korean verbs, which will be encoded in each elementary tree. Tree families proposed here cover those of (Han C.-H. et al., 2000). Before representing initial subcategorization frames, we explain the linguistic choice for KMG : As defective verbs, auxiliary verbs, causative and/or passive auxiliary verbs, raising verbs are not represented by sentential structures, i.e., they have a reduced projection to VP and not to S. We use the syntactic category SNP (sentential noun phrase) for the complex noun phrase, and the syntactic category GNP (gerund noun phrase) for the gerund construction. When sentential clauses appear in an argument position, they become either like complex noun phrases as in (1), or like gerund noun phrases as in (2). Head items in SNP and GNP take a case marker such as a lexical head noun in NP3. SNP and GNP behave as nouns as a whole. But in contrast to NP nodes, modifiers for nouns can not adjoin at a SNP (complex NP) or a GNP (gerund NP) node. We have specified SNP↓ and GNP↓ nodes in tree families of predicates for which subcategorize. Complex NPs are represented by an initial tree, whose root node is SNP. It is anchored by the head dependant noun and it has a substitution node S for the clause that modifies the head noun in Fig. 2(a). Gerund NPs are represented by an initial tree, whose root node is GNP, that is anchored by the head verb that represents appropriate subcategorization frames in Fig. 2(b). (1) Minho-ga [snp yaksok-e neujossda-n-sasil-eul] Minhonom appointmentpp be.lateadn.FACT.acc arassda. realize ‘Minho realizes that he is late for the appointment’ (2) Minho-ga [gnp sakwa-reul meok-gi-reul ] Minhonom appleacc eatnominalizer.acc silheohanda. dislike ‘Minho does not like to eat apples.’
منابع مشابه
Statistical Morphological Tagging and Parsing of Korean with an LTAG Grammar
This paper describes a lexicalized tree adjoining grammar (LTAG) based parsing system for Korean which combines corpus-based morphological analysis and tagging with a statistical parser. Part of the challenge of statistical parsing for Korean comes from the fact that Korean has free word order and a complex morphological system. The parser uses an LTAG grammar which is automatically extracted u...
متن کاملThe Interaction of Gender with Text Enhancement and Meta-cognitive Grammar Instruction on Learning and Recall of English Grammar
The current research was an effort to study the interaction of gender with text enhancement and meta-cognitive grammar instruction on learning and recall of English grammar. To this end, two groups of students consisting of 51 learners from both genders were formed. The participants were 51 male and 51 female learners. The 51 participants of each gender were further divided into two groups. The...
متن کاملThe Interaction of Gender with Text Enhancement and Meta-cognitive Grammar Instruction on Learning and Recall of English Grammar
The current research was an effort to study the interaction of gender with text enhancement and meta-cognitive grammar instruction on learning and recall of English grammar. To this end, two groups of students consisting of 51 learners from both genders were formed. The participants were 51 male and 51 female learners. The 51 participants of each gender were further divided into two groups. The...
متن کاملGrammar conversion from LTAG to HPSG
We propose an algorithm for the conversion of grammars from an arbitrary FB-LTAG grammar into a strongly equivalent HPSG-style grammar. Our algorithm converts LTAG elementary trees into HPSG feature structures by encoding the tree structures in stacks. A set of pre-determined rules manipulate the stack to emulate substitution and adjunction. We have used our algorithm to obtain HPSG-style gramm...
متن کاملResource sharing among HPSG and LTAG communities by a method of grammar conversion from FB-LTAG to HPSG
This paper describes the RenTAL system, which enables sharing resources in LTAG and HPSG formalisms by a method of grammar conversion from an FB-LTAG grammar to a strongly equivalent HPSG-style grammar. The system is applied to the latest version of the XTAG English grammar. Experimental results show that the obtained HPSG-style grammar successfully worked with an HPSG parser, and achieved a dr...
متن کامل