Constraint - Based Lexica
نویسندگان
چکیده
As the field of generative linguistics has developed, the lexicon has taken on an increasingly important role in the description of both idiosyncratic and regular properties of language. Always viewed as a natural home for exceptions, the lexicon was given relatively little work in the early years of transformational grammar. Then Chomsky proposed in 1970 (Chomsky, 1970) that similarities in the structure of deverbal noun phrases and sentences could be expressed in terms of a lexical relationship between the verb and its nominalization. Jackendoff (1975) characterized further lexical regularities in both morphology and semantics, and Bresnan (1976, 1982) pioneered the development of a syntactic framework (Lexical Functional Grammar) in which central grammatical phenomena such as passivization could be explained within the lexicon. A parallel line of work by Gazdar (1981) called Generalized Phrase Structure Grammar sought to provide a nontransformational syntactic framework, by employing metarules over a context-free grammar. Gazdar et al. (1985) constrained the power of those metarules by restricting them to lexically-headed phrase structure rules. Pollard and Sag (1987, 1994) built on the work in GPSG, outlining the more radically lexicalist framework of Head-driven Phrase Structure Grammar (HPSG), abandoning construction-specific phrase structure rules in favor of a small number of rule schemata interacting with a more richly articulated lexicon to capture relevant syntactic generalizations. As a well-known and widely used constraint-based grammar formalism1 HPSG will serve us well in this chapter by providing a precise linguistic framework within which we can organize the relevant data and examine the technical devices available for analysis of that data. For those who are not familiar with the notation, we first provide a brief introduction to this framework.
منابع مشابه
A Bare-bones Constraint Grammar
This paper presents a solution for overcoming the lexical resource gap when mounting rule-based Constraint Grammar systems for minor languages, or in the face of licensing and financing limitations. We investigate how the performance of a CG disambiguation grammar responds to shifting input parameters, among them lexicon limitations of various degrees, the lack a morphological analyzer or both....
متن کاملSpecifications of Building Polish Lexica for Application in ASR and TTS Systems
This paper brings detailed information concerning the specifications of building Polish lexica of common and special application words for use in speech applications such as ASR (automatic speech recognition) or TTS (text-to-speech) synthesis. The specifications include information on the collection of text corpora and word lists, phonetic, grammatical and morphological annotation, as well as s...
متن کاملTraining of Lexica for Subword-Based Speech Recognisers
In this paper we present an automatic optimal baseform determination algorithm. Given a set of subword Hidden Markov Models (HMMs) and acoustic tokens of a speciic word, we apply the tree-trellis N-best search algorithm to nd the optimal baseforms (transcriptions) in the maximum likelihood sense. The proposed algorithm is used in an iterative manner, creating a series of lexica trained from the...
متن کاملEvaluating and improving syntactic lexica by plugging them within a parser
We present some evaluation results for four French syntactic lexica, obtained through their conversion to the Alexina format used by the Lefff lexicon (Sagot, 2010), and their integration within the large-coverage TAG-based FRMG parser (de La Clergerie, 2005). The evaluations are run on two test corpora, annotated with two distinct annotation formats, namely EASy/Passage chunks and relations an...
متن کاملTransforming Lexica as Trees
We investigate the problem of structurally changing lexica, while preserving the information. We present a type of lexicon transformation that is complete on an interesting class of lexica. Our work is motivated by the problem of merging one or more lexica into one lexicon. Lexica, lexicon schemas, and lexicon transformations are all seen as particular kinds of trees.
متن کاملThe Robustness of Domain Lexico-Taxonomy: Expanding Domain Lexicon with CiLin
This paper deals with the robust expansion of Domain LexicoTaxonomy (DLT). DLT is a domain taxonomy enriched with domain lexica. DLT was proposed as an infrastructure for crossing domain barriers (Huang et al. 2004). The DLT proposal is based on the observation that domain lexica contain entries that are also part of a general lexicon. Hence, when entries of a general lexicon are marked with th...
متن کامل