Enriching Morphological Lexica through Unsupervised Derivational Rule Acquisition

نویسندگان

  • Géraldine Walther
  • Lionel Nicolas
چکیده

In a morphological lexicon, each entry combines a lemma with a specific inflection class, often defined by a set of inflection rules. Therefore, such lexica usually give a satisfying account of inflectional operations. Derivational information, however, is usually badly covered. In this paper we introduce a novel approach for enriching morphological lexica with derivational links between entries and with new entries derived from existing ones and attested in large-scale corpora, without relying on prior knowledge of possible derivational processes. To achieve this goal, we adapt the unsupervised morphological rule acquisition tool MorphAcq (Nicolas et al., 2010) in a way allowing it to take into account an existing morphological lexicon developed in the Alexina framework (Sagot, 2010), such as the Lefff for French and the Leffe for Spanish. We apply this tool on large corpora, thus uncovering morphological rules that model derivational operations in these two lexica. We use these rules for generating derivation links between existing entries, as well as for deriving new entries from existing ones and adding those which are best attested in a large corpus. In addition to lexicon development and NLP applications that benefit from rich lexical data, such derivational information will be particularly valuable to linguists who rely on vast amounts of data to describe and analyse these specific morphological phenomena.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Derivational Morphology in Inheritance-based Lexica: Insights from Pān. ini

This paper demonstrates that the treatment of nominal derivational morphology in Pān. ini’s grammar of the Sanskrit language (ca. 500BC) is based on an architecture strikingly similar to that of modern inheritance-based lexica. Specifically, Pān. ini adopts a single inheritance network with defaults to account concisely for intricate cases of affix homonymy and affix synonymy with minimal redun...

متن کامل

DerivBase.hr: A High-Coverage Derivational Morphology Resource for Croatian

Knowledge about derivational morphology has been proven useful for a number of natural language processing (NLP) tasks. We describe the construction and evaluation of DERIVBASE.HR, a large-coverage morphological resource for Croatian. DERIVBASE.HR groups 100k lemmas from web corpus hrWaC into 56k clusters of derivationally related lemmas, so-called derivational families. We focus on suffixal de...

متن کامل

Towards a Malay Derivational Lexicon: Learning Affixes Using Expectation Maximization

We propose an unsupervised training method to guide the learning of Malay derivational morphology from a set of morphological segmentations produced by a naı̈ve morphological analyzer. Using a morphology-based language model, we first estimate the probability of a given segmentation. We train the model with EM to find the segmentation that maximizes the probability of each morpheme. We extract t...

متن کامل

Morphology Based Automatic Acquisition of Large-coverage Lexica

In this article, we introduce a new technique for constructing wide-coverage morphological lexica from large corpora and morphological knowledge, with an application to French. Basically, it relies on the idea that the existence of a hypothetical lemma can be guessed if several different words found in the corpus are best interpreted as morphological variants of this lemma. We first validated o...

متن کامل

Derivational Relations in Czech WordNet

In the paper we describe enriching Czech WordNet with the derivational relations that in highly inflectional languages like Czech form typical derivational nests (or subnets). Derivational relations are mostly of semantic nature and their regularity in Czech allows us to add them to the WordNet almost automatically. For this purpose we have used the derivational version of morphological analyze...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011