Compiling a Partition-Based Two-Level Formalism

نویسندگان

  • Edmund Grimley-Evans
  • George Anton Kiraz
  • Stephen G. Pulman
چکیده

This paper describes an algorithm for the compilation of a two (or more) level orthographic or phonological rule notation into finite state transducers. The notation is an alternative to the standard one deriving from Koskenniemi's work: it is believed to have some practical descriptive advantages, and is quite widely used, but has a different interpretation. Etficient interpreters exist for the notation, but until now it has not been clear how to compile to equivalent automata in a transparent way. The present paper shows how to do this, using some of the conceptual tools provided by Kaplan and Kay's regular relations calculus. 1 I n t r o d u c t i o n Two-level formalisins based on that introduced by (Koskenniemi, 1983) (see also (Ritchie et al., 1992) and (Kaplan and Kay, 1994)) are widely used in practical NLP systems, and are deservedly regarded as something of a standard. However, there is at least one serious rival two-level notation in existence, developed in response to practical difficulties encountered in writing large-scale morphological descriptions using Koskenniemi's notation. Tile formalism was first introduced in (Black et al., 1987), was adapted by (Ruessink, 1989), and an extended version of it was proposed for use in the European Commission's ALEP language engineering platform (Pulman, 1991). A flmther extension to the formalisln was described in (Pulman and Hepple, 1993). The alternative partit ion tbrmalism was motivated by several perceived practical disadvan*Supported by SERC studentship no. 92313384. tSupported by a Benefactors' Studentship from St John's College. rages to Koskenniemi's notation. These are detailed more fully in (Black et al., 1987, pp. 13-15), and in (Ritchie et al., 1992, pp. 181-9). In brief: (1) Koskennienli rules are not easily interpretable (by tile grammarian) locally, for the interpretation of 'feasible pairs' depends on other rules in the set. (2) There are frequently interactions between rules: whenever the lexieal/surface pair affected by a rule A appears in tile context of another rule B, the grammarian must check that its appearance in rule B will not conflict with the requirements of rule A. (3) Contexts may conflict: the same lexical character may obligatorily have multiple realisations in different contexts, but it may be impossible to state the contexts in ways that do not block a desired application. (4) Restriction to single character changes: whenever a change affecting more than one adjacent character occurs, multiple rules nmst be written. At best this prompts tile interaction problem, and at worst can require the rules to be forInulated with under-restrictive contexts to avoid mutual blocking. (5) There is no mechanism for relating particular rules to specific classes of morpheme. This has to be achieved indirectly by introducing special abstract triggering characters in lexical representations. This is clumsy, and sometimes descriptively inadequate ('h'ost, 1990). Some of these problems can be alleviated by the use of a rule compiler that detects conflicts such as that described in (Kart tunen and Beesley, 1992). Others could be overcome by simple extensions to the tbrmalism. But several of these problems arise from the interpretation of Koskenniemi rules: each rule corresponds to a transducer, and the two-level description of a language consists of the intersection of these transducers. Thus somehow or other it must be arranged that every rule accepts every two-level correspondence. We refer 1;o this class of formalisms as 'parallel': every rule, in effect, is applied ill parallel at each point in the input.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Mazurkiewicz Trace Languages for Partition-Based Morphology

Partition-based morphology is an approach of finite-state morphology where a grammar describes a special kind of regular relations, which split all the strings of a given tuple into the same number of substrings. They are compiled in finite-state machines. In this paper, we address the question of merging grammars using different partitionings into a single finite-state machine. A morphological...

متن کامل

An Extended Partition Model for Generalized Multidimensional Data

Multidimensional databases have been playing a significant role in the database field. However, some cases may need a more flexible model than the rigid structure of a data cube where all cells in each dimension resolultion combination must be filled. Generally, a data cube may be viewed as a multiple partition over a population. We propose a two-level formalism for modeling multiple partitions...

متن کامل

Designing and Compiling a Friendship-Oriented Leadership Model

Purpose: The purpose of this research is to design and compile a model of friendship-oriented leadership. The statistical population included a group of experts in the field of leadership and management and broadcasting experts in the field of broadcasting. Methodology: In this research, a total of 20 people were selected as participants using a targeted sampling approach. The data was collect...

متن کامل

Optimizing Teleportation Cost in Multi-Partition Distributed Quantum Circuits

There are many obstacles in quantum circuits implementation with large scales, so distributed quantum systems are appropriate solution for these quantum circuits. Therefore, reducing the number of quantum teleportation leads to improve the cost of implementing a quantum circuit. The minimum number of teleportations can be considered as a measure of the efficiency of distributed quantum systems....

متن کامل

Improved Affine Partition Algorithm for Compile-Time and Runtime Performance

The Affine partitioning framework, which unifies many useful program transforms such as unimodular transformations, loop fusion, fission, scaling, reindexing, and statement reordering, has been proved to be successful in automatic discovery of the loop-level parallelization in programs. The affine partition algorithm was improved from the aspects of compile-time and runtime efficiency in this p...

متن کامل

Exploring Spatial Partition for Parallel Simulation of DEVS-FIRE

DEVS-FIRE is a cellular space model for simulating wildfire spread based on the DEVS formalism. To apply parallel simulation of the DEVS-FIRE model, we need a way to divide the simulation tasks and assign them to multiple parallel processing nodes. One way to divide the simulation tasks is based on the spatial partition of the cellular space model. Two spatial partition ideas, i.e., a uniform p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996