Commentary on Kaplan and Kay

نویسنده

  • Mark Liberman
چکیده

Anyone with a fundamental interest in morphology and phonology, either from a scientific or a computational perspective, will want to study this long-awaited paper carefully. Kaplan and Kay (henceforth K&K) announce two goals: "to provide the core of a mathematical framework for phonology" and "to establish a solid basis for computation in the domain of phonological and orthographic systems." They show how the algebra of regular relations, with their corresponding automata, can be used to compile systems of phonological rules in the style of SPE, including directionality, optionality, and ordering. They sketch mechanisms for incorporating a lexicon and for dealing with exceptional forms, thus providing a complete treatment in a unified framework. This accomplishment in itself will not compel the attention of many working phonologists, who have found good reasons to replace the SPE framework (see Kenstowicz [1994] for a survey of modern practice), and whose efforts since 1975 have been aimed mainly at finding representational primitives to explain typological generalizations, support accounts of learning, generalization and change, and provide one end of the mapping between symbols and speech. In this effort, there has been little emphasis on SPE's goal of giving phonological descriptions an algorithmically specified denotation. Perhaps this paper, despite its superficial lack of connection to contemporary work in phonology, will set in motion a discussion that will ultimately redress the balance. On the computational side, practitioners of practical NLP will be happy to make extensive use of the algebra of regular relations, since it provides a truly elegant engineering solution to a wide range of problems. However, although direct interpretation of some simple FSTs can be efficient (e.g. Feigenbaum et al. 1991), and although Koskenniemi has documented efficient implementation techniques for his two-level systems, the overall architecture presented in this paper is not practically usable as written, because of either the size of the resulting automata or the time needed for (unwisely implemented) nondeterminism, or both. A range of well-known techniques enable programs based on the algebraic combination of (unary) FSAs to make efficient use of both time and space. Although these methods do not apply to FSTs in general, we may presume that K&K have developed analogous techniques for the crucial range of cases. With the growing interest in this technology, we can expect that either K&K will publish their work or others will recapitulate it, so that the algebra of regular relations can take its proper and prominent place in the toolkit of computational linguistics.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Integrating Finite-state Technology with Deep LFG Grammars1

Researchers at PARC were pioneers in developing finite-state methods for applications in computational linguistics, and one of the original motivations was to provide a coherent architecture for the integration of lower-level lexical processing with higher-level syntactic analysis (Kaplan and Kay, 1981; Karttunen et al., 1992; Kaplan and Kay, 1994). Finite-state methods for tokenizing and morph...

متن کامل

Commentary on Kaplan and Kay

To appreciate this article fully, it is essential to understand the historical context into which it fits, and which it has to some extent created. Although formally published for the first time here, it is already an extremely influential and classic piece of work. Finite-state machines, in one form or another, have been used for the description of natural language since the early 1950s, with ...

متن کامل

Twenty-five years of finite-state morphology

Twenty-five years ago in the early 1980s, morphological analysis of natural language was a challenge to computational linguists. Simple cutand-paste programs could be and were written to analyze strings in particular languages, but there was no general language-independent method available. Furthermore, cut-and-paste programs for analysis were not reversible, they could not be used to generate ...

متن کامل

A General Computational Model for Word-Form Recognition and Production

A language independent model for recognition and production of word forms is presented. This "two-level model" is based on a new way of describing morphological alternations. All rules describing the morphophonological variations are parallel and relatively independent of each other. Individual rules are implemented as finite state automata, as in an earlier model due to Martin Kay and Ron Kapl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computational Linguistics

دوره 20  شماره 

صفحات  -

تاریخ انتشار 1994