Weighted Automata in Text

نویسندگان

  • Mehryar Mohri
  • Fernando Pereira
  • Michael Riley
چکیده

Processing Mehryar Mohri, Fernando Pereira and Michael Riley AT&T Research 600 Mountain Avenue Murray Hill, 07974 NJ fmohri,pereira,[email protected] Abstract. Finite-state automata are a very e ective tool in natural language processing. However, in a variety of applications and especially in speech precessing, it is necessary to consider more general machines in which arcs are assigned weights or costs. We brie y describe some of the main theoretical and algorithmic aspects of these machines. In particular, we describe an e cient composition algorithm for weighted transducers, and give examples illustrating the value of determinization and minimization algorithms for weighted automata.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Weighted Finite-State Transducer Algorithms An Overview

Weighted finite-state transducers are used in many applications such as text, speech and image processing. This chapter gives an overview of several recent weighted transducer algorithms, including composition of weighted transducers, determinization of weighted automata, a weight pushing algorithm, and minimization of weighted automata. It briefly describes these algorithms, discusses their ru...

متن کامل

Definable Transductions and Weighted Logics for Texts

A text is a word together with an additional linear order on it. We study quantitative models for texts, i.e. text series which assign to texts elements of a semiring. We introduce an algebraic notion of recognizability following Reutenauer and Bozapalidis as well as weighted automata for texts combining an automaton model of Lodaya and Weil with a model of Ésik and Németh. After that we show t...

متن کامل

Improving the Operation of Text Categorization Systems with Selecting Proper Features Based on PSO-LA

With the explosive growth in amount of information, it is highly required to utilize tools and methods in order to search, filter and manage resources. One of the major problems in text classification relates to the high dimensional feature spaces. Therefore, the main goal of text classification is to reduce the dimensionality of features space. There are many feature selection methods. However...

متن کامل

M . Droste and P . Gastin Weighted automata and weighted logics Research Report LSV - 05 - 13 July 2005

Weighted automata are used to describe quantitative properties in various areas such as probabilistic systems, image compression, speech-to-text processing. The behaviour of such an automaton is a mapping, called a formal power series, assigning to each word a weight in some semiring. We generalize Büchi’s and Elgot’s fundamental theorems to this quantitative setting. We introduce a weighted ve...

متن کامل

What's Decidable about Weighted Automata?

Weighted automata map input words to numerical values. Applications of weighted automata include formal verification of quantitative properties, as well as text, speech, and image processing. In the 90’s, Krob studied the decidability of problems on rational series, which strongly relate to weighted automata. In particular, it follows from Krob’s results that the universality problem (that is, ...

متن کامل

Statistical Language Models within the Algebra of Weighted Rational Languages

Statistical language models are an important tool in natural language processing. They represent prior knowledge about a certain language which is usually gained from a set of samples called a corpus. In this paper, we present a novel way of creating N -gram language models using weighted finite automata. The construction of these models is formalised within the algebra underlying weighted fini...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996