An optimal pre-determinization algorithm for weighted transducers
نویسندگان
چکیده
We present a general algorithm, pre-determinization, that makes an arbitrary weighted transducer over the tropical semiring or an arbitrary unambiguous weighted transducer over a cancellative commutative semiring determinizable by inserting in it transitions labeled with special symbols. After determinization, the special symbols can be removed or replaced with -transitions. The resulting transducer can be significantly more efficient to use. We report empirical results showing that our algorithm leads to a substantial speed-up in large-vocabulary speech recognition. Our pre-determinization algorithm makes use of an efficient algorithm for testing a general twins property, a sufficient condition for the determinizability of all weighted transducers over the tropical semiring and unambiguous weighted transducers over cancellative commutative semirings. Based on the transitions marked by this test of the twins property, our pre-determinization algorithm inserts new transitions just when needed to guarantee that the resulting transducer has the twins property and thus is determinizable. It also uses a single-source shortest-paths algorithm over the min-max semiring for carefully selecting the positions for insertion of new transitions to benefit from the subsequent application of determinization. These positions are proved to be optimal in a sense that we describe.
منابع مشابه
A New Epsilon Filter for Efficient Composition of Weighted Finite-State Transducers
In this paper we propose a new composition algorithm for weighted finite-states transducers that are more and more used for speech and pattern recognition applications. Composition joins multiple transducers into one. We have implemented an embedded speech based dialog system for steering applications. Therefore regular grammars are very useful, but they may enlarge strongly by determinization....
متن کاملWeighted Finite-State Transducer Algorithms An Overview
Weighted finite-state transducers are used in many applications such as text, speech and image processing. This chapter gives an overview of several recent weighted transducer algorithms, including composition of weighted transducers, determinization of weighted automata, a weight pushing algorithm, and minimization of weighted automata. It briefly describes these algorithms, discusses their ru...
متن کاملEfficient Algorithms for Testing the Twins Property
Weighted automata and transducers are powerful devices used in many large-scale applications. The efficiency of these applications is substantially increased when the automata or transducers used are deterministic. There exists a general determinization algorithm for weighted automata and transducers that is an extension of the classical subset construction used in the case of unweighted finite...
متن کاملWeighted Automata Algorithms
This chapter presents several fundamental algorithms for weighted automata and transducers. While the mathematical counterparts of weighted transducers, rational power series , have been extensively studied in the past [22, 54, 13, 36], several essential weighted transducer algorithms, e.g., composition, determinization, minimization, have been devised only in the last decade [38, 43], in part ...
متن کاملWeighted Automata in Text
Processing Mehryar Mohri, Fernando Pereira and Michael Riley AT&T Research 600 Mountain Avenue Murray Hill, 07974 NJ fmohri,pereira,[email protected] Abstract. Finite-state automata are a very e ective tool in natural language processing. However, in a variety of applications and especially in speech precessing, it is necessary to consider more general machines in which arcs are assigned ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Theor. Comput. Sci.
دوره 328 شماره
صفحات -
تاریخ انتشار 2004