Prefix Probabilities from Stochastic Tree Adjoining Grammars

نویسندگان

  • Mark-Jan Nederhof
  • Anoop Sarkar
  • Giorgio Satta
چکیده

Language models for speech recognition typically use a probability model of the form Pr(an|a1, a2, . . . , an−1). Stochastic grammars, on the other hand, are typically used to assign structure to utterances. A language model of the above form is constructed from such grammars by computing the prefix probability ∑ w∈Σ Pr(a1 · · · anw), where w represents all possible terminations of the prefix a1 · · · an. The main result in this paper is an algorithm to compute such prefix probabilities given a stochastic Tree Adjoining Grammar (TAG). The algorithm achieves the required computation in O(n6) time. The probability of subderivations that do not derive any words in the prefix, but contribute structurally to its derivation, are precomputed to achieve termination. This algorithm enables existing corpus-based estimation techniques for stochastic TAGs to be used for language modelling.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Prefix probabilities for linear indexed grammars

vVe show how prefix probabilities can be computed for stochastic linear indexed grammars (SLIGs). Our results apply as weil to stochastic tree-adjoining grammars (STAGs), due to their equivalence to SLIGs.

متن کامل

Prefix Probabilities for Linear Context-Free Rewriting Systems

We present a novel method for the computation of prefix probabilities for linear context-free rewriting systems. Our approach streamlines previous procedures to compute prefix probabilities for context-free grammars, synchronous context-free grammars and tree adjoining grammars. In addition, the methodology is general enough to be used for a wider range of problems involving, for example, sever...

متن کامل

Preex Probabilities for Linear Indexed Grammars

We show how preex probabilities can be computed for stochastic linear indexed grammars (SLIGs). Our results apply as well to stochastic tree-adjoining grammars (STAGs), due to their equivalence to SLIGs.

متن کامل

Pre x Probabilities from Stochastic Tree Adjoining Grammars

Language models for speech recognition typically use a probability model of the form Pr(anja1; a2; : : : ; an 1). Stochastic grammars, on the other hand, are typically used to assign structure to utterances. A language model of the above form is constructed from such grammars by computing the pre x probability P w2 Pr(a1 anw), where w represents all possible terminations of the pre x a1 an. The...

متن کامل

Proceedings of the 9 th International Workshop Finite State Methods and Natural Language Processing

The paradigm of parsing as intersection has been used throughout the literature to obtain elegant and general solutions to numerous problems involving grammars and automata. The paradigm has its origins in (Bar-Hillel et al., 1964), where a general construction was used to prove closure of context-free languages under intersection with regular languages. It was pointed out by (Lang, 1994) that ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998