Context - Sensitive Statistics for ImprovedGrammatical Language Models Eugene

نویسندگان

  • Eugene Charniak
  • Glenn Carroll
چکیده

We develop a language model using probabilistic context-free grammars (PCFGs) that is \pseudo context-sensitive" in that the probability that a non-terminal N expands using a rule r depends on N's parent. We derive the equations for estimating the necessary probabilities using a variant of the inside-outside algorithm. We give experimental results showing that, beginning with a high-performance PCFG, one can develop a pseudo PCSG that yields signiicant performance gains. Analysis shows that the beneets from the context-sensitive statistics are localized, suggesting that we can use them to extend the original PCFG. Experimental results connrm that this is both feasible and the resulting grammar retains the performance gains. This implies that our scheme may be useful as a novel method for PCFG induction.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Context-Sensitive Statistics For Improved Grammatical Language Models

We develop a language model using probabilistic context-free grammars (PCFGs) that is “pseudo context-sensitive” in that the probability that a nonterminal N expands using a rule T depends on N’s parent. We give the equations for estimating the necessary probabilities using a variant of the inside-outside algorithm. We give experimental results showing that, beginning with a high-performance PC...

متن کامل

Model agglomeration for context-dependent acoustic modeling

This work describes a method for generating back-off models for context-dependent unit modeling. The main characteristic of the approach is that of building generic models by gathering statistics of detailed models, collected during BaumWelch reestimation. The construction of back-off models does not require additional processing of the training data, allowing to quickly build different models ...

متن کامل

Language and Identity in the Iranian Context: The Impact of Identity Aspects on EFL Learners' Achievement

Identity orientations refer to the relative importance that individuals place on various identity attributes or characteristics such as race, religion, culture and language when constructing their self-definitions (Chew, 2007; Cheek, 1989). Accordingly, the present study aims at identifying the impact of identity aspects on the Iranian learners' English language achievements at Shiraz Universit...

متن کامل

Parsing with Context - Free Grammars and WordStatistics

We present a language model in which the probability of a sentence is the sum of the individual parse probabilities, and these are calculated using a probabilistic context-free grammar (PCFG) plus statistics on individual words and how they t into parses. We have used the model to improve syntactic disambiguation. After training on Wall Street Journal (WSJ) text we tested on about 200 WSJ sente...

متن کامل

CAMAC: a context-aware mandatory access control model

Mandatory access control models have traditionally been employed as a robust security mechanism in multilevel security environments such as military domains. In traditional mandatory models, the security classes associated with entities are context-insensitive. However, context-sensitivity of security classes and flexibility of access control mechanisms may be required especially in pervasive c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994