A Maximum Entropy Language Model with Topic Sensitive Features

نویسندگان

  • Jun Wu
  • Sanjeev Khudanpur
چکیده

We present a maximum entropy approach to topic sensitive language modeling. By classifying the training data into di erent parts according to topic, extracting topic sensitive unigram features, and combining these new features with conventional N-grams in language modeling, we build a topic sensitive bigram model. This model improves both perplexity and word error rate. keywords: maximum entropy, language modeling, topic dependency

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

رفع ابهام معنایی واژگان مبهم فارسی با مدل موضوعی LDA

Word sense disambiguation is the task of identifying the correct sense for the word in a given context among a finite set of possible sense. In this paper a model for farsi word sense disambiguation is presented. The model use two group of features: first, all word and stop words around target word and topic models as second features. We extract topics from a farsi corpus with Latent Dirichlet ...

متن کامل

Maximum Entropy Translation Model in Dependency-Based MT Framework

Maximum Entropy Principle has been used successfully in various NLP tasks. In this paper we propose a forward translation model consisting of a set of maximum entropy classifiers: a separate classifier is trained for each (sufficiently frequent) source-side lemma. In this way the estimates of translation probabilities can be sensitive to a large number of features derived from the source senten...

متن کامل

Maximum Entropy Language Modeling with Non-Local and Syntactic Dependencies

Standard N -gram language models exploit information only from the immediate past to predict the future word. To improve the performance of a language model, two di erent kinds of long-range dependence, the syntactic structure and the topic of sentences are taken into consideration. The likelihood of many words varies greatly with the topic of discussion and topics capture this di erence. Synta...

متن کامل

A Combination of Topic Models with Max-margin Learning for Relation Detection

This paper proposes a novel application of a supervised topic model to do entity relation detection (ERD). We adapt Maximum Entropy Discriminant Latent Dirichlet Allocation (MEDLDA) with mixed membership for relation detection. The ERD task is reformulated to fit into the topic modeling framework. Our approach combines the benefits of both, maximum-likelihood estimation (MLE) and max-margin est...

متن کامل

A maximum entropy language model integrating N-grams and topic dependencies for conversational speech recognition

A compact language model which incorporates local dependencies in the form of N-grams and long distance dependencies through dynamic topic conditional constraints is presented. These constraints are integrated using the maximum entropy principle. Issues in assigning a topic to a test utterance are investigated. Recognition results on the Switchboard corpus are presented showing that with a very...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998