Compact Maximum Entropy Language Models
نویسندگان
چکیده
In language modeling we are always confronted with a sparse data problem. The Maximum Entropy formalism allows to fully integrate complementary statistical properties of limited corpora. The focus of the present paper is twofold. The new smoothing technique of LM-induced marginals is introduced and discussed. We then highlight the advantages resulting from a combination of robust features and show that the brute-force inclusion of too many constraints may deteriorate the performance due to overtraining effects. Very good LMs may be trained on the basis of pair correlations which are supplemented by heavily pruned N -grams. This is especially true if word and class based features are combined. Tests were carried out for the German Verbmobil task and on WSJ data. The test-set perplexities were reduced by 3-7% and the number of free parameters was reduced by 60-75%. At the same time overtraining effects are considerably reduced.
منابع مشابه
Maximum Entropy Modeling Toolkit
The Maximum Entropy Modeling Toolkit supports parameter estimation and prediction for statistical language models in the maximum entropy framework. The maximum entropy framework provides a constructive method for obtaining the unique conditional distribution p*(y|x) that satisfies a set of linear constraints and maximizes the conditional entropy H(p|f) with respect to the empirical distribution...
متن کاملA maximum entropy language model integrating N-grams and topic dependencies for conversational speech recognition
A compact language model which incorporates local dependencies in the form of N-grams and long distance dependencies through dynamic topic conditional constraints is presented. These constraints are integrated using the maximum entropy principle. Issues in assigning a topic to a test utterance are investigated. Recognition results on the Switchboard corpus are presented showing that with a very...
متن کاملFast parameter estimation for joint maximum entropy language models
This paper discusses efficient parameter estimation methods for joint (unconditional) maximum entropy language models such as whole-sentence models. Such models are a sound framework for formalizing arbitrary linguistic knowledge in a consistent manner. It has been shown that general-purpose gradient-based optimization methods are among the most efficient algorithms for estimating parameters of...
متن کاملCluster Expansions and Iterative Scaling for Maximum Entropy Language Models
The maximum entropy method has recently been successfully introduced to a variety of natural language applications. In each of these applications, however, the power of the maximum entropy method is achieved at the cost of a considerable increase in computational requirements. In this paper we present a technique, closely related to the classical cluster expansion from statistical mechanics , f...
متن کاملA Maximum Entropy Method for Language Modelling
The language models used for automatic speech recognition (ASR) are often based on very simple Markov models. This paper presents an overview of a more powerful modelling technique, Maximum Entropy (ME), and its application in langauge modelling. Preliminary results indicate that ME models are viable for this task, and perform slightly better than the traditional models.
متن کاملThe Maximum Entropy Relaxation Path
The relaxed maximum entropy problem is concerned with finding a probability distribution on a finite set that minimizes the relative entropy to a given prior distribution, while satisfying relaxed max-norm constraints with respect to a third observed multinomial distribution. We study the entire relaxation path for this problem in detail. We show existence and a geometric description of the rel...
متن کامل