Discriminative Training of N-gram Classifi
نویسنده
چکیده
We present a method for conditional maximum likelihood estimation of N-gram models used for text or speech utterance classification. The method employs a well known technique relying on a generalization of the Baum-Eagon inequality from polynomials to rational functions. The best performance is achieved for the 1-gram classifier where conditional maximum likelihood training reduces the class error rate over a maximum likelihood classifier by 45% relative.
منابع مشابه
Discriminative training of language model classifiers
We show how discriminative training methods, namely the Maximum Mutual Information and Maximum Discrimination approach, can be adopted for the training of N-gram language models used as clas-siiers working on symbol strings. By estimating the model parameters according to a discriminative objective function instead of Maximum Likelihood, the emphasis is not put on the exact modeling of each cla...
متن کاملN - gram Parsing for Jointly Training a Discriminative Constituency
Syntactic parsers are designed to detect the complete syntactic structure of grammatically correct sentences. In this paper, we introduce the concept of n-gram parsing, which corresponds to generating the constituency parse tree of n consecutive words in a sentence. We create a stand-alone n-gram parser derived from a baseline full discriminative constituency parser and analyze the characterist...
متن کاملN-gram Parsing for Jointly Training a Discriminative Constituency Parser
Syntactic parsers are designed to detect the complete syntactic structure of grammatically correct sentences. In this paper, we introduce the concept of n -gram parsing, which corresponds to generating the constituency parse tree of n consecutive words in a sentence. We create a stand-alone n -gram parser derived from a baseline full discriminative constituency parser and analyze the characteri...
متن کاملMinimum rank error training for language modeling
Discriminative training techniques have been successfully developed for many pattern recognition applications. In speech recognition, discriminative training aims to minimize the metric of word error rate. However, in an information retrieval system, the best performance should be achieved by maximizing the average precision. In this paper, we construct the discriminative n-gram language model ...
متن کاملPerceptron Reranking for CCG Realization
This paper shows that discriminative reranking with an averaged perceptron model yields substantial improvements in realization quality with CCG. The paper confirms the utility of including language model log probabilities as features in the model, which prior work on discriminative training with log linear models for HPSG realization had called into question. The perceptron model allows the co...
متن کامل