Performance Guarantees for Regularized Maximum Entropy Density Estimation
نویسندگان
چکیده
We consider the problem of estimating an unknown probability distribution from samples using the principle of maximum entropy (maxent). To alleviate overfitting with a very large number of features, we propose applying the maxent principle with relaxed constraints on the expectations of the features. By convex duality, this turns out to be equivalent to finding the Gibbs distribution minimizing a regularized version of the empirical log loss. We prove nonasymptotic bounds showing that, with respect to the true underlying distribution, this relaxed version of maxent produces density estimates that are almost as good as the best possible. These bounds are in terms of the deviation of the feature empirical averages relative to their true expectations, a number that can be bounded using standard uniform-convergence techniques. In particular, this leads to bounds that drop quickly with the number of samples, and that depend very moderately on the number or complexity of the features. We also derive and prove convergence for both sequential-update and parallel-update algorithms. Finally, we briefly describe experiments on data relevant to the modeling of species geographical distributions.
منابع مشابه
Maximum Entropy Density Estimation with Generalized Regularization and an Application to Species Distribution Modeling
We present a unified and complete account of maximum entropy density estimation subject to constraints represented by convex potential functions or, alternatively, by convex regularization. We provide fully general performance guarantees and an algorithm with a complete convergence proof. As special cases, we easily derive performance guarantees for many known regularization types, including `1...
متن کاملMaximum Entropy Density Estimation with Incomplete Data
We propose a natural generalization of Regularized Maximum Entropy Density Estimation (maxent) to handle input data with unknown values. While standard approaches to handling missing data usually involve estimating the actual unknown values, then using the estimated, complete data as input, our method avoids the two-step process and handles unknown values directly in the maximum entropy formula...
متن کاملSimilarity Based Smoothing In Language Modeling
In this paper, we improve our previously proposed Similarity Based Smoothing (SBS) algorithm. The idea of the SBS is to map words or part of sentences to an Euclidean space, and approximate the language model in that space. The bottleneck of the original algorithm was to train a regularized logistic regression model, which was incapable to deal with real world data. We replace the logistic regr...
متن کاملCorrecting sample selection bias in maximum entropy density estimation
We study the problem of maximum entropy density estimation in the presence of known sample selection bias. We propose three bias correction approaches. The first one takes advantage of unbiased sufficient statistics which can be obtained from biased samples. The second one estimates the biased distribution and then factors the bias out. The third one approximates the second by only using sample...
متن کاملMaximum Entropy Density Estimation and Modeling Geographic Distributions of Species
Maximum entropy (maxent) approach, formally equivalent to maximum likelihood, is a widely used density-estimation method. When input datasets are small, maxent is likely to overfit. Overfitting can be eliminated by various smoothing techniques, such as regularization and constraint relaxation, but theory explaining their properties is often missing or needs to be derived for each case separatel...
متن کامل