Minimum KL-Divergence on Complements of $L_{1}$ Balls
نویسندگان
چکیده
Pinsker’s widely used inequality upper-bounds the total variation distance ‖P−Q‖1 in terms of the Kullback-Leibler divergence D(P‖Q). Although in general a bound in the reverse direction is impossible, in many applications the quantity of interest is actually D∗(v,Q) — defined, for an arbitrary fixed Q, as the infimum of D(P‖Q) over all distributions P that are at least vfar away from Q in total variation. We show that D∗(v,Q) ≤ Cv2 +O(v3), where C =C(Q) = 1/2 for “balanced” distributions, thereby providing a kind of reverse Pinsker inequality. Some of the structural results obtained in the course of the proof may be of independent interest. An application to large deviations is given.
منابع مشابه
Unifying Non-Maximum Likelihood Learning Objectives with Minimum KL Contraction
When used to learn high dimensional parametric probabilistic models, the classical maximum likelihood (ML) learning often suffers from computational intractability, which motivates the active developments of non-ML learning methods. Yet, because of their divergent motivations and forms, the objective functions of many non-ML learning methods are seemingly unrelated, and there lacks a unified fr...
متن کاملFrom -entropy to KL-entropy: Analysis of Minimum Information Complexity Density Estimation
We consider an extension of -entropy to a KL-divergence based complexity measure for randomized density estimation methods. Based on this extension, we develop a general information theoretical inequality that measures the statistical complexity of some deterministic and randomized density estimators. Consequences of the new inequality will be presented. In particular, we show that this techniq...
متن کاملStochastic Mirror Descent with Inexact Prox - Mapping in Density
Appendix A Strong convexity As we discussed, the posterior from Bayes’s rule could be viewed as the optimal of an optimization problem in Eq (1). We will show that the objective function is strongly convex w.r.t KL-divergence. Proof for Lemma 1. The lemma directly results from the generalized Pythagaras theorem for Bregman divergence. Particularly, for KL-divergence, we have KL(q 1 ||q) = KL(q ...
متن کاملInformation Theoretic Kernel Integration -
In this paper we consider a novel information-theoretic approach to multiple kernel learning based on minimising a Kullback-Leibler (KL) divergence between the output kernel matrix and the input kernel matrix. There are two formulations which we refer to as MKLdiv-dc and MKLdiv-conv. We propose to solve MKLdiv-dc by a difference of convex (DC) programming method and MKLdivconv by a projected gr...
متن کاملMinimum Entropy Rate Simplification of Stochastic Processes: Supplemental Material
A.1 Gaussian MERS Solution We first consider the purely nondeterministic case; the result is easily extended to arbitrary stationary and ergodic Gaussian processes using the Wold decomposition. Let X and X̃ be two discrete-time stationary and ergodic purely nondeterministic univariate Gaussian processes, with spectral power density functions RX ( e ) and RX̃ ( e ) respectively. These are by neces...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEEE Trans. Information Theory
دوره 60 شماره
صفحات -
تاریخ انتشار 2014