An Information-Theoretic Route from Generalization in Expectation to Generalization in Probability

نویسنده

  • Ibrahim M. Alabdulmohsin
چکیده

One fundamental goal in any learning algorithm is to mitigate its risk for overfitting. Mathematically, this requires that the learning algorithm enjoys a small generalization risk, which is defined either in expectation or in probability. Both types of generalization are commonly used in the literature. For instance, generalization in expectation has been used to analyze algorithms, such as ridge regression and SGD, whereas generalization in probability is used in the VC theory, among others. Recently, a third notion of generalization has been studied, called uniform generalization, which requires that the generalization risk vanishes uniformly in expectation across all bounded parametric losses. It has been shown that uniform generalization is, in fact, equivalent to an information-theoretic stability constraint, and that it recovers classical results in learning theory. It is achievable under various settings, such as sample compression schemes, finite hypothesis spaces, finite domains, and differential privacy. However, the relationship between uniform generalization and concentration remained unknown. In this paper, we answer this question by proving that, while a generalization in expectation does not imply a generalization in probability, a uniform generalization in expectation does imply concentration. We establish a chain rule for the uniform generalization risk of the composition of hypotheses and use it to derive a large deviation bound. Finally, we prove that the bound is tight. Proceedings of the 20 International Conference on Artificial Intelligence and Statistics (AISTATS) 2017, Fort Lauderdale, Florida, USA. JMLR: W&CP volume 54. Copyright 2017 by the author(s).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Uniform Generalization, Concentration, and Adaptive Learning

One fundamental goal in any learning algorithm is to mitigate its risk for overfitting. Mathematically, this requires that the learning algorithm enjoys a small generalization risk, which is defined either in expectation or in probability. Both types of generalization are commonly used in the literature. For instance, generalization in expectation has been used to analyze algorithms, such as ri...

متن کامل

AN INTEGRAL DEPENDENCE IN MODULES OVER COMMUTATIVE RINGS

In this paper, we give a generalization of the integral dependence from rings to modules. We study the stability of the integral closure with respect to various module theoretic constructions. Moreover, we introduce the notion of integral extension of a module and prove the Lying over, Going up and Going down theorems for modules.

متن کامل

FUZZY INFORMATION AND STOCHASTICS

In applications there occur different forms of uncertainty. The twomost important types are randomness (stochastic variability) and imprecision(fuzziness). In modelling, the dominating concept to describe uncertainty isusing stochastic models which are based on probability. However, fuzzinessis not stochastic in nature and therefore it is not considered in probabilisticmodels.Since many years t...

متن کامل

A Study on the Commentary of Historical Verses with an Emphasis on the Rule of Al-Ibrah

One of the prevalent commentary rules about commentary of the historical verses which has a certain revelation occasion and refers to a specific time and place is the rule of alibrah being stated as: take in consideration universality of the word not particularity of the occasion. The source of this rule refers to the verses which have universal word and particular occasion. The referent of the...

متن کامل

A generalization of the probability that the commutator of two group elements is equal to a given element

The probability that the commutator of two group elements is equal to a given element has been introduced in literature few years ago. Several authors have investigated this notion with methods of the representation theory and with combinatorial techniques. Here we illustrate that a wider context may be considered and show some structural restrictions on the group.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017