On the Eeective Vc Dimension

نویسنده

  • Corinna Cortes
چکیده

The very idea of an \EEective Vapnik Chervonenkis (VC) dimension" (Vapnik, Levin and Le Cun, 1993) relies on the hypothesis that the relation between the generalization error and the number of training examples can be expressed by a formula algebraically similar to the VC bound. This hypothesis calls for a serious discussion since the traditional VC bound widely overestimates the generalization error. In this paper we describe an algorithm and data dependent measure of capacity. We derive a conndence interval on the diierence between the training error and the generalization error. This conndence interval is much tighter than the traditional VC bound. A simple change of the formulation of the problem yields this extra accuracy: our conndence interval bounds the error diierence between a training set and a test set, rather than the error diierence between a training set and some hypothetical grand truth. This \transductive" approach allows for deriving a data and algorithm dependent conndence interval.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bayesian Classiiers Are Large Margin Hyperplanes in a Hilbert Space

It is often claimed that one of the main distinctive features of Bayesian Learning Algorithms for neural networks is that they don't simply output one hypothesis, but rather an entire distribution of probability over an hypothesis set: the Bayes posterior. An alternative perspective is that they output a linear combination of classiiers, whose coeecients are given by Bayes theorem. This can be ...

متن کامل

Bayesian Voting Schemes as Large MarginClassi

It is often claimed that one of the main distinctive features of Bayesian Learning Algorithms for neural networks is that they don't simply output one hypothesis, but rather an entire distribution of probability over an hypothesis set: the Bayes posterior. An alternative perspective is that they output a linear combination of classiiers, whose coeecients are given by Bayes theorem. This can be ...

متن کامل

Error Bounds for Real Function Classes Based on Discretized Vapnik-Chervonenkis Dimensions

The Vapnik-Chervonenkis (VC) dimension plays an important role in statistical learning theory. In this paper, we propose the discretized VC dimension obtained by discretizing the range of a real function class. Then, we point out that Sauer’s Lemma is valid for the discretized VC dimension. We group the real function classes having the infinite VC dimension into four categories by using the dis...

متن کامل

On the VC-Dimension of Univariate Decision Trees

In this paper, we give and prove lower bounds of the VC-dimension of the univariate decision tree hypothesis class. The VC-dimension of the univariate decision tree depends on the VC-dimension values of its subtrees and the number of inputs. In our previous work (Aslan et al., 2009), we proposed a search algorithm that calculates the VC-dimension of univariate decision trees exhaustively. Using...

متن کامل

COS 511 : Theoretical Machine Learning

The dot sign means inner product. If b is forced to be 0, the VC-dimension reduces to n. It is often the case that the VC-dimension is equal to the number of free parameters of a concept (for example, a rectangle’s parameters are its topmost, bottommost, leftmost and rightmost bounds, and its VC-dimension is 4). However, it is not always true; there exists concepts with 1 parameter but an infin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994