On the Eeective Vc Dimension
نویسنده
چکیده
The very idea of an \EEective Vapnik Chervonenkis (VC) dimension" (Vapnik, Levin and Le Cun, 1993) relies on the hypothesis that the relation between the generalization error and the number of training examples can be expressed by a formula algebraically similar to the VC bound. This hypothesis calls for a serious discussion since the traditional VC bound widely overestimates the generalization error. In this paper we describe an algorithm and data dependent measure of capacity. We derive a conndence interval on the diierence between the training error and the generalization error. This conndence interval is much tighter than the traditional VC bound. A simple change of the formulation of the problem yields this extra accuracy: our conndence interval bounds the error diierence between a training set and a test set, rather than the error diierence between a training set and some hypothetical grand truth. This \transductive" approach allows for deriving a data and algorithm dependent conndence interval.
منابع مشابه
Bayesian Classiiers Are Large Margin Hyperplanes in a Hilbert Space
It is often claimed that one of the main distinctive features of Bayesian Learning Algorithms for neural networks is that they don't simply output one hypothesis, but rather an entire distribution of probability over an hypothesis set: the Bayes posterior. An alternative perspective is that they output a linear combination of classiiers, whose coeecients are given by Bayes theorem. This can be ...
متن کاملBayesian Voting Schemes as Large MarginClassi
It is often claimed that one of the main distinctive features of Bayesian Learning Algorithms for neural networks is that they don't simply output one hypothesis, but rather an entire distribution of probability over an hypothesis set: the Bayes posterior. An alternative perspective is that they output a linear combination of classiiers, whose coeecients are given by Bayes theorem. This can be ...
متن کاملError Bounds for Real Function Classes Based on Discretized Vapnik-Chervonenkis Dimensions
The Vapnik-Chervonenkis (VC) dimension plays an important role in statistical learning theory. In this paper, we propose the discretized VC dimension obtained by discretizing the range of a real function class. Then, we point out that Sauer’s Lemma is valid for the discretized VC dimension. We group the real function classes having the infinite VC dimension into four categories by using the dis...
متن کاملOn the VC-Dimension of Univariate Decision Trees
In this paper, we give and prove lower bounds of the VC-dimension of the univariate decision tree hypothesis class. The VC-dimension of the univariate decision tree depends on the VC-dimension values of its subtrees and the number of inputs. In our previous work (Aslan et al., 2009), we proposed a search algorithm that calculates the VC-dimension of univariate decision trees exhaustively. Using...
متن کاملCOS 511 : Theoretical Machine Learning
The dot sign means inner product. If b is forced to be 0, the VC-dimension reduces to n. It is often the case that the VC-dimension is equal to the number of free parameters of a concept (for example, a rectangle’s parameters are its topmost, bottommost, leftmost and rightmost bounds, and its VC-dimension is 4). However, it is not always true; there exists concepts with 1 parameter but an infin...
متن کامل