Statistical Query Learning (1993; Kearns)
نویسندگان
چکیده
The problem deals with learning {−1, +1}-valued functions from random labeled examples in the presence of random noise in the labels. In the random classification noise model of of Angluin and Laird [1] the label of each example given to the learning algorithm is flipped randomly and independently with some fixed probability η called the noise rate. The model is the extension of Valiant’s PAC model [14] that formalizes the simplest type of white label noise. Robustness to this relatively benign noise is an important goal in the design of learning algorithms. Kearns defined a powerful and convenient framework for constructing noise-tolerant algorithms based on statistical queries. Statistical query (SQ) learning is a natural restriction of PAC learning that models algorithms that use statistical properties of a data set rather than individual examples. Kearns demonstrated that any learning algorithm that is based on statistical queries can be automatically converted to a learning algorithm in the presence of random classification noise of arbitrary rate smaller than the information-theoretic barrier of 1/2. This result was used to give the first noise-tolerant algorithm for a number of important learning problems. In fact, virtually all known noise-tolerant PAC algorithms were either obtained from SQ algorithms or can be easily cast into the SQ model.
منابع مشابه
Computational Bounds on Statistical Query Learning
We study the complexity of learning in Kearns’ well-known statistical query (SQ) learning model (Kearns, 1993). A number of previous works have addressed the definition and estimation of the information-theoretic bounds on the SQ learning complexity, in other words, bounds on the query complexity. Here we give the first strictly computational upper and lower bounds on the complexity of several ...
متن کاملGeneral Bounds on Statistical Query Learning and PAC Learning with Noise via Hypothesis Bounding
We derive general bounds on the complexity of learning in the Statistical Query model and in the PAC model with classification noise. We do so by considering the problem of boosting the accuracy of weak learning algorithms which fall within the Statistical Query model. This new model was introduced by Kearns [12] to provide a general framework for efficient PAC learning in the presence of class...
متن کاملNoise - tolerant learning , the parity problem , and the
We describe a slightly sub-exponential time algorithm for learning parity functions in the presence of random classiication noise. By applying this algorithm to the restricted case of parity functions that depend on only the rst O(log n loglog n) bits of input, we achieve the rst known instance of a polynomial-time noise-tolerant learning algorithm for a concept class that is provably not learn...
متن کاملLearning from Positive and Unlabeled Examples
In many machine learning settings, labeled examples are difficult to collect while unlabeled data are abundant. Also, for some binary classification problems, positive examples which are elements of the target concept are available. Can these additional data be used to improve accuracy of supervised learning algorithms? We investigate in this paper the design of learning algorithms from positiv...
متن کاملStatistical Active Learning Algorithms
We describe a framework for designing efficient active learning algorithms that are tolerant to random classification noise and differentially-private. The framework is based on active learning algorithms that are statistical in the sense that they rely on estimates of expectations of functions of filtered random examples. It builds on the powerful statistical query framework of Kearns [30]. We...
متن کامل