PAC Learning with Irrelevant Attributes
نویسندگان
چکیده
We consider the problem of learning in the presence of irrelevant attributes in Valiant's PAC model V84]. In the PAC model, the goal of the learner is to produce an approximately correct hypothesis from random sample data. If the number of relevant attributes in the target function is small, it may be desirable to produce a hypothesis that also depends on only a small number of variables. Haussler H88] previously considered the problem of learning monomials of a small number of variables. He showed that the greedy set cover approximation algorithm can be used as a polynomial-time Occam algorithm for learning mono-mials on r of n variables. It outputs a monomial on r(ln q + 1) variables, where q is the number of negative examples in the sample. We extend this result by showing that there is a polynomial-time Occam algorithm for learning k-term DNF formulas depending on r of n variables that outputs a DNF formula depending on O(r k log k q) variables, where q is the number of negative examples in the sample. We also give a polynomial-time Occam algorithm for learning decision lists (sometimes called 1-decision lists) with k alternations. It outputs a decision list with k alternations depending on O(r k log k m) variables, where m is the size of the sample. Using recent non-approximability techniques and Tromp HJLT94] have shown that, unless NP DTIMEE2 poly(log n) ], decision lists with k alternations cannot be approximated within a multiplicative factor of log k n and decision lists with an unbounded number of alternations cannot be approximated in polynomial time within a multiplicative factor of 2 log n for any < 1.
منابع مشابه
Notes on Learning with Irrelevant Attributes in the PAC Model
In these notes, we sketch some of our work on learning with irrelevant attributes in Valiant’s PAC model [V84]. In the PAC model, the goal of the learner is to produce an approximately correct hypothesis from random sample data. If the number of relevant attributes in the target function is small, it may he desirable to produce a hypothesis that also depends on only a small number of variables....
متن کاملKnowing what doesn't Matter: Exploiting the Omission of Irrelevant Data
Most learning algorithms work most e ectively when their training data contain completely speci ed labeled samples In many diagnostic tasks however the data will include the values of only some of the attributes we model this as a blocking process that hides the values of those attributes from the learner While blockers that remove the values of critical attributes can handicap a learner this p...
متن کاملExploiting the Omission of Irrelevant Data
Most learning algorithms work most eeectively when their training data contain completely speciied labeled samples. In many diagnostic tasks, however, the data will include the values of only some of the attributes; we model this as a blocking process that hides the values of those attributes from the learner. While blockers that remove the values of critical attributes can handicap a learner, ...
متن کاملOpen Problem: The Statistical Query Complexity of Learning Sparse Halfspaces
We consider the long-open problem of attribute-efficient learning of halfspaces. In this problem the learner is given random examples labeled by an unknown halfspace function f on R. Further f is r-sparse, that is it depends on at most r out of n variables. An attribute-efficient learning algorithm is an algorithm that can output a hypothesis close to f using a polynomial in r and log n number ...
متن کاملPAC-Bayes Learning of Conjunctions and Classification of Gene-Expression Data
We propose a “soft greedy” learning algorithm for building small conjunctions of simple threshold functions, called rays, defined on single real-valued attributes. We also propose a PAC-Bayes risk bound which is minimized for classifiers achieving a non-trivial tradeoff between sparsity (the number of rays used) and the magnitude of the separating margin of each ray. Finally, we test the soft g...
متن کامل