Communication Efficient Distributed Agnostic Boosting
نویسندگان
چکیده
We consider the problem of learning from distributed data in the agnostic setting, i.e., in the presence of arbitrary forms of noise. Our main contribution is a general distributed boosting-based procedure for learning an arbitrary concept space, that is simultaneously noise tolerant, communication efficient, and computationally efficient. This improves significantly over prior works that were either communication efficient only in noisefree scenarios or computationally prohibitive. Empirical results on large synthetic and realworld datasets demonstrate the effectiveness and scalability of the proposed approach.
منابع مشابه
Distribution-Specific Agnostic Boosting
We consider the problem of boosting the accuracy of weak learning algorithms in the agnostic learning framework of Haussler (1992) and Kearns et al. (1992). Known algorithms for this problem (BenDavid et al., 2001; Gavinsky, 2002; Kalai et al. , 2008) follow the same strategy as boosting algorithms in the PAC model: the weak learner is executed on the same target function but over different dis...
متن کاملDistributed Learning, Communication Complexity and Privacy
We consider the problem of PAC-learning from distributed data and analyze fundamental communication complexity questions involved. We provide general upper and lower bounds on the amount of communication needed to learn well, showing that in addition to VCdimension and covering number, quantities such as the teaching-dimension and mistakebound of a class play an important role. We also present ...
متن کاملOptimally-Smooth Adaptive Boosting and Application to Agnostic Learning
We describe a new boosting algorithm that is the first such algorithm to be both smooth and adaptive. These two features make possible performance improvements for many learning tasks whose solutions use a boosting technique. The boosting approach was originally suggested for the standard PAC model; we analyze possible applications of boosting in the context of agnostic learning, which is more ...
متن کاملAgnostic Boosting
We extend the boosting paradigm to the realistic setting of agnostic learning, that is, to a setting where the training sample is generated by an arbitrary (unknown) probability distribution over examples and labels. We deene a-weak agnostic learner with respect to a hypothesis class F as follows: given a distribution P it outputs some hypothesis h 2 F whose error is at most erP(F) + , where er...
متن کاملOn Communication Complexity of Classification Problems
This work introduces a model of distributed learning in the spirit of Yao’s communication complexity model. We consider a two-party setting, where each of the players gets a list of labelled examples and they communicate in order to jointly perform some learning task. To naturally fit into the framework of learning theory, we allow the players to send each other labelled examples, where each ex...
متن کامل