The Rate of Convergence of Adaboost

نویسندگان

  • Indraneel Mukherjee
  • Cynthia Rudin
  • Robert E. Schapire
چکیده

The AdaBoost algorithm of Freund and Schapire (1997) was designed to combine many “weak” hypotheses that perform slightly better than a random guess into a “strong” hypothesis that has very low error. We study the rate at which AdaBoost iteratively converges to the minimum of the “exponential loss” with a fast rate of convergence. Our proofs do not require a weak-learning assumption, nor do they require that minimizers of the exponential loss are finite. Specifically, our first result shows that at iteration t, the exponential loss of AdaBoost’s computed parameter vector will be at most ε more than that of any parameter vector of `1-norm bounded by B in a number of rounds that is bounded by a polynomial in B and 1/ε. We also provide rate lower bound examples showing a polynomial dependence on these parameters is necessary. Our second result is that within C/ε iterations, AdaBoost achieves a value of the exponential loss that is at most ε more than the best possible value, where C depends on the dataset. We show that this dependence of the rate on ε is optimal up to constant factors, i.e. at least Ω(1/ε) rounds are necessary to achieve within ε of the optimal exponential loss.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Precise Statements of Convergence for AdaBoost and arc-gv

We present two main results, the first concerning Freund and Schapire’s AdaBoost algorithm, and the second concerning Breiman’s arc-gv algorithm. Our discussion of AdaBoost revolves around a circumstance called the case of “bounded edges”, in which AdaBoost’s convergence properties can be completely understood. Specifically, our first main result is that if AdaBoost’s “edge” values fall into a ...

متن کامل

Boosting Based on a Smooth Margin

We study two boosting algorithms, Coordinate Ascent Boosting and Approximate Coordinate Ascent Boosting, which are explicitly designed to produce maximum margins. To derive these algorithms, we introduce a smooth approximation of the margin that one can maximize in order to produce a maximum margin classifier. Our first algorithm is simply coordinate ascent on this function, involving a line se...

متن کامل

The Convergence Rate of AdaBoost

We pose the problem of determining the rate of convergence at which AdaBoost minimizes exponential loss. Boosting is the problem of combining many “weak,” high-error hypotheses to generate a single “strong” hypothesis with very low error. The AdaBoost algorithm of Freund and Schapire (1997) is shown in Figure 1. Here we are given m labeled training examples (x1, y1), . . . , (xm, ym) where the ...

متن کامل

SelfieBoost: A Boosting Algorithm for Deep Learning

We describe and analyze a new boosting algorithm for deep learning called SelfieBoost. Unlike other boosting algorithms, like AdaBoost, which construct ensembles of classifiers, SelfieBoost boosts the accuracy of a single network. We prove a log(1/ ) convergence rate for SelfieBoost under some “SGD success” assumption which seems to hold in practice.

متن کامل

Some Open Problems in Optimal AdaBoost and Decision Stumps

The significance of the study of the theoretical and practical properties of AdaBoost is unquestionable, given its simplicity, wide practical use, and effectiveness on real-world datasets. Here we present a few open problems regarding the behavior of “Optimal AdaBoost,” a term coined by Rudin, Daubechies, and Schapire in 2004 to label the simple version of the standard AdaBoost algorithm in whi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011