11:00 am, Monday 19 June 2006
Three Results on the Convergence of Boosting Algorithms
Cynthia Rudin
In Caruana and Niculescu-Mizil's 2005 empirical survey of
classification algorithms, boosted decision trees were rated as the
best of the state-of-the-art algorithms. Boosting algorithms are
clearly among the most competitive and most successful algorithms for
statistical learning, yet there is still much to be understood about
the convergence properties of these algorithms.
I will discuss three things: first, a circumstance called the case
of "bounded edges", in which AdaBoost's convergence properties can be
completely determined. As far as I know, this is only one of two
possible known circumstances where AdaBoost's convergence, with
respect to the margin, can be analytically understood in the separable
case (where the training error vanishes). Second, I will present a new
result, namely a convergence rate for Breiman's Arc-Gv algorithm to a
maximum margin solution. Meir and Ratsch's 2003 review paper discusses
convergence rates for Arc-Gv as an interesting open problem. Both of
these results (the result on AdaBoost's convergence and the
convergence rate for Arc-Gv) are derived from an important tool called
the "smooth margin", which is a differentiable approximation of the
true margin for boosting algorithms. The "smooth margin" obeys a
useful recursion relation which helps greatly in the analyses.
Thirdly, I will discuss AdaBoost's ability as an algorithm for
bipartite ranking. Caruana and Niculescu-Mizil have noted that
AdaBoost seems surprisingly useful as a ranking algorithm. We present
the analytical reason for their observation, namely that AdaBoost is
just as useful for ranking as its counterpart, RankBoost.
The first two results are joint work with Rob Schapire and Ingrid
Daubechies, and the third result is joint with Corinna Cortes, Mehryar
Mohri, and Rob Schapire.
Back