11:00 am, Monday 19 June 2006
Three Results on the Convergence of Boosting Algorithms

Cynthia Rudin

In Caruana and Niculescu-Mizil's 2005 empirical survey of classification algorithms, boosted decision trees were rated as the best of the state-of-the-art algorithms. Boosting algorithms are clearly among the most competitive and most successful algorithms for statistical learning, yet there is still much to be understood about the convergence properties of these algorithms. I will discuss three things: first, a circumstance called the case of "bounded edges", in which AdaBoost's convergence properties can be completely determined. As far as I know, this is only one of two possible known circumstances where AdaBoost's convergence, with respect to the margin, can be analytically understood in the separable case (where the training error vanishes). Second, I will present a new result, namely a convergence rate for Breiman's Arc-Gv algorithm to a maximum margin solution. Meir and Ratsch's 2003 review paper discusses convergence rates for Arc-Gv as an interesting open problem. Both of these results (the result on AdaBoost's convergence and the convergence rate for Arc-Gv) are derived from an important tool called the "smooth margin", which is a differentiable approximation of the true margin for boosting algorithms. The "smooth margin" obeys a useful recursion relation which helps greatly in the analyses. Thirdly, I will discuss AdaBoost's ability as an algorithm for bipartite ranking. Caruana and Niculescu-Mizil have noted that AdaBoost seems surprisingly useful as a ranking algorithm. We present the analytical reason for their observation, namely that AdaBoost is just as useful for ranking as its counterpart, RankBoost. The first two results are joint work with Rob Schapire and Ingrid Daubechies, and the third result is joint with Corinna Cortes, Mehryar Mohri, and Rob Schapire.

Back