boosting as optimization

(1.4 hours to learn)

Summary

AdaBoost can be interpreted as a sequential procedure for minimizing the exponential loss on the training set with respect to the coefficients of a particular basis function expansion. This leads to generalizations of the algorithm to different loss functions.

Context

This concept has the prerequisites:

Goals

  • Derive AdaBoost as a sequential procedure to minimize the exponential loss on the training set.
  • Based on this analysis, why might AdaBoost be especially sensitive to mislabeled training examples?
  • Understand how the basic boosting procedure can be generalized to other loss function.s
    • Why do we often re-estimate the weights of the base classifiers for more general boosting algorithms, but not for AdaBoost?

Core resources (read/watch one of the following)

-Free-

-Paid-

See also

-No Additional Notes-