decision trees

(2.9 hours to learn)


Decision trees are a kind of tree-structured model used in machine learning and data mining. Each leaf node corresponds to a prediction, and each internal node divides the data points into two or more sets depending on the value of one of the input variables. Decision trees are widely used because of their simplicity and their ability to handle heterogeneous input features.


-this concept has no prerequisites-


  • Know what a decision tree is.
  • Give examples of functions which can't be represented compactly (e.g. majority, parity)
  • Be able to fit a decision tree using a recursive greedy strategy.
  • What is the information gain criterion, and why does it produce better splits than classification accuracy?
  • Be aware that decision trees can be unstable, in that the structure changes dramatically with respect to small changes in the training data.

Core resources (read/watch one of the following)


The Elements of Statistical Learning
A graudate-level statistical learning textbook with a focus on frequentist methods.
Authors: Trevor Hastie,Robert Tibshirani,Jerome Friedman
Other notes:
  • Read the introductory chapters if you're not familiar with the basic machine learning setup.
Coursera: Machine Learning
An online machine learning course aimed at advanced undergraduates.
Author: Pedro Domingos
Other notes:
  • The rest of the "decision tree induction" section is optional but useful.
  • Watch the Week One lectures if you're not familiar with the basic machine learning setup.
  • Click on "Preview" to see the videos.


Supplemental resources (the following are optional, but you may find them useful)


See also

  • Two common algorithms for combining decision trees include:
    • bagging , which reduces variance by sampling random training sets
    • boosting , which makes the model class more expressive