VC dimension

(45 minutes to learn)

Summary

In statistical learning theory, VC dimension is a measure of the complexity of a continuous hypothesis class (such as linear classifiers). While it often corresponds to the number of parameters in a model, this isn't always the case. There is a general bound on the generalization error of a classifier in terms of its error on a training set and its VC dimension.

Context

This concept has the prerequisites:

Goals

  • Know the definition of VC dimension.
  • What is the VC dimension of a linear classifier in D dimensions?
  • Give an example of a hypothesis class whose VC dimension is different from the number of parameters. (Hint: you can find one with a single parameter but infinite VC dimension.)

Core resources (read/watch one of the following)

-Free-

Coursera: Machine Learning
An online machine learning course aimed at advanced undergraduates.
Author: Pedro Domingos
Other notes:
  • Click on "Preview" to see the videos.
Stanford's Machine Learning lecture notes
Lecture notes for Stanford's machine learning course, aimed at graduate and advanced undergraduate students.
Author: Andrew Y. Ng
The Elements of Statistical Learning
A graudate-level statistical learning textbook with a focus on frequentist methods.
Authors: Trevor Hastie,Robert Tibshirani,Jerome Friedman

-Paid-

See also