(45 minutes to learn)
In statistical learning theory, VC dimension is a measure of the complexity of a continuous hypothesis class (such as linear classifiers). While it often corresponds to the number of parameters in a model, this isn't always the case. There is a general bound on the generalization error of a classifier in terms of its error on a training set and its VC dimension.
This concept has the prerequisites:
- Know the definition of VC dimension.
- What is the VC dimension of a linear classifier in D dimensions?
- Give an example of a hypothesis class whose VC dimension is different from the number of parameters. (Hint: you can find one with a single parameter but infinite VC dimension.)
Core resources (read/watch one of the following)
→ Coursera: Machine Learning
→ Stanford's Machine Learning lecture notes
Lecture notes for Stanford's machine learning course, aimed at graduate and advanced undergraduate students.
→ The Elements of Statistical Learning
A graudate-level statistical learning textbook with a focus on frequentist methods.
→ An Introduction to Computational Learning Theory (1994)
A graduate level textbook on computational learning theory.
- Section 3.2, "The Vapnik-Chervonenkis dimension," pages 50-51
- Section 3.3, "Examples of the VC dimension," pages 51-54
- create concept: shift + click on graph
- change concept title: shift + click on existing concept
- link together concepts: shift + click drag from one concept to another
- remove concept from graph: click on concept then press delete/backspace
- add associated content to concept: click the small circle that appears on the node when hovering over it
- other actions: use the icons in the upper right corner to optimize the graph placement, preview the graph, or download a json representation