(30 minutes to learn)
The main advantage of the SVM as a linear classifier is that it can be kernelized in order to represent complex nonlinear decision boundaries. Conveniently, since only a (hopefully) sparse subset of the training examples are used, kernels only need to be computed with a small fraction of the training examples. Kernel SVMs are one of the most widely used classifiers in machine learning, because off-the-shelf tools often perform very well.
This concept has the prerequisites:
Core resources (read/watch one of the following)
→ The Elements of Statistical Learning
A graudate-level statistical learning textbook with a focus on frequentist methods.
Location: Section 12.3, "Support vector machines and kernels," up through and including 12.3.1, "Computing the SVM for classification," pages 423-425
Supplemental resources (the following are optional, but you may find them useful)
→ Coursera: Machine Learning (2013)
An online machine learning course aimed at a broad audience.
Location: Lecture "Kernels II"
- Click on "Preview" to see the videos.
→ Machine Learning: a Probabilistic Perspective
A very comprehensive graudate-level machine learning textbook.
Location: Section 14.5-184.108.40.206, pages 496-502
→ Pattern Recognition and Machine Learning
A textbook for a graduate machine learning course, with a focus on Bayesian methods.
Location: Section 7.1, up to 7.1.1, pages 326-331
- create concept: shift + click on graph
- change concept title: shift + click on existing concept
- link together concepts: shift + click drag from one concept to another
- remove concept from graph: click on concept then press delete/backspace
- add associated content to concept: click the small circle that appears on the node when hovering over it
- other actions: use the icons in the upper right corner to optimize the graph placement, preview the graph, or download a json representation