(2.2 hours to learn)
The kernel trick allows us to reformulate linear machine learning models in terms of a kernel function which defines a notion of similarity between data points. A few simple rules allow us to construct kernels which capture a wide variety of similarity functions.
This concept has the prerequisites:
Core resources (read/watch one of the following)
→ Gaussian Processes for Machine Learning
A graduate-level machine learning textbook focusing on Gaussian processes.
Location: Sections 4-4.2, pages 79-95
- Don't worry about the parts about spectral density if you're not familiar with Fourier techniques. Section 4.2.4, on constructing new kernels from simpler kernels, is especially useful.
→ Pattern Recognition and Machine Learning
A textbook for a graduate machine learning course, with a focus on Bayesian methods.
Location: Section 6.2, pages 294-299
Supplemental resources (the following are optional, but you may find them useful)
→ Bayesian Reasoning and Machine Learning
A textbook for a graudate machine learning course.
- The Schur product theorem justifies the surprising fact that the product of kernels is a kernel.
- Kernels can be defined on a variety of mathematical objects, allowing us to extend linear machine learning models to those cases:Fisher kernels are a general recipe for obtaining a kernel from a generative model.
- create concept: shift + click on graph
- change concept title: shift + click on existing concept
- link together concepts: shift + click drag from one concept to another
- remove concept from graph: click on concept then press delete/backspace
- add associated content to concept: click the small circle that appears on the node when hovering over it
- other actions: use the icons in the upper right corner to optimize the graph placement, preview the graph, or download a json representation