the kernel trick

(50 minutes to learn)


We can use linear models to model complex nonlinear functions by mapping the original data to a basis function representation. Such a representation can get unweildy, however. The kernel trick allows us to implicitly map the data to a very high (possibly infinite) dimensional space by replacing the dot product with a more general inner product, or kernel.


This concept has the prerequisites:

Core resources (read/watch one of the following)


Gaussian Processes for Machine Learning
A graduate-level machine learning textbook focusing on Gaussian processes.
Authors: Carl E. Rasmussen,Christopher K. I. Williams


Supplemental resources (the following are optional, but you may find them useful)


Coursera: Machine Learning (2013)
An online machine learning course aimed at a broad audience.
Author: Andrew Y. Ng
Other notes:
  • Click on "Preview" to see the videos.
Bayesian Reasoning and Machine Learning
A textbook for a graudate machine learning course.
Author: David Barber

See also