the kernel trick

(50 minutes to learn)

Summary

We can use linear models to model complex nonlinear functions by mapping the original data to a basis function representation. Such a representation can get unweildy, however. The kernel trick allows us to implicitly map the data to a very high (possibly infinite) dimensional space by replacing the dot product with a more general inner product, or kernel.

Context

This concept has the prerequisites:

Core resources (read/watch one of the following)

-Free-

Gaussian Processes for Machine Learning
A graduate-level machine learning textbook focusing on Gaussian processes.
Authors: Carl E. Rasmussen,Christopher K. I. Williams

-Paid-

Supplemental resources (the following are optional, but you may find them useful)

-Free-

Coursera: Machine Learning (2013)
An online machine learning course aimed at a broad audience.
Author: Andrew Y. Ng
Other notes:
  • Click on "Preview" to see the videos.
Bayesian Reasoning and Machine Learning
A textbook for a graudate machine learning course.
Author: David Barber

See also