constructing kernels

(2.2 hours to learn)

Summary

The kernel trick allows us to reformulate linear machine learning models in terms of a kernel function which defines a notion of similarity between data points. A few simple rules allow us to construct kernels which capture a wide variety of similarity functions.

Context

This concept has the prerequisites:

Core resources (read/watch one of the following)

-Free-

Gaussian Processes for Machine Learning
A graduate-level machine learning textbook focusing on Gaussian processes.
Authors: Carl E. Rasmussen,Christopher K. I. Williams
Other notes:
  • Don't worry about the parts about spectral density if you're not familiar with Fourier techniques. Section 4.2.4, on constructing new kernels from simpler kernels, is especially useful.

-Paid-

Supplemental resources (the following are optional, but you may find them useful)

-Free-

Bayesian Reasoning and Machine Learning
A textbook for a graudate machine learning course.
Author: David Barber

See also

  • The Schur product theorem justifies the surprising fact that the product of kernels is a kernel.
  • Kernels can be defined on a variety of mathematical objects, allowing us to extend linear machine learning models to those cases: Fisher kernels are a general recipe for obtaining a kernel from a generative model.