# constructing kernels

(2.2 hours to learn)

## Summary

The kernel trick allows us to reformulate linear machine learning models in terms of a kernel function which defines a notion of similarity between data points. A few simple rules allow us to construct kernels which capture a wide variety of similarity functions.

## Context

This concept has the prerequisites:

## Core resources (read/watch one of the following)

## -Free-

→ Gaussian Processes for Machine Learning

A graduate-level machine learning textbook focusing on Gaussian processes.

Location:
Sections 4-4.2, pages 79-95

Other notes:

- Don't worry about the parts about spectral density if you're not familiar with Fourier techniques. Section 4.2.4, on constructing new kernels from simpler kernels, is especially useful.

## -Paid-

→ Pattern Recognition and Machine Learning

A textbook for a graduate machine learning course, with a focus on Bayesian methods.

Location:
Section 6.2, pages 294-299

## Supplemental resources (the following are optional, but you may find them useful)

## -Free-

→ Bayesian Reasoning and Machine Learning

A textbook for a graudate machine learning course.

## See also

- The Schur product theorem justifies the surprising fact that the product of kernels is a kernel.
- Kernels can be defined on a variety of mathematical objects, allowing us to extend linear machine learning models to those cases: Fisher kernels are a general recipe for obtaining a kernel from a generative model.