Gaussian processes
(1.1 hours to learn)
Summary
Gaussian processes are distributions over functions such that the joint distribution at any finte set of points is a multivariate Gaussian. They are commonly used in probabilistic modeling when we want to put a prior over functions without reference to an underlying parametric representation. Usually they express fairly weak beliefs about the function, such as smoothness, but more structured versions are also possible. The most common use case is nonparametric regression and classification.
Context
This concept has the prerequisites:
- the kernel trick (Gaussian processes are defined in terms of kernels.)
- multivariate Gaussian distribution (Gaussian processes are a generalization of the multivariate Gaussian distribution to possibly infinite spaces.)
Core resources (read/watch one of the following)
-Free-
→ Gaussian Processes for Machine Learning
A graduate-level machine learning textbook focusing on Gaussian processes.
-Paid-
→ Pattern Recognition and Machine Learning
A textbook for a graduate machine learning course, with a focus on Bayesian methods.
Location:
Section 6.4-6.4.2, pages 303-311
Supplemental resources (the following are optional, but you may find them useful)
-Free-
→ Bayesian Reasoning and Machine Learning
A textbook for a graudate machine learning course.
-Paid-
→ Machine Learning: a Probabilistic Perspective
A very comprehensive graudate-level machine learning textbook.
Location:
Section 15.1-15.2.3, pages 515-521
See also
- Gaussian processes have a variety of uses in machine learning, including:
- regression
- classification
- black-box optimization (where we only get to evaluate the function, and doing so is expensive)
- reinforcement learning