(45 minutes to learn)
Exponential families are a broad class of probability distributions which includes many basic distributions such as Bernoullis and Gaussians, as well as Markov random fields. What they have in common is that the distributions can be represented in terms of log-linear functions of sufficient statistics.
This concept has the prerequisites:
- Know the basic definitions: exponential family, natural parameter, sufficient statistic
- Derive the exponential family representations of some simple distributions, e.g.
- Give an example of a family of distributions which is not an exponential family.
Core resources (read/watch one of the following)
→ Mathematical Monk: Machine Learning (2011)
→ Stanford's Machine Learning lecture notes
Lecture notes for Stanford's machine learning course, aimed at graduate and advanced undergraduate students.
Location: Lecture notes 1, "Supervised learning, discriminative algorithms," section 8, "The exponential family," pages 22-24
→ Pattern Recognition and Machine Learning
A textbook for a graduate machine learning course, with a focus on Bayesian methods.
Location: Section 2.4, "The exponential family," not counting subsections, pages 113-116
→ Probabilistic Graphical Models: Principles and Techniques
A very comprehensive textbook for a graduate-level course on probabilistic AI.
Location: Section 8.2, "Exponential families," pages 261-266
Supplemental resources (the following are optional, but you may find them useful)
→ Machine Learning: a Probabilistic Perspective
A very comprehensive graudate-level machine learning textbook.
Location: Section 9.2, "The exponential family," up to 9.2.4, "MLE for the exponential family," pages 281-283
- Exponential families can be parameterized in terms of sufficient statistics .
- Generalized linear models are a generalization of linear regression where the observation model is an exponential family distribution.
- A lot of aspects of statistical modeling wind up having convenient forms for exponential families:
- conjugate priors
- maximum likelihood estimation
- variational inference , a kind of approximate probabilistic inference algorithm
- create concept: shift + click on graph
- change concept title: shift + click on existing concept
- link together concepts: shift + click drag from one concept to another
- remove concept from graph: click on concept then press delete/backspace
- add associated content to concept: click the small circle that appears on the node when hovering over it
- other actions: use the icons in the upper right corner to optimize the graph placement, preview the graph, or download a json representation