variational inference
(55 minutes to learn)
Summary
In most probabilistic models of interest, it's intractable to compute posterior marginals and/or normalizing constants exactly. Variational inference is a framework for approximating both. Variational inference treats inference as an optimization problem: we're trying to find a distribution (or a representation resembling a distribution) which is as close as possible to the true posterior, according to some measure.
Context
This concept has the prerequisites:
- multivariate distributions (Marginalization is the operation we most often want to perform using variational inference.)
- KL divergence (KL divergence is part of the variational objective function.)
- entropy (Entropy is part of the variational objective function.)
- Lagrange multipliers (Lagrange multipliers are necessary for analyzing variational inference algorithms.)
Core resources (read/watch one of the following)
-Paid-
→ Pattern Recognition and Machine Learning
A textbook for a graduate machine learning course, with a focus on Bayesian methods.
Location:
Sections 10.1-10.1.2
Additional dependencies:
- multivariate Gaussian distribution
→ Machine Learning: a Probabilistic Perspective
A very comprehensive graudate-level machine learning textbook.
Location:
Sections 21.1-21.2
Supplemental resources (the following are optional, but you may find them useful)
-Paid-
→ Probabilistic Graphical Models: Principles and Techniques
A very comprehensive textbook for a graduate-level course on probabilistic AI.
Location:
Sections 8.5-8.5.1 and 11.1
Additional dependencies:
- junction trees
See also
- Some examples of variational inference algorithms:
- Mean field approximation
- Structured variational approximations in graphical models
- Expectation propagation , which is slower, but often considerably more accurate, than mean field
- Variational Bayes is the application of variational inference to fitting Bayesian models.
- Markov chain Monte Carlo (MCMC) is another versatile set of techniques for performing inference in probabilistic models.
- In the case of graphical models, belief propagation is another inference algorithm with a [variational interpretation](loopy_bp_as_variational) .