(55 minutes to learn)
In most probabilistic models of interest, it's intractable to compute posterior marginals and/or normalizing constants exactly. Variational inference is a framework for approximating both. Variational inference treats inference as an optimization problem: we're trying to find a distribution (or a representation resembling a distribution) which is as close as possible to the true posterior, according to some measure.
This concept has the prerequisites:
Core resources (read/watch one of the following)
→ Pattern Recognition and Machine Learning
A textbook for a graduate machine learning course, with a focus on Bayesian methods.
Location: Sections 10.1-10.1.2
- multivariate Gaussian distribution
→ Machine Learning: a Probabilistic Perspective
A very comprehensive graudate-level machine learning textbook.
Location: Sections 21.1-21.2
Supplemental resources (the following are optional, but you may find them useful)
→ Probabilistic Graphical Models: Principles and Techniques
A very comprehensive textbook for a graduate-level course on probabilistic AI.
Location: Sections 8.5-8.5.1 and 11.1
- junction trees
- Some examples of variational inference algorithms:
- Mean field approximation
- Structured variational approximations in graphical models
- Expectation propagation , which is slower, but often considerably more accurate, than mean field
- Variational Bayes is the application of variational inference to fitting Bayesian models.
- Markov chain Monte Carlo (MCMC) is another versatile set of techniques for performing inference in probabilistic models.
- In the case of graphical models, belief propagation is another inference algorithm with a [variational interpretation](loopy_bp_as_variational) .