# variational inference

(55 minutes to learn)

## Summary

In most probabilistic models of interest, it's intractable to compute posterior marginals and/or normalizing constants exactly. Variational inference is a framework for approximating both. Variational inference treats inference as an optimization problem: we're trying to find a distribution (or a representation resembling a distribution) which is as close as possible to the true posterior, according to some measure.

## Context

This concept has the prerequisites:

- multivariate distributions (Marginalization is the operation we most often want to perform using variational inference.)
- KL divergence (KL divergence is part of the variational objective function.)
- entropy (Entropy is part of the variational objective function.)
- Lagrange multipliers (Lagrange multipliers are necessary for analyzing variational inference algorithms.)

## Core resources (read/watch one of the following)

## -Paid-

→ Pattern Recognition and Machine Learning

A textbook for a graduate machine learning course, with a focus on Bayesian methods.

Location:
Sections 10.1-10.1.2

Additional dependencies:

- multivariate Gaussian distribution

→ Machine Learning: a Probabilistic Perspective

A very comprehensive graudate-level machine learning textbook.

Location:
Sections 21.1-21.2

## Supplemental resources (the following are optional, but you may find them useful)

## -Paid-

→ Probabilistic Graphical Models: Principles and Techniques

A very comprehensive textbook for a graduate-level course on probabilistic AI.

Location:
Sections 8.5-8.5.1 and 11.1

Additional dependencies:

- junction trees

## See also

- Some examples of variational inference algorithms:
- Mean field approximation
- Structured variational approximations in graphical models
- Expectation propagation , which is slower, but often considerably more accurate, than mean field

- Variational Bayes is the application of variational inference to fitting Bayesian models.
- Markov chain Monte Carlo (MCMC) is another versatile set of techniques for performing inference in probabilistic models.
- In the case of graphical models, belief propagation is another inference algorithm with a [variational interpretation](loopy_bp_as_variational) .