sum-product on trees
(2 hours to learn)
Summary
Sum-product is an algorithm for marginalization and partition function computation in graphical models. It is based on dynamic programming, and has the advantage that it reuses computations to compute marginals for all nodes in the graph. It is a generalization of the forward-backward algorithm for hidden Markov models.
Context
This concept has the prerequisites:
- inference in MRFs (Sum-product can be thought of as an MRF inference algorithm.)
- factor graphs (The sum-product algorithm is typically defined on factor graphs.)
Core resources (read/watch one of the following)
-Paid-
→ Pattern Recognition and Machine Learning
A textbook for a graduate machine learning course, with a focus on Bayesian methods.
Location:
Section 8.4.4, pages 402-411
Supplemental resources (the following are optional, but you may find them useful)
-Free-
→ Information Theory, Inference, and Learning Algorithms
A graudate-level textbook on machine learning and information theory.
→ Coursera: Probabilistic Graphical Models (2013)
An online course on probabilistic graphical models.
Other notes:
- These lectures describe the junction tree (or clique tree) algorithm, a way of applying BP to general graphs. The notation differs significantly from the more traditional formulation of BP on factor graphs.
- Click on "Preview" to see the videos.
-Paid-
→ Probabilistic Graphical Models: Principles and Techniques
A very comprehensive textbook for a graduate-level course on probabilistic AI.
Location:
Sections 10.1-10.2, pages 345-364
Other notes:
- This section describe the junction tree (or clique tree) algorithm, a way of applying BP to general graphs. The notation differs significantly from the more traditional formulation of BP on factor graphs.
→ Machine Learning: a Probabilistic Perspective
A very comprehensive graudate-level machine learning textbook.
Location:
Section 20.2-20.2.2, pages 707-710
See also
- If we apply the BP update rules in a non-tree graph, it often still works; this is known as loopy BP .
- We can apply (exact) BP to any MRF using the junction tree representation , although possibly with a large increase in complexity
- The special case of Gaussian graphical models
- Some HMM inference algorithms can be interpreted as belief propagation
- If we have a polytree or chordal MRF, we can convert it to a factor graph and then [run BP](factor_graph_bp) .
- BP can be interpreted as fixed point updates to an optimization problem involving Bethe free energy.