MIT 6.438: Algorithms for Inference

Created by: Roger Grosse
Intended for: MIT 6.438 students

Here is an overview of the topics covered in MIT's probabilistic graphical models course, 6.438. If you're a student taking the class, you may find this a helpful source of additional readings. If you're not taking the class, but want to learn about graphical models, this can help you identify some of the key topics.

This roadmap corresponds to how the class was taught in the fall of 2011 (the semester I TA'ed it), and the class has probably changed since then.

Lecture 1: Introduction, overview, preliminaries

  • Nothing specifically for this lecture, but you may want to learn about conditional independence now, since that gets used a lot early on in the course.

Lecture 2: Directed probabilistic graphical models

  • Bayesian networks, or Bayes nets, known in 438-land as directed graphical models
  • d-separation, a way of analyzing conditional independence structure in Bayes nets
  • Bayes Ball, an efficient algorithm for computing Bayes net conditional independencies. Note that while the course uses Bayes Ball to find conditional independencies, you may find it more intuitive to think directly in terms of the d-separation rules, as in the previous item.

Lecture 3: Undirected graphs

Lecture 4: Factor graphs; generating and converting graphs

  • factor graphs. Note that factor graphs and undirected graphical models are two different ways to represent the structure of Boltzmann distributions, and the only real difference is that factor graphs are a more fine-grained notation.
  • converting between graphical models

Lecture 5: Perfect maps, chordal graphs, Markov chains, trees

  • Nothing to go with this lecture, sorry.

Lecture 6: Gaussian graphical models

Lecture 7: Inference on graphs: elimination algorithm

Lecture 8: Inference on trees: sum-product algorithm

  • sum-product algorithm. Unfortunately, different sources differ in which version of this algorithm they present. Most of them use the factor graph version, which is covered in a later lecture. Koller and Friedman jump straight to the junction tree (clique tree) version, which is the most general, but it can be a lot to take in all at once. Start with whichever you like, and it should make the other versions easier to understand.

Lecture 9: Example: forward-backward algorithm

Lecture 10: Sum-product algorithm with factor graphs

  • See the references for lecture 8, since some of them use factor graphs.

Lecture 11: MAP estimation and min-sum algorithm

Lecture 12: Inference with Gaussian graphical models

Lecture 13: Example: Kalman filtering and smoothing

Lecture 14: Junction tree algorithm

Lecture 15: Loopy belief propagation, part 1

Lecture 16: Loopy belief propagation, part 2

Lecture 17: Variational inference

Lecture 18: Sampling by Markov chain Monte Carlo

Lecture 19: Approximate inference by particle methods

Lecture 20: Parameter estimation in directed graphs

Lecture 21: Learning structure in directed graphs

Lecture 22: Modeling from partial observations

Lecture 23: Learning undirected graphical models

Lecture 24: Learning exponential family models