Here is an overview of the topics covered in MIT's probabilistic graphical models course, 6.438. If you're a student taking the class, you may find this a helpful source of additional readings. If you're not taking the class, but want to learn about graphical models, this can help you identify some of the key topics.
This roadmap corresponds to how the class was taught in the fall of 2011 (the semester I TA'ed it), and the class has probably changed since then.
Lecture 1: Introduction, overview, preliminaries
- Nothing specifically for this lecture, but you may want to learn about conditional independence now, since that gets used a lot early on in the course.
Lecture 2: Directed probabilistic graphical models
- Bayesian networks, or Bayes nets, known in 438-land as directed graphical models
- d-separation, a way of analyzing conditional independence structure in Bayes nets
- Bayes Ball, an efficient algorithm for computing Bayes net conditional independencies. Note that while the course uses Bayes Ball to find conditional independencies, you may find it more intuitive to think directly in terms of the d-separation rules, as in the previous item.
Lecture 3: Undirected graphs
- Markov random fields (MRFs), also known as undirected graphical models
Lecture 4: Factor graphs; generating and converting graphs
- factor graphs. Note that factor graphs and undirected graphical models are two different ways to represent the structure of Boltzmann distributions, and the only real difference is that factor graphs are a more fine-grained notation.
- converting between graphical models
Lecture 5: Perfect maps, chordal graphs, Markov chains, trees
- Nothing to go with this lecture, sorry.
Lecture 6: Gaussian graphical models
- multivariate Gaussian distribution
- information form for multivariate Gaussians
- Gaussian MRFs
- linear-Gaussian models, or Gaussian Bayes nets
Lecture 7: Inference on graphs: elimination algorithm
Lecture 8: Inference on trees: sum-product algorithm
- sum-product algorithm. Unfortunately, different sources differ in which version of this algorithm they present. Most of them use the factor graph version, which is covered in a later lecture. Koller and Friedman jump straight to the junction tree (clique tree) version, which is the most general, but it can be a lot to take in all at once. Start with whichever you like, and it should make the other versions easier to understand.
Lecture 9: Example: forward-backward algorithm
- hidden Markov models
- forward-backward algorithm
- HMM inference as a special case of belief propagation. This one covers MAP inference as well, which doesn't appear until a later lecture.
Lecture 10: Sum-product algorithm with factor graphs
- See the references for lecture 8, since some of them use factor graphs.
Lecture 11: MAP estimation and min-sum algorithm
- the max-product algorithm (Note that max-product, max-sum, and min-sum are all basically the same algorithm.)
- the Viterbi algorithm, the special case of max-product applied to HMMs
- HMM inference as a special case of belief propagation
- If you're feeling rusty on linear algebra, now is a good time to brush up since the Gaussian inference lectures will make heavy use of it.
Lecture 12: Inference with Gaussian graphical models
- Gaussian belief propagation
- connection between Gaussian inference and variable elimination
- Note that these nodes have quite a few linear algebra dependencies. You may want to review those before the lecture, so that the derivations will make sense.
Lecture 13: Example: Kalman filtering and smoothing
- Kalman filter, and derivation
- Kalman smoother
- Viewing Kalman smoothing as a special case of forward-backward
Lecture 14: Junction tree algorithm
Lecture 15: Loopy belief propagation, part 1
Lecture 16: Loopy belief propagation, part 2
Lecture 17: Variational inference
Lecture 18: Sampling by Markov chain Monte Carlo
Lecture 19: Approximate inference by particle methods
Lecture 20: Parameter estimation in directed graphs
- maximum likelihood
- learning Bayes net parameters
- Bayesian parameter estimation
- Bayesian estimation of Bayes net parameters