variational interpretation of EM
(50 minutes to learn)
The expectation-maximization (EM) algorithm can be interpreted as a coordinate ascent procedure which optimizes a variational lower bound on the likelihood function. This connects it with variational inference algorithms and justifies various generalizations and approximations to the algorithm.
This concept has the prerequisites:
Core resources (read/watch one of the following)
→ Pattern Recognition and Machine Learning
A textbook for a graduate machine learning course, with a focus on Bayesian methods.
Location: Section 9.4, pages 450-455
→ Machine Learning: a Probabilistic Perspective
A very comprehensive graudate-level machine learning textbook.
Location: Section 11.4.7, pages 363-365
-No Additional Notes-