Expectation-Maximization algorithm

(2.1 hours to learn)


Expectation-Maximization (EM) is an algorithm for maximum likelihood estimation in models with hidden variables (usually missing data or latent variables). It involves iteratively computing expectations of terms in the log-likelihood function under the current posterior, and then solving for the maximum likelihood parameters. Common applications include fitting mixture models, learning Bayes net parameters with latent data, and learning hidden Markov models.


This concept has the prerequisites:

Core resources (read/watch one of the following)


Supplemental resources (the following are optional, but you may find them useful)


Mathematical Monk: Machine Learning (2011)
Online videos on machine learning.
Other notes:
  • The 16.4-16.5 sequence presents an EM justification by trying to analytically maximize the likelihood rather than the (more typical) justification that the EM algorithm maximizes a lower bound on the likelihood.
Bayesian Reasoning and Machine Learning
A textbook for a graudate machine learning course.
Author: David Barber
Additional dependencies:
  • KL divergence
  • Lagrange multipliers


See also