learning Bayes net parameters with missing data
(2.4 hours to learn)
Summary
There is no closed-form solution for the maximum likelihood parameters of a Bayes net when some of the variables are unobserved. However, it is possible to apply the EM algorithm, where the E step involves computing marginals and the M step involves computing the maximum likelihood parameters with fully observed data.
Context
This concept has the prerequisites:
- Bayes net parameter learning
- Expectation-Maximization algorithm (We can use the EM algorithm to fill in missing data.)
- maximum likelihood (Maximum likelihood is used in the M step.)
- inference in MRFs (The E step requires solving an inference problem.)
Goals
- Be able to use the EM algorithm to learn Bayes net parameters when some of the variables are unobserved.
- Know how to derive the update rules.
- Know how you would implement it if you're given toolboxes for inference and for parameter learning with fully observed data. What outputs are needed from the inference algorithm?
- What is the missing at random assumption, and why is it needed to apply EM?
- In the fully observed case, maximum likelihood decomposed into separate estimation problems for each clique. Why doesn't that happen when there is missing data?
- And why does the decomposition hold in the M step?
- Give an example where the likelihood function is multimodal (and therefore you shouldn't always expect to the global optimum).
- Give an example where the model is unidentifiable, i.e. multiple parameter settings are equally good.
Core resources (read/watch one of the following)
-Free-
→ Coursera: Probabilistic Graphical Models (2013)
An online course on probabilistic graphical models.
Other notes:
- The lecture "EM in practice" has good practical advice about using EM, and "Latent variables" talks about some cool applications.
- Click on "Preview" to see the videos.
-Paid-
→ Probabilistic Graphical Models: Principles and Techniques
A very comprehensive textbook for a graduate-level course on probabilistic AI.
- Section 19.1, "Foundations," pages 849-862
- Section 19.2.2.3, "The EM algorithm for Bayesian networks," pages 872-875
Other notes:
- The part of 19.2.2.3 about exponential families is optional.
Supplemental resources (the following are optional, but you may find them useful)
-Free-
→ Coursera: Machine Learning
An online machine learning course aimed at advanced undergraduates.
Other notes:
- Click on "Preview" to see the videos.
See also
-No Additional Notes-