# learning Bayes net parameters with missing data

(2.4 hours to learn)

## Summary

There is no closed-form solution for the maximum likelihood parameters of a Bayes net when some of the variables are unobserved. However, it is possible to apply the EM algorithm, where the E step involves computing marginals and the M step involves computing the maximum likelihood parameters with fully observed data.

## Context

This concept has the prerequisites:

- Bayes net parameter learning
- Expectation-Maximization algorithm (We can use the EM algorithm to fill in missing data.)
- maximum likelihood (Maximum likelihood is used in the M step.)
- inference in MRFs (The E step requires solving an inference problem.)

## Goals

- Be able to use the EM algorithm to learn Bayes net parameters when some of the variables are unobserved.
- Know how to derive the update rules.
- Know how you would implement it if you're given toolboxes for inference and for parameter learning with fully observed data. What outputs are needed from the inference algorithm?

- What is the missing at random assumption, and why is it needed to apply EM?

- In the fully observed case, maximum likelihood decomposed into separate estimation problems for each clique. Why doesn't that happen when there is missing data?
- And why does the decomposition hold in the M step?

- Give an example where the likelihood function is multimodal (and therefore you shouldn't always expect to the global optimum).

- Give an example where the model is unidentifiable, i.e. multiple parameter settings are equally good.

## Core resources (read/watch one of the following)

## -Free-

→ Coursera: Probabilistic Graphical Models (2013)

An online course on probabilistic graphical models.

Other notes:

- The lecture "EM in practice" has good practical advice about using EM, and "Latent variables" talks about some cool applications.
- Click on "Preview" to see the videos.

## -Paid-

→ Probabilistic Graphical Models: Principles and Techniques

A very comprehensive textbook for a graduate-level course on probabilistic AI.

- Section 19.1, "Foundations," pages 849-862
- Section 19.2.2.3, "The EM algorithm for Bayesian networks," pages 872-875

Other notes:

- The part of 19.2.2.3 about exponential families is optional.

## Supplemental resources (the following are optional, but you may find them useful)

## -Free-

→ Coursera: Machine Learning

An online machine learning course aimed at advanced undergraduates.

Other notes:

- Click on "Preview" to see the videos.

## See also

-No Additional Notes-