MRF parameter learning

(1.1 hours to learn)


The parameters of a Markov random field (MRF) can be fit to data using maximum likelihood. The optimal parameters have an interesting interpretation: they are the parameters such that certain sufficient statistics of the model must match the corresponding statistics of the empirical distribution.


This concept has the prerequisites:


  • Consider the maximum likelihood objective function for learning the parameters of an MRF given fully observed data.
    • Why doesn't the optimization problem decompose into separate optimization problems for each variable?
    • Why is it hard even to compute the objective function?
    • Derive the gradient of the objective function.
    • By setting the gradient to zero, show that for the maximum likelihood parameters, the model statistics must match the data statistics.
  • Performing gradient descent requires performing inference in the MRF. Which quantities need to be computed?
  • Optional: show that the maximum likelihood optimization problem is convex (which implies there are no local optima). You may first want to read about covariance matrices and [convex optimization](convex_optimization) .

Core resources (read/watch one of the following)


Coursera: Probabilistic Graphical Models (2013)
An online course on probabilistic graphical models.
Author: Daphne Koller
Other notes:
  • Click on "Preview" to see the videos.


See also