Dirichlet process

(45 minutes to learn)

Summary

The Dirichlet process is a stochastic process that defines a probability distribution over infinite-dimensional discrete distributions, meaning that a draw form a DP is itself a distribution (with a countably infinite number of parameters). Its name stems from the fact that the marginal of a DP for any finite partition is Dirichlet distributed. While the DP is often discussed alongside the Chinese Restaurant Process (CRP), the two are not the same entity. The DP is the de Finetti mixing measure for the CRP, meaning that sampling i.i.d. from a draw of a DP is equivalent to sequentially drawing samples from the CRP.

Context

This concept has the prerequisites:

Core resources (read/watch one of the following)

-Free-

Graphical Models for Visual Object Recognition and Tracking (2006)
Erik Sudderth's Ph.D. thesis, which includes readable overviews of a variety of topics.
Author: Erik Sudderth

Supplemental resources (the following are optional, but you may find them useful)

-Free-

Dirichlet Process
Author: Yee Whye Teh
Other notes:
  • assumes some familiarity with measure theory
Bayesian Nonparametrics (2011)
Location: part 2, from 0:00 - 10:33
Author: Yee Whye Teh
Other notes:
  • DP is further discussed throughout the entire lecture

See also

  • The beta process is an analogue of the Dirichlet process which is useful for probabilistic models which represent binary attributes.
  • Dirichlet diffusion trees are a hierarchical clustering model based on the same ideas as the Dirichlet process.