Dirichlet distribution
(45 minutes to learn)
Summary
The Dirichlet distribution specifies a distribution on a n-dimensional vector and can be viewed as a probability distribution on a n-1 dimensional simplex (a simplex is an n-dimensional generalization of a triangle). Its parameters determine the distribution of mass on this simplex. The Dirichlet distribution is a conjugate prior to the categorigal and multinomial distributions, and for this reason, it is common in Bayesian statistics. Also, the Dirichlet distribution is a generalization of the beta distribution to higher dimensions (for n=2 it is the beta distribution).
Context
This concept has the prerequisites:
- beta distribution (The Dirichlet distribution is a multivariate generalization of the beta distribution.)
- gamma function (The gamma function is part of the definition of the Dirichlet distribution.)
- multinomial distribution (The Dirichlet distribution is most commonly used as the conjugate prior for the multinomial distribution.)
Core resources (read/watch one of the following)
-Free-
→ Introduction to the Dirichlet Distribution and Related Processes
→ Mathematical Monk: Machine Learning (2011)
Supplemental resources (the following are optional, but you may find them useful)
-Paid-
→ Pattern Recognition and Machine Learning
A textbook for a graduate machine learning course, with a focus on Bayesian methods.
Location:
Section 2.2.2 (page 76)
See also
- We can define the Dirichlet distribution in terms of the gamma distribution .
- The Dirichlet process is a generalization of the Dirichlet distribution to possibly infinite spaces, and is useful in mixture modeling.
- The Dirichlet distribution is a conjugate prior to the categorical and multinomial distribution.