# Dirichlet distribution

(45 minutes to learn)

## Summary

The Dirichlet distribution specifies a distribution on a n-dimensional vector and can be viewed as a probability distribution on a n-1 dimensional simplex (a simplex is an n-dimensional generalization of a triangle). Its parameters determine the distribution of mass on this simplex. The Dirichlet distribution is a conjugate prior to the categorigal and multinomial distributions, and for this reason, it is common in Bayesian statistics. Also, the Dirichlet distribution is a generalization of the beta distribution to higher dimensions (for n=2 it is the beta distribution).

## Context

This concept has the prerequisites:

- beta distribution (The Dirichlet distribution is a multivariate generalization of the beta distribution.)
- gamma function (The gamma function is part of the definition of the Dirichlet distribution.)
- multinomial distribution (The Dirichlet distribution is most commonly used as the conjugate prior for the multinomial distribution.)

## Core resources (read/watch one of the following)

## -Free-

→ Introduction to the Dirichlet Distribution and Related Processes

→ Mathematical Monk: Machine Learning (2011)

## Supplemental resources (the following are optional, but you may find them useful)

## -Paid-

→ Pattern Recognition and Machine Learning

A textbook for a graduate machine learning course, with a focus on Bayesian methods.

Location:
Section 2.2.2 (page 76)

## See also

- We can define the Dirichlet distribution in terms of the gamma distribution .
- The Dirichlet process is a generalization of the Dirichlet distribution to possibly infinite spaces, and is useful in mixture modeling.
- The Dirichlet distribution is a conjugate prior to the categorical and multinomial distribution.