Bayesian parameter estimation: multinomial distribution

(1.1 hours to learn)


Suppose we observe a set of draws from a multinomial distribution with unknown parameters and we're trying to predict the distribution over subsequent draws. If we put a Dirichlet prior over the probabilities, we can analytically integrate out the parameters to get the posterior predictive distribution. This has a very simple form: adding fake counts and then normalizing. These ideas are used more generally in Bayesian models involving discrete variables.


This concept has the prerequisites:


  • Know how the Dirichlet distribution is defined and what the parameters represent
  • Know what the Dirichlet-multinomial model is
  • Derive the posterior distribution and posterior predictive distribution

Core resources (read/watch one of the following)


Supplemental resources (the following are optional, but you may find them useful)


Coursera: Probabilistic Graphical Models (2013)
An online course on probabilistic graphical models.
Author: Daphne Koller
Additional dependencies:
  • maximum likelihood
  • Bayesian networks
Other notes:
  • Click on "Preview" to see the videos.

See also