Bayesian parameter estimation: multinomial distribution
(1.1 hours to learn)
Suppose we observe a set of draws from a multinomial distribution with unknown parameters and we're trying to predict the distribution over subsequent draws. If we put a Dirichlet prior over the probabilities, we can analytically integrate out the parameters to get the posterior predictive distribution. This has a very simple form: adding fake counts and then normalizing. These ideas are used more generally in Bayesian models involving discrete variables.
This concept has the prerequisites:
- Know how the Dirichlet distribution is defined and what the parameters represent
- Know what the Dirichlet-multinomial model is
- Derive the posterior distribution and posterior predictive distribution
Core resources (read/watch one of the following)
→ Pattern Recognition and Machine Learning
A textbook for a graduate machine learning course, with a focus on Bayesian methods.
Location: Section 2.2, pgs. 74-77
→ Machine Learning: a Probabilistic Perspective
A very comprehensive graudate-level machine learning textbook.
Location: Sections 2.5.4 (pages 47-49) and 3.4, (pages 78-82)
- Lagrange multipliers
Supplemental resources (the following are optional, but you may find them useful)
- create concept: shift + click on graph
- change concept title: shift + click on existing concept
- link together concepts: shift + click drag from one concept to another
- remove concept from graph: click on concept then press delete/backspace
- add associated content to concept: click the small circle that appears on the node when hovering over it
- other actions: use the icons in the upper right corner to optimize the graph placement, preview the graph, or download a json representation