Bayesian parameter estimation: multivariate Gaussians
(1 hours to learn)
Summary
Using the Bayesian framework, we can infer the posterior over the mean vector of a multivariate Gaussian, the covariance matrix, or both. Since multivariate Gaussians are widely used in probabilistic modeling, the computations that go into this are common motifs in Bayesian machine learning more generally.
Context
This concept has the prerequisites:
- Bayesian parameter estimation
- information form for multivariate Gaussians (The prior over covariance is easiest to express in information form.)
- Wishart distribution (The Wishart distribution is the conjugate prior for the precision matrix of a multivariate Gaussian.)
- Student-t distribution (The predictive distribution is a student-t distribution.)
- computations on multivariate Gaussians (We need to perform various operations with multivariate Gaussians to obtain the posterior over the mean vector.)
Goals
- Derive the conjugate priors for the multivariate distribution in three cases:
- unknown mean, but known covariance
- known mean, but unknown covariance
- unknown mean and unknown covariance
- Derive the posterior distributions for each of these cases.
Core resources (read/watch one of the following)
-Paid-
→ Machine Learning: a Probabilistic Perspective
A very comprehensive graudate-level machine learning textbook.
Location:
Sections 4.6-4.6.2, pgs. 127-131
Supplemental resources (the following are optional, but you may find them useful)
-Paid-
→ Pattern Recognition and Machine Learning
A textbook for a graduate machine learning course, with a focus on Bayesian methods.
Location:
Section 2.3.6, pgs. 97-102
See also
- These techniques are used in various models, including: The Wishart process allows us to model dependencies between different Wishart-distributed random variables.
- When there's not enough data to estimate a full covariance matrix, here are some related models with more structure: