Bayesian model comparison
(2.5 hours to learn)
Summary
The framework of Bayesian model comparison evaluates probabilistic models based on the marginal likelihood, or the probability they assign a dataset with all the parameters marginalized out. The marginalization of model parameters implements a sort of "Occam's razor" effect. Marginal likelihoods can also be used to compute a posterior over model classes using Bayes' rule.
Context
This concept has the prerequisites:
- Bayesian parameter estimation (Most techniques for Bayesian model comparison involve estimating the parameters as well.)
- Bayes' rule (Bayes' rule is used to compute a posterior over models from the prior and marginal likelihoods.)
Goals
- Know what the marginal likelihood of a model refers to
- Motivate the marginal likelihood in terms of Bayes factors
- Understand the basis for the "Bayesian Occam's razor" effect (hint: it's not primarily a result of assigning lower prior probability to models with more parameters, as many people believe)
- Derive the Bayes factor for a simple example (e.g. a beta-Bernoulli model)
Core resources (read/watch one of the following)
-Free-
→ Information Theory, Inference, and Learning Algorithms
A graudate-level textbook on machine learning and information theory.
-Paid-
→ Machine Learning: a Probabilistic Perspective
A very comprehensive graudate-level machine learning textbook.
Location:
Section 5.3, pages 155-165
Supplemental resources (the following are optional, but you may find them useful)
-Paid-
→ Pattern Recognition and Machine Learning
A textbook for a graduate machine learning course, with a focus on Bayesian methods.
Location:
Section 3.4, pages 161-165
See also
- From a Bayesian standpoint, it's better to average predictions over many models than to select just one. This is known as Bayesian model averaging .
- Some general classes of methods for estimating Bayes factors include: