the evidence approximation
(1 hours to learn)
Summary
The evidence approximation is an approximation to Bayesian parameter estimation and model comparison. Rather than integrating out model hyperparameters, the hyperparameters chosen to maximize the marginal likelihood of the data.
Context
This concept has the prerequisites:
- Bayesian model comparison (The evidence approximation is a way of approximating Bayesian model comparison.)
- optimization problems (The evidence approximation is formulated as an optimization problem.)
Core resources (read/watch one of the following)
-Paid-
→ Machine Learning: a Probabilistic Perspective
A very comprehensive graudate-level machine learning textbook.
Location:
Section 5.6, pages 172-176
Additional dependencies:
- Bayesian parameter estimation: Gaussian distribution
Supplemental resources (the following are optional, but you may find them useful)
-Paid-
→ Pattern Recognition and Machine Learning
A textbook for a graduate machine learning course, with a focus on Bayesian methods.
Location:
Sections 3.5-3.5.2, pages 165-169
Additional dependencies:
- Bayesian linear regression
See also
- The evidence approximation requires integrating out the model parameters. Variational Bayes gives a way of doing this.
- An alternative to the evidence approximation is to define a prior over the hyperparameters and integrate them out. This is an instance of hierarchical Bayesian modeling .
- The evidence approximation can be computed exactly in the case of Gaussian processes.