# Bayesian model comparison

(2.5 hours to learn)

## Summary

The framework of Bayesian model comparison evaluates probabilistic models based on the marginal likelihood, or the probability they assign a dataset with all the parameters marginalized out. The marginalization of model parameters implements a sort of "Occam's razor" effect. Marginal likelihoods can also be used to compute a posterior over model classes using Bayes' rule.

## Context

This concept has the prerequisites:

- Bayesian parameter estimation (Most techniques for Bayesian model comparison involve estimating the parameters as well.)
- Bayes' rule (Bayes' rule is used to compute a posterior over models from the prior and marginal likelihoods.)

## Goals

- Know what the marginal likelihood of a model refers to

- Motivate the marginal likelihood in terms of Bayes factors

- Understand the basis for the "Bayesian Occam's razor" effect (hint: it's not primarily a result of assigning lower prior probability to models with more parameters, as many people believe)

- Derive the Bayes factor for a simple example (e.g. a beta-Bernoulli model)

## Core resources (read/watch one of the following)

## -Free-

→ Information Theory, Inference, and Learning Algorithms

A graudate-level textbook on machine learning and information theory.

## -Paid-

→ Machine Learning: a Probabilistic Perspective

A very comprehensive graudate-level machine learning textbook.

Location:
Section 5.3, pages 155-165

## Supplemental resources (the following are optional, but you may find them useful)

## -Paid-

→ Pattern Recognition and Machine Learning

A textbook for a graduate machine learning course, with a focus on Bayesian methods.

Location:
Section 3.4, pages 161-165

## See also

- From a Bayesian standpoint, it's better to average predictions over many models than to select just one. This is known as Bayesian model averaging .
- Some general classes of methods for estimating Bayes factors include: