rejection sampling
(30 minutes to learn)
Summary
Rejection sampling (RS) is a monte carlo method for sampling from a potentially complex distribution p(x) given a simpler distribution q(x). RS is applicable when we can evaluate p(x) easily and sample from q(x) easily. The basic idea is to scale q(x) by some constant K such that K*q(x) >= p*(x) for all x, where p(x) = 1/Z p*(x) for some constant Z (that is, we only need to know p(x) up to a normalization factor, which is common in practice). RS operates by sampling a value x0 from q(x) and then generating a uniform random number, u, between [0, K*q(x0)]. If u <= p*(x0) we keep x0 as a valid sample from p(x), else we discard x0.
Context
This concept has the prerequisites:
- Monte Carlo estimation
- conditional distributions (Rejection sampling commonly used to estimate conditional expectations.)
Core resources (read/watch one of the following)
-Free-
→ Machine learning summer school: Markov chain Monte Carlo (2009)
→ Information Theory, Inference, and Learning Algorithms
A graudate-level textbook on machine learning and information theory.
Additional dependencies:
- multivariate Gaussian distribution
-Paid-
→ Machine Learning: a Probabilistic Perspective
A very comprehensive graudate-level machine learning textbook.
Location:
Sections 23.3-23.3.3, pages 817-819
→ Pattern Recognition and Machine Learning
A textbook for a graduate machine learning course, with a focus on Bayesian methods.
Location:
Section 11.1.2
Supplemental resources (the following are optional, but you may find them useful)
-Free-
→ Mathematical Monk Tutorials
→ Bayesian Reasoning and Machine Learning
A textbook for a graudate machine learning course.
See also
- Importance sampling is a way of getting weighted samples from a distribution and is useful in many of the same situations.
- Adaptive rejection sampling is a way to improve the sampler based on the rejected samples [ (go to concept)](adaptive_rejection_sampling)
- Other commonly used sampling methods include:
- Gibbs sampling , a generic and widely applicable sampling algorithm
- particle filters , which are useful for time series modeling
- Metropolis-Hastings algorithm , which is very general