the bootstrap
(4.8 hours to learn)
Summary
The bootstrap is a Monte Carlo technique for estimating variances or confidence intervals of statistical estimators. It uses the empirical distribution as a proxy for the true distribution, and measures the accuracy of the estimator on datasets resampled from the empirical distribution. It is widely applicable and doesn't require assuming a parametric form for the true distribution.
Context
This concept has the prerequisites:
- expectation and variance (The bootstrap is often used for estimating the variance of an estimator.)
- Monte Carlo estimation (The bootstrap is a Monte Carlo estimator.)
Goals
- Know the procedures for both the parametric and nonparametric bootstrap
- When would you choose one over the other?
- Note: for the parametric bootstrap, it may help to know about a point estimator such as maximum likelihood , but you can treat this as a black box.
- Be able to use the bootstrap to:
- estimate the variance of an estimator
- compute a confidence interval for an estimator
- The nonparametric bootstrap introduces two sources of error: using the empirical distribution as a proxy for the true distribution, and repeatedly simulating from the empirical distribution. Which of these would you expect to be a larger source of error?
Core resources (read/watch one of the following)
-Free-
→ CMU 36-402, Advanced data analysis: the bootstrap
- Section 1, "Stochastic models, uncertainty, sampling distributions," pages 2-4
- Section 2, "The bootstrap principle," pages 4-15
- Section 3, "Non-parametric bootstrapping," pages 15-18
-Paid-
→ All of Statistics
A very concise introductory statistics textbook.
- Chapter 8, "The bootstrap," pages 107-115
- Section 9.11, "The parametric bootstrap," pages 134-135
Supplemental resources (the following are optional, but you may find them useful)
-Paid-
→ Probability and Statistics
An introductory textbook on probability theory and statistics.
Location:
Section 12.6, "The bootstrap," pages 839-849
See also
-No Additional Notes-