asymptotics of maximum likelihood
(3.4 hours to learn)
Under certain regularity conditions, the maximum likelihood estimator is consistent, i.e. it asymptotically approaches the true value. Its sampling distribution (when rescaled appropriately) approaches a normal distribution whose variance is determined by the Fisher information. Because of the Cramer-Rao bound, this is the best we can do. The asymptotic analysis is useful for constructing confidence intervals for parameter estimates.
This concept has the prerequisites:
- maximum likelihood
- Fisher information (The asymptotic covariance of the maximum likelihood estimate is determined by Fisher information.)
- Gaussian distribution (The maximum likelihood estimate of the parameters is approximately Gaussian in the large data limit.)
- Understand basic properties of maximum likelihood estimators:
- they are consistent (they approach the correct value in the limit)
- asymptotically, their sampling distribution (rescaled appropriately) approaches a normal distribution whose variance is the inverse Fisher information
- they are efficient (no unbiased estimator has smaller variance asymptotically)
- Note: for the multivariate version of the asymptotic normality result, you'll want to know about the Fisher information matrix .
Core resources (read/watch one of the following)
→ Mathematical Statistics and Data Analysis
An undergraduate statistics textbook.
- Section 8.5.2, "Large sample theory for maximum likelihood estimators," pages 274-279
- Section 8.5.3, "Confidence intervals for maximum likelihood estimates," pages 279-285
- Section 8.7, "Efficiency and the Cramer-Rao lower bound," not including 8.7.1, pages 298-302
→ All of Statistics
A very concise introductory statistics textbook.
- Section 9.4, "Properties of maximum likelihood estimators," pages 124-126
- Section 9.5, "Consistency of maximum likelihood estimators," pages 126-127
- Section 9.7, "Asymptotic normality," pages 128-130
- Section 9.8, "Optimality," pages 130-131
- Section 9.10 gives the multivariate generalizations of these results.
Supplemental resources (the following are optional, but you may find them useful)
→ Probability and Statistics
An introductory textbook on probability theory and statistics.
- Section 7.6, "Properties of maximum likelihood," subsection "Consistency," pages 427-428
- Section 8.8, "Fisher information," pages 514-528
- Because of the Cramer-Rao bound , no unbiased estimator can achieve lower variance than the ML estimator.
- The ML estimator depends only on sufficient statistics of the distribution.
- create concept: shift + click on graph
- change concept title: shift + click on existing concept
- link together concepts: shift + click drag from one concept to another
- remove concept from graph: click on concept then press delete/backspace
- add associated content to concept: click the small circle that appears on the node when hovering over it
- other actions: use the icons in the upper right corner to optimize the graph placement, preview the graph, or download a json representation