asymptotics of maximum likelihood
(3.4 hours to learn)
Summary
Under certain regularity conditions, the maximum likelihood estimator is consistent, i.e. it asymptotically approaches the true value. Its sampling distribution (when rescaled appropriately) approaches a normal distribution whose variance is determined by the Fisher information. Because of the Cramer-Rao bound, this is the best we can do. The asymptotic analysis is useful for constructing confidence intervals for parameter estimates.
Context
This concept has the prerequisites:
- maximum likelihood
- Fisher information (The asymptotic covariance of the maximum likelihood estimate is determined by Fisher information.)
- Gaussian distribution (The maximum likelihood estimate of the parameters is approximately Gaussian in the large data limit.)
Goals
- Understand basic properties of maximum likelihood estimators:
- they are consistent (they approach the correct value in the limit)
- asymptotically, their sampling distribution (rescaled appropriately) approaches a normal distribution whose variance is the inverse Fisher information
- they are efficient (no unbiased estimator has smaller variance asymptotically)
- Note: for the multivariate version of the asymptotic normality result, you'll want to know about the Fisher information matrix .
Core resources (read/watch one of the following)
-Paid-
→ Mathematical Statistics and Data Analysis
An undergraduate statistics textbook.
- Section 8.5.2, "Large sample theory for maximum likelihood estimators," pages 274-279
- Section 8.5.3, "Confidence intervals for maximum likelihood estimates," pages 279-285
- Section 8.7, "Efficiency and the Cramer-Rao lower bound," not including 8.7.1, pages 298-302
→ All of Statistics
A very concise introductory statistics textbook.
- Section 9.4, "Properties of maximum likelihood estimators," pages 124-126
- Section 9.5, "Consistency of maximum likelihood estimators," pages 126-127
- Section 9.7, "Asymptotic normality," pages 128-130
- Section 9.8, "Optimality," pages 130-131
Other notes:
- Section 9.10 gives the multivariate generalizations of these results.
Supplemental resources (the following are optional, but you may find them useful)
-Paid-
→ Probability and Statistics
An introductory textbook on probability theory and statistics.
- Section 7.6, "Properties of maximum likelihood," subsection "Consistency," pages 427-428
- Section 8.8, "Fisher information," pages 514-528
See also
- Because of the Cramer-Rao bound , no unbiased estimator can achieve lower variance than the ML estimator.
- The ML estimator depends only on sufficient statistics of the distribution.