# asymptotics of maximum likelihood

(3.4 hours to learn)

## Summary

Under certain regularity conditions, the maximum likelihood estimator is consistent, i.e. it asymptotically approaches the true value. Its sampling distribution (when rescaled appropriately) approaches a normal distribution whose variance is determined by the Fisher information. Because of the Cramer-Rao bound, this is the best we can do. The asymptotic analysis is useful for constructing confidence intervals for parameter estimates.

## Context

This concept has the prerequisites:

- maximum likelihood
- Fisher information (The asymptotic covariance of the maximum likelihood estimate is determined by Fisher information.)
- Gaussian distribution (The maximum likelihood estimate of the parameters is approximately Gaussian in the large data limit.)

## Goals

- Understand basic properties of maximum likelihood estimators:
- they are consistent (they approach the correct value in the limit)
- asymptotically, their sampling distribution (rescaled appropriately) approaches a normal distribution whose variance is the inverse Fisher information
- they are efficient (no unbiased estimator has smaller variance asymptotically)

- Note: for the multivariate version of the asymptotic normality result, you'll want to know about the Fisher information matrix .

## Core resources (read/watch one of the following)

## -Paid-

→ Mathematical Statistics and Data Analysis

An undergraduate statistics textbook.

- Section 8.5.2, "Large sample theory for maximum likelihood estimators," pages 274-279
- Section 8.5.3, "Confidence intervals for maximum likelihood estimates," pages 279-285
- Section 8.7, "Efficiency and the Cramer-Rao lower bound," not including 8.7.1, pages 298-302

→ All of Statistics

A very concise introductory statistics textbook.

- Section 9.4, "Properties of maximum likelihood estimators," pages 124-126
- Section 9.5, "Consistency of maximum likelihood estimators," pages 126-127
- Section 9.7, "Asymptotic normality," pages 128-130
- Section 9.8, "Optimality," pages 130-131

Other notes:

- Section 9.10 gives the multivariate generalizations of these results.

## Supplemental resources (the following are optional, but you may find them useful)

## -Paid-

→ Probability and Statistics

An introductory textbook on probability theory and statistics.

- Section 7.6, "Properties of maximum likelihood," subsection "Consistency," pages 427-428
- Section 8.8, "Fisher information," pages 514-528

## See also

- Because of the Cramer-Rao bound , no unbiased estimator can achieve lower variance than the ML estimator.
- The ML estimator depends only on sufficient statistics of the distribution.