Jensen's inequality

Summary

Jensen's Inequality states that the expectation of a convex function is larger than the function of the expectation. It is used to prove the Rao-Blackwell theorem in statistics, and is the basis behind many algorithms for probabilistic inference, including Expectation-Maximization (EM) and variational inference.

Context

This concept has the prerequisites:

Core resources (we're sorry, we haven't finished tracking down resources for this concept yet)

Supplemental resources (the following are optional, but you may find them useful)

-Free-

Information Theory, Inference, and Learning Algorithms
A graudate-level textbook on machine learning and information theory.
Location: Section 2.7, "Jensen's inequality for convex functions," pages 35-36
Author: David MacKay

-Paid-

See also

  • Some uses of Jensen's inequality:
    • showing that KL divergence , a measure of distance between probability distributions, is nonnegative
    • showing that the EM algorithm [increases the likelihood function](expectation_maximization_variational_interpretation)
    • variational Bayes , a general framework for approximate inference in probabilistic models
    • the Rao-Blackwell theorem , which shows that estimators should only consider sufficient statistics is the loss function is convex