Jensen's inequality


Jensen's Inequality states that the expectation of a convex function is larger than the function of the expectation. It is used to prove the Rao-Blackwell theorem in statistics, and is the basis behind many algorithms for probabilistic inference, including Expectation-Maximization (EM) and variational inference.


This concept has the prerequisites:

Core resources (we're sorry, we haven't finished tracking down resources for this concept yet)

Supplemental resources (the following are optional, but you may find them useful)


Information Theory, Inference, and Learning Algorithms
A graudate-level textbook on machine learning and information theory.
Location: Section 2.7, "Jensen's inequality for convex functions," pages 35-36
Author: David MacKay


See also

  • Some uses of Jensen's inequality:
    • showing that KL divergence , a measure of distance between probability distributions, is nonnegative
    • showing that the EM algorithm [increases the likelihood function](expectation_maximization_variational_interpretation)
    • variational Bayes , a general framework for approximate inference in probabilistic models
    • the Rao-Blackwell theorem , which shows that estimators should only consider sufficient statistics is the loss function is convex