Jensen's Inequality states that the expectation of a convex function is larger than the function of the expectation. It is used to prove the Rao-Blackwell theorem in statistics, and is the basis behind many algorithms for probabilistic inference, including Expectation-Maximization (EM) and variational inference.
This concept has the prerequisites:
Core resources (we're sorry, we haven't finished tracking down resources for this concept yet)
Supplemental resources (the following are optional, but you may find them useful)
→ Information Theory, Inference, and Learning Algorithms
A graudate-level textbook on machine learning and information theory.
Location: Section 2.7, "Jensen's inequality for convex functions," pages 35-36
→ Elements of Information Theory
A graduate level textbook on information theory.
Location: Section 2.6, "Jensen's inequality and its consequences," up to Theorem 2.6.2, pages 25-27
→ A First Course in Probability
An introductory probability textbook.
Location: Section 8.5, "Other inequalities," page 453
- Some uses of Jensen's inequality:
- showing that KL divergence , a measure of distance between probability distributions, is nonnegative
- showing that the EM algorithm [increases the likelihood function](expectation_maximization_variational_interpretation)
- variational Bayes , a general framework for approximate inference in probabilistic models
- the Rao-Blackwell theorem , which shows that estimators should only consider sufficient statistics is the loss function is convex
- create concept: shift + click on graph
- change concept title: shift + click on existing concept
- link together concepts: shift + click drag from one concept to another
- remove concept from graph: click on concept then press delete/backspace
- add associated content to concept: click the small circle that appears on the node when hovering over it
- other actions: use the icons in the upper right corner to optimize the graph placement, preview the graph, or download a json representation