mutual information

(2.1 hours to learn)


Mutual information is a measure of the amount of information one random variable conveys about another. It is one of the fundamental quantities of information theory, and determines the rate at which information can be conveyed over a noisy channel.


This concept has the prerequisites:


  • Know the definition of mutual information (in terms of the difference between joint entropy and conditional entropy)
  • Derive some basic properties of mutual information:
    • that it is symmetric
    • that the mutual information of a random variable with itself is the entropy
    • that it is nonnegative
    • that it is zero for independent random variables
  • Know various ways joint entropy decomposes into sums of conditional entropies and mutual information

Core resources (read/watch one of the following)


Supplemental resources (the following are optional, but you may find them useful)


Information Theory, Inference, and Learning Algorithms
A graudate-level textbook on machine learning and information theory.
Location: Section 8.1, "More about entropy," pages 138-140
Author: David MacKay


See also

  • Mutual information can also be defined in terms of [KL divergence](kl_divergence) .