entropy

(1.9 hours to learn)

Summary

Entropy is a measure of the information content of a random variable, and one of the fundamental quantities of information theory. It determines the minimum expected code length necessary to encode samples of the random variable.

Context

This concept has the prerequisites:

Goals

  • Understand the notion of entropy of a discrete random variable.
  • What is the largest possible entropy of a discrete random variable which takes on r possible values?
  • Know the definitions of joint entropy and conditional entropy.
  • Derive the chain rule for writing joint entropy as a sum of conditional entropies.
  • Show that the entropy of a set of independent random variables is the sum of the individual entropies.

Core resources (read/watch one of the following)

-Free-

Information Theory, Inference, and Learning Algorithms
A graudate-level textbook on machine learning and information theory.
  • Section 2.4, "Definition of entropy and related functions," pages 32-33
  • Section 2.5, "Decomposability of the entropy," pages 33-34
  • Section 4.1, "How to measure the information content of a random variable," pages 67-73
Author: David MacKay

-Paid-

Supplemental resources (the following are optional, but you may find them useful)

-Free-

Course on Information Theory, Pattern Recognition, and Neural Networks
Video lectures on machine learning and information theory.
Author: David MacKay

-Paid-

See also